Workflow
evaluation
icon
Search documents
X @Easy
Easy· 2025-06-26 13:33
Holy shit the dollar is speed running to 0.If we actually get 7 rate cuts, this thing will be like 85 if we're lucky. https://t.co/Hh3KypOkfJ ...
Getting Started with LangSmith (5/6): Automations & Online Evaluation
LangChain· 2025-06-25 01:12
Hi, today we're going to talk about automations and online evaluations in Langmith. Automations are powerful rules you can configure to run over every trace sent to your production application. Online evaluations are a type of automation that help you measure metrics on your application's output.Unlike offline evaluations which we covered in a previous video, online evaluations are run on live production user interactions, not on a curated data set. So let's see how to set up automations. We'll start by ret ...
Getting Started with LangSmith (3/6): Datasets & Evaluations
LangChain· 2025-06-25 01:05
- Code: https://github.com/xuro-langchain/eli5 - Learn more about LangSmith: https://www.langchain.com/langsmith/?utm_medium=social&utm_source=youtube&utm_campaign=q2-2025_onboarding-videos_co - Get started with LangSmith for free: https://smith.langchain.com/ - Docs: https://docs.smith.langchain.com/ ...
票选|2025上半年全国十大高端作品
克而瑞地产研究· 2025-06-19 09:16
6月17日起,2025上半年全国十大作品入围项目大众网络票选环节开启。 当前,2025上半年度中国房企产品测评进入角逐阶段,即日起, 6月17日至23日中午12点 ,大众网络票选通道 正式开启。期间, 为期两天的入围项目专家评审环节于6月17-18日举行。 为你心中高端、轻奢、品质住宅产品的典范投票 选出你心目中的 " 好房子 " 与此同时,为方便详实地了解项目情况,本次入围的项目也将再次通过线上进行集中展示。 在"好房子"战略的号召下,房企积极投身产品力建设,致力于解决居住痛点、提升住宅产品品质。自2018年 起,克而瑞连续八年开展中国房企"产品力100"测评研究工作。 本次十大作品高端/轻奢/品质三大类别票选使用同一通道; 同一微信ID仅限投票一次 ,每人每个类别最多选十个项目投票; 最终票数为所有参加者有效投票数的累计。 目前,2025年度产品力测评工作已经开展,除"全国十大作品"外新增"中国好房子"测评,上半年的"十大作 品"测评经组委会初评,现已产生入围项目,并进入专业评审与大众票选环节。 入围项目将经过专家评审、大众票选、测评模型等环节,最终于6月底发布《全国十大高端/轻奢/品质作品》 《中国好房 ...
No Code LangSmith Evaluations
LangChain· 2025-06-18 15:10
Hey, this is Lance from Lang Chain. Evaluations are one of the most important ways to build effective agents. And we wanted to lower the barrier to entry so that anyone, not just developers, can very easily run evaluations on agents that you're building. So, we've recently added the ability to run evaluations on Langraph agents directly from Langraph Studio.This is an agent, Opend Research, that we've developed over the past few months. It's a very popular repo and many people use it. Some of the people tha ...
Scite Expands Extensive Publisher Partnership Network With American Society For Microbiology Indexing Agreement
Prnewswire· 2025-06-18 12:00
Core Insights - Research Solutions' Scite platform has signed an indexing agreement with the American Society for Microbiology (ASM), enhancing its position in the competitive AI research landscape [1][2] - The partnership adds ASM's extensive portfolio of peer-reviewed journals to Scite's network, which now includes over 30 major publishers and provides access to more than 1.3 billion indexed citations [2][5] - Scite's Smart Citations technology offers deeper insights into citation patterns, allowing researchers to understand how microbiology research is referenced and utilized [3][5] Company Overview - Research Solutions is a vertical SaaS and AI company that simplifies research workflows for academic institutions and life science companies globally, combining AI-powered tools with access to both open access and paywalled research [7] - The company emphasizes ethical sourcing of information through direct partnerships with publishers, distinguishing itself from competitors that scrape content [5] Industry Context - The indexing agreement with ASM reflects the growing importance of citation intelligence in modern research evaluation, providing a richer framework for understanding scholarly impact [4][5] - ASM, established in 1899, is a leading organization in microbiological research, advocating for open science and evidence-based public policies [8]
Databricks CEO on evaluating AI agents
CNBC Television· 2025-06-12 14:45
What is a bottleneck perhaps that CEOs, CIOS, executives aren't talking enough about. Yeah, I would say this thing that we call evaluations or benchmarks like it doesn't matter if the agent can crush it at programming contests or you know do math Olympiad really well and it's smarter than us in math. We wanted to do a specific job at the company.How do we know how it's doing. That's called evaluations or benchmarks. So that's what we focused on when we launched agent bricks and it's a way to do agent learni ...
Vortex Energy Announces Plans to Complete an Ambient Noise Tomography Survey at the Robinsons River Salt Project
Globenewswire· 2025-06-12 12:00
Vortex Energy Plans Geophysical Survey at Robinsons River Salt Project to Assess Hydrogen Storage PotentialVANCOUVER, British Columbia, June 12, 2025 (GLOBE NEWSWIRE) -- Vortex Energy Corp. (CSE: VRTX) (OTC: VTECF) (FSE: AA3) (“Vortex” or the “Company”) is pleased to announce plans for a detailed geophysical survey at its flagship Robinsons River Salt Project (the “Project”) in Newfoundland. This marks the start of summer exploration activities that are focused on evaluating the project’s suitability for la ...
2025,AI Agent赛道还有哪些机会?
Hu Xiu· 2025-05-26 08:16
进入2025年以来, AI Agent的发展明显提速。5月6日,OpenAI宣布以30亿美元收购 Windsurf;编程工具Cursor的母公司Anysphere也获得了9亿美元的融 资,估值高达90亿美元;号称中国第一个通用AI Agent的Manus在五月也获得了硅谷老牌风险投资公司Benchmark领投的7500万美元的融资;OpenAI在一 月推出了具备自主使用浏览器能力的Operator,并在二月发布了专注于复杂任务处理的Deep Research,这两个产品上线后迅速获得关注,如今已有不少用 户成为其深度使用者。 这期文章我们就来聊聊:究竟是哪些关键能力,支撑了Agent的技术跃迁?哪一类Agent最有可能成为未来的通用Agent?而普通创业者目前在Agent赛道还 有哪些机会? 我们邀请了MindVerse心识宇宙的创始人陶芳波以及AI产品经理Kolento Hou,一起聊一聊AI Agent的核心技术、热门产品使用体验、创业机会与挑战,以 及AI Agent的未来将走向何方? 以下是这次对话内容的精选: 一、RTF推动的Agent热潮 泓君:首先请两位嘉宾来分享下自己最近使用Agent的频率是 ...
50年僵局打破!MIT最新证明:对于算法少量内存胜过大量时间
机器之心· 2025-05-25 03:51
选自量子杂志 作者: Ben Brubaker 机器之心编译 相信大家都曾有过这样的经历:运行某个程序时,电脑突然卡住,轻则恢复文件,重则重新创建;或者手机频繁弹出「内存不足」的警告,让我们不得不忍痛删 除珍贵的照片或应用。 这些日常的烦恼,其实都指向了计算世界中两个至关重要的基本要素: 时间和空间。 时间和空间(也称为内存)是计算中最基本的两种资源: 任何算法在执行时都需要一定的时间,并在运行过程中占用一定的空间以存储数据。 以往已知的某些任务的算法,其所需的空间大致与运行时间成正比,研究人员长期以来普遍认为这一点无法改进。 MIT 的理论计算机科学家 Ryan Williams 的最新研究建立了一种数学程序,能够将任意算法 —— 无论其具体执行何种任务 —— 转化为一种占用空间显著更少的形 式, 证明少量计算内存(空间)在理论上比大量计算时间更有价值, 这颠覆了计算机科学家近 50 年来的认知。 50 年的探索与瓶颈 Juris Hartmanis 1965 年, Juris Hartmanis 和 Richard Stearns 两人合作发表了两篇开创性论文,首次对「时间」(Time)和「空间」(Spa ...