Workflow
LLM
icon
Search documents
Tracing Claude Code to LangSmith
LangChain· 2025-12-19 21:05
Are you curious about what cloud code is doing behind the scenes. Or do you want observability in the critical workflows that you've set up with claude code. Hey, I'm Tanish from Langchain and we built a claude code to LinkSmith integration so that you can see each step that cla takes whether that be an LLM call or tool calls.Um it's pretty fascinating to see the entire trace. So I want to show you what this looks like. Um uh I have uh a project here.It's a very very very simple uh agent that I build with u ...
倒计时3周离职,LeCun最后警告:硅谷已陷入集体幻觉
3 6 Ke· 2025-12-16 07:11
LeCun不忍了,三周后从Meta「提桶跑路」,临走前狠扇了硅谷所有人一巴掌:你们信奉的大模型全是泡沫,根本通往不了AGI! 大模型是一条死路,无法通往AGI! 在今天的最新一期深度访谈中,LeCun直言不讳指出: 所谓的ASI路线——不断训练LLM,用更多合成数据喂模型,雇上几千人在后训练阶段「管教」系统,再折腾出一些RL的新技巧。 在我看来,完全是胡说八道! 这条路根本行不通,而且永远都不可能成功。 任职12年,即将暴走Meta的65岁图灵奖得主,在公众场合的观点愈加偏激了。 这场近2小时的对谈中,LeCun的观点一针见血—— 硅谷对不断「堆大」LLM的痴迷是一条死路; 搞AI最难的问题在于,它依旧是「阿猫阿狗」的智能,而非类人的水平。 如今,他正把一生的学术声誉押注在一条截然不同的AI路线之上,即「世界模型」。 访谈中,LeCun还分享了创业公司AMI(Advanced Machine Intelligence)正在构建的「世界模型」—— 在抽象表征空间中进行预测,而不是简单地像素级输出。 几天前,LeCun在与谷歌DeepMind大佬Adam Brown一场激辩中,同样提出他的经典论点: LLM没那么 ...
Insurers and AI, a systemic risk
Freakonometrics· 2025-11-25 05:00
Core Viewpoint - Major insurers are retreating from providing coverage for risks associated with artificial intelligence due to the potential for multibillion-dollar claims and systemic risk posed by correlated losses across multiple incidents [1][2][12] Group 1: Insurers' Response to AI Risks - Insurers like AIG, Great American, and WR Berkley are introducing explicit exclusions for AI-related risks, particularly concerning agents and language models [1] - The potential losses related to AI could reach several hundreds of millions of dollars, with the primary concern being the possibility of simultaneous, massive losses that cannot be mutualized [1][2] Group 2: Systemic Risk and Interconnectedness - The interconnected nature of AI systems creates a breeding ground for contagion, where a single error can propagate rapidly across a network, affecting thousands of users simultaneously [5][10] - Financial systems exhibit a "robust-yet-fragile" dynamic, where they can withstand numerous shocks but may collapse suddenly when a specific shock travels through interconnected channels [3][4] Group 3: Challenges in Insurability - Insurability relies on the law of large numbers, which requires events to be independent; however, cyber risks and generative AI create environments where losses are highly correlated and difficult to attribute [6][8] - Generative AI amplifies the structural fragility of cyber insurance, as a single defect or vulnerability can lead to widespread, identical losses across an entire sector [7][8] Group 4: Legal and Regulatory Implications - The issue of "AI liability" remains largely unexplored, with significant contractual asymmetry where AI providers limit their liability and transfer risk to users [19][20] - This creates a regulatory gap, a contractual gap, and an insurance gap, leading to a legal systemic risk characterized by diffuse responsibility and concentrated dependency [23]
From Stateless Nightmares to Durable Agents — Samuel Colvin, Pydantic
AI Engineer· 2025-11-24 20:16
Pantic AI Products & Features - Pantic AI supports temporal and other durable execution frameworks, with ongoing efforts to integrate more workflow orchestration backends [1] - Pantic AI offers tools for building AI agents, including the ability to perform web searches and analyze data [11][41] - Pantic AI's temporal agent handles the IO needed to call an LLM, including tool calls, by turning them into activities [16] - Pantic AI is developing a gateway for buying inference from various models, including observability features [61] Temporal & Durable Execution - Temporal is highlighted as a leading solution for durable execution, crucial for long-running workflows where progress preservation is essential [2] - Temporal records every activity and its inputs/outputs, enabling rerun from any point by plugging in the answers [15] - Temporal enables the resumption of workflows without adding resume code to the agent code [29] - Temporal's retry logic handles runtime errors and ensures continuous operation [22][25] Deep Research & Agent Architecture - Deep research is presented as analogous to a 20 questions game, with web search or RAG as intermediate steps [11] - The company is shifting towards viewing agents as micro-tasks that form larger autonomous task completion systems [40] - A deep research agent can be composed of multiple specialized agents, such as a plan agent, a search agent, and an analysis agent [41] Evaluation & Performance - Pantic AI evals are used to compare the performance of different models, considering factors like cost, speed, and accuracy [33] - Gemini was initially found to be faster and cheaper, but later discovered to sometimes invent incorrect answers [33][35]
Ice Cold, Zen-Like Investing With Alex King
Seeking Alpha· 2025-10-26 20:00
AI and Technology Sector - The AI demand cycle is still in its early stages, with significant growth in GPU shipments, server shipments, and data center builds expected to continue [6][7][8] - Many large companies are adopting AI but are struggling with use cases and understanding the true economics of implementation [8][9] - The current excitement around AI may lead to a "trough of disillusionment," where valuations could drop as reality catches up with expectations [12][14] - Nvidia's valuation is considered reasonable based on its growth and margins, but concerns exist about potential competition affecting its market share [15][26] Semiconductor Industry - The semiconductor sector has seen a significant run-up in prices, with the SOXX ETF moving from 148 to 290 over six months [56][58] - There is a possibility that the semiconductor sector may become a source of funds for investors, as profits are taken and capital is rotated into other sectors [57][64] - Intel is positioned to benefit from government support and reshoring of semiconductor manufacturing, but its fundamentals remain weak [65][69] Tesla - Tesla's stock is viewed positively due to potential synergies with xAI, despite challenges in its core automotive business [34][38] - The market perception of Tesla is driven more by Elon Musk's leadership than by traditional automotive fundamentals [41][42] Gold Market - Gold prices are perceived to have risen too quickly, driven by fear rather than fundamental economic indicators [43][45] - The current demand for gold is seen as a reaction to global uncertainties, but there is skepticism about its sustainability at current price levels [48][50] Quantum Computing - The quantum computing sector has experienced significant momentum, but the long-term viability of smaller companies in this space remains uncertain [30][32] - Government investments may provide temporary support, but stock prices are currently viewed as overvalued relative to fundamentals [32][33] Cryptocurrency - The cryptocurrency market is characterized by high volatility, with Bitcoin and Ether seen as having potential upside, while lower-order coins are viewed with caution [74][84] - The use of ETFs for cryptocurrency investments is recommended as a safer alternative to direct holdings [86]
AI芯片,大泡沫?
半导体行业观察· 2025-10-21 00:51
Core Viewpoint - The article discusses the current state of the AI industry, comparing it to the internet bubble of 1999-2000, highlighting the rapid rise in valuations and the potential risks associated with companies like Coreweave [3][5]. Valuation and Market Trends - As of September, the Nasdaq composite index had a P/E ratio of 33, with major companies like Amazon, Apple, Google, Microsoft, Meta, and TSMC ranging from 27 to 39 [6]. - Nvidia's P/E ratio is notably high at 52, reflecting its leadership in the AI sector, while AMD's P/E has surged to 140 due to its acquisition of OpenAI [6][7]. - GenAI revenue is experiencing rapid growth, with predictions of AI data center investments reaching $5 trillion by 2030, primarily from large, profitable companies [6][7]. Adoption Rates and Consumer Behavior - GenAI adoption is accelerating, with ChatGPT reaching 100 million users in just two months, significantly faster than other platforms like TikTok and Facebook [6][11]. - A consumer AI market valued at $12 billion has emerged within two and a half years, with 60% of U.S. adults using AI in the past six months [11][12]. Enterprise Use Cases and Productivity - GenAI is expected to be the largest market, with significant applications in enhancing productivity, particularly in programming and financial analysis [13][14]. - Companies like Walmart and Salesforce are leveraging AI to avoid hiring additional staff while still achieving growth [14][15]. Competitive Landscape and Future Outlook - The cost of training advanced models is projected to reach billions, limiting participation to companies with substantial resources [16]. - Major players like Anthropic, AWS, Google, and Microsoft are expected to dominate, while smaller companies may need to specialize in niche markets [30][31]. - The article suggests that multiple winners may emerge in the GenAI space, as differentiation and ecosystem bundling are likely to occur [40]. Hardware and Infrastructure Challenges - The demand for data center capacity is surging, with predictions that the scale of data centers will grow significantly by 2026 [32]. - There are concerns about the adequacy of power supply to meet the growing needs of AI data centers, with projections indicating that AI could consume a substantial portion of the U.S. electricity supply by 2024 [38][39].
Tool-Integrated RL 会是 Agents 应用突破 「基模能力限制」 的关键吗?
机器之心· 2025-09-21 01:30
Core Insights - The article discusses the evolution of AI agents, emphasizing the need for enhanced reasoning capabilities through Tool-Integrated Reasoning (TIR) and Reinforcement Learning (RL) to overcome limitations in current AI models [7][8][10]. Group 1: AI Agent Development - The term "Agent" has evolved, with a consensus that stronger agents must interact with the external world and take actions, moving beyond reliance on pre-trained knowledge [8][9]. - AI systems are categorized into LLM, AI Assistant, and AI Agent, with the latter gaining proactive execution capabilities [9][10]. - The shift from simple tool use to TIR is crucial for agents to handle complex tasks that require multi-step reasoning and real-time interaction [10][12]. Group 2: Tool-Integrated Reasoning (TIR) - TIR is identified as a significant research direction, allowing agents to understand goals, plan autonomously, and utilize tools effectively [10][12]. - The transition from supervised fine-tuning (SFT) to RL in TIR is driven by the need for agents to actively learn when and how to use external APIs [12][14]. - TIR enhances the capabilities of LLMs by integrating external tools, enabling them to perform tasks that were previously impossible, such as complex calculations [12][13]. Group 3: Practical Implications of TIR - TIR allows for empirical support expansion, enabling LLMs to generate previously unattainable problem-solving trajectories [12][14]. - Feasible support expansion through TIR makes complex strategies practically executable within token limits, transforming theoretical solutions into efficient strategies [14][15]. - The integration of tool usage into the reasoning process elevates the agent's ability to optimize multi-step decision-making through feedback from tool outcomes [15].
急招+快速面试|理想汽车AI应用高级产品经理
理想TOP2· 2025-09-16 15:04
Group 1 - The company is seeking a senior AI product manager with a competitive salary and a collaborative team environment focused on product innovation [2] - The role involves managing AI applications across multiple platforms, including LLM and AIGC, with a focus on user interaction and comprehensive solution planning [3] - The company emphasizes a real and open team atmosphere that encourages valuable ideas and rapid market feedback [2] Group 2 - Candidates should have over 3 years of experience in AI product applications or strategy, particularly in scenarios with over 10,000 daily active users [4] - Strong project management skills are required to lead complex projects to successful completion [5] - The ideal candidate should possess excellent data analysis and logical thinking abilities, with a deep understanding of user needs [6] Group 3 - The company values proactive learning and the ability to adapt to industry changes and emerging applications [7] - Candidates should demonstrate clear logic and the ability to think systematically to solve problems and coordinate complex business operations [8]
大模型,为何搞不定软件开发?根本原因就在…
程序员的那些事· 2025-09-08 00:57
Core Viewpoint - The article discusses the limitations of Large Language Models (LLMs) in software development, emphasizing that while LLMs can generate code and assist with simple tasks, they struggle with maintaining clear cognitive models necessary for complex problem-solving [5][14][15]. Group 1: LLM Capabilities - LLMs can perform routine engineering tasks such as reading code, writing tests, and debugging, but they often fail to maintain a coherent understanding of the code's behavior [8][15]. - They can generate code quickly and are effective in organizing requirement documents for straightforward tasks [15][16]. Group 2: Limitations of LLMs - LLMs cannot maintain two similar cognitive models simultaneously, which leads to confusion in determining whether to modify the code or the requirements [14][20]. - They often assume their generated code is flawless and struggle to adapt when tests fail, lacking the ability to validate their work against a clear mental model [9][14][22]. Group 3: Future Improvements - There is potential for improvement in LLMs, but significant changes to their underlying architecture are necessary to enhance their problem-solving capabilities beyond mere code generation [12][21]. - The article suggests that while LLMs currently have shortcomings, their rapid evolution indicates that they may become more competent in software development tasks in the future [21][22]. Group 4: Human vs. LLM Collaboration - The article advocates for human oversight in software development, asserting that LLMs should be viewed as tools rather than replacements for human engineers [17][19]. - It highlights the importance of human engineers in ensuring clarity in requirements and the actual effectiveness of the code produced [16][17].
X @Bitget
Bitget· 2025-08-25 10:02
Event Overview - Bitget Onchain 交易竞赛 40 交易 $LLM, $DONKEY, $ULTI,瓜分 20,000 $BGB 奖励 [1] - 活动时间:8 月 25 日上午 11:00 至 8 月 28 日上午 10:59 (UTC) [1] Participation Details - 参与方式:注册活动后在 BitgetOnchain 上交易 LLM、DONKEY 和 ULTI [1] - 注册链接:https://tco/ATPMGAzp79 [1] Reward Structure - 排名 1-5 名:100 BGB [1] - 排名 7-100 名:50 BGB [1] - 排名 101-500 名:20 BGB [1] - 排名 501-1840 名:5 BGB [1]