Workflow
大语言模型
icon
Search documents
英特尔深入零售门店打造“智慧大脑”,重点发力海外
Feng Huang Wang· 2025-05-09 02:45
Core Insights - Intel is leveraging AI and computing power to transform retail experiences, enabling features like facial recognition for personalized recommendations and quick checkout processes [1] - At the 25th China Retail Industry Expo, Intel showcased smart retail solutions in collaboration with partners, emphasizing the role of AI technologies in retail transformation [1] Group 1: Smart Retail Solutions - Intel's smart retail architecture combines edge computing and endpoint devices, utilizing its Core Ultra processors and Xe graphics for various retail functionalities [1] - The endpoint devices powered by Intel's Core Ultra processors support functions such as smart shopping assistance, stock alerts, product recommendations, and advertising, aimed at reducing operational costs [1] - Edge devices, supported by Core Ultra processors and multiple Xe graphics cards, facilitate store management tasks like compliance checks and customer flow analysis [1] Group 2: AI POS Solutions - Intel's AI POS solutions are built on different levels of computing platforms, optimized with Intel's oneAPI and OpenVINO toolkits for flexible algorithm models [2] - The company aims to break the price war cycle with its initiatives and plans to launch another Edge AI project this year to promote retail devices in overseas markets [2]
挑战AI数学推理极限!大规模形式化数学基准FormalMATH发布,最强模型成功率仅16%
量子位· 2025-05-07 09:33
Core Insights - The FormalMATH benchmark test, developed by institutions such as The Chinese University of Hong Kong and Zhejiang University, consists of 5,560 rigorously validated mathematical problems, covering various fields from Olympiad level to undergraduate courses, and is 22.8 times larger than existing benchmarks [1][5][4]. Group 1: Performance of LLMs - The performance of current LLM-driven theorem provers is significantly below expectations, with the best model, Kimina-Prover, achieving a success rate of only 16.46% under resource constraints [3][15]. - Most models perform close to random guessing in calculus and other areas, indicating a substantial capability gap [3][7]. - There is a notable domain bias, with better performance in algebra compared to weaker results in calculus [11][12]. Group 2: Error Analysis - Common error patterns include: - Redundant assumptions (34%): Introducing irrelevant premises [16]. - Incomplete proofs (62%): Missing critical steps in the proof [16]. - Misuse of automation strategies (65%): Incorrectly applying automated tools [16]. - Inability to handle inequalities correctly (13%): Over-reliance on automated inequality calculation strategies [16]. - The analysis shows that LLM provers often resort to shortcut tactics, which leads to significant errors [14]. Group 3: Future Directions - To enhance the formal reasoning capabilities of LLMs, three areas of focus are proposed: - Strengthening multi-step planning to reduce reliance on single-step tactics [19]. - Cross-domain generalization through curriculum learning to balance training data across different mathematical fields [19]. - Development of interactive proof-assistance tools for collaboration between LLMs and human experts [19]. Group 4: Open Source Initiative - The research team has made the FormalMATH benchmark's code, training data, and evaluation models publicly available, encouraging collaboration between academia and industry to advance formal mathematical reasoning technologies [20][21].
搞不懂CUDA的人有救了,Devin开发商开源Kevin,强化学习生成CUDA内核
机器之心· 2025-05-07 04:34
| 机器之心报道 | | --- | 编辑:蛋酱、泽南 本周三,知名 AI 创业公司,曾发布「全球首个 AI 软件工程师」的 Cognition AI 开源了一款使用强化学习,用于编写 CUDA 内核的大模型 Kevin-32B 。 Kevin-32B 基于 QwQ-32B 在 KernelBench 数据集上使用 GRPO 进行了多轮强化学习训练,实现了超越 o3 和 o4-mini 的顶级推理表现。 对此,机器学习社区表现出了极大的兴趣。有人表示期待 DeepSeek R1 风格的训练方法用来提升代码效率已久,这回终于有人站出来了。 在一篇博客中,Cognition AI 详细介绍了新模型强化学习训练的机制。 代码是一个不断迭代的过程 —— 需要我们编写、执行程序,评估结果,并根据反馈优化代码。大语言模型(LLM)在代码生成方面的最新进展尝试将此过程融入 推理阶段,并使用并行采样等方法。虽然这些方法是有效的,但它们依赖于搜索而非实际学习 —— 在这其中模型权重被冻结。 Cognition AI 探索了多轮强化学习,使用来自环境的中间反馈,并屏蔽模型思维以避免在多轮训练中上下文爆炸。 他们提出的模型 Kev ...
AI赋能保险业变革:从经验到数据智能驱动的跨越
Huan Qiu Wang· 2025-05-06 08:17
Core Insights - The insurance industry is undergoing a transformation driven by the integration of artificial intelligence (AI) technologies, as highlighted by Wang Min, the Executive Vice President of ZhongAn Insurance, at the 2025 Insurance Technology Summit [1] - The summit's theme focused on the strategic advancement and application innovation of AI in the insurance sector, marking a shift from the internet era to the AI era [1] Group 1: AI's Impact on Financial Services - The application of large language models is reshaping the operational philosophies, business logic, and value creation models of financial institutions, leading to two significant trends: precision in financial services and cross-industry ecological collaboration [2][4] - Financial services are becoming more precise, with banks optimizing credit assessment systems using real-time business data and social media dynamics, while brokerages leverage knowledge graphs for market predictions [2] - Cross-industry collaborations are emerging, such as insurance companies partnering with healthcare platforms to develop preventive insurance based on real-time health data [4] Group 2: Transformation in the Insurance Sector - The rise of large language models is prompting a fundamental shift in the insurance industry from experience-driven to data intelligence-driven approaches [4] - ZhongAn Technology has developed an intelligent platform tailored to the insurance sector, utilizing over 600 million user data points and creating more than 200 specialized AI agents [4] - The internal AI platform of ZhongAn is being utilized over 50 million times monthly, demonstrating significant engagement and application [4] Group 3: AI Applications and Efficiency Gains - AI is being integrated across the entire value chain of ZhongAn, from product design and marketing to underwriting, claims, quality inspection, and internal IT management [5] - The implementation of AI has drastically reduced product configuration time from several days to hours and decreased costs by 80% [5] - AI-driven customer service has achieved a 95% accuracy rate, with a 90% intervention rate, resulting in significant cost savings [5] Group 4: Future Directions and Collaborative Efforts - The insurance industry is expected to experience fundamental changes in core elements like risk pricing due to advancements in AI, leading to a reshaping of business models and organizational structures [6] - Differentiated technology strategies are recommended for various sizes of insurance companies, with larger firms focusing on AI infrastructure investment and smaller firms emphasizing tool-based applications for efficiency [6] - ZhongAn Technology aims to build an AI ecosystem through collaborations, including the establishment of an "AI + Insurance Joint Laboratory" with partners to enhance model capabilities and integrate technology into insurance operations [6]
当答案变得廉价时,好问题就是新的稀缺品
3 6 Ke· 2025-05-04 00:03
Group 1 - The core argument of the article is that in an era where answers are easily accessible, the value lies in asking the right questions, which can reshape understanding and drive creativity [1][4][19] - The invention of photography in the 1830s challenged traditional artistic standards, leading artists to focus on subjective experiences rather than mere replication of reality [3][10][11] - The emergence of large language models (LLMs) has made obtaining answers cheaper, but this has led to a decline in the quality of inquiry and an increase in the cost of asking good questions [15][17][26] Group 2 - The article emphasizes that the value of information is proportional to the uncertainty it eliminates, as illustrated by Claude Shannon's information theory [21][22][23] - It argues that in a world of information overload, the challenge is not the lack of facts but the misalignment of attention, leading to a focus on quantity over quality in answers [31][32][46] - The piece highlights the importance of redefining problems and frameworks to navigate structural uncertainties effectively, suggesting that good questions can expand the boundaries of understanding [37][38][39]
315 行代码构建编程助手,Go大佬揭开智能体的「神秘面纱」
机器之心· 2025-05-03 04:18
Core Viewpoint - Thorsten Ball has successfully built a programming agent using 315 lines of code, emphasizing that it runs well and lacks a competitive moat, making it easily replicable [1]. Group 1: Programming Agent Development - The programming agent, while not as advanced as Claude or Gemini, serves as a valuable learning example for beginners, reflecting Ball's philosophy of demystifying technology through practical and open-source projects [3]. - The construction of a small agent requires less than 400 lines of code, primarily consisting of boilerplate code, and involves a large language model, a loop, and sufficient tokens [4][10]. - The core functionality of the agent allows for a conversational interface with Claude, where it maintains context across multiple exchanges [13]. Group 2: Tool Integration - A significant aspect of the agent's functionality is its ability to use tools, defined as prompts that instruct the model on how to respond when it wants to utilize a specific tool [15]. - The process of defining tools involves specifying a name, description, input schema, and an execution function, which collectively enable the model to understand and utilize the tools effectively [22][24]. - The agent can autonomously determine when to use a tool based on the context of the conversation, demonstrating a level of independence in problem-solving [40]. Group 3: Practical Implementation - The agent's implementation includes a method to check if Claude requests a tool, executing it if necessary, and returning the results back to Claude [37][38]. - The example provided illustrates how the agent can read a file and respond to queries about its contents, showcasing its practical application in real-world scenarios [39][40]. - Additional tools such as list_files and edit_file can be integrated into the agent, further enhancing its capabilities [41].
ICML 2025放榜!接收率26.9%,高分被拒,低分录用惹争议
机器之心· 2025-05-02 04:39
Core Insights - The 42nd International Conference on Machine Learning (ICML) will be held in Vancouver, Canada, from July 13 to 19, 2025, with a significant increase in submissions this year [1] - A total of 12,107 papers were submitted, marking a 28% increase from the previous year, with an acceptance rate of 26.9% as 3,260 papers were accepted [1] - The article discusses both high-quality accepted papers and controversial rejected papers, providing a platform for discussion among researchers [1] Accepted High-Quality Papers - Spotlight papers are the highest recommended by ICML, including notable titles such as "Neural Discovery in Mathematics" and "Monte Carlo Tree Diffusion for System 2 Planning" [3][5] - The paper "MARS: Unleashing the Power of Variance Reduction for Training Large Models" achieved an average score of 4.25, showcasing a variance reduction adaptive optimizer framework with a convergence rate of (T⁻²/³), outperforming AdamW's (T⁻¹/²) [7][8] - "EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents" received an average score of 4.5 and includes 1,128 test tasks across various environments [10] Controversial Rejected Papers - Some rejected papers, despite high evaluations, have raised concerns among researchers regarding the review process [12][15] - Instances of low-quality reviews and discrepancies in scoring have been reported, with some researchers receiving incomplete and irrelevant feedback [18][20] - The article highlights the contradictions in the review process, where some papers with lower scores were accepted while higher-scoring papers were rejected [12][20]
唐兴资本:睿见果敢,洞察投资项目潜藏的巨大价值
Sou Hu Cai Jing· 2025-05-02 02:58
Group 1 - The emergence of DeepSeek, a large model comparable to ChatGPT, has created significant waves in the global technology and capital markets, igniting enthusiasm for innovation and investment opportunities in the tech sector [3] - Tangxing Capital focuses on discovering and nurturing high-growth potential hard tech companies, aiming to drive industrial upgrades and regional economic development through a comprehensive support system [3][4] - The investment team at Tangxing Capital possesses deep industry backgrounds and professional investment capabilities, allowing them to accurately grasp technology development trends and identify quality projects [3][4] Group 2 - Young entrepreneurs like Liang Wenfeng and Wang Xingxing exemplify the characteristics of contemporary tech leaders, showcasing strong learning abilities and rapid application of new technologies [4][5] - These entrepreneurs break traditional thinking and industry boundaries, integrating resources across sectors to create new application scenarios and business models [5][6] - Key traits admired in successful entrepreneurs include innovation spirit, cross-disciplinary integration ability, strategic vision, and focus on core business areas [6] Group 3 - The investment style of Tangxing Capital is characterized by "insightful decisiveness," emphasizing the ability to quickly identify and act on investment opportunities [7] - A notable investment decision involved a significant investment in Plater, a key player in the 3D printing industry, despite market uncertainties, which later yielded a tenfold return [9] - Plater's technology addresses complex manufacturing needs in aerospace, automotive, and medical sectors, significantly contributing to China's manufacturing transformation [8][9] Group 4 - The current bull market is driven by a combination of macroeconomic stability, loose monetary policy, and positive market sentiment, creating a conducive environment for investment [10][11] - The bull market enhances the financing environment for primary markets, encouraging entrepreneurship and accelerating company growth through increased funding [12][13] - The interaction between primary and secondary markets fosters a cycle of investment and exit opportunities, optimizing resource allocation and enhancing economic vitality [14]
苹果公司CEO库克:仍然对公司的人工智能(AI)和大语言模型(LLM)路线图感到兴奋。
news flash· 2025-05-01 21:53
苹果公司CEO库克:仍然对公司的人工智能(AI)和大语言模型(LLM)路线图感到兴奋。 ...