大型语言模型

Search documents
苹果四位 AI 大将出走,其中三位是华人
3 6 Ke· 2025-09-04 02:13
前段时间轰轰烈烈的Meta抢人行动,容易让我们忘掉一点:AI人才的流动一直都很大,而"被高薪挖走"从来就不是唯 一的原因。 彭博社名记马克·古尔曼(Mark Gurman)爆料称,苹果又损失了四位AI大将,分别是: 苹果的机器人首席AI研究员Jian Zhang,以及苹果基础模型团队三名研员Nan Du、Zhao Meng和John Peebles。 从这里面我们至少能得到两个信息。 第一,离开的研究员很集中,有三个都是基础模型团队的。 第二,华人占比依然很高,四个当中除了John Peebles都是华人。 这很像是Meta抢人的习惯,但这次真的它关系不大——四个人中,只有Jian Zhang去了Meta。Nan Du和John Peelbles去 了OpenAI,而Zhao Meng则加入了Anthropic。 Meta挖走了苹果的机器人AI大将 从2005年加入,到如今离开,Jian Zhang在苹果整整效力十年。领英资料显示,他离开时已经是苹果人工智能与机器学 习(AIML)部门的机器人研究负责人。 不同于特斯拉的人形机器人项目,机器人技术是苹果未来产品线的关键组成部分。据彭博社报道,苹果有一系列设备 ...
狮腾控股推出突破性的多模型大型语言模型平台Geene M2
Zhi Tong Cai Jing· 2025-09-04 00:11
Group 1 - The core offering of LionTeng Holdings is the launch of its multi-model large language model (LLM) platform, Geene M2, which integrates various leading LLMs and utilizes a proprietary neural intelligence routing engine to dynamically select the best model for each user [1] - Geene M2 serves as a unified AI ecosystem that enhances efficiency and speed by routing queries to the most suitable LLM based on conversation type and user intent, establishing a new benchmark for enterprise AI [1] - The platform introduces features such as multi-response comparison and intelligent answer fusion technology, allowing businesses to evaluate outputs from different LLMs and create richer content [1] Group 2 - Geene M2's AI programming module aids enterprises in accelerating digital transformation by converting natural language prompts into functional code, thereby reducing reliance on technical development and shortening development cycles [2] - The platform enhances productivity by generating responsive front-end applications and bridging different systems, enabling faster innovation and efficient scaling for businesses in the digital economy [2] Group 3 - Geene's AI Vault is a cloud storage platform that transforms traditional file storage into a powerful intelligent resource center, automatically organizing and associating content for instant retrieval through natural language queries [3] - The AI Vault utilizes Retrieval-Augmented Generation (RAG) technology to ensure that stored data is "AI-ready," facilitating quicker decision-making and operational efficiency for enterprises [3] Group 4 - Geene M2 offers advanced features with a pricing strategy aimed at widespread adoption, including a free plan with 10GB storage, a professional plan at $18 per month with 20GB storage, and an enterprise plan at $36 per month with 60GB storage [4] - The tiered pricing model allows companies of various sizes to access advanced AI technologies at a lower cost compared to traditional enterprise software, making Geene M2 one of the most cost-effective and scalable AI platforms available [4] - The global AI solutions market is projected to grow from $150 billion in 2024 to over $500 billion by 2029, with Geene M2 focusing on key application scenarios in finance, business, and digital assets to capitalize on this growth opportunity [4]
苹果新研究:不微调、不重训,如何让AI提问效率暴增6.5倍?
3 6 Ke· 2025-09-02 09:45
Core Insights - Apple has been relatively low-profile in the AI wave centered around large language models (LLMs), but it has produced notable research outcomes, such as the efficient visual language model FastVLM that can run directly on iPhones [1] - A recent collaboration between Apple, Oxford University, and City University of Hong Kong introduced a new method called BED-LLM, which enhances AI problem-solving capabilities by 6.5 times, increasing success rates from 14% to 91% without the need for fine-tuning or retraining [1][18] - The key to this breakthrough lies in teaching AI to ask the right questions [1] Group 1 - The BED-LLM method addresses a significant limitation of LLMs, which struggle to adaptively gather information from users or external environments, often leading to a "multi-turn amnesia" [3][4] - The method employs a sequential Bayesian experimental design framework to formulate interactive information-gathering tasks as sequential experimental design problems, maximizing expected information gain (EIG) with each question [5][7] - The approach involves updating beliefs based on user responses and selecting the next question accordingly, akin to a scientific experiment [8][9] Group 2 - BED-LLM is characterized by three key insights: 1. It focuses on genuine information gain rather than superficial uncertainty, ensuring that questions yield maximum value [12] 2. It employs a sample-then-filter strategy to maintain logical consistency, preventing LLMs from forgetting previous constraints [16] 3. It uses a targeted conditional generation strategy to generate questions that effectively narrow down hypotheses [17] Group 3 - The effectiveness of BED-LLM was validated against two mainstream benchmarks, showing superior performance in tasks such as a 20-question guessing game and movie preference recommendations [18] - The method demonstrated a significant increase in success rates, with a notable example being the success rate rising from 14% to 91% when predicting celebrities using Mistral-Large [18] - In a stress test involving different models for questioning and answering, BED-LLM maintained its performance advantages, showcasing its robustness in real-world scenarios [20][22] Group 4 - This research illustrates how a rigorous mathematical framework can transform LLMs from passive knowledge repositories into proactive, efficient information gatherers, potentially leading to more intelligent dialogues in future AI interactions [24]
苹果新研究:不微调、不重训,如何让AI提问效率暴增6.5倍?
机器之心· 2025-09-02 09:33
Core Viewpoint - The article discusses a new method called BED-LLM developed by Apple in collaboration with Oxford University and City University of Hong Kong, which enhances the problem-solving capabilities of AI by 6.5 times without the need for fine-tuning or retraining [1][20]. Group 1: Introduction to BED-LLM - Apple has been relatively low-profile in the AI landscape dominated by large language models (LLMs), but it has produced significant research outcomes like FastVLM [1]. - The BED-LLM method allows AI to improve its question-asking capabilities, leading to a success rate increase from 14% to 91% [1][20]. Group 2: Challenges with LLMs - LLMs struggle to adaptively gather information from users or external environments, often leading to a "multi-turn amnesia" where they forget previous constraints [4][16]. - Enhancing LLMs' ability to ask targeted questions based on real-time feedback is essential for effective interaction [5]. Group 3: Mechanism of BED-LLM - The BED-LLM framework employs a sequential Bayesian experimental design to formulate interactive information-gathering tasks as sequential experimental design problems [7][9]. - The process involves maximizing expected information gain (EIG) with each question asked, updating beliefs based on user responses, and selecting the next question accordingly [10][11]. Group 4: Innovations in BED-LLM - The method incorporates three key innovations: - **Wisdom One**: Focus on genuine information gain rather than superficial uncertainty, ensuring that questions yield maximum value [14]. - **Wisdom Two**: A sample-then-filter strategy to maintain logical consistency and prevent LLMs from contradicting previous answers [16][17]. - **Wisdom Three**: A targeted conditional generation strategy that allows LLMs to generate questions that effectively narrow down hypotheses [18]. Group 5: Performance Validation - The research team compared BED-LLM against two mainstream benchmarks, demonstrating superior performance in tasks like the "20 Questions" game and movie preference recommendations [20]. - In various datasets, BED-LLM significantly improved success rates, with Mistral-Large achieving a success rate of 91% in celebrity prediction tasks [20][21]. Group 6: Real-World Application - The team conducted a "model cross-server chat" stress test, showing that BED-LLM maintains its performance advantages even when the questioning and answering AIs use different models [23][24]. - This indicates the robustness of BED-LLM in real-world scenarios where user thought processes differ from AI models [24]. Group 7: Conclusion - The research illustrates how a rigorous mathematical framework can transform LLMs from passive knowledge repositories into proactive, efficient information gatherers, paving the way for more intelligent AI interactions [26].
Copilot强塞马斯克Grok新模型,遭开发者集体“抵抗”!GitHub内部工程师曝:我们是被“胁迫”的
Sou Hu Cai Jing· 2025-08-30 06:49
Core Points - GitHub is deepening its collaboration with Elon Musk's xAI by integrating the Grok Code Fast 1 large language model into GitHub Copilot, raising concerns about safety testing and work conditions within the engineering team [1][4][5] Group 1: GitHub Copilot and Grok Code Fast 1 - Grok Code Fast 1 is being rolled out as an optional public preview for users of GitHub Copilot Pro, Pro+, Business, and Enterprise plans, with free access until September 2, 2025 [2][3] - The model is designed specifically for coding tasks and provides visible reasoning trails in its responses, allowing programmers to iterate faster on complex projects [2][3] - Users can enable Grok Code Fast 1 through the model selector in Visual Studio Code, with personal users having the option to use their own xAI API keys [3] Group 2: Internal Concerns and Developer Reactions - Eric Bailey, a senior engineer at GitHub, publicly criticized the rushed safety review process and claimed the engineering team felt pressured to integrate Grok Code Fast 1 against their values [4][5] - The integration has sparked significant backlash among developers, with many expressing intentions to migrate to alternative platforms due to dissatisfaction with the collaboration [5][6] - Some developers argue that the partnership with xAI could bring unique value to GitHub by enhancing tools for understanding model behavior and improving trust in automated workflows [6]
美股异动 | 部分机器人概念股盘中冲高 Serve Robotics(SERV.US)大涨超15%
Zhi Tong Cai Jing· 2025-08-27 14:50
Core Viewpoint - The recent launch of NVIDIA's Jetson Thor is seen as a significant advancement in robotics, enhancing AI computing power and enabling higher capabilities in humanoid robots [1] Group 1: Stock Performance - Several US robotics stocks experienced notable gains, with Serve Robotics rising over 15%, Richtech Robotics increasing nearly 14%, and iRobot up more than 3% [1] Group 2: Technological Advancements - NVIDIA's Jetson Thor features a Blackwell GPU and 128 GB of memory, achieving 2070 FP4 TFLOPS of AI computing power, which is 7.5 times that of the previous Jetson Orin model [1] - This technological leap allows robots to process large amounts of sensory data and large language models (LLM) in real-time, enhancing their ability to see, think, and act [1]
TrendForce:预计人形机器人芯片市场规模有望于2028年突破4800万美元
Zhi Tong Cai Jing· 2025-08-26 07:49
Group 1 - TrendForce's latest research indicates that NVIDIA's newly launched Jetson Thor is expected to enhance computational power for robots, potentially expanding the chip market [1] - The humanoid robot chip market is projected to exceed $48 million by 2028, driven by the adoption of Jetson Thor by companies like Agility Robotics, Boston Dynamics, and Amazon [1] - Jetson Thor features a Blackwell GPU and 128 GB memory, achieving 2070 FP4 TFLOPS AI computing power, which is 7.5 times that of the previous Jetson Orin [1] Group 2 - The International Federation of Robotics (IFR) suggests that humanoid robot development varies by country, with short-term focus on pilot projects, mid-term scaling in manufacturing and services, and long-term integration into daily household scenarios [4] - TrendForce notes that despite the powerful performance of the NVIDIA Jetson Thor series, its development kit is priced at $3,499, significantly higher than the previous Jetson Orin's $1,499 [4] - The industry trend aims to lower humanoid robot prices for broader adoption, indicating that companies planning simpler tasks may opt for more affordable chips [4]
大型语言模型稳定强化学习的新路径:几何平均策略优化GMPO
机器之心· 2025-08-13 00:52
本文主要作者:赵毓钟,中国科学院大学在读博士,微软亚洲研究院 MSRA 实习生,主要研究方向为多模态学习、语言模型后训练。刘悦,中国科学院大学在读 指导老师:万方,中国科学院大学计算机学院副教授,博导。叶齐祥,中国科学院大学电子学院教授,博导。 崔磊,微软亚洲研究院通用人工智能组(GenAI) 首席研究经理。韦福如,微软亚洲研究院通用人工智能组(GenAI)杰出科学家。 近年来,强化学习(RL)在大型语言模型(LLM)的微调过程中,尤其是在推理能力提升方面,取得了显著的成效。传统的强化学习方法,如近端策略优化 (Proximal Policy Optimization,PPO)及其变种,包括组相对策略优化(Group Relative Policy Optimization,GRPO),在处理复杂推理任务时表现出了强大的潜 力。然而,尽管它们在许多场景下都表现良好,仍然 面临着在训练过程中不 稳定 的问题 ,尤其是在处理带有极端重要性加权奖励时。几何平均策略优化 (Geometric-Mean Policy Optimization,GMPO),作为 GRPO 的稳定化版本,解决这一问题。本文将深入探讨 GM ...
马斯克宣布Grok 4 在限定时间内对所有用户免费开放
Sou Hu Cai Jing· 2025-08-11 09:15
【环球网科技综合报道】8月11日消息,据外媒AInvest报道,埃隆·马斯克旗下的xAI公司宣布其最新的大型语言模型Grok 4将在限定时间内向所有用户免费 开放。 据了解,xAI虽然将免费使用期限描述为"有限",但并未明确说明何时结束。(青云) ...
GPT-5来了,微软抢先接入:一键生成网页、博士级智能,所有用户免费使用;马斯克不服
Sou Hu Cai Jing· 2025-08-08 04:45
Core Viewpoint - OpenAI has launched its latest large language model, GPT-5, which is claimed to be the best model in the world, available for free to users after a delay of two and a half years since GPT-4's release [1][3][6]. Group 1: Model Features and Performance - GPT-5 utilizes an integrated model architecture that automatically selects reasoning depth based on the task, eliminating the need for users to switch modes [3][5]. - The model shows significant improvements in various fields, including coding, mathematics, writing, health, and visual perception [5][10]. - In coding tasks, GPT-5 achieved a first-attempt accuracy of 74.9% in the SWE-bench Verified benchmark, outperforming previous models [10][13]. - GPT-5's error rate for factual inaccuracies is significantly lower than that of its predecessors, with a 4.8% error rate compared to 20.6% for GPT-4o [18]. Group 2: User Access and Pricing - GPT-5 is accessible to all users, including free users with usage limits, while Plus and Pro users have higher usage quotas [5][10]. - The pricing for developers using the API is set at $1.25 per million tokens for input and $10 for output, making it cheaper than GPT-4o and other competitors [5][10]. Group 3: Safety and Customization - OpenAI has introduced a new safety training method called "safe completions," which helps the model provide useful answers while avoiding unnecessary refusals [20][21]. - GPT-5 allows users to choose from four preset personalities for interaction, enhancing the customization of user experience [21]. Group 4: Market Impact and Investment - The release of GPT-5 is expected to strengthen OpenAI's leading position in large model technology, boosting investor confidence and aiding in the company's valuation growth [31]. - OpenAI recently secured $8.3 billion in new capital, raising its valuation to $300 billion, which may be linked to the timing of GPT-5's launch [30][31].