大语言模型
Search documents
国芯科技(688262.SH):CNN300预计将支持INT8/FP8/FP16等常规AI应用所需要的数据类型
Ge Long Hui· 2025-09-05 08:20
格隆汇9月5日丨国芯科技(688262.SH)在投资者互动平台表示,CNN200采用GCU+NN网络架构设计,单 核算力最大达到10TOPS@INT8,适用于各种边缘计算AI SoC芯片,可广泛应用于包括机器狗等众多AI 应用场景中。其核心采用脉动阵列计算单元实现,通过动态功耗与内存面积的协同优化,结合数据零拷 贝与混合精度计算,有效降低能耗和延迟,实现了业内领先的能效比;集成片上缓存与网层间片内数据 共享技术,显著减少DDR访问;硬件加速单元可覆盖90余种神经网络算子,且具有快捷的扩展接口设 计,可以根据神经网络模型的发展进行扩展补充;支持训练后量化PTQ,提供对称、非对称、逐层和逐 通道多种量化策略,支持CNN、RNN等主流神经网络结构,兼容INT8与FP16数据精度,兼容PyTorch、 TensorFlow、ONNX、PaddlePaddle等主流深度学习框架,具备广泛的生态适应性。配套的NPU工具链 涵盖了从模型格式转换、预处理、量化、编译、仿真等不同功能的工具,为NPU的推理实施、应用落地 提供软件生态支撑。CNN200不支持FP8数据类型,而公司正在研发的CNN300是面向AIPC应用的NPU ...
苹果(AAPL.US)明年拟推AI搜索工具,联手谷歌升级Siri挑战OpenAI
Zhi Tong Cai Jing· 2025-09-04 01:16
Core Insights - Apple plans to launch its own AI-driven search tool next year, intensifying competition with OpenAI and Perplexity AI Inc. [1] - The new system, internally dubbed "World Knowledge Answers," aims to integrate with Siri and potentially be applied to Safari and Spotlight [1][2] - The initiative is part of a long-awaited upgrade for Siri, which is expected to be released in spring as an "answer engine" [1][2] Group 1: Siri Upgrade and AI Integration - The new search experience will feature an interactive interface that integrates text, photos, videos, and local points of interest, providing AI-driven summaries for faster and more accurate results [2] - Current Siri struggles with complex queries and often defaults to Google or ChatGPT for answers, highlighting its shortcomings in Apple's AI capabilities [2] - Apple is working on a comprehensive upgrade for Siri, which will allow it to utilize personal data and screen content for better query handling [4] Group 2: Collaboration with Google - Apple has reached an agreement with Google to evaluate AI models developed by Google for supporting Siri, indicating a collaborative approach to enhance Siri's capabilities [1][6] - The new Siri will potentially use a customized version of Google's Gemini model, which will run on Apple's private cloud servers [6][7] - Apple is also considering using Anthropic's Claude model but is currently leaning towards Google's technology due to more favorable financial terms [7] Group 3: Future Developments and Strategic Moves - Apple plans to visually revamp Siri and develop a health AI agent to support a paid health subscription service by 2026 [8] - During the development of the new Siri, Apple has considered acquisitions, including discussions with Perplexity and Mistral, although it is no longer actively pursuing these options [9] - The company is facing a talent exodus, with key members of its AI team leaving for competitors, which may impact its development efforts [9]
大模型“记性差一点”反而更聪明,金鱼损失随机剔除token,让AI不再死记硬背
3 6 Ke· 2025-09-03 23:54
大语言模型如果不加约束,很容易把训练数据原封不动地复刻出来。为解决这个问题,来自马里兰大学、图宾根大学和马普所的研究团队提出了一个新方 法——金鱼损失(Goldfish Loss)。 训练大模型时,有时让它"记性差一点",反而更聪明! 顾名思义,金鱼损失就是让模型像金鱼一样,不去死记每一个细节,而是在损失函数计算时随机剔除一小部分token。 由此,模型不再逐字记住训练集内容,但仍能学会语言规律。 实验显示,LLaMA-2在使用金鱼损失后: 用网友的精辟评论概括就是:dropout,但损失函数! 在梯度计算中随机屏蔽部分token 金鱼损失的核心理念非常简单,就是在模型训练过程中随机剔除一部分训练文本中的tokens,使其不参与损失计算。 这样一来,当模型在推理阶段遇到这些位置时,就只能"猜测",而不是逐字逐句复现训练数据的完整序列。 $\mathcal{L}_{\text{goldfish}}(\theta)=-\frac{1}{|G|}\sum_{i=1}^{L}G_{i}(x_{i})\log P(x_{i}|x_{<i};\theta)$. 此外,为了保证被剔除token的一致性,研究人员设计了一种基 ...
百度视觉技术部多模态感知与理解招聘(社招/校招/实习)
自动驾驶之心· 2025-09-03 23:33
Core Viewpoint - The article focuses on recruitment opportunities in the field of video understanding and artificial intelligence, highlighting the responsibilities and requirements for various positions within the company [2][4][5]. Recruitment Responsibilities - The company is looking for candidates to engage in cutting-edge algorithm research and development for video understanding, specifically targeting tasks such as video question answering, video summarization, temporal action localization, and event detection [2]. - Responsibilities also include building large-scale, high-quality multimodal datasets, distributed training of large models, and collaborating with business teams for practical application and innovation [2]. Job Requirements - Candidates should possess a master's or doctoral degree in computer science, artificial intelligence, electronic information, automation, or related fields [4]. - Experience in top AI conferences or journals is preferred, particularly in areas like computer vision and multimodal learning [5]. Advantages of Joining - The company offers a supportive environment with ample hiring capacity for new graduates, interns, and experienced hires, along with competitive salaries and benefits such as mentorship and participation in significant projects [6]. Community and Resources - The article mentions a community platform for job seekers in autonomous driving and robotics, providing resources like interview questions, industry reports, and salary negotiation tips [7][19].
苹果将发布自主AI网页搜索工具 已与谷歌达成模型协议
Feng Huang Wang· 2025-09-03 22:52
Core Insights - Apple plans to launch its own AI-based web search tool next year to enhance competition with OpenAI and Perplexity AI [1] - The new system, named "World Knowledge Answers," will be integrated into Siri and aims to provide a platform for users to query information across the web [1] - A significant upgrade to Siri is expected in spring next year, with the initiative referred to as an "answer engine" [1] Group 1 - Apple is developing a new search experience that will include an interface combining text, photos, videos, and local points of interest [2] - The new system will feature an AI-driven summarization system to make search results faster, clearer, and more accurate than the current Siri [2] - Apple has reached a formal agreement with Google to evaluate and test an AI model developed by Google to assist in driving the new Siri experience [2]
Midoo.AI 发布:AI Agent 能否破解教育行业千亿美金的「无解方程」?
Founder Park· 2025-09-03 08:24
用 AI 学语言,正成为越来越多人的选择。我们习惯于在手机里下一个 Duolingo 或 Babbel,利用碎片时间打卡闯关,仿佛离「掌握一门外语」的目标又近 了一步。 传统的 AI 语言学习,用游戏化和碎片化的策略,在过去的十年里赢得了大量初级学习者的认可。但当你试图依赖它获得真正的能力提升时,却又不可避 免地会遇到内容僵化、反馈机械、场景缺失等问题。学了半天,词汇量上去了,但面对真实世界的交流,依然是「哑巴外语」。 这些磕磕绊绊,一方面反映了上个时代产品能力的边界,另一方面,也揭开了整个教育行业深层困境的冰山一角。教育产业问题的根源,在于对于用户 「学习效果」的保障和交付;而这个问题的答案,远比想象中复杂,几乎直指「千人千面」的教育本质。 今天,一家名为 Midoo.AI 的初创公司,带着他们全球首个语言学习 Agent,试图给出自己的解答。其创始人 Mark 曾推出 Talk AI,在中国开创了 AI 口语 练习赛道。这一次,Mark 携手前 Fellou.AI 联合创始人 Leo,从「Day One Global」开始,目标直指全球第一的 AI 语言学习产品。他们想做的,可能不止 是搅动两千亿美金的市 ...
大模型“记性差一点”反而更聪明!金鱼损失随机剔除token,让AI不再死记硬背
量子位· 2025-09-03 05:49
Core Viewpoint - The article introduces a new method called "Goldfish Loss" that allows large language models to avoid memorizing training data verbatim, thereby enhancing their ability to learn language patterns while reducing the risk of overfitting [1][4]. Group 1: Goldfish Loss Concept - Goldfish Loss encourages models to forget specific details by randomly omitting a small portion of tokens during loss calculation [3][6]. - This method prevents the model from reproducing the training data word-for-word, while still enabling it to generate coherent text [4][9]. - The approach utilizes a hashing-based masking strategy to ensure consistency in the tokens that are omitted during training [8][14]. Group 2: Comparison with Traditional Methods - Unlike traditional regularization methods like Dropout, which introduce noise randomly, Goldfish Loss employs a static masking technique to consistently omit the same tokens across training iterations [11][19]. - This consistency fundamentally prevents the model from memorizing complete training sequences, as it cannot piece together omitted tokens from different training instances [12][14]. Group 3: Experimental Results - Experiments demonstrated that in extreme scenarios, standard training led to the model memorizing 84 out of 100 articles, while Goldfish Loss resulted in no memorization [22][24]. - In standard training scenarios, Goldfish Loss also significantly reduced the model's tendency to reproduce training data verbatim [24]. - Performance tests indicated no systematic differences in overall capabilities between models trained with Goldfish Loss and those trained with standard loss methods [26]. Group 4: Implications and Considerations - The core of Goldfish Loss lies in ignoring certain tokens during gradient calculations, which may require the model to process more data to compensate for the omitted information, potentially affecting computational efficiency [28].
【9月9日直播】大模型复杂推理技术:如何重塑AI推理逻辑
机器人大讲堂· 2025-09-03 04:19
Core Viewpoint - The article discusses the evolution of large language models from "fast thinking" to "slow thinking" paradigms, emphasizing the importance of deep reasoning and logical coherence in AI development [2]. Group 1: Slow Thinking Technology - The new model DeepSeek-R1 enhances long reasoning chain capabilities through reinforcement learning, demonstrating superior understanding and decision-making in complex tasks [2]. - "Slow thinking" technology is identified as a key pathway for advancing large models towards higher intelligence levels, leading the industry towards greater automation and reliability [2]. Group 2: Seminar Details - A seminar titled "AI Slow Thinking: Complex Reasoning Technology of Large Models" was organized by Springer Nature, featuring Professor Zhao Xin from Renmin University of China, who shared insights on the latest research in slow thinking technology [2][6]. - Dr. Chang Lanlan, the Director of Computer Science Book Publishing at Springer Nature, discussed the new AI book resources and academic publishing in 2025 [2][6]. Group 3: Speaker Profiles - Professor Zhao Xin has a research focus on information retrieval and natural language processing, with over 200 published papers and significant contributions to large language models [8]. - Dr. Chang Lanlan has extensive experience in computer science book publishing and has been with Springer Nature for 14 years, overseeing AI-related publications [11]. Group 4: Book Recommendations - A new book led by Professor Zhao Xin and his team provides a systematic framework for learners in the large model field, aiming to help readers grasp core concepts and cutting-edge algorithms [19]. - The Springer Nature AI electronic book collection offers a comprehensive resource for research and learning, covering a wide range of topics from foundational knowledge to advanced research outcomes [21].
潍坊推动政务服务从网上办迈向智能办 “数字政务服务官”上岗
Da Zhong Ri Bao· 2025-09-03 02:42
Core Insights - The article highlights the implementation of intelligent government services in Weifang City, focusing on the use of AI technologies to enhance efficiency and user experience in administrative processes [1][2][3] Group 1: Intelligent Customer Service - Weifang has developed an intelligent customer service system that allows users to inquire about administrative matters using natural language, significantly improving the search process for required services [1] - The knowledge base for this system includes over 21,000 user-friendly entries derived from 1,974 administrative service items, achieving an accuracy rate of over 90% in responding to online inquiries [1] Group 2: Intelligent Pre-Approval System - The "Ai Xiaowei" system consolidates 126 approval-related policies and over 8,000 historical cases to create a smart pre-approval rules library, enhancing the efficiency of the approval process by 30% [2] - This system has already assisted in processing 285 business applications and identified over 400 material errors within a month of its launch [2] Group 3: Streamlined Approval Process - Weifang's government has developed a decision tree for 12 initial approval items, breaking them down into over 1,000 decision nodes to create 30 standardized guidance scenarios, which helps users navigate the approval process more effectively [2] - The introduction of intelligent guidance has led to a 60% increase in the first-time approval rate for submitted materials [2] Group 4: Overall Innovation in Government Services - The integration of AI technologies in Weifang's administrative services represents a significant innovation, transitioning from traditional online services to a more intelligent and user-friendly approach [3] - The "three-in-one" intelligent service model combines intelligent customer service, pre-approval, and guidance, positioning itself as a comprehensive digital government service solution [3]
研报掘金丨太平洋:维持长盈精密“买入”评级,人形机器人进度加快
Ge Long Hui A P P· 2025-09-02 09:41
格隆汇9月2日|太平洋证券研报指出,长盈精密2025H1实现归母净利润3.06亿元,同比减少29.37%, 业绩符合预期,消费电子、新能源领域稳健增长,人形机器人产能建设加快。在人形机器人领域,得益 于公司提前布局,2025年上半年海外人形机器人零件实现收入超过3,500万元,而2024年全年仅为1,011 万元。除此外,报告期内公司还取得了多个国产人形机器人品牌的量产订单。在全球能源转型与绿色发 展的大背景下,新能源行业将持续保持高速增长与创新变革的趋势。在具身智能人形机器人领域,受生 成式人工智能和大语言模型的催化,已经进入商业化应用阶段,有望成为继计算机、智能手机、新能源 汽车之后的颠覆性产品,将深刻变革人类生产生活方式。维持公司"买入"评级。 ...