Artificial Intelligence
Search documents
刚刚,梁文锋署名,DeepSeek元旦新论文要开启架构新篇章
机器之心· 2026-01-01 08:22
新年第一天, DeepSeek 发布了一篇新论文,提出了一种名为 mHC (流形约束超连接)的新架构。 该研究旨在解决传统超连接在大规模模型训练中的不稳定性问题,同时保持其显著的性能增益 。 简单来说,DeepSeek 提出的 mHC 通过将传统 Transformer 的单一残差流扩展为多流并行架构,并利用 Sinkhorn-Knopp 算法将连接矩阵约束在双拟随机矩阵流形 上,成功解决了超连接(HC)在大规模训练中因破坏恒等映射属性而导致的数值不稳定和信号爆炸问题。 这篇论文的第一作者有三位:Zhenda Xie(解振达)、Yixuan Wei(韦毅轩)、Huanqi Cao。值得注意的是, DeepSeek 创始人 & CEO 梁文锋也在作者名单中。 传统的残差连接(即 Transformer 中的 x + F (x) 结构)凭借「恒等映射」保证了信号无损传输和训练稳定性。但它的瓶颈在于信息通道的宽度受限于隐藏层维度 C。 虽然这些方法带来了显著的性能提升,但但也带来了两个严重问题: 从根本上破坏了残差连接固有的恒等映射属性,导致了严重的训练不稳定性和受限的可扩展性,并额外增加了显著的内存访问开销。 为 ...
OpenDataArena全面升级版正式上线,四大核心模块重构数据价值评估新格局
机器之心· 2026-01-01 08:22
为破解长期以来学界与业界难以对数据进行价值量化的困局,上海人工智能实验室(上海 AI 实验室) OpenDataLab 团队在今年 8 月正式开源了首个全面、公正的后训练数据价值评测平台 —— OpenDataArena (ODA) 。该项目致力于将数据选择从「盲目试错」的炼丹术,转变为一门可复现、可分析、可累积的严谨 科学。 在初版系统发布后的数月间,项目通过团队内部及小范围社区用户的深度使用,完成了高强度的技术验证 与功能打磨。伴随着评测规模、工具链和分析能力的持续扩展,近期,我们终于迎来了 ODA 的全面升级 —— 一个结论更系统、功能更完整、视角更多元的正式版本 ,该项目正式面向全体开发者开放。 项目主页: https://opendataarena.github.io/ 开源工具: https://github.com/OpenDataArena/OpenDataArena-Tool 数据集: https://huggingface.co/OpenDataArena/datasets 报告链接:https://arxiv.org/pdf/2512.14051 ODA 的核心理念非常明确:数据价值必须 ...
港股AI赛道再添猛将!MiniMax全球发售启动,14家基石认购超27.23亿港元
Zheng Quan Shi Bao Wang· 2026-01-01 08:11
继智谱之后,港股AI赛道将再添猛将。MiniMax(00100.HK)于2025年12月31日启动招股,至2026年1 月6日结束,计划于2026年1月9日正式登陆港交所。 根据招股书,MiniMax本次IPO拟发行2538.922万股,定价区间为151港元至165港元,在不考虑发售量 调整权及超额配股权行使的情况下,发行估值将介于461.23亿港元至503.99亿港元之间。公司称,约 90%所得款项净额将用于公司未来五年的研发,包括开发公司的大模型和AI原生产品。具体来看,在未 来五年内将约70%所得款项净额投入大模型的研发;在未来五年内将约20%所得款项净额用 AI原生产品 的开发、改进及全球规模化。 值得注意的是,MiniMax此次引入了包括Aspex、Eastspring、Mirae Asset、阿里巴巴及易方达在内的14 家基石投资者,认购总额3.5亿美元(约27.23亿港元)。投资者类型包括国际长线、头部科技、中资长 线及产业战略等多个维度。其中,阿布扎比旗下ADIA认购6500万美元,阿里旗下Alisoft China认购3000 万美元,Aspex Master Fund、博裕资本分别认购35 ...
谷歌三年逆袭:草蛇灰线,伏脉千里
3 6 Ke· 2026-01-01 07:13
Core Insights - OpenAI has declared a "red alert" status, pausing all non-core projects to focus on improving ChatGPT, signaling a shift in the tech landscape [2] - Google has successfully regained its competitive edge in AI by launching advanced models and restructuring its internal processes [5][18] Group 1: Google's Response to Competition - Google initially faced a significant threat from OpenAI's ChatGPT, which gained over 100 million users within two months of its launch [3] - The company reacted by prioritizing the development of its own AI products, including Bard, in response to the competitive pressure [11] - Following a public error during Bard's launch, Google experienced a significant drop in stock price, leading to a reassessment of its product release strategy [14][15] Group 2: Organizational Changes - Google has undergone a major organizational restructuring, reducing management layers by approximately 35% to enhance decision-making speed [19] - The company has adopted a more agile development approach, shifting from a "perfection before release" mindset to "release and iterate" [16] - Founders Larry Page and Sergey Brin have returned to take a more active role in AI projects, emphasizing the need for rapid progress in the competitive landscape [34][36] Group 3: Talent Acquisition and Retention - Google has implemented a "boomerang program" to rehire former employees, with about 20% of new AI engineers being former staff [47] - The company has made significant investments to attract top talent, including a $2.7 billion licensing fee to bring back Noam Shazeer, a key figure in AI development [52] - Reforms in compensation structures have aligned rewards with product performance metrics rather than solely academic achievements [55] Group 4: Ongoing Competition - Despite Google's resurgence, competition remains fierce, with OpenAI planning to release a new model that could surpass Google's Gemini 3 [58] - Other competitors like Anthropic and Meta are also making strides in the AI space, indicating that the race for AI dominance is far from over [57]
一文读懂 | 预见2026——机构首席这么看
Xin Hua Cai Jing· 2026-01-01 06:42
Macroeconomic Insights - China's economic growth is characterized by a shift from a single focus on growth rate to a more balanced and stable economic structure, with significant contributions from domestic consumption [2] - The resilience of exports has exceeded expectations despite external tariff pressures, indicating a robust adaptability of China's economic fundamentals [2] - The "anti-involution" policies are positively impacting the midstream manufacturing sector, leading to a revaluation of value driven by deeper industrial structure changes [2] Bond Market Analysis - The 10-year government bond yield is experiencing narrow fluctuations, with a range of approximately 30 basis points, reflecting a complex interplay of market forces [3] - The narrative in the bond market has shifted from linear macroeconomic projections to a focus on event impacts, policy expectations, and institutional behaviors [3] - Chinese bonds are becoming increasingly attractive to global investors, offering stable returns and low correlation, positioning them as a potential "stability anchor" in global portfolios [4] Artificial Intelligence Sector - The AI sector is transitioning from a "technology race" to a practical "value realization" phase, with rapid industry expansion and a shift in investment focus from infrastructure to application layers [5] - The structure of the AI industry is expected to resemble an inverted pyramid, with increasing market size driven by applications, models, and chips [5] - Companies with a growing share of AI revenue are likely to attract more capital market attention, leading to potential valuation increases [5] Commodity Market Trends - The commodity market is undergoing a significant value reassessment, with a clear divide in performance between traditional cyclical products and sectors like precious and non-ferrous metals [7] - Analysts predict that the structural differentiation in the commodity market will continue, driven by supply reshaping, policy adjustments, and enhanced financial attributes [7] - Non-ferrous metals are widely regarded as having the most significant upside potential in the upcoming market landscape [7]
ARR 超300万刀、实现月度盈亏平衡!ListenHub 完成天使+轮融资,加速出海进程
AI前线· 2026-01-01 05:33
Core Insights - MarsWave, a leading company in generative AI and multimodal interaction technology, has completed a $2 million angel round financing led by Tianji Capital, with participation from Xiaomi co-founder Wang Chuan [2] - Despite profitability concerns in the AI audio sector, MarsWave has achieved an annual recurring revenue (ARR) exceeding $3 million and reached monthly breakeven, establishing itself as one of the few AI-native companies with a validated profit model [2] - The funding will primarily be used to expand into the North American market and develop the next generation of multimodal agents [2] Product and Market Strategy - MarsWave's core product, ListenHub, transforms complex professional knowledge, industry reports, and internal documents into easily understandable "knowledge explanation videos, podcasts, and slides" [2] - The platform has a 5% paid user rate and a monthly churn rate below 3%, indicating strong demand for its services [4] - ListenHub has undergone a significant product and positioning upgrade, rebranding from an "AI voice and podcast tool" to "the narrator of all things," with a new slogan emphasizing one-click generation of videos, podcasts, and PPTs [6] Global Expansion Plans - The recent financing will focus on global strategic layout, with an initial emphasis on the North American market [8] - ListenHub plans to launch a "Global Creator Program" to replicate its validated organic growth model, which has achieved $3 million ARR without advertising spend [8] - The new COO, with extensive experience in AI and internet operations, will lead the global strategy, leveraging the high demand for efficient knowledge digestion tools in North America [6][8]
拾象 2026 AI Best Ideas:20 大关键预测
海外独角兽· 2026-01-01 05:25
出品:拾象投研团队 预测每一年的 AI 关键趋势是拾象投研团队的传统,我们以 2026 年的 20 大 AI 关键预测,作为新一 年的开启和新年礼物送给 拾象和海外独角兽的朋友们。 2025 是 AI 相当激荡的一年,以 DeepSeek 开启,以 Manus 时刻作为完美收尾,同时我们也见证了 模型 Agentic 能力的跨越式进步、AI Coding 领域的 ARR 奇迹,以及 Google 的叙事反转等等…而在 2026,AI 新范式、World Model、多模态等领域同样蕴含着惊喜。 再次祝大家新年快乐!我们和大家共同期待着 2026 年 AI 领域出现更多振奋人心的时刻和未来信 号。 | 5 | xAI 被并入 Tesla,打通数字和物理世界 AGI | | --- | --- | | 6 | 2026 是 Enterprise Al 大年,Anthropic ARR 至少翻倍 | | 7 | 多模态迎来"Al Coding 时刻", 诞生 Al 版 Pokémon GO | | 8 | Long-horizon Tasks 和 多模态需求爆发,带来新一波 10 亿 美元 ARR 数据公司 | | ...
再融 5 亿美金,新模型带动 Kimi 海外 API 收入呈 4 倍级速度增长
投资实习所· 2026-01-01 04:34
2025 年的最后两天,没想到两个国内 AI 团队给行业带来了非常不错利好的消息。在 Manus 被高价收购后,Kimi(月之暗面)昨天也宣布完成了 5 亿美 金的 C 轮融资,投后估值达到了 43 亿美金。 Kimi 产品从 5 月开始高频推出新的 Agent 功能,发布了 Researcher, OK Computer, PPT, Kimi Code 等新品,功能日渐强大。借助 K2 模型的 sota 表 现,C 端商业化指数增长。 K2 和 K2 Thinking 分别作为大规模基座模型与强化版思考模型,标志着 Kimi 在 "复杂推理、长链思考" 上取得实质突破。不仅发布了中国首个程度扩 展到万亿参数级别的大模型,还搭建了第一个开源 Agentic 思考模型,在多个核心 Benchmark 上达到甚至超越 OpenAI 同类模型的表现。 K2 Thiking 算得上是一个真正意义上的"支持数百步工具调用的思考模型",其技术突破的核心落脚点不再只是单一的大模型,而是能连续进行自我推理和 工具调用的思考型智能体。他让模型在执行复杂任务时,可以像人一样持续思考、验证信息、横向探索答案。 比方说它可以连续执行 ...
谷歌三年逆袭:草蛇灰线,伏脉千里
机器之心· 2026-01-01 04:33
不过这一次,发出警报的不是谷歌,而是 OpenAI。 当 OpenAI CEO 萨姆・奥特曼在内部备忘录中宣布进入最高级别的「红色警报」状态,暂停广告、医疗 AI 智能体等所有非核心项目,将全部资源集中于改进 ChatGPT 时,整个科技圈都意识到风向变了。 三年前的同一幕还历历在目。 2025 年 12 月 1 日,硅谷再次拉响了「红色警报」。 彼时的谷歌,在自己最擅长的 AI 领域,被一家成立仅七年的创业公司杀了个措手不及。 在一段低谷时期,谷歌员工们聚集在走廊里,公开表达对谷歌可能沦为下一个雅虎的担忧。 而今,剧情反转。 谷歌推出 Gemini 3 大语言模型、Nano Banana 图像生成模型、Veo3 视频生成模型以及 TPU 芯片,在各个战线全面开花,重夺技术制高点。 短短三年时间,从被动挨打到主动进攻,谷歌的逆袭绝非偶然。 攻守易形,谷歌究竟做对了什么? 内部反思:从慢公司到快公司 2022 年 12 月,ChatGPT 的用户数在 5 天内突破百万,谷歌召开了一场不寻常的全体员工大会。 2022 年 11 月 30 日,ChatGPT 横空出世,短短五天用户突破百万,两个月突破一亿。谷歌内部 ...
总编辑圈点 | 更小内存带来更强AI,压缩内存可提升大模型处理任务准确性
Huan Qiu Wang Zi Xun· 2026-01-01 04:29
来源:科技日报 英国爱丁堡大学与英伟达的联合团队开发出一种新方法,能够压缩人工智能(AI)模型运行时所依赖的内存,从而在保持响应速度不变的情况下,提升模 型处理复杂任务的准确性,或显著降低其能耗。这也意味着,更小的内存将带来"更强的AI",有望打破大语言模型(LLM)性能瓶颈。 团队发现,将LLM所使用的内存压缩至原有大小的1/8后,模型在数学、科学和编程等专业测试中的表现反而更好,且推理时间并未延长。这一方法亦有助 于模型同时响应更多用户请求,从而降低单个任务的平均功耗。除了节能优势,这项改进还有望使AI更适用于处理复杂问题的系统,或存储速度较慢、内 存容量有限的终端设备,例如智能家居产品和可穿戴技术。 AI模型通常通过"思考"更复杂的假设,或同时探索更多可能性来寻找答案。在此过程中,模型需要将已生成的推理线程内容暂存于一种称为"KV缓存"的内 存中。随着线程数量增多或线程长度增加,KV缓存的体积会迅速扩大,成为性能瓶颈,拖慢模型输出响应的速度。 为突破这一限制,团队提出了一种名为"动态记忆稀疏化"(DMS)的内存压缩技术。该方法并非保留所有生成的标记(即AI模型处理的基本数据单元), 而是动态判断哪些标记 ...