混元深度思考模型T1

Search documents
李彦宏说 DeepSeek 幻觉高,是真的吗?
3 6 Ke· 2025-05-02 04:29
Core Insights - The article discusses the hallucination problem in large language models (LLMs), particularly focusing on DeepSeek-R1, which has a high hallucination rate compared to its predecessor and other models [2][6][13] - Li Yanhong criticizes DeepSeek-R1 for its limitations, including high hallucination rates, slow performance, and high costs, sparking discussions about the broader issues of hallucinations in AI models [2][6][19] - The hallucination phenomenon is not unique to DeepSeek, as other models like OpenAI's o3/o4-mini and Alibaba's Qwen3 also exhibit significant hallucination issues [3][8][13] Summary by Sections Hallucination Rates - DeepSeek-R1 has a hallucination rate of 14.3%, significantly higher than DeepSeek-V3's 3.9%, indicating a fourfold increase in hallucination [6][7] - Other models, such as Qwen-QwQ-32B-Preview, show even higher hallucination rates at 16.1% [6][7] - OpenAI's o3 model has a hallucination rate of 33%, nearly double that of its predecessor o1, while the lightweight o4-mini model reaches 48% [8][10] Industry Response - The AI industry is grappling with the persistent issue of hallucinations, which complicates the development of more advanced models [13][19] - Companies are exploring various methods to mitigate hallucinations, including retrieval-augmented generation (RAG) and strict data quality control [20][22][23] - Despite advancements in certain areas, such as multimodal outputs, hallucinations remain a significant challenge in generating long texts or complex visual scenarios [18][19] Implications of Hallucinations - Hallucinations are increasingly seen as a common trait among advanced models, raising questions about their reliability and user trust, especially in professional or high-stakes contexts [17][27] - The phenomenon of hallucinations may also contribute to creativity in AI, as they can lead to unexpected and imaginative outputs [24][26] - The acceptance of hallucinations as an inherent characteristic of AI models suggests a need for a paradigm shift in how AI is perceived and utilized [27]
智谱发的「干活Agent」,不用邀请码
36氪· 2025-04-01 13:52
以下文章来源于智能涌现 ,作者周鑫雨 智能涌现 . 直击AI新时代下涌现的产业革命。36氪旗下账号。 CEO张鹏: "我们不属To B赛道,拒被标签化。 " 文 | 周鑫雨 编辑 | 苏建勋 来源| 智能涌现 (ID:AIEmergence) 封面来源 | 视觉中国 交出后DeepSeek R1时代的答卷,对如今的六小虎而言,显得尤为重要。 DeepSeek R1和Manus,已经分别在推理模型和AI Agent领域炸了场。对于后来者而言,跟随是最为保守的路径。比如,百度发布 了推理模型文心X1,腾讯上线了混元深度思考模型T1。 在3月31日的OpenDay上,在国内资本市场拿钱到手软的智谱,开年交出的答卷 则是R1和 Manus的"plus版本"——具有深度思考 能力的Agent产品,"AutoGLM 沉思(以下简称'沉思')",已经免费上线。 | ·· 智濟 AutoGLM in | AutoGLM 安卓 7 | AutoGLM Web 7 | 加入社群 | 立即体验 | | --- | --- | --- | --- | --- | | AutoGLM 沉思 | | | | | | AutoGLM沉思是 ...