Workflow
Large Language Model
icon
Search documents
X @The Economist
The Economist· 2026-03-22 08:20
Even when an English-language version of a large language model passes a safety test, it can still hallucinate dangerous misinformation in other languages https://t.co/7ldkjGyvmU ...
X @The Economist
The Economist· 2026-03-19 18:20
To get the most accurate answer from a large language model, make sure to prompt it in the right language. We explain the widespread problem https://t.co/QbqNU70HNR ...
从AlphaGo到DeepSeek R1,推理的未来将走向何方?
机器之心· 2026-02-19 23:43
Core Insights - The article discusses the transformative impact of AI, particularly in the context of reasoning models that have evolved from basic language models to systems capable of systematic thinking and causal reasoning [1][4]. Group 1: Evolution of AI Models - Since the introduction of ChatGPT in 2022, AI has shifted from mere statistical language imitation to understanding and manipulating logic [1]. - Eric Jang emphasizes that the real change lies in models beginning to think systematically, which could lead to a restructuring of productivity, organizational forms, and power structures in society [1][4]. Group 2: Capabilities of Modern AI - Modern programming agents, such as Claude Code, have become proficient in coding and reasoning, allowing users to automate coding tasks and generate hypotheses and conclusions [5][8]. - The ability of AI to run experiments and optimize parameters has evolved, enabling it to modify its own code and reflect on experimental results [8][9]. Group 3: Reasoning in AI - Reasoning can be categorized into deductive and inductive reasoning, with the former relying on strict logical rules and the latter focusing on probabilistic judgments [19][20]. - The limitations of traditional reasoning systems highlight the need for AI to handle the complexities and uncertainties of the real world, which neural networks can approximate through end-to-end probabilistic modeling [20][21]. Group 4: Future of AI Reasoning - The article suggests that the future of reasoning in AI will involve powerful base models that can utilize reinforcement learning and rule-based rewards to enhance reasoning capabilities [38][39]. - There is potential for further simplification and optimization of reasoning processes, which could lead to significant advancements in AI's ability to handle complex tasks [39][40]. Group 5: Implications for Research and Development - The automation of research processes is expected to become standard, significantly increasing productivity in various fields, including non-AI domains [43]. - The demand for reasoning computational power is anticipated to grow astronomically, similar to how air conditioning has transformed productivity in warmer regions [44].
阿里除夕发布千问3.5,性能媲美Gemini 3,价更低
Nan Fang Du Shi Bao· 2026-02-16 10:16
Core Insights - Alibaba has launched the Qwen3.5-Plus model, which is claimed to rival Gemini 3 Pro, marking it as the strongest open-source model globally [1][3] - The Qwen3.5-Plus model features a total of 397 billion parameters, with only 17 billion activated, achieving superior performance with significantly reduced memory usage and enhanced inference efficiency [1][4] - The model has transitioned from a pure text model to a native multimodal model, incorporating visual and text mixed tokens for training, which has improved its reasoning capabilities and knowledge acquisition [1][3] Performance and Efficiency - Qwen3.5-Plus has demonstrated exceptional performance in various multimodal reasoning tasks, achieving top scores in assessments such as MathVision, VQA, and video understanding [3][4] - The model's inference throughput can be increased by up to 19 times in long-context scenarios, showcasing a substantial improvement in efficiency [4] - Innovations in the underlying architecture, including a self-developed gating technology and a combination of linear attention mechanisms, have contributed to the model's efficiency and performance [3][4] Market Context - The launch of Qwen3.5-Plus coincides with a wave of new releases from domestic AI models, including ByteDance's Doubao 2.0 and MiniMax M2.5, indicating a competitive landscape in the AI model sector [5] - The advancements in Qwen3.5-Plus are expected to enhance its application in various domains, including mobile and PC environments, improving operational efficiency for users [4]
大模型行业点评:模型百花齐放,迭代日新月异
ZHESHANG SECURITIES· 2026-02-12 04:16
Investment Rating - The industry investment rating is "Positive" (maintained) [6] Core Insights - Domestic large models have been intensively released around the Spring Festival, initiating an AI arms race. Notable releases include DeepSeek's new model with a context processing capability of 1 million tokens, GLM-5 which ranks first globally in programming and agent testing, and ByteDance's Seedance 2.0 aimed at revolutionizing video creation [1][2] - The usability of agents is increasing, with large models transitioning from chat to collaboration. Claude Opus 4.5 can autonomously program for 5 hours, and AI coding agents are expected to double their task handling time every 4 months starting from 2024-2025, compared to a 7-month doubling period from 2019-2024 [2] - The demand for inference is expected to rise due to large-scale applications, with significant increases in token consumption for agent execution compared to dialogue scenarios. The cost of generating a 5-second 720P video is approximately 4 RMB, with Seedance costing about 2.3 RMB, indicating a substantial cost advantage over manual production [3] Summary by Sections Model Updates - MiniMax's M2.5 model is set to launch soon, currently in internal testing for the MiniMax Agent product. Other updates include GLM-5 from Zhizhu, which has achieved state-of-the-art capabilities in coding and agent functions, and DeepSeek's new model with a context window increased to 1 million tokens [7] Related Companies - Key companies mentioned include MiniMax, Zhizhu, Yunsai Zhilian, UCloud, Capital Online, Qingyun Technology, Wangsu Technology, and Nanxing Co. [4]
智谱GLM-5:从“会写”到“会完成” 赋能真实生产力场景
Zhi Tong Cai Jing· 2026-02-12 00:43
在衡量模型经营能力的Vending Bench2上,GLM-5取得开源模型第一的表现。Vending-Bench2要求GLM- 5在一年期内经营一个模拟的自动售货机业务,并尽可能多地在年底积攒银行账户余额。GLM-5目前的 账户余额达到4432美元,经营表现接近Claude Opus4.5,展现出优秀的长期规划与资源管理能力。 算力支撑方面,GLM-5已完成与华为昇腾、摩尔线程、寒武纪、昆仑芯、平头哥、沐曦等国产算力平 台的深度推理适配,通过算子优化与硬件加速,实现高吞吐、低延迟稳定运行,为线上服务提供坚实保 障。生态方面,开发者已利用其能力端到端地开发出可部署上线的应用。通过与OpenClaw等平台结 合,GLM-5能化身为7x24小时的智能助手,处理搜索、整理、编程等各类任务。 智谱(02513)新一代旗舰大模型GLM-5不仅在性能上领先,更具备端到端完成大型工程任务的能力,堪 称开源SOTA级"系统架构师",同时已实现全栈国产芯片适配。 GLM-5面向生产级落地设计,可在极少人工干预下,自主完成Agentic长程规划执行、后端重构、深度 调试等系统工程任务。内部评估显示,其在前端、后端、长程编程任务上大 ...
智谱(02513)GLM-5:从“会写”到“会完成” 赋能真实生产力场景
智通财经网· 2026-02-12 00:32
智通财经APP获悉,智谱(02513)新一代旗舰大模型GLM-5 不仅在性能上领先,更具备端到端完成大型 工程任务的能力,堪称开源 SOTA 级 "系统架构师",同时已实现全栈国产芯片适配。 GLM-5 面向生产级落地设计,可在极少人工干预下,自主完成Agentic长程规划执行、后端重构、深度 调试等系统工程任务。内部评估显示,其在前端、后端、长程编程任务上大幅超越GLM-4.7,真实编程 体验逼近 Claude Opus 4.5。 在衡量模型经营能力的Vending Bench 2上,GLM-5取得开源模型第一的表现。Vending-Bench 2要求 GLM-5在一年期内经营一个模拟的自动售货机业务,并尽可能多地在年底积攒银行账户余额。GLM-5 目前的账户余额达到4432美元,经营表现接近Claude Opus 4.5,展现出优秀的长期规划与资源管理能 力。 算力支撑方面,GLM-5 已完成与华为昇腾、摩尔线程、寒武纪、昆仑芯、平头哥、沐曦等国产算力平 台的深度推理适配,通过算子优化与硬件加速,实现高吞吐、低延迟稳定运行,为线上服务提供坚实保 障。生态方面,开发者已利用其能力端到端地开发出可部署上线的应 ...
Youdao(DAO) - 2025 Q4 - Earnings Call Transcript
2026-02-11 11:02
Financial Data and Key Metrics Changes - In Q4 2025, net revenues reached RMB 1.6 billion, a 16.8% year-over-year increase, driven by growth in learning services and online marketing services [4][15] - Operating profit for Q4 was RMB 60.2 million, marking a 113% quarter-over-quarter increase but a 28.5% decrease year-over-year [4] - For the full year 2025, total net revenues were RMB 5.9 billion, a 5% increase year-over-year, with operating profit growing to RMB 221.3 million, up 48.7% year-over-year [5][20] - Net cash flow from operating activities for 2025 was RMB 55.2 million, a significant improvement from a net outflow of RMB 67.9 million in 2024 [6][20] Business Line Performance Changes - Learning services in Q4 generated net revenues of RMB 727.2 million, a 17.7% year-over-year increase, with digital content services contributing RMB 436.1 million, up 12.2% year-over-year [6][15] - Online marketing services saw Q4 net revenues of RMB 660.9 million, a 37.2% increase year-over-year, driven by demand from the NetEase Group and overseas markets [9][16] - Smart devices segment reported Q4 net revenues of RMB 176.5 million, down 26.6% year-over-year, primarily due to decreased demand for smart learning devices [11][16] Market Data and Key Metrics Changes - The online marketing segment's gross margin was 27.8% in Q4, reflecting a 2 percentage point sequential improvement despite a year-over-year decline [11] - International KOL revenues increased by over 50% year-over-year in Q4, with successful campaigns executed in over 50 countries [10] Company Strategy and Development Direction - The company aims to continue its AI-native strategy, focusing on innovative product development and competitive positioning in learning services and advertising [13][24] - Plans for 2026 include leveraging AI capabilities to enhance online marketing services and learning services, with expectations of double-digit growth in learning services [25][26] - The company is committed to improving the operational health of its smart devices segment while focusing on key products like the dictionary pen and tutoring pen [29] Management's Comments on Operating Environment and Future Outlook - Management expressed confidence in achieving sustainable growth through AI-driven innovations and strategic investments [48][49] - The company anticipates strong momentum in its core business units, particularly in AI-driven subscription services and advertising [48] Other Important Information - Youdao Lingshi's revenue surged over 40% year-over-year, with retention rates exceeding 75% [7] - The company plans to introduce new AI applications and agents to enhance user engagement and expand service offerings [28][56] Q&A Session Summary Question: 2026 outlook across different business lines - Management aims to grow sustainably while enhancing innovative product offerings, supported by the AI-native strategy [24] Question: Plans for Youdao Lingshi in 2026 - Management is confident in Youdao Lingshi's growth, focusing on product refinement and efficient customer acquisition [33][34] Question: Outlook for overseas advertisement growth in 2026 - The company plans to deepen its international KOL business and explore overseas programmatic advertising to drive growth [41][42] Question: Goal for total cash flow in 2026 - The focus will be on improving operating cash flow while maintaining a balanced approach to strategic investments [49][50] Question: Deployment of AI agents in Youdao's business - AI agents will be applied in advertising and learning applications, enhancing user experience and operational efficiency [55][56]
Youdao(DAO) - 2025 Q4 - Earnings Call Transcript
2026-02-11 11:02
Financial Data and Key Metrics Changes - In Q4 2025, net revenues reached RMB 1.6 billion, a 16.8% year-over-year increase, driven by growth in learning services and online marketing services [4][15] - Full year 2025 total net revenues were RMB 5.9 billion, a 5% increase year-over-year, with operating profit growing to RMB 221.3 million, up 48.7% year-over-year [5][20] - Net cash flow from operating activities for Q4 was RMB 184.2 million, up 16.4% year-over-year, marking the first full year of net cash inflow from operating activities at RMB 55.2 million [4][5][20] Business Line Performance Changes - Learning services in Q4 generated net revenues of RMB 727.2 million, a 17.7% year-over-year increase, with digital content services contributing RMB 436.1 million, up 12.2% year-over-year [6][15] - Online marketing services saw Q4 net revenues of RMB 660.9 million, a 37.2% increase year-over-year, driven by demand from the NetEase Group and overseas markets [9][16] - Smart devices segment reported Q4 net revenues of RMB 176.5 million, down 26.6% year-over-year, primarily due to decreased demand for smart learning devices [11][16] Market Data and Key Metrics Changes - The online marketing segment's gross margin was 27.8% in Q4, reflecting a 2 percentage point sequential improvement despite a year-over-year decline [11][17] - International KOL revenues increased by over 50% year-over-year in Q4, with successful campaigns executed in over 50 countries [10] Company Strategy and Development Direction - The company aims to continue its AI-native strategy, focusing on innovative product development and competitive positioning in learning services and advertising [13][24] - Plans for 2026 include leveraging AI capabilities to enhance online marketing services and learning services, with expectations of double-digit growth in learning services [25][27] - The company is committed to improving operational health in the smart devices segment while enhancing user engagement through AI-driven features [29] Management's Comments on Operating Environment and Future Outlook - Management expressed confidence in achieving sustainable growth through innovation and competitive products, with a focus on AI-driven solutions [24][29] - The company anticipates strong momentum in marketing demands across sectors such as AI applications and gaming, aiming to improve targeting precision and conversion efficiency [25][26] Other Important Information - The company launched new AI applications, including ScholarAI and Youdao AnyDub, which contributed to record high revenue in AI-driven subscription services [8][28] - The Youdao Lingshi business is expected to continue its growth trajectory, with plans for product refinement and efficient customer acquisition strategies [33][34] Q&A Session Summary Question: What are the management's thoughts on the 2026 outlook across different business lines? - Management aims to grow sustainably while enhancing innovative product offerings, supported by their AI-native strategy [24] Question: What is the plan for the Youdao Lingshi business in 2026? - The strategy includes product refinement and efficient customer acquisition, with confidence in future growth due to improved AI features [33][34] Question: How does management view the outlook for overseas advertisement growth in 2026? - The company plans to deepen its international KOL business and explore programmatic advertising to drive long-term growth [41][42] Question: Is the goal for 2026 to achieve a net inflow for total cash flow? - The focus will remain on enhancing operating cash flow while balancing strategic investments and cost discipline [49][50] Question: In what areas will AI agents be deployed, and what is the potential impact? - AI agents will be applied in advertising and learning applications, with significant potential to enhance user experience and operational efficiency [55][56]
Nature认定的论文综述神器来了
量子位· 2026-02-07 04:22
Core Viewpoint - The article discusses the launch of OpenScholar, an AI system developed by the Allen Institute for AI and the University of Washington, which aims to eliminate the issue of false citations in academic writing by leveraging a vast database of 45 million scientific papers [2][5]. Group 1: OpenScholar's Features - OpenScholar connects to a large database called ScholarStore, which contains full texts and abstracts of 45 million papers, significantly reducing the false citation rate of traditional large language models (LLMs) [9][11]. - The system employs Retrieval-Augmented Generation (RAG) technology to ensure that each knowledge point is backed by a real paper, enhancing the accuracy of citations [12][13]. - OpenScholar's feedback loop allows it to refine its outputs by searching, generating, self-reviewing, and revising, which helps confirm the existence of supporting literature [12][13]. Group 2: Performance Comparison - In a benchmark test called Scholar QABench, OpenScholar-8B outperformed GPT-4o by 5% in correctness and matched human expert citation accuracy [16]. - A double-blind experiment showed that 51% of OpenScholar's answers were rated better than those written by human researchers, with an upgraded version achieving a 70% success rate [18]. - Experts noted that OpenScholar's strengths lie in its comprehensive information coverage, clearer structure, and stronger logical coherence compared to traditional models [19].