量子位
Search documents
半世纪难题48小时破解!陶哲轩组队把AI数学玩成打怪游戏了
量子位· 2025-12-13 04:34
Core Viewpoint - The collaboration between mathematicians and AI has led to the resolution of the long-standing Erdős 1026 problem, which had remained unsolved for 50 years, in just 48 hours [1][2][3]. Group 1: Problem Overview - The Erdős 1026 problem was proposed in 1975 and involves determining the minimum possible value of a function related to a game theory scenario involving two players, Alice and Bob [8][10][12]. - The problem's complexity was highlighted by the introduction of a maximum constant c(n) that represents the minimum proportion of coins Bob can guarantee to take, regardless of how Alice distributes them [10][13]. Group 2: AI's Role in the Solution - AI tools played a crucial role in solving the problem quickly, with traditional methods potentially taking weeks or months to reach a conclusion [3][5]. - The use of AI models, such as Harmonic and AlphaEvolve, allowed mathematicians to automate the construction and proof of key inequalities, transforming the original problem into a computational geometry challenge [16][18][22]. Group 3: Collaborative Efforts - The solution involved multiple mathematicians working together, with contributions from Boris Alexeev, Koishi Chan, and Lawrence Wu, showcasing the effectiveness of human-AI collaboration [17][28][32]. - The collaborative approach of combining human insight with AI capabilities is emerging as a new trend in mathematical problem-solving [46]. Group 4: Historical Context and Future Implications - The Erdős problems, proposed by the renowned mathematician Paul Erdős, have been a significant part of mathematical research, with many remaining unsolved [39][41]. - The increasing success of AI in solving these problems suggests a shift in how mathematical research may be conducted in the future, with AI becoming a standard tool for researchers [41][42].
交大高金朱宁:经济学家视角下AI时代的范式思维转变 | MEET2026
量子位· 2025-12-13 02:00
Core Viewpoints - The concept of scarcity has changed after the emergence of AI, prompting a need for deeper consideration on how to make better choices in the face of this new reality [6][11] - As AI begins to replace human decision-making, competition may arise between humans and algorithms, as well as among algorithms themselves [6][22] Economic Implications - Economics has historically focused on technological progress and its impact on economic principles and human welfare, with fundamental concepts like "what is human?" and "what is production?" undergoing significant changes in the AI era [8][11] - The traditional view of scarcity, which included time, computational power, and creativity, is being challenged as AI can now perform tasks that previously required significant human effort [11][12] - AI is expected to contribute to global economic growth by 0.5% to 0.7% annually over the next decade, although this may not be sufficient to support high valuations in tech markets [14][24][25] Industry Impact - The nature of work is changing, with both white-collar and blue-collar jobs facing potential replacement by AI, blurring the lines between these categories [31] - Knowledge-intensive industries, previously thought to be safe from AI disruption, are also at risk as AI capabilities evolve [33] - Companies are encouraged to focus on how to leverage AI technology to enhance productivity and efficiency rather than seeking industries that are immune to AI [33] Global Considerations - There is a significant disparity in access to AI capabilities between high-income and low-income countries, which may exacerbate global wealth distribution issues [28][29] - The shift towards AI-driven trade will lead to new regulatory and governance challenges, particularly regarding accountability in cross-border transactions [30]
中国机器人比赛应急救援,美国网友Reddit破防:我们还在给机器狗化妆拍段子
量子位· 2025-12-12 06:41
Core Viewpoint - The article highlights the growing attention from foreign audiences, particularly Americans, towards China's advancements in embodied intelligence and robotics, especially in practical applications like emergency rescue, contrasting it with the more entertainment-focused developments in the U.S. [1][10][40] Group 1: Event and Competition - The GDPS 2025 (Global Developer Pioneer Summit and International Embodied Intelligence Skills Competition) showcased China's capabilities in robotics, featuring competitions that emphasize real-world applications rather than theoretical concepts [7][46]. - The event attracted significant interest from foreign observers, with comments noting the impressive scale and organization of the competition, likening it to a military formation [12][13]. - The competition included participation from top universities and leading robotics companies, with multiple categories covering various fields such as industrial and medical applications [46][50]. Group 2: Technological Advancements - China's advancements in embodied intelligence are attributed to a robust industrial ecosystem, with Shanghai's robotics industry accounting for one-third of the national market [49][51]. - The article emphasizes the importance of mass production in robotics, which allows for the identification of potential hardware issues and the expansion of applications beyond limited scenarios [37][40]. - Companies like UBTECH and ZhiYuan have achieved large-scale production, contrasting with U.S. firms that are still in prototype stages, thus creating a competitive edge for China in the global market [41][42]. Group 3: Global Perception and Response - The article notes a shift in perception among foreign audiences, who are increasingly recognizing the practical achievements of Chinese robotics, which are now seen as commonplace rather than mere demonstrations [54][55]. - The competitive landscape is evolving, with U.S. companies beginning to adopt strategies and learn from Chinese advancements in robotics, indicating a potential shift in the global tech hierarchy [30][36]. - The urgency expressed by foreign observers reflects a growing concern about the technological gap and the need for the U.S. to accelerate its own developments in embodied intelligence [43][56].
只需三步,就能认领一台AI手机!
量子位· 2025-12-12 06:41
Core Viewpoint - The article discusses the launch and capabilities of AutoGLM, an AI framework that allows smartphones to perform tasks autonomously based on natural language commands, marking a significant advancement in mobile AI technology [2][12]. Group 1: AutoGLM Overview - AutoGLM is a visual language model-based intelligent assistant framework for smartphones, enabling a paradigm shift from chat-based interactions to actionable tasks [12][13]. - The framework allows users to describe tasks in natural language, which the AI interprets to understand user intent and execute operations on the smartphone [13]. Group 2: Installation Process - The article outlines a simplified three-step process for users to install AutoGLM on their Android devices, utilizing tools like Claude Code and GLM-4.6 [8][11]. - The steps include installing ABD Keyboard, connecting the phone to a computer, and using Claude Code to execute the installation command [9][11]. Group 3: Development Timeline - The development of AutoGLM has spanned 32 months, with three significant milestones, including its open-source release, which allows local deployment and cloud-based experiences [14]. - Key milestones include the first AI agent capable of automatically operating a phone in October 2024, the first fully automated AI-issued red envelope in November 2024, and the release of AutoGLM2.0 in August 2025, which operates in a cloud environment [14].
量子位编辑作者招聘
量子位· 2025-12-12 06:41
Core Viewpoint - The article emphasizes the ongoing AI boom and invites individuals to join the company "Quantum Bit," which focuses on tracking AI advancements and has established itself as a leading content platform in the industry [1]. Group 1: Job Opportunities - The company is hiring for three main directions: AI Industry, AI Finance, and AI Product, with positions available for both experienced professionals and fresh graduates [2][4]. - Positions are open for various levels, including editors, lead writers, and chief editors, with a focus on matching roles to individual capabilities [6]. Group 2: Job Responsibilities - **AI Industry Direction**: Responsibilities include tracking innovations in infrastructure, such as chips, AI infrastructure, and cloud computing, as well as interpreting technical reports from conferences [6][7]. - **AI Finance Direction**: Focuses on venture capital, financial reports, and capital movements within the AI industry, requiring strong analytical skills and a passion for interviews [11]. - **AI Product Direction**: Involves monitoring AI applications and hardware developments, requiring a keen understanding of product experiences and market trends [11]. Group 3: Benefits and Growth - Employees can expect to engage with cutting-edge AI technologies, enhance their work efficiency through new tools, and build personal influence in the AI field [6]. - The company offers competitive salaries, comprehensive benefits, and a supportive environment for professional growth, including mentorship from senior editors [6][12]. Group 4: Company Impact - By 2025, Quantum Bit aims to have over 2.4 million subscribers on WeChat and more than 7 million users across platforms, with a daily reading volume exceeding 2 million [12]. - The company is recognized as the top new media outlet in the AI and frontier technology sectors according to third-party data platforms [12].
10亿美元OpenAI股权兑换迪士尼版权!米老鼠救Sora来了
量子位· 2025-12-12 06:41
一水 发自 凹非寺 量子位 | 公众号 QbitAI 天下果然没有免费的午餐! 为了让"米老鼠"加入Sora,OpenAI刚刚正式官宣与迪士尼达成合作。 协议内容之一即为, OpenAI需向迪士尼出售价值10亿美元的公司股权,而且迪士尼还获得未来增持的权利。 消息一出,彭博社带头吃瓜,文章标题更是相当赤裸裸: 奥特曼刚刚意识到,在迪士尼没有免费的午餐。 如此大出血下,现在看OpenAI CEO奥特曼和总裁Greg Brockman的庆贺推文似乎也有点"强颜欢笑"的意味了(bushi。 不过作为交换,OpenAI旗下的视频生成工具Sora现在能"光明正大"地生成200多个热门IP角色。 什么米老鼠、白雪公主、巴斯光年、钢铁侠等,通通拿下。 包括ChatGPT Images也将具备同等能力。 该说不说,迪士尼"地表最强法务"的大名真不是虚的,现在连OpenAI也得变相"割肉"求平安了。 OpenAI迪士尼达成三年之约 调侃归调侃,还是来看看双方达成的正式协议。 根据OpenAI公告,迪士尼将成为 Sora的首个主要内容授权合作伙伴 ,合作时间为三年,第一年的授权许可具有排他性。 根据协议,Sora将获得迪士尼旗下 ...
谷歌智能体发力:增强版Gemini Deep Research和专属API都来了
量子位· 2025-12-12 06:41
Core Insights - OpenAI and Google are both making significant updates in the AI space, with Google launching an enhanced version of Gemini Deep Research aimed at reducing hallucinations and excelling in complex information retrieval and analysis tasks [1][3][10]. Group 1: Gemini Deep Research Enhancements - The enhanced Gemini Deep Research is built on Gemini 3 Pro and will soon be integrated into various Google services such as Google Search, NotebookLM, Google Finance, and the upgraded Gemini App [3][8]. - This version of Gemini Deep Research can perform iterative reasoning, allowing it to generate queries, read and integrate search results, and identify knowledge gaps, significantly improving its web search capabilities [10][12]. - In benchmark tests like HLE, BrowseComp, and DeepSearchQA, the enhanced model has achieved state-of-the-art (SOTA) results, showcasing its superior performance in complex research tasks [10][12]. Group 2: DeepSearchQA Benchmark - Google has released the DeepSearchQA benchmark dataset to provide a more comprehensive evaluation standard for deep search and research tasks, addressing the limitations of existing benchmarks [5][12]. - The dataset includes 900 manually designed causal chain tasks from 17 domains, requiring detailed answer sets, which better measure the model's multi-step reasoning and information fusion capabilities [12]. Group 3: Interactions API - Google has introduced the Interactions API, designed to provide a unified interface for developers to interact with Gemini 3 Pro and Deep Research agents [6][16]. - This API is particularly suited for scenarios requiring multi-step reasoning, tool invocation, and long-term task execution, enhancing the capabilities of existing models [17][18]. - The Interactions API simplifies workflows and adapts better to developer environments by expanding the core capabilities of content generation and supporting server-side state, interpretable data models, and remote tool support [18].
ToC智能体火得快,但更大的价值在企业丨中关村科金@MEET2026
量子位· 2025-12-12 05:30
Core Viewpoints - The transition from the mobile internet's "human-machine connection" to the AI era's "intelligent connection" signifies a profound restructuring within enterprises, where the essence lies in stronger connections rather than merely enhanced tools [1][9]. - Intelligent agents are emerging as super connectors, weaving together people, data, knowledge, and intelligence into the entire operational framework of enterprises, thus forming a new digital workforce [2][12]. Group 1: Intelligent Agent Implementation - The implementation of intelligent agents is not a one-time project but a long-term endeavor driven by continuous iteration across three elements: scenario selection, data and knowledge governance, and model construction [3][14][17]. - Enterprises are advised to focus on three key platforms for effective intelligent agent deployment: a large model platform for cognitive capabilities, an AI capability platform for perception, and an AI data platform for organizational memory [19][20][25]. Group 2: Market Opportunities and Applications - Intelligent agents are creating significant value in both internal and external enterprise operations, enhancing collaboration among employees and improving customer engagement through marketing, customer service, and sales empowerment [12][36]. - The marketing service scenario is highlighted as the most typical and effective application area for intelligent agents, enabling efficient interaction with millions of users through a unified management system [35][36]. Group 3: Industry-Specific Applications - In the financial sector, the company has served over 200 banks and 500 financial institutions, developing numerous intelligent agent solutions for risk control, consumer protection, and credit scenarios [41]. - The industrial sector is also seeing extensive applications of intelligent agents, with a focus on leveraging large language models and other advanced technologies to enhance operational efficiency and optimize processes [45][46]. Group 4: Global Expansion and Future Outlook - The company positions itself as a leading provider of enterprise-level large model technology and application services, actively expanding into international markets such as Hong Kong, Singapore, Malaysia, Thailand, and Indonesia [47]. - The future of intelligent agents in enterprises hinges on creating substantial value, with a higher demand for industry know-how, accuracy, and compliance compared to consumer-facing applications [49][50].
跳过“逐字生成”!蚂蚁集团赵俊博:扩散模型让我们能直接修改Token | MEET2026
量子位· 2025-12-12 03:00
Core Viewpoint - The article discusses the shift from autoregressive models to diffusion architecture in language models, highlighting the potential for faster generation speeds and lower computational costs with diffusion models [2][8]. Group 1: Diffusion Architecture Insights - Diffusion architecture allows for direct modification and control of tokens during inference, unlike autoregressive models that require re-generating entire segments [2][15]. - The recent release of LLaDA 2.0 marks a significant milestone, achieving a scale of 100 billion parameters for diffusion language models [4][44]. - The development of diffusion models is still in its early stages, but it has attracted attention from major companies like Google and ByteDance, as well as several startups [5][41]. Group 2: Technical Aspects and Comparisons - Diffusion models operate on a "fill-in-the-blank" mechanism rather than a sequential token generation, which can lead to more efficient data utilization [12][21]. - In terms of parameter efficiency, diffusion models can achieve similar performance with fewer parameters compared to autoregressive models under the same computational constraints [15][23]. - The unique characteristics of diffusion models allow for continuous training, unlike autoregressive models that plateau after several epochs [24][26]. Group 3: Future Directions and Community Engagement - The article emphasizes the need for further exploration of the scaling laws specific to diffusion language models, which differ from those of autoregressive models [56]. - The community is encouraged to participate in the development and optimization of diffusion models, as the ecosystem is still in its infancy [56]. - Upcoming collaborations and API releases are planned to enhance accessibility and integration of diffusion models into various applications [51].
港中文联手美团开源“视觉推理通才”!图像视频10类任务一网打尽
量子位· 2025-12-12 01:00
Core Insights - OneThinker is a unified multimodal visual reasoning model developed by the MMLab of The Chinese University of Hong Kong and Meituan, capable of handling ten core visual tasks across both image and video modalities [1][2][8]. Group 1: Model Capabilities - OneThinker has demonstrated impressive performance across 31 mainstream visual tasks, showcasing its ability to generalize and make reasonable inferences on previously unseen tasks [2][28]. - The model addresses limitations of traditional RL models, which typically handle single modalities or tasks, by enabling unified reasoning across different tasks and modalities [4][6][8]. Group 2: Data and Training Methodology - To build a model with general visual reasoning capabilities, the research team constructed a comprehensive dataset, OneThinker-600k, which includes image and video modalities and covers ten core visual tasks [14][15]. - The training methodology incorporates a new algorithm, EMA-GRPO, which enhances training stability and convergence speed by addressing reward structure imbalances across different tasks [19][20]. Group 3: Experimental Results - OneThinker achieved notable results in various tasks, such as scoring 70.6% in the MMMU image question-answering task and 66.2% in video understanding [22][25]. - The model also excelled in tracking tasks, achieving an AO score of 73.0 in the GOT-10k benchmark, indicating robust performance in perception-related tasks [25][27]. Group 4: Knowledge Transfer and Generalization - OneThinker exhibits effective knowledge transfer and sharing between tasks, allowing for mutual enhancement across different tasks [27][28]. - The model demonstrates zero-shot capabilities, adapting to new tasks like point tracking and image quality assessment, highlighting its strong generalization ability [28].