量子位
Search documents
从「能用」到「好用」:数据可视化的三个维度,你还在第一层吗?——人大提出图表创作新方式
量子位· 2026-01-20 04:17
Core Insights - The article discusses the evolution of data visualization from merely creating charts to addressing deeper challenges such as enhancing visual appeal and storytelling through dynamic data representation [2][9] - It highlights the need for tools that can streamline the process of creating visually engaging and interactive data presentations, moving beyond traditional methods that are often labor-intensive and not easily reusable [10][12] Group 1: Challenges in Data Visualization - The first challenge is creating visually appealing data representations without excessive manual effort, which often leads to time-consuming processes in design software [2][3][4] - The second challenge involves animating data visualizations, where the complexity of coding and limited flexibility in templates can deter users from implementing dynamic features [5][6] - The third challenge is the repetitive nature of implementing interactive features across different visualization types, which often requires starting from scratch with each new project [7][8] Group 2: Proposed Solutions - The IDEAS Lab team has developed three key projects: PiCCL for enhancing static chart creation, CAST for simplifying animation processes, and Libra for improving interactive capabilities [11][12][13] - PiCCL redefines the creation of static charts by focusing on graphic operations and constraints, allowing for more efficient and reusable designs [20][21][23] - CAST introduces a declarative model for animation that emphasizes data-driven timing structures, making it easier to create complex animations without extensive coding [28][35][36] Group 3: Enhancements in Interactivity - Libra aims to treat interactivity as a first-class citizen by breaking it down into reusable components, enhancing the ability to create complex interactions without starting from scratch [39][45] - The system supports features like undo/redo and provides a structured approach to managing interactions, making it easier to implement and maintain [42][43] - By leveraging the capabilities of PiCCL, CAST, and Libra, the future of data visualization is expected to incorporate more efficient and user-friendly tools, potentially utilizing large models for enhanced visualization generation [44]
首个真正“能用”的LLM游戏Agent诞生!可实时高频决策,思维链还全程可见
量子位· 2026-01-20 04:17
Core Viewpoint - The article discusses the emergence of AI in the gaming industry, highlighting the capabilities of a new AI agent called COTA developed by Chao Can Shu Technology, which demonstrates advanced decision-making and operational skills in gaming environments [1][6][55]. Group 1: AI in Gaming - A mysterious gaming account named "快递员" has gained significant attention for its impressive performance in League of Legends, raising questions about the role of AI in gaming [2][4]. - The gaming industry is increasingly focusing on AI, with various companies exploring this technology to enhance gaming experiences [6][7]. - Chao Can Shu Technology has successfully commercialized AI agents across multiple game types, showcasing their expertise in this field [8][9]. Group 2: COTA's Features and Performance - COTA is described as a versatile gaming agent capable of cognitive reasoning, operational execution, tactical planning, and assistance, all powered by a large model [9][10]. - The agent has demonstrated professional-level performance in a first-person shooter (FPS) game demo, where it must make rapid decisions in high-stakes environments [12][13]. - COTA's design allows it to perform complex actions fluidly, simulating human-like gameplay while maintaining high levels of strategy and decision-making [28][34]. Group 3: Technical Innovations - COTA employs a dual-system architecture that separates fast action execution from deep analysis, mimicking human cognitive processes [40][41]. - The agent utilizes a base model called Qwen3-VL-8B-Thinking, balancing performance and efficiency to meet the demands of real-time gaming [39]. - COTA's training pipeline includes stages for supervised fine-tuning, self-play for strategy optimization, and alignment with human preferences, enhancing its gameplay realism [50][51][52]. Group 4: Industry Implications - COTA represents a significant advancement in AI gaming technology, indicating a shift from experimental models to practical applications in the gaming industry [55][56]. - The success of COTA suggests a broader trend where AI agents are becoming integral to enhancing player experiences and game design [57][59]. - The potential applications of COTA extend beyond gaming, offering insights into solving complex real-world problems through its innovative architecture [72][76].
谷歌新发现:DeepSeek推理分裂出多重人格,左右脑互搏越来越聪明
量子位· 2026-01-20 04:17
Core Insights - The article discusses how advanced AI models like DeepSeek-R1 exhibit a phenomenon where they internally "split" into different virtual personas during problem-solving, resembling a debate or discussion among various character types [1][7][13] - This internal dialogue enhances the model's ability to tackle complex tasks, as the conflict of perspectives leads to a more comprehensive examination of solutions [4][11] Group 1: AI Internal Dynamics - AI models develop distinct virtual roles, such as creative, critical, and execution-oriented personas, which contribute to diverse problem-solving approaches [8][9] - The intensity of internal discussions increases significantly when faced with challenging tasks, while simpler tasks see a reduction in this internal dialogue [4][5] Group 2: Research Methodology - Researchers utilized Sparse Autoencoders (SAE) to decode the AI's reasoning process, successfully identifying the internal dialogues by analyzing the activation patterns of hidden layer neurons [14][17] - The study involved extracting and categorizing the AI's thought processes during complex reasoning tasks, leading to the identification of various logical entities within the model [15][18] Group 3: Performance Insights - The dialogue-driven behavior of reasoning models like DeepSeek-R1 occurs more frequently compared to standard instruction models, indicating a correlation between conversational dynamics and reasoning accuracy [19] - Enhancements in dialogue features, such as emphasizing expressions of surprise, significantly improved the model's accuracy in arithmetic reasoning tasks, doubling the success rate from 27.1% to 54.8% [21] Group 4: Training Implications - The research highlights that models can learn to adopt dialogue-based thinking without explicit training signals, showing that reinforcement learning can lead to faster improvements when using multi-agent dialogue data [24] - In early training stages, models fine-tuned with dialogue data outperformed those trained with monologue data by over 10%, with the gap widening to 22% in later stages [24]
智谱新模型也用DeepSeek的MLA,苹果M5就能跑
量子位· 2026-01-20 04:17
Core Viewpoint - The article discusses the launch of the new lightweight language model GLM-4.7-Flash by Zhipu AI, which aims to replace its predecessor GLM-4.5-Flash and is available for free API access. Group 1: Model Specifications - GLM-4.7-Flash features a total of 30 billion parameters, with only 3 billion activated during inference, significantly reducing computational costs while maintaining performance [4][10]. - The model is designed as a mixed expert (MoE) architecture, specifically positioned for local programming and intelligent assistant tasks [4][9]. - It achieved a score of 59.2 in the SWE-bench Verified code repair test, outperforming similar models like Qwen3-30B and GPT-OSS-20B [4]. Group 2: Performance and Applications - The model is optimized for efficiency and retains core capabilities in coding and reasoning from the GLM-4 series [7]. - Besides programming, GLM-4.7-Flash is recommended for creative writing, translation, long-context tasks, and role-playing scenarios [8]. - Initial tests on a 32GB unified memory Apple laptop showed a speed of 43 tokens per second [17]. Group 3: Technical Innovations - The introduction of the MLA (Multi-head Latent Attention) architecture marks a significant advancement, previously validated by DeepSeek-v2 [12]. - The model's structure is similar in depth to GLM-4.5 Air and Qwen3-30B-A3B, but it utilizes 64 experts, activating only 5 during inference [13]. Group 4: Market Position and Pricing - GLM-4.7-Flash is offered for free on the official API platform, with a high-speed version available at a low cost [19]. - Compared to similar models, GLM-4.7-Flash has advantages in context length support and output token pricing, although latency and throughput require further optimization [19].
算力越高收入越多!OpenAI率先验证AI商业Scaling Law:最新收入200亿美元
量子位· 2026-01-20 01:34
Core Viewpoint - OpenAI's revenue has significantly increased, with annual recurring revenue (ARR) rising from $2 billion to $20 billion over two years, indicating a strong growth trajectory despite high operational costs [2][12]. Revenue and Growth - OpenAI's ARR has surged to $20 billion, reflecting a tenfold increase in revenue projected from 2023 to 2025, alongside a 9.5-fold increase in computing power [2][13]. - The relationship between computing power and revenue is emphasized, where increased investment in computing drives research and model capabilities, leading to higher revenue, which in turn supports further investment [9][12]. Comparison with Competitors - In comparison to a competitor (Claude's parent company), OpenAI's computing power and ARR are significantly larger, with projections showing a growth from 0.2 GW and $2 billion in 2023 to 1.9 GW and over $20 billion by 2025 [14][17]. Operational Costs - OpenAI's operational costs are substantial, with an estimated $7 billion spent on computing resources in 2024, primarily through cloud services from Microsoft [21][22]. - The company is also investing heavily in building its own AI data centers, indicating a long-term strategy to manage costs and enhance capabilities [18][19]. Business Model and Future Plans - OpenAI's business model is evolving, with the introduction of advertising aimed at providing decision support in commercial scenarios, alongside subscription services and API usage [27][30]. - The company plans to launch its first hardware product in the second half of 2026, which is expected to further integrate into its revenue-computing cycle [33][34].
定位大模型「作弊」神经回路!新研究首次揭示:虚假奖励如何精准激活第18-20层记忆
量子位· 2026-01-20 01:34
Core Insights - The article discusses the phenomenon of "Spurious Rewards" in large language models (LLMs) and how they can enhance accuracy even with false reward signals during training [1][2] - It highlights the concept of "Perplexity Paradox," where models show decreased perplexity for answers but increased perplexity for questions, indicating a trade-off between general understanding and specific memorization [3][6] Group 1: Key Findings - The research team identified that the model's internal memory shortcuts are activated by false RLVR, leading to a more efficient retrieval of contaminated knowledge rather than genuine learning [1][6] - The critical memory nodes are located in layers 18-20, which serve as functional anchors for retrieving memorized answers [10][20] - The study utilized various analytical methods, including Path Patching and Jensen-Shannon Divergence (JSD), to pinpoint the layers responsible for memory retrieval and structural adaptation [9][15] Group 2: Mechanisms and Dynamics - The research demonstrated that the model's decision-making process occurs at layers 18-20, where it chooses between reasoning paths and memory shortcuts [23] - The introduction of Neural ODEs allowed the team to model the continuous evolution of hidden states, confirming that separation forces peak at the critical layers [21] - The team successfully manipulated memory retrieval by scaling the activation of specific neurons, demonstrating a dose-dependent relationship in memory retrieval accuracy [25][30] Group 3: Implications and Future Directions - The findings provide new tools for evaluating RLVR effectiveness, suggesting that improvements may be illusory if they stem from memory activation circuits [36] - The research opens new avenues for detecting data contamination through internal neural activation patterns, moving beyond traditional statistical methods [38] - It proposes controllable methods for reducing reliance on contaminated knowledge without retraining the model, paving the way for new techniques in reasoning and decontamination [39]
ChatGPT强行上马广告,因为OpenAI真的太烧钱
量子位· 2026-01-19 09:30
Core Viewpoint - OpenAI is facing a financial crisis, prompting the introduction of advertising in ChatGPT as a potential solution to generate revenue and avoid bankruptcy [7][15][51]. Financial Situation - OpenAI is projected to run out of funds within 18 months, with reports indicating a possible acquisition by larger companies like Microsoft or Amazon [7][15]. - The company raised $40 billion in funding last year, but its expenses are significantly higher, with projected annual burn rates exceeding $8 billion in 2025 and reaching $40 billion by 2028 [10][13]. - OpenAI's revenue for the previous year was only $20 billion, highlighting a substantial financial gap compared to its expenditures [15]. - The AI industry is estimated to have a funding shortfall of at least $800 billion, exacerbating OpenAI's financial challenges [15][16]. Advertising Strategy - OpenAI plans to test advertising in the free version of ChatGPT, marking a shift from a subscription-based revenue model to include advertising income [26][28]. - The ads will be labeled as "sponsored content" and will not affect the objectivity of ChatGPT's responses [27][29]. - OpenAI anticipates generating "low billions" in revenue from advertising by 2026, with plans to scale this income source over time [22][28]. Business Model Expansion - The introduction of advertising is part of OpenAI's broader commercial strategy, which includes subscription services and API usage-based billing [25][41]. - OpenAI's CFO emphasized that the business model should expand in line with the value created by its intelligence [36]. - Future revenue growth is expected to come from various sources, including subscriptions, API usage, and potential new pricing models as AI technology evolves [41]. User Engagement and Growth - OpenAI's weekly and daily active user metrics are at all-time highs, driven by a cycle of investment in computing power, research, and product development [43][44]. - The company is experiencing a 9.5 times increase in computing power from 2023 to 2025, with revenue growth projected to match this increase [46][55].
AI的尽头,是电工(doge)
量子位· 2026-01-19 09:30
Group 1 - The core viewpoint of the article highlights the increasing demand for electricians in the AI era, with an estimated annual shortage of about 81,000 electricians in the U.S. from 2024 to 2034, leading to a projected 9% growth in employment for electricians over the next decade, significantly higher than the average for all occupations [2][3][4] - The surge in job openings is primarily driven by data centers, which are creating a substantial demand for electricians and other blue-collar workers, including plumbers and HVAC technicians [6][8] - Major tech companies are significantly increasing their hiring in the energy sector, with a 34% year-on-year increase in recruitment for 2024, maintaining high levels into 2025, and overall hiring in this sector is approximately 30% higher than before the release of ChatGPT in 2022 [9][10][11][12] Group 2 - The shortage of electricians is exacerbated by a long-standing lack of skilled labor in the construction industry, with many young people being encouraged to pursue white-collar jobs instead of trades, leading to a gap as experienced workers retire [24][28][30] - Training for electricians is becoming more rigorous, with companies preferring to hire fully trained workers rather than apprentices, which further complicates the supply-demand imbalance [34][36] - Tech companies like Google are proactively addressing the shortage by funding training programs to enhance the skills of existing electricians and train new apprentices, aiming to increase the overall workforce by about 70% by 2030 [36][37] Group 3 - The article discusses the critical energy demands of AI and data centers, emphasizing that the lack of electricity supply is becoming a more pressing issue than chip shortages, with predictions that China's electricity output will reach three times that of the U.S. by 2026 [40][50] - The future of AI development is increasingly tied to energy availability, including the need for infrastructure such as transformers and cooling systems, indicating a collective effort across the industry is necessary [48][49]
ChatGPT强行上马广告,因为OpenAI真的很烧钱
量子位· 2026-01-19 07:00
Core Viewpoint - OpenAI is facing a financial crisis, prompting the introduction of advertising in ChatGPT as a potential solution to generate revenue and avoid bankruptcy [7][15][51]. Financial Situation - OpenAI is projected to run out of funds within 18 months, with reports indicating a possible acquisition by larger companies like Microsoft or Amazon [7][15]. - The company raised a record $40 billion in funding last year, but its expenses are significantly high, with projected annual burn rates exceeding $8 billion in 2025 and reaching $40 billion by 2028 [10][13]. - OpenAI's revenue for the previous year was only $20 billion, highlighting a substantial financial gap compared to its expenditures [15]. - The AI industry is estimated to have an $800 billion funding shortfall, exacerbating OpenAI's financial challenges [15][16]. Advertising Strategy - OpenAI plans to test advertising in the free version of ChatGPT, marking a shift from a subscription-based revenue model to include advertising income [26][28]. - The ads will be labeled as "sponsored content" and will not affect the objectivity of ChatGPT's responses [27][29]. - OpenAI anticipates generating "low billions" in revenue from advertising by 2026, with plans to scale this income source over time [22][28]. Business Model Expansion - The introduction of advertising is part of OpenAI's broader commercial strategy, which includes subscription services and API usage-based billing [25][41]. - OpenAI's CFO emphasized that the business model should expand in line with the value created by its intelligence [36]. - Future revenue growth is expected to come from various sources, including subscriptions, API usage, and potential new pricing models as AI technology advances [41][42]. User Engagement and Growth - OpenAI's weekly and daily active user metrics are at all-time highs, driven by a cycle of investment in computing power, research, and product development [43][44]. - The company expects a 9.5-fold increase in computing power from 2023 to 2025, with revenue growth projected to match this increase [46][55].
哈工大系闯出人形机器人黑马:成立不到一年,全栈开源3m/s原型机,小米商汤都投了
量子位· 2026-01-19 07:00
Core Viewpoint - Roboparty has launched a fully open-source bipedal humanoid robot prototype, "Roboto_Original," aiming to revolutionize the humanoid robot development industry through collaborative innovation and shared resources [2][10]. Group 1: Open Source Initiative - The open-source release includes not only software code but also hardware schematics, EBOM material lists, supplier information, and a comprehensive knowledge base to facilitate development [5][10]. - The goal is to create a reproducible, verifiable, and modifiable open-source framework, addressing the industry's long-standing pain points of high development barriers and lack of standardization [6][9][10]. Group 2: Technical Specifications - The "Roboto_Original" prototype has a running speed of up to 3 m/s, positioning it among the leading open-source humanoid robots globally [4][24]. - The robot's hardware features a height of 1.2m and a weight of 30kg, with detailed design documents available to lower the barriers for hardware development and replication [12][14]. Group 3: Software and Control - The project has released full control code covering core modules for imitation, perception, and navigation, allowing developers to leverage extensive motion capture data [16]. - The AMP control algorithm enhances the robot's walking and running capabilities, ensuring natural movement and stability, which is crucial for real-world applications [26][27]. Group 4: Engineering and Collaboration - Roboparty has established a knowledge base for hands-on learning in humanoid robotics, focusing on practical issues like walking stability and production costs [21][36]. - The initiative aims to shift the industry from isolated trial-and-error approaches to collaborative breakthroughs, fostering a community-driven development environment [22][30]. Group 5: Industry Impact and Funding - The project has secured millions in seed funding from notable investors, indicating strong market interest and validation of its technological approach [29]. - Roboparty aims to reduce development costs by 80%, making humanoid robotics more accessible and scalable across various industries [32][31].