Workflow
量子位
icon
Search documents
小扎天价offer创新高:10亿刀!但这支前OpenAI班底0人心动
量子位· 2025-07-30 00:24
Core Viewpoint - Mark Zuckerberg is attempting to recruit members from the company Thinking Machines, which includes former OpenAI employees, offering substantial compensation packages, but has faced rejection from all targeted individuals [1][3][4]. Recruitment Efforts - Zuckerberg has offered between $200 million to $500 million, with some offers exceeding $1 billion over multiple years, aiming to recruit about 25% of Thinking Machines' 50 employees [2][4]. - Despite the lucrative offers, no employees from Thinking Machines have accepted the proposals to join Meta [3][4]. Company Valuation and Funding - Thinking Machines recently completed a $2 billion seed funding round, marking it as the largest seed round in history, with a valuation reaching $10 billion [9]. - The company had initially aimed for a $1 billion funding target, which was doubled within a few months [9]. Employee Movement - While Thinking Machines employees have declined offers, Meta has successfully recruited key personnel from Apple, including Bowen Zhang, a significant researcher in multimodal AI [13][16]. - This marks the fourth Apple employee to join Meta in a month, indicating a notable trend of talent migration from Apple to Meta [16]. Strategic Adjustments - Meta is reportedly considering a shift in its AI strategy, potentially moving away from open-source models and restructuring its AI department with significant financial investments [19][20]. - The company is exploring the development of AI agents capable of executing step-by-step tasks, similar to OpenAI's models [21]. Financial Performance - Meta's second-quarter earnings report indicated an 11.5% profit growth rate, the slowest in two years, with operational costs rising by 9% due to AI investments [19]. - Despite the challenges, Meta's stock price has increased by over 20% this year, reflecting investor support for Zuckerberg's strategic changes [22].
ChatGPT大更新推出学习模式!“一夜之间1000个套壳应用又死了”
量子位· 2025-07-30 00:24
Core Viewpoint - OpenAI has launched a new "Study Mode" for ChatGPT, designed to enhance learning by guiding users through problem-solving rather than simply providing answers [1][2]. Summary by Sections Introduction of Study Mode - The Study Mode is now available for free, Plus, Pro, and Team users, with ChatGPT Edu users to gain access in the coming weeks [2]. Educational Impact - Leah Belsky, OpenAI's VP of Education, emphasizes that using ChatGPT for teaching can significantly improve student learning outcomes, while merely using it as an "answer machine" may hinder critical thinking [4]. - Approximately one-third of college students are using ChatGPT to assist with their studies, raising concerns among educators and parents about potential academic dishonesty [4]. Learning Mode Features - The Study Mode does not provide direct answers; instead, it poses guiding questions to encourage users to think through problems and summarize concepts in their own words [12][15]. - The design of the Study Mode is a result of collaboration with educators and experts in teaching methodologies, incorporating long-term research in learning science [15]. Interactive Learning - Key features include: - Interactive questioning that promotes active learning through Socratic questioning and self-reflection prompts [16]. - Scaffolding responses that organize information into understandable parts, highlighting key connections between topics [16]. - Knowledge checks through quizzes and open-ended questions, providing personalized feedback to support knowledge retention [17]. Customization and Flexibility - The Study Mode adapts to the user's skill level and past interactions, breaking down complex information into manageable modules while maintaining contextual relevance [18]. - Users can toggle the Study Mode on or off based on their learning objectives [19]. Future Developments - OpenAI views the current Study Mode as an initial step, with plans to refine the model based on real student feedback and to incorporate clearer visual representations for complex concepts [23][24]. - Future improvements may include cross-dialogue goal setting and deeper personalization based on individual student needs [24]. Strategic Intent - OpenAI's CEO, Sam Altman, expresses skepticism about traditional education, suggesting a potential shift in educational paradigms over the next 18 years [26][28]. - This perspective indicates a strategic intent to fundamentally reshape future educational models through AI [28].
首个企业级智能体全开源!京东云将Agent门槛直接给打没了
量子位· 2025-07-29 07:07
Core Viewpoint - JoyAgent is the first 100% open-source enterprise-level intelligent agent, providing complete capabilities for developers to deploy locally without additional development [2][10][46]. Group 1: Product Features - JoyAgent includes front-end, back-end, framework, engine, and core sub-agents, allowing for out-of-the-box deployment [2][10]. - It has achieved a validation accuracy of 75.15% on the GAIA leaderboard, showcasing its competitive performance [4]. - The agent supports multiple intelligent agents and tools, enhancing its extensibility and customization capabilities [16][28]. Group 2: Technical Innovations - JoyAgent employs a dual-level planning architecture, separating task planning and execution to handle complex problems effectively [34][35]. - It features a mixed context management system that retains information integrity and supports task intercommunication [36]. - The agent can dynamically generate specialized digital employee roles based on task requirements, enhancing adaptability [38]. Group 3: Market Position and Impact - JoyAgent addresses the challenges of deploying AI in enterprise environments, such as the need for precise understanding of industry-specific terminology and integration with traditional systems [44][45]. - The open-source nature of JoyAgent allows companies to replicate its capabilities without incurring high costs, promoting innovation among developers [47]. - The agent has been tested in real-world scenarios, demonstrating reliability and the ability to generate actionable insights for business processes [42][46].
自回归模型杀回图像生成!实现像素级精准控制,比Diffusion更高效可控
量子位· 2025-07-29 05:05
Core Viewpoint - The article discusses the limitations of Diffusion models in AI image generation, particularly in precise control, and introduces a new framework called MENTOR, which utilizes Autoregressive (AR) models for more efficient and controllable multimodal image generation [1][2][3]. Group 1: Challenges in Current Models - Diffusion models face challenges in precise visual control, balancing multimodal inputs, and high training costs [2][6]. - The inherent randomness of Diffusion models makes it difficult to achieve precise control in high-fidelity tasks like image reconstruction [6]. - Existing methods often exhibit modality imbalance, over-relying on either reference images or text instructions [6]. Group 2: Introduction of MENTOR - MENTOR is a novel AR framework that requires only one-tenth of the training data and suboptimal model components to outperform Diffusion methods like Emu2 and DreamEngine [2][3]. - The framework employs a unique two-stage training method to enable efficient multimodal image generation with pixel-level precision [3][8]. Group 3: MENTOR's Design and Training - MENTOR features a unified AR architecture consisting of a multimodal encoder and an autoregressive generator, allowing for token-level alignment between inputs and outputs [9]. - The two-stage training strategy includes: 1. Multimodal Alignment Pretraining: Focuses on understanding different input types and establishing pixel-level and semantic alignment [10]. 2. Multimodal Instruction Tuning: Enhances the model's ability to follow instructions and reason across modalities [12]. Group 4: Performance and Efficiency - MENTOR achieved competitive performance on DreamBench++, surpassing larger models like Emu2 (37 billion parameters) and DreamEngine (10.5 billion parameters) while maintaining a lower CP/PF ratio, indicating better balance between visual feature preservation and prompt following [15][17]. - The training process for MENTOR utilized approximately 3 million image-text pairs over 1.5 days, demonstrating significant efficiency compared to other baseline methods [18]. Group 5: Applications and Future Potential - MENTOR's framework is highly versatile, capable of handling various complex multimodal generation tasks with minimal adjustments [24]. - The article concludes that MENTOR opens a new path for controllable image generation tasks, showcasing the potential of AR models in visual generation, while acknowledging that there are still areas where it lags behind top-tier Diffusion models [26].
英伟达全新开源模型:三倍吞吐、单卡可跑,还拿下推理SOTA
量子位· 2025-07-29 05:05
Core Viewpoint - NVIDIA has launched the Llama Nemotron Super v1.5, an open-source model designed for complex reasoning and agent tasks, achieving state-of-the-art performance while tripling throughput compared to its predecessor, and efficiently running on a single GPU [2][11]. Model Introduction - Llama Nemotron Super v1.5 is an upgraded version of Llama-3.3-Nemotron-Super-49B-V1, specifically tailored for complex reasoning and intelligent agent tasks [3]. Model Architecture - The model employs Neural Architecture Search (NAS) to balance accuracy and efficiency, effectively converting throughput improvements into lower operational costs [4]. - NAS generates non-standard, non-repetitive network modules, introducing two key changes compared to traditional Transformers: - Skip attention mechanism, which bypasses the attention layer in certain modules [6]. - Variable Feedforward Network (FFN), where different modules utilize varying expansion/compression ratios [7]. Efficiency Improvements - The model reduces FLOPs by skipping attention or altering FFN widths, allowing for more efficient operation under resource constraints [8]. - A block-wise distillation process was applied to the original Llama model, constructing multiple variants for each module and searching for optimal combinations [9]. Training and Dataset - The model was trained on 40 billion tokens from three datasets: FineWeb, Buzz-V1.2, and Dolma, focusing on English single-turn and multi-turn conversations [10]. - Post-training involved a combination of supervised fine-tuning and reinforcement learning to enhance performance in key tasks such as coding, mathematics, reasoning, and instruction following [10]. Deployment and Ecosystem - NVIDIA's AI models are optimized for running on NVIDIA GPU-accelerated systems, achieving significant speed improvements over CPU-only solutions [12]. - Llama Nemotron Super v1.5 is now open-source, available for developers on build.nvidia.com or via Hugging Face [13]. Ecosystem and Model Series - The Llama Nemotron ecosystem integrates large language models, training and inference frameworks, optimization tools, and enterprise deployment solutions for high-performance AI application development [14]. - NVIDIA has introduced three series of large language models: Nano, Super, and Ultra, catering to different deployment needs and user profiles [16]. - The Super series, including Llama Nemotron Super v1.5, balances precision and computational efficiency for single GPU use [17]. Enterprise Support - The Nemotron model has gained support from major enterprises like SAP, Microsoft, and Deloitte for building AI agent platforms aimed at enterprise-level process automation and complex problem-solving [17].
单张消费级显卡也能参与大模型训练!无问芯穹用「三个盒子」打通十万卡到一张卡AI效能跃升路径
量子位· 2025-07-29 05:05
衡宇 发自 WAIC 量子位 | 公众号 QbitAI 智能时代的尺度,在计算资源与智能效率的双重牵引下正在极速压缩、迅速蔓延。 两年前,我们惊艳于几千卡集群训练而成的GPT3.5;但今天,一部手机也可以装下与它同等性能的小型AI了。 2025年WAIC上, 无问芯穹联合创始人、CEO夏立雪 如此说道。 他还代表无问芯穹,带来了AI落地这道难题的最新回答—— 三个盒子,打通从十万卡到一张卡的AI效能跃升路径 。 是的,仅仅是三个盒子。 在无问芯穹看来,这三个盒子背后,是一整套面向未来的智能基础设施设计。 什么是三个盒子? "三个盒子"其实是无问芯穹全规模AI效能跃升方案的三大核心产品: 这是一整套软硬件协同系统,专为未来智能基础设施设计,能覆盖从云到端的各种规模场景,支持多种异构算力,同时打通模型调度、性能优 化到应用部署的全流程。 我们一个一个来看—— 大盒子:无穹AI云 大盒子:无穹AI云 中盒子:无界智算平台 小盒子:无垠终端智能 大盒子,即无问芯穹推出的 无穹AI云 ,是面向万卡至十万卡级别的智算网络,为超大规模算力集群的利用提供了一个系统性的解决方案。 夏立雪在现场透露,无界智算平台已在超过100个 ...
预测太阳磁暴全球最强!首个空间天气链式AI预报模型亮相WAIC
量子位· 2025-07-29 05:05
允中 发自 凹非寺 量子位 | 公众号 QbitAI 就在一颗通信卫星以第一宇宙速度飞过我们头顶的几分钟时间里,上百万人正借助由它所搭建的网络去链接这个世界,而实际上,这样的卫星 有成千上万颗。 当我们使用方便快捷的卫星网络服务时,就在网络的另一边,一个名 叫 " 风云太空 " 的系统 , 却平静无声地向这些为我们提供服务的卫星 发送了预警信息,一场因太阳爆发活动所带来的冲击即将在大约 24 小时后到达 ...... 在获取预警信息后,地面运控部门启动应急预案,并在 太阳风暴到来时从容应对,化解了此次空间天气危机。 这个场景,正是 我国空间天气预报能力迈向智能化 的一个缩影,而其背后的核心技术,就是本文的主角—— "风宇"模型 。 王劲松主任认为,"风宇"模型的研发成功,使得空间天气预报实现了 物理模型、数值预报和人工智能三足并立 的格局,大大提高了我国空间 天气预报水平。 国家卫星气象中心(国家空间天气监测预警中心)主任王劲松介绍,这是 全球首个全链式的空间天气人工智能预报模型 。 世界首个全链路空间天气AI预报模型 当前太阳正处于活动高发期,日珥爆发等随机事件如同无形的"宇宙海啸",时刻威胁着在轨卫星、航 ...
狂拿大模型明星订单,一家清华系HPC-AI Infra公司浮出水面
量子位· 2025-07-29 05:05
明敏 发自 凹非寺 量子位 | 公众号 QbitAI 不靠囤算力,拿下数家大模型明星公司订单。 93年创始人掌舵的清华系计算创业公司,有点出其不意。 2023上半年,百模大战开启,模型预训练需求空前爆发,在算力焦虑下,囤积算力成为一种趋同性动作,更充裕的算力几乎就等于金额更高 的订单。 10亿、甚至50亿 ,诱惑非常大。 站在暴风眼最核心,创始人闫博文没有这么做。从技术角度出发,他知道未来算力一定会有闲置,疯狂囤算力对于一家技术公司而言似乎也不 够make sense。 而且从结果看, 这也不影响他拿大单 。 百度、Kimi 以及视频生成赛道顶尖玩家 生数科技 等,都选择与他们合作。 So,why? 3次获得戈登·贝尔奖 是石科技创立于2021年,团队从国家超级计算无锡中心孵化而来,是国内最早将超算智算并行优化的技术进行产业化的团队之一。 创始人兼董事长闫博文 ,出生于1993年,毕业于清华大学,是清华计算机系博士后。主要研究方向包括计算机应用技术、高性能计算、并行 优化等。 博士期间,闫博文 参与了国家超级计算无锡中心项目 ,主要实现将CFD整体算法移植到国产超算"神威·太湖之光"上。 "神威·太湖之光" ...
AI改造激光焊接检测!“过杀”率暴降50%,国际头部客户产线已用上
量子位· 2025-07-29 05:05
Core Viewpoint - The precision manufacturing industry is highly suitable for AI transformation, especially in the context of advanced manufacturing and quality control [1]. Group 1: AI Integration in Precision Manufacturing - A laser welding online detection system showcased at the International Supply Chain Expo demonstrated a 50% reduction in "overkill" rates through AI-based deep learning detection models, significantly enhancing detection accuracy and production efficiency [2]. - The system, developed by Guangzhou Deqing Optical Technology, integrates proprietary AI and optoelectronic detection technologies, including AI parameter adjustment, AI overkill reduction, and AI fault diagnosis, and is already in use by leading international consumer electronics clients [3]. Group 2: Challenges in Traditional Detection Methods - Traditional optical detection methods rely on comparing current welding signals with established benchmarks, which can lead to misclassification of acceptable products as defective due to reliance on human experience [7]. - The limitations of traditional methods necessitate the adoption of advanced AI techniques to improve detection accuracy and efficiency [8]. Group 3: Advancements in AI Detection Algorithms - Deqing Optical has developed an end-to-end deep learning model to enhance defect detection accuracy, utilizing a high-quality dataset generated from real production line data [8][9]. - The AI model incorporates various algorithm structures, including CNN, RNN, RF, and SVM, and has been validated through multiple training rounds to ensure its effectiveness in real-world applications [9]. Group 4: Future Developments and Applications - The AI integration in laser welding detection will expand to include additional functionalities such as AI physical parameter regression and defect classification, enabling comprehensive monitoring across all stages of the welding process [11]. - Deqing Optical aims to enhance the performance of AI detection systems and implement digital twin technology for real-time optimization of process parameters, with over 5,000 devices currently operational globally [12].
每个人的AI科学助手!全球首个通用科学智能体来了,全网资源+1.7亿学术文献让科研效率狂飙
量子位· 2025-07-29 03:43
Core Viewpoint - The article introduces SciMaster, the world's first general scientific intelligence agent, developed by Shanghai Jiao Tong University and DeepMind Technology, which serves as an expert-level research assistant for various scientific inquiries and everyday problems [1][42]. Group 1: Features and Capabilities - SciMaster integrates resources from the internet and 170 million scientific documents to assist users in overcoming research challenges [2]. - It offers two modes: a "general assistant" mode for quick insights and a "deep research" mode for comprehensive reports, including references and links [22][25]. - The tool can automatically match and utilize various scientific tools based on user queries, enhancing its functionality [28]. Group 2: Research and Application - SciMaster's core function is expert-level deep research, leveraging the Innovator model with multimodal capabilities [5]. - It can conduct extensive searches across the internet and scientific literature, employing methods like WebSearch, WebParse, and PaperSearch to gather relevant data [7][14]. - The tool has demonstrated its ability to refine search strategies based on initial results, leading to more relevant findings [10][15]. Group 3: Industry Impact and Future Prospects - SciMaster aims to reshape the research paradigm in universities, moving beyond traditional teaching and research methods [45]. - The collaboration between DeepMind Technology and various universities is expected to foster innovation and broaden the application of AI in scientific research [44][46]. - The ultimate goal of SciMaster is to become a leading platform in the AI for Science (AI4S) field, akin to Hugging Face in its domain [47][48].