Workflow
量子位
icon
Search documents
李飞飞发布世界模型新成果:一个提示,生成无限3D世界
量子位· 2025-09-17 01:42
Core Viewpoint - The article discusses the latest advancements in 3D world generation by Li Fei-Fei's startup, World Labs, highlighting the ability to create expansive, customizable, and consistent 3D environments that can be navigated and explored indefinitely [1][3][27]. Group 1: Model Capabilities - The new model can generate persistent, navigable, and customizable 3D worlds, allowing for seamless integration of multiple independently generated scenes into larger virtual environments [3][25]. - Users can export generated worlds as Gaussian point clouds for use in downstream projects, facilitated by the open-source Spark rendering library, which integrates well with Three.js for web-based 3D experiences [8][12]. - The model supports free viewpoint roaming in a coherent 3D world, enabling users to explore hidden spaces beyond their initial perspective [13][14]. Group 2: Visual Style and Diversity - The model excels in generating diverse visual styles, from cartoonish to realistic, allowing creators to iterate freely on the visual aesthetics of their 3D environments [15][17]. - Users can explore and adjust various styles to find the most suitable virtual world for their needs, enhancing the creative process [16][18]. Group 3: Scale and Exploration - The model allows for the creation of larger virtual worlds by enabling users to combine generated scenes, akin to assembling a puzzle, thus expanding the potential applications of these environments [19][24]. - The generated worlds are designed to be permanently accessible, allowing users to create links and save their work without time constraints, which is a significant advantage over competitors like Google's Genie [28][29].
小白也能玩转AI视频!即梦Agent模式实测:一句话搞定插画、海报、Vlog
量子位· 2025-09-16 09:04
Core Viewpoint - The article discusses the launch of the new Agent mode by Jimeng AI, which simplifies the process of generating images and videos from text prompts, making it accessible for users with no prior experience in AI tools [3][53]. Group 1: Features of Agent Mode - Agent mode allows users to input complex instructions in a single line, streamlining the process of creating images and videos [3][53]. - The mode includes a smart multi-frame feature that can generate multiple continuous images and automatically connect them to form a complete video [9][48]. - Users can create a series of images that tell a complete story, enhancing creative possibilities [6][48]. Group 2: User Experience and Efficiency - The article highlights a user experience where a prompt to create illustrations of iconic Chinese landmarks resulted in a completed video in under three minutes [12]. - The system adapts to user needs, automatically adjusting formats and styles based on the input prompt, such as generating a vertical layout for mobile display [13]. - Users can generate up to 40 images or 8 videos simultaneously with a single command, significantly increasing productivity [39]. Group 3: Technical Advancements - The Agent mode is powered by the Seedream 4.0 model, which has surpassed Google's Nano Banana in both text-to-image and image editing capabilities [49][51]. - The new model supports 4K resolution, a feature not available in previous versions, enhancing the quality of generated content [52]. - The integration of various functionalities, such as image editing and sequence generation, allows for a more cohesive and comprehensive creative process [51].
谷歌DeepMind:AI独立创造价值的经济层正在形成
量子位· 2025-09-16 05:58
Core Viewpoint - The emergence of AI Agents is leading to the formation of a new economic layer, referred to as the "Sandbox Economy," where these agents can operate at scales and speeds beyond human oversight [1][2]. Group 1: Characteristics of the New Economy - The new economic system can be characterized by two dimensions: its origin (spontaneous emergence vs. human design) and its degree of separation from the existing human economy (permeable vs. impermeable) [3]. - The rapid proliferation of AI Agents indicates the formation of an independent economic layer that creates value autonomously [2][4]. Group 2: Applications of AI Agents - AI Agents are being applied in various fields, including: - **Scientific Research**: For instance, the AI named Gauss successfully tackled a mathematical challenge in three weeks under the guidance of a professor [9]. - **Robotics**: Robots are assisting in household chores and can also work in factories for tasks like sorting packages [10]. - **Personal Assistants**: New AI agents, such as Meituan's Agent Xiaomei, allow users to place food orders using voice commands [12]. Additionally, office assistants help organize materials and generate reports, enhancing work efficiency [13]. Group 3: Resource Management and Fairness - Conflicts may arise when multiple AI Agents represent different users and compete for the same resources. Researchers suggest using market mechanisms and fair distribution rules to manage resources effectively [14]. - A virtual market concept is proposed where each AI Agent has an equal amount of "virtual currency" to bid for shared resources, ensuring fairness and preventing disparities in resource allocation [16][18]. Group 4: Regulatory Considerations - To ensure a practical and safe virtual agent economy, targeted measures in legal, technical, and policy domains are necessary [19]. - **Legal Responsibility**: A shift from traditional single-entity accountability to a collective responsibility model for AI collaborations is recommended [21]. - **Technical Standards**: Promoting interoperability standards like A2A and MCP protocols to prevent fragmentation in the AI ecosystem is essential [23]. - **Supervisory Framework**: A three-tiered supervision system is proposed, where AI monitors AI, with human oversight as a safety net [24]. - **Regulatory Sandbox Trials**: Small-scale tests in specific scenarios are suggested to observe AI interactions and assess fairness mechanisms [26]. Group 5: Societal Impact and Human Collaboration - Emphasizing the need to reform education to enhance human strengths in critical thinking and problem-solving, rather than allowing AI to completely replace human decision-making [27]. - Strengthening social safety nets to address potential job displacement caused by AI, ensuring that the wealth generated by AI benefits a broad population rather than a select few [28]. Group 6: Market Development - The launch of MuleRun, the world's first AI Agent marketplace, marks a significant step in the AI economy, providing a unified platform for users to access various agents and services [29][30]. - MuleRun also introduces a support program for AI Agent creators, offering financial, marketing, and technical assistance to foster sustainable income growth for creators [32].
首次人体实验成功!基因编辑胰岛细胞“隐身”植入,可正常分泌胰岛素
量子位· 2025-09-16 05:58
Core Viewpoint - The article highlights a significant breakthrough in diabetes treatment, where CRISPR-edited pancreatic cells were successfully transplanted into a type 1 diabetes patient, showing promising results in insulin secretion and immune evasion [1][2][3]. Group 1: Research Background - Type 1 diabetes is an autoimmune disease where the immune system attacks insulin-secreting pancreatic cells, leading to uncontrolled blood sugar levels [4][5]. - The research conducted by Sana Biotechnology aims to provide a potential cure for approximately 9.5 million type 1 diabetes patients globally [8]. Group 2: Methodology - Researchers extracted pancreatic cells from a 60-year-old deceased donor and utilized CRISPR-Cas12b technology to edit these cells by knocking out two key genes, B2M and CIITA, which typically mark foreign invaders for the immune system [9][10]. - To further protect the cells from immune surveillance, a gene encoding the CD47 protein was introduced, which sends a "don't eat me" signal to the immune system [12]. Group 3: Clinical Application - The edited pancreatic cells, totaling 79.6 million, were implanted into a 42-year-old patient with 37 years of type 1 diabetes through 17 injections into muscle tissue [20][24]. - Notably, the entire procedure did not involve any glucocorticoids, anti-inflammatory drugs, or immunosuppressants [25]. Group 4: Results and Future Plans - After 12 weeks post-transplant, the edited cells showed no signs of rejection and continued to secrete insulin, effectively regulating the patient's blood sugar levels [26]. - C-peptide levels, a direct marker of endogenous insulin secretion, were significantly elevated at 4, 8, and 12 weeks post-intervention [28]. - Sana Biotechnology plans to conduct more comprehensive clinical trials starting next year to further investigate the treatment's efficacy [30].
马斯克周末血裁xAI 500人
量子位· 2025-09-16 05:58
Jay 发自 凹非寺 量子位 | 公众号 QbitAI 什么情况,帮马斯克训练大模型的人说失业就失业了? 马斯克裁员式考核 数据标注团队曾是xAI最大的团队,在Grok的开发过程中发挥了关键作用。他们的工作是标记、分类并将原始数据置于特定语境中,从而教 会AI如何更好地理解世界。 自xAI成立以来,数据标注团队的规模一直在持续增长。 与大多数人工智能公司不同,xAI的许多数据标注员都是直接聘请的,而非外包 。通过这种方式,可以让公司对模型训练拥有更多的控制 权,更好的隐私。 但相应的,成本也更高。 今年2月份,xAI披露计划雇用数千人来帮助训练Grok,并在半年内新增了约700名数据标注员。 上周四晚,xAI内部上演了一场突袭测试,还要求员工必须在第二天早上之前完成并提交。 这可不是一次简单的随堂测试—— 截至目前,本次xAI内部测试的淘汰率高达33%,已有 超过500名员工 被通知卷铺盖走人。 然而9月初,Linkedin页面显示,负责管理数据标注团队的十几名经理中, 至少已有9位被解雇 。 这次不太寻常的人事变动,为即将到来的剧烈动荡埋下了种子。 之后一段时间内,xAI开始与数据标注团队的部分员工开展 一 ...
魅族AI眼镜1999元开卖:拍照翻译付款全都会,39g重
量子位· 2025-09-16 05:58
Core Viewpoint - The article discusses the launch of Meizu's new AI-powered smart glasses, StarV Snap, which integrates various AI functionalities and aims to cater to a younger audience with its innovative features and practical applications [2][5][24]. Group 1: Product Features - StarV Snap weighs only 39g, making it lightweight and comfortable for users [3][22]. - The glasses support 12 languages for simultaneous translation, AI object recognition, and voice transcription, enhancing user interaction [5][17]. - The device includes a dedicated AI button for quick command execution without needing a wake word [11]. - Equipped with a 12MP camera, StarV Snap offers a 109° ultra-wide field of view, 720P recording, and 1080P photo capabilities, emphasizing its focus on capturing moments [28][30]. - The glasses feature a Type-C charging port, allowing users to charge while using the device, addressing battery anxiety [19][22]. Group 2: User Experience and Interaction - StarV Snap allows interaction with Meizu's Ring2, enabling quick photo and video capture [16]. - The glasses include a live photo feature that captures moments before and after pressing the shutter, enhancing content creation [33]. - A unique film filter mode is introduced, providing a nostalgic aesthetic to photos [35]. - The device has a continuous shooting mode at 720P, allowing for extended recording sessions [37]. - Privacy features include a visible recording indicator and a "block detection" mechanism to prevent unauthorized recording [39]. Group 3: Market Positioning and Strategy - Meizu's approach is practical, focusing on usability and integration into daily life, rather than gimmicks [24]. - The glasses are among the few in the market that support payment functionalities through partnerships with Alipay and Ant International [22][23]. - The launch of StarV Snap aligns with Meizu's broader strategy to create an ecosystem that includes smartphones and automotive technology [41][43].
宇树:开源机器人世界大模型!
量子位· 2025-09-16 04:05
Core Viewpoint - The article discusses the release of a new open-source model named UnifoLM-WMA-0, which is designed to enhance the interaction between robots and their environments through a world model that understands physical laws [1][9]. Group 1: Model Performance - The model demonstrates effective performance in tasks such as stacking blocks, with predictions closely matching actual operations [3]. - It can also handle more intricate tasks, such as organizing stationery, showcasing its versatility [7]. Group 2: Model Features - UnifoLM-WMA-0 is part of the UnifoLM series, specifically tailored for general robot learning and adaptable to various robotic platforms [9]. - The model's training code, inference code, and checkpoints have been fully open-sourced, quickly gaining over 100 stars on GitHub [11]. Group 3: Training Strategy - The training strategy involved fine-tuning a video generation model using the Open-X dataset to adapt its capabilities to real-world robotic tasks [15]. - The model operates under a dual-function architecture: a decision mode for predicting key information during physical interactions and a simulation mode for generating realistic environmental feedback based on robot actions [20]. Group 4: Dataset Utilization - The training utilized five open-source datasets provided by Unitree Technology, which contributed to the comprehensive training process [22]. - The model excels as a simulation engine, capable of generating controlled interactions based on current scene images and future action commands [23].
奥特曼“续命”大计:押注让大脑变年轻的药物,预计年底临床试验
量子位· 2025-09-16 04:05
Core Viewpoint - The article discusses Sam Altman's increased investment in the biotech startup Retro Biosciences, which aims to extend human lifespan by 10 years through innovative therapies targeting aging and related diseases [3][4][33]. Investment and Company Overview - Sam Altman has invested a total of $180 million (approximately 1.3 billion RMB) in Retro Biosciences, indicating strong support for the company's mission [4]. - Retro Biosciences plans to initiate its first human clinical trial for an experimental drug, RTR242, by the end of 2025 [8][12]. - The company has previously collaborated with OpenAI to develop a model called GPT-4b-micro, designed for protein engineering [5][28]. Scientific Approach and Mechanism - RTR242 aims to enhance the cellular "garbage disposal and recycling system" to clear damaged cellular components, potentially reversing aging effects [16]. - The drug targets "cellular garbage" associated with Alzheimer's and Parkinson's diseases, aiming to restart the autophagy process in aging individuals [17]. - Retro is also developing therapies for leukemia and central nervous system diseases, indicating a broad approach to longevity [34]. Future Goals and Funding - Retro Biosciences has set an ambitious goal to increase healthy human lifespan by 10 years, focusing on maintaining health and vitality until the end of life [33]. - The company aims to raise $1 billion in its Series A funding round to support its clinical trials and research initiatives [35]. - Comparatively, other longevity companies like Altos Labs have raised over $3 billion, highlighting the competitive landscape in the longevity biotech sector [37]. Leadership and Expertise - The CEO of Retro Biosciences, Bates-Lacroix, has a strong background in protein research and has previously led successful ventures in the tech industry [44]. - The company also features a co-founder, Ding Sheng, known for his expertise in stem cell research, enhancing the company's scientific credibility [41].
每周7亿人都在如何用ChatGPT?OpenAI最全报告来了
量子位· 2025-09-16 00:52
Core Insights - The report titled "How People Use ChatGPT" was published by OpenAI in collaboration with Harvard economist David Deming, detailing the usage of ChatGPT from November 2022 to July 2025 [1][4] - As of July this year, ChatGPT has over 700 million weekly active users, sending a total of 18 billion messages weekly [5] User Engagement - The research analyzed 1.5 million large-scale dialogues to understand how people utilize ChatGPT, employing automated classifiers while ensuring privacy [6] - ChatGPT is primarily used for daily tasks, with three-quarters of conversations focusing on practical guidance, information search, and writing [11][12] - The main usage types include: - Practical guidance (28.8%): Personalized fitness plans, creative brainstorming, skill teaching - Information search (24.4%): Information on people/events, recipes, product inquiries - Writing (23.9%): Email/document generation, text editing, translation, summarization [18] User Demographics - The gender gap among ChatGPT users has significantly narrowed, with typical female names surpassing typical male names in user engagement this year [23][24] - Users aged 18-25 contribute 46% of the total message volume, with older users showing a higher proportion of work-related messages [26] - ChatGPT is experiencing rapid growth in low to middle-income countries, with usage growth in the lowest income countries being over four times that of the highest income countries as of May [27][28] Conclusion - Overall, ChatGPT is increasingly being adopted across diverse demographics, serving primarily as a consultant for practical guidance, information retrieval, and writing tasks [20][29]
GPT-5编程专用版发布!独立连续编程7小时,简单任务提速10倍,VS Code就能用
量子位· 2025-09-16 00:52
Core Viewpoint - OpenAI has launched the GPT-5-Codex model, which significantly enhances programming capabilities, allowing for independent continuous programming for up to 7 hours, and introduces a new "dynamic thinking" ability that adjusts computational resources in real-time during task execution [1][4][5]. Group 1: Model Enhancements - The new GPT-5-Codex model is specifically trained for complex engineering tasks, including building complete projects from scratch, adding features, testing, debugging, and executing large-scale refactoring [8]. - In testing, GPT-5-Codex demonstrated a nearly 20% improvement in success rates for code refactoring tasks compared to the original GPT-5 [9]. - For simple tasks, GPT-5-Codex reduced output token count by 93.7%, resulting in a 10-fold speed increase in response time [11]. Group 2: Dynamic Thinking Capability - GPT-5-Codex can spend double the time reasoning, editing, and testing code for complex tasks, leading to a 102.2% increase in output token volume [12]. - The model's dynamic thinking capability allows it to adjust its computational approach during task execution, enhancing its problem-solving efficiency [4]. Group 3: Code Review and Quality Improvement - GPT-5-Codex underwent specialized training for code review, reducing the error comment rate from 13.7% to 4.4% and increasing the proportion of high-impact comments from 39.4% to 52.4% [15]. - The model can understand the true intent of pull requests (PRs) and traverse entire codebases to validate behavior through testing [15][17]. Group 4: Ecosystem and Tool Integration - OpenAI has restructured the entire Codex product ecosystem, introducing features like image input support, allowing users to input screenshots and design drafts for implementation [18]. - The updated Codex CLI now tracks progress with to-do lists and integrates tools like web search and MCP for enhanced task management [19]. - New IDE extensions bring Codex directly into editors like VS Code and Cursor, enabling seamless cloud and local task management [23]. Group 5: Market Positioning - The timing of this upgrade coincides with a decline in user subscriptions for Claude Code due to quality issues, positioning OpenAI to capture market share in AI programming [25][26].