Workflow
量子位
icon
Search documents
张朝阳对话理论物理学家汤大卫:我们来自量子涨落,也都是恒星的尘埃
量子位· 2025-07-31 06:51
Group 1 - The dialogue between Zhang Chaoyang and David Tong covers the evolution of physics from classical mechanics to quantum mechanics and field theory, emphasizing the importance of mathematical rigor in understanding physical laws [1][2][3] - The discussion highlights significant milestones in physics, including Newton's laws, Einstein's theories, and the development of quantum mechanics, showcasing how these theories have transformed our understanding of the universe [2][16][19] - The conversation also touches on the role of fluid dynamics in physics, particularly in understanding complex phenomena such as the behavior of quark-gluon plasma and its implications for the universe [8][12][13] Group 2 - The importance of scientific communication and public education is emphasized, with a belief that rigorous mathematics should not be avoided in popular science [35][41] - The potential of AI in assisting physicists is discussed, highlighting its role in solving complex equations and aiding research, while also acknowledging the irreplaceable value of human interaction in education [10][11][38] - The dialogue concludes with reflections on the future of scientific dissemination, suggesting that the next generation of scientists should embrace the challenge of making complex theories accessible without oversimplifying the underlying mathematics [36][40][41]
机器人终于能帮人洗衣服了
量子位· 2025-07-31 06:51
henry 发自 凹非寺 量子位 | 公众号 QbitAI 打拳、跳舞、跑马拉松……"不务正业"的机器人终于能帮人洗衣服了! 刚刚,Figure公司创始人Brett Adcock发布了一段Figure.02机器人在家中洗衣服的演示视频。 视频里,机器人半蹲,双手协作,左手拿着衣篓,右手一件件将衣服放进洗衣机,中途还不时调整衣物的位置。 据Adcock表示,这套动作已经在办公室里连续测试了一个月,而这次是机器人首次在真实家庭环境中完成这一任务! 从之前的宝马车间到分拣流水线,再到家里的洗衣房,行业标杆Figure终于"登堂入室",实现了真正的工业和家用场景操作。 那么,Figure到底做了什么? 洗衣服这件"小事" 对于机器人来说,洗衣服这项任务可以分为两个层面: 就像开头Adcock提到的,搭载Helix的Figure.02在办公环境跑了一个月后才在家庭环境中落地,这就体现出任务场景(task environment) 之于机器人这类智能体的重要性。 在工业场景下,机器人面对的是 结构化、重复、确定的任务 。像装配零件、搬动物体这样的任务,都具备清晰、可控的操作流程。 同时,工业场景中 物体和设备的位置都相对 ...
随手拍照片就能VR云旅游!无位姿、稀疏图像条件下实现稳定3D重建和新视角合成|港科广
量子位· 2025-07-31 04:23
Core Viewpoint - A new algorithm, RegGS, developed by the Hong Kong University of Science and Technology (Guangzhou), can reconstruct 3D models from sparse 2D images without precise camera positioning, achieving centimeter-level accuracy suitable for VR applications [2][4]. Group 1: Methodology - RegGS combines feed-forward Gaussian representation with structural registration to address the challenges of sparse and pose-less images, providing a new pathway for practical 3D reconstruction [6][8]. - The core mechanism involves registering local 3D Gaussian mixture models to gradually build a global 3D scene, avoiding reliance on traditional Structure from Motion (SfM) initialization and requiring fewer input images [8][12]. Group 2: Experimental Results - In experiments on the RE10K and ACID datasets, RegGS outperformed existing mainstream methods across various input frame counts (2×/8×/16×/32×) in metrics such as PSNR, SSIM, and LPIPS [9][12]. Group 3: Applications - RegGS addresses the "sparse + pose-less" problem with significant real-world applications, including: - 3D reconstruction from user-generated content (UGC) videos, which often lack camera parameters [13]. - Drone aerial mapping, demonstrating robustness to large viewpoint variations and low frame rates [13]. - Restoration of historical images/documents, enabling 3D reconstruction from a few photos taken from different angles [13]. - Compared to traditional SfM or Bundle Adjustment methods, RegGS requires less structural input and is more feasible for unstructured data applications [13]. Group 4: Limitations and Future Directions - The performance and efficiency of RegGS are currently limited by the quality of the upstream feed-forward model and the computational cost of the MW2 distance calculation, indicating areas for future optimization [13].
小扎改口不开源,Meta股价暴涨12%
量子位· 2025-07-31 04:23
Core Viewpoint - Meta's recent financial results exceeded expectations, with revenue of $47.52 billion and net income of $18.3 billion, leading to a significant stock price increase of 12% [2][10][16]. Financial Performance - Meta's Q2 revenue grew by 22% year-over-year, reaching $47.52 billion, surpassing the expected $44.8 billion [10]. - Net income increased by 36% year-over-year to $18.3 billion [10]. - Advertising revenue remains the primary source of income, with ad impressions through applications increasing by 11% [11]. - Operating income rose by 38% to $20.44 billion, with an operating margin of 43% [12]. - Reality Labs continues to incur losses, with a Q2 operating loss of $4.53 billion, totaling nearly $70 billion in losses since 2020 [12]. Strategic Focus - Meta is shifting its strategy towards AI, emphasizing the development of "personal superintelligence" and a cautious approach to open-source initiatives [3][21]. - The company plans to increase capital expenditures from a previous lower limit of $64 billion to $66 billion, with total expenditures projected between $114 billion and $118 billion for the year [17][18]. - Meta's CEO, Mark Zuckerberg, highlighted the importance of AI in recent business decisions, including high-profile recruitment efforts [18]. Vision for AI - Zuckerberg's vision emphasizes personal empowerment through superintelligence, aiming to make this technology accessible to everyone rather than focusing solely on automation of valuable work [22][44]. - The integration of technology into daily life is a key focus, with devices like smart glasses envisioned as primary computing tools [24][45]. - The company acknowledges potential security risks associated with superintelligence and plans to manage these risks carefully while being selective about open-source content [26][46]. Market Reception - The market has shown confidence in Meta's AI investments, as evidenced by the stock price surge following the financial report [9][16]. - Despite the ambitious vision, there are concerns regarding the clarity and feasibility of Zuckerberg's plans for superintelligence, with critics questioning the specifics of how these goals will be achieved [37][39].
全网疯传GPT-5泄露!首次统一GPT和o系列,编程实测demo抢先曝光,下周发布?
量子位· 2025-07-31 04:23
Core Viewpoint - GPT-5 is expected to be released soon, with significant enhancements in capabilities, including multi-modal interactions and advanced programming skills [10][12][31]. Group 1: Release and Features - GPT-5 has been spotted across various platforms, including ChatGPT, MacOS applications, Cursor, and Microsoft Copilot, indicating a broad rollout [2][5][12]. - The model will integrate the capabilities of the GPT series and the o series, allowing for seamless switching between different functionalities without manual intervention [11][14]. - The main model, GPT-5, is reported to have a context window of up to 1 million tokens and can output up to 100,000 tokens, enhancing its performance in long-term dialogues and logical processing [19]. Group 2: Model Variants - GPT-5 will include multiple versions: the main model (codename "nectarine" or "o3-alpha"), GPT-5 mini (codename "lobster"), and GPT-5 nano (codename "starfish") [15][25]. - The mini version, Lobster, is designed specifically for programming tasks, outperforming other models like Claude 4 in complex coding scenarios [22]. - Lobster can quickly generate complete and accurate code with minimal input, making it suitable for managing legacy code and optimizing code structure [22]. Group 3: Performance and Capabilities - GPT-5 is expected to demonstrate superior programming abilities, achieving near-human programmer levels and enabling faster and more precise software development [16]. - The model will support multi-modal capabilities, allowing it to handle text, images, and tool calls simultaneously, enhancing its utility as a versatile assistant [24]. - The nano version, starfish, has been observed in testing but is currently limited to static game interfaces [25][27]. Group 4: Community Reactions and Skepticism - Despite the excitement surrounding GPT-5, there are voices of skepticism regarding its long-term performance and potential limitations, echoing past experiences with model releases [33][35]. - Concerns have been raised about the model's ability to handle complex reasoning tasks and its tendency to produce misleading outputs [35][37]. - Some community members speculate that the leaks about GPT-5 may be part of a marketing strategy by OpenAI to generate hype [39].
15.8万全尺寸人形抱回家!逐际动力让具身机器人也有经济适用款:31自由度,二开友好度拉满
量子位· 2025-07-31 02:29
衡宇 发自 凹非寺 量子位 | 公众号 QbitAI 还有人没被《大展鸿图》洗脑吗? 有的时候,甚至还有那么一点优雅在身上: 反正今天一睁眼,朋友圈里就有人形机器人"别墅里面唱K"开始刷屏,给我看得一愣一愣的—— 我多有心眼呀!只用了一眼,就怀疑它是AI特效,结果不是;然后我就怀疑这么丝滑一定是CG效果,结果也不是。 这就是一台真人大小的人形机器人,现场跳的,实拍无剪辑,毫无特效加持 。 何以见得? 因为《大展鸿图》只是这位人形机器人秀了一把,它真正的用武之地,可不在舞台上。 要它像人一样走路,它就像人一样走路,行走最大速度5km/h: 要力气,有力气,单臂最大负载3kg: 答案揭晓——这款刷屏朋友圈的机器人,出自一家非常实在的具身智能实力派公司, 逐际动力 。 它的名字叫 LimX Oli ,是一款全尺寸·全自由度人形机器人,今天正式公开预售。 通用人形构型, 身高165cm,机身自由度31个,最大化适配现实生活和生产场景中的任务需求 。 看了一眼它的价格,属实倒吸一口凉气……十万元级别! 说真的,等你看完LimX Oli到底有多能打,你一定会和我一起惊叹这个价格。 逐际动力新人形机器人,有料 再过一周,2 ...
阿里安全揭示:恶意邮件可致macOS/iOS瞬间瘫痪!畸形证书发现密码库新漏洞
量子位· 2025-07-30 23:56
Core Viewpoint - The article discusses a significant security vulnerability in macOS/iOS systems that can be exploited through malformed X.509 certificates, leading to Denial-of-Service (DoS) attacks, as revealed by Alibaba Security's research [1][2][3]. Group 1: Research Findings - Alibaba Security, in collaboration with Indiana University, identified a new attack vector using malformed X.509 certificates to detect potential DoS vulnerabilities in cryptographic libraries [2]. - The research led to the discovery of 18 new CVE vulnerabilities and the identification of 12 known CVE vulnerabilities across six major open-source cryptographic libraries and one Apple-specific library [4]. - The findings were presented at the USENIX Security '25 conference and received a nomination for the Pwnie Awards [3]. Group 2: Attack Mechanism - The research highlights that malformed X.509 certificates can trigger DoS attacks by exhausting system resources during certificate parsing and validation processes [7][8]. - Attackers can exploit these vulnerabilities by sending malformed certificates via email or during TLS handshake processes, causing systems to become unresponsive [9][10]. - The study emphasizes that existing cryptographic APIs are often complex and can be misused, leading to security risks even when developers follow guidelines [10][11]. Group 3: Contributions and Tools - The research team conducted a systematic analysis of cryptographic libraries, identifying three new types of DoS risks and proposing malformed X.509 certificates as a universal attack vector [13]. - They developed an automated tool named X.509DoSTool to generate specific malformed certificates and detect corresponding DoS vulnerabilities in cryptographic libraries [28]. - The tool successfully identified new vulnerabilities and demonstrated the feasibility of using malformed certificates to exploit DoS vulnerabilities in real-world scenarios [30]. Group 4: Mitigation Strategies - The article suggests that developers should adopt secure programming practices and be aware of potential security risks when implementing cryptographic libraries [32]. - Recommendations include implementing checks for user inputs, optimizing code for efficiency, and limiting the size of certificates to mitigate potential DoS attacks [33]. - The research advocates for the gradual removal of redundant features in cryptographic libraries to enhance overall security [34]. Group 5: Conclusion - The study underscores the importance of recognizing X.509DoS as a widespread but under-researched security threat, calling for increased attention from the security community [34]. - The research aims to enhance awareness of cryptographic vulnerabilities and inspire further exploration of effective detection and defense mechanisms [34].
DeepSeek下一代技术提前曝光,梁文锋署名论文获ACL2025最佳论文奖
量子位· 2025-07-30 23:56
Core Insights - The article highlights the groundbreaking achievement of a paper co-authored by DeepSeek's Liang Wenfeng and Peking University, which won the Best Paper Award at ACL 2025 [1] - The conference saw an unprecedented scale with a total submission of 8,360 papers, nearly doubling from last year's 4,407, indicating fierce competition [2] Technical Innovations - The proposed Native Sparse Attention (NSA) mechanism significantly enhances long text processing speed by 11 times through algorithm and hardware optimization, outperforming traditional full attention models [3][8] - The technology allows for an extension of context length up to 1 million tokens, set to be applied in next-generation models [4] - The NSA employs a dynamic hierarchical sparse strategy with three parallel attention branches: coarse-grained global information capture, selective attention for key segments, and sliding attention for local context [10][17] Performance Metrics - In practical tests, NSA demonstrated remarkable speed advantages across the entire lifecycle of processing 64k length sequences, with decoding speed improved by 11.6 times, forward propagation by 9 times, and backward propagation by 6 times [15][16] - The NSA pre-trained 27B parameter model surpassed the full attention baseline in 7 out of 9 evaluation metrics, particularly excelling in inference-related benchmarks [19][20] - In long text processing tests, NSA achieved perfect retrieval accuracy and outperformed the full attention baseline by 0.032 in the LongBench benchmark [21] Comparative Analysis - An experiment using DeepSeek-R1's mathematical reasoning data showed that NSA-R achieved an accuracy of 0.121 in an 8k context setting, significantly higher than the full attention model's 0.046 [22][23] - NSA also outperformed full attention in complex reasoning tasks, with improvements of 0.087 in HPQ and 0.069 in code understanding tasks [25] Additional Research Highlights - The article mentions three other best paper winners, including a study on the resilience of large language models post-alignment training, emphasizing the need for more effective alignment techniques [26] - Another paper explored fairness in large models through a new perspective of "difference awareness," revealing that traditional fairness tests may not adequately address the nuances of model behavior [28] - A third paper discussed the sampling mechanisms in large models, highlighting potential biases in decision-making processes that could lead to ethical concerns [29]
这是最新AI产品百强 | 量子位智库AI 100
量子位· 2025-07-30 23:56
以下文章来源于量子位智库 ,作者AI 100组委会 量子位智库 . 为了确保榜单的客观性和准确性,「AI 100」双榜单据称采用了 定量与定性相结合 的双重评估体系。 从综合实力到突破潜力,两份榜单共同勾勒出一个更加立体、真实的AI产品地图,也为产业参与者与观察者提供了可参考的样本坐标。 连接AI创新,提供产业研究 二喵 发自 凹非寺 量子位 | 公众号 QbitAI 2025年已经过半。 国内AI产品,也从高速爆发期走进了精细打磨的深水区。 用户红利逐渐触顶,产品体验日趋同质化,各家AI产品之间的较量,也从"有没有"进入"好不好""用多久""还用不用"。 而在这场关乎留存、体验和持续价值的较量中,哪些产品已经确立了领先优势?又有哪些新面孔正在快速突围,成为下一个爆款种子选手? 量子位智库最新发布的 「AI 100」双榜单 ,给出了一个阶段性的判断。 这份AI产品观察体系由 "旗舰100" 和 "创新100" 两个榜单组成,一个代表当前格局中的领军阵营,一个代表未来市场的种子选手。 新的机会窗口也在同步开启。 Vibe Coding 作为新蓝海崛起,推动AI低代码平台和AI编程工具的入榜产品数量激增。 相较之 ...
腾讯发布混元3D世界模型1.0:首个支持物理仿真的开源世界生成系统
量子位· 2025-07-30 09:44
Core Viewpoint - Tencent has launched Hunyuan 3D World Model 1.0, the first open-source 3D world generation model that supports physical simulation and is compatible with traditional CG pipelines, allowing for immersive and interactive 3D world generation from text or images [1][3]. Group 1: Model Features - Hunyuan 3D World Model 1.0 integrates the advantages of video-driven and 3D-driven methods, enabling the generation of immersive, explorable, and interactive 3D scenes [5][6]. - The model offers three core advantages: 360° immersive experience, industrial-grade compatibility with standard 3D mesh formats, and atomic-level interaction through decoupled 3D modeling [5][6][7]. Group 2: Technical Framework - The model employs a generative architecture that combines panoramic image synthesis and layered 3D reconstruction techniques, supporting both "text-to-world" and "image-to-world" generation methods [7][11]. - It utilizes a three-part technical framework: panoramic world proxy generation, semantic world layering, and layered world reconstruction [7][21]. Group 3: Semantic Layering and Reconstruction - Hunyuan 3D World Model 1.0 introduces a semantic hierarchical representation and generation algorithm, allowing for intelligent separation of foreground and background, and ground and sky [21][22]. - The model predicts depth for each layer and aligns them across layers to maintain geometric coherence in the reconstructed 3D scene [23][27]. Group 4: Applications - The generated 3D mesh worlds can efficiently support various professional applications, including virtual reality (VR), game development, object editing, and physical simulation [35][36]. - In VR applications, the model generates seamless 360° environments that can be deployed on mainstream VR platforms, enhancing user experience in virtual tourism and training [36]. - For game development, the 3D mesh worlds can be exported in standard formats for integration with industry engines like Unity and Unreal Engine, facilitating rapid scene construction and content iteration [37]. - The object editing application allows precise 3D manipulation of individual elements within a scene, enhancing flexibility for interactive design [38]. - In physical simulation, the layered meshes are compatible with mainstream physics engines, ensuring accurate representation of physical properties for applications like autonomous driving testing and engineering simulations [39].