Workflow
Coconut
icon
Search documents
太疯狂了!Meta裁员裁到田渊栋头上,连组员一锅端
量子位· 2025-10-23 03:52
Core Viewpoint - The recent layoffs at Meta AI, led by the new Chief AI Officer Alexander Wang, are not merely organizational streamlining but indicate a significant shift in the company's AI strategy, impacting prominent figures like Tian Yuandong, who has been with Meta for over a decade [1][6]. Group 1: Tian Yuandong's Background and Contributions - Tian Yuandong has a strong academic background with degrees from Shanghai Jiao Tong University and a PhD from Carnegie Mellon University, specializing in robotics [7][8]. - He joined Facebook (now Meta) in 2014 and has made significant contributions to AI, including the development of the Go AI "Dark Forest," which achieved a level comparable to top amateur players before AlphaGo [9][12]. - His research focus shifted towards AI interpretability and foundational principles, rejecting an invitation from OpenAI to work on language models to concentrate on understanding neural network operations [13]. Group 2: Recent Developments and Innovations - Recently, Tian Yuandong led a team focused on planning and reasoning within AI, publishing a paper on the role of key hyperparameters in "Grokking" and the effectiveness of optimizers like Muon [14][15]. - His innovative work includes memory-efficient training methods like GaLore, which compresses the memory required for pre-training a 7B model to under 24GB, enabling training on consumer-grade GPUs [16]. - The Dualformer model integrates "fast thinking" and "slow thinking" processes, allowing dynamic responses to simple and complex problems, while the Coconut paradigm compresses reasoning trajectories into a continuous latent space [16]. Group 3: Industry Reactions and Future Prospects - Following the layoffs, companies like OpenAI and various startups quickly expressed interest in recruiting Tian Yuandong and his team members, indicating a competitive job market in the AI sector [4][6]. - Tian Yuandong's experiences in the workplace may inspire his creative endeavors, as he is also a science fiction author, with his first novel set to be published in 2024 [17][20].
首篇潜空间推理综述!模型思考不必依赖Token,带宽暴增2700+倍
量子位· 2025-07-16 01:49
Core Insights - The article presents a comprehensive overview of latent space reasoning, highlighting its potential to achieve over 2700 times the bandwidth of traditional explicit reasoning chains (CoT) [1][15]. Group 1: Overview of Latent Space Reasoning - Latent space reasoning is an emerging field that traces its origins to the 2019 ICLR paper "Universal Transformers" by researchers from the University of Amsterdam and Google Brain [7]. - The article introduces a unified framework for latent space reasoning, which is based on mechanical interpretability and connects with the internal operations of models [3][4]. - The framework aims to facilitate future explorations, such as investigating infinite-depth reasoning through diffusion models [4]. Group 2: Mechanisms of Latent Space Reasoning - Latent space reasoning employs latent chains of thought, which represent reasoning in a continuous internal form rather than discrete natural language tokens [13][14]. - This method significantly enhances bandwidth, with each token in explicit CoT being approximately 15 bits, while latent CoT operations in a 2560-dimensional FP16 space yield around 40960 bits per step [15]. - The reasoning process is not constrained by a limited vocabulary, allowing for richer expressive capabilities [16]. Group 3: Modes of Latent Space Reasoning - There are two primary modes of latent space reasoning: vertical cycles and horizontal cycles [19]. - Vertical cycles utilize activation-based methods to extend computational depth, allowing models to repeatedly process the same set of layers to enhance reasoning [20][21]. - Horizontal cycles focus on expanding the model's memory and reasoning capabilities over time, maintaining a compressed hidden state that aggregates information from multiple time steps [28][29]. Group 4: Depth and Reasoning Capacity - The relationship between layer depth and reasoning capability is critical, with studies indicating that the implicit reasoning chain ability of models is strictly limited by the number of layers [34][40]. - Sufficient layer depth is necessary to execute multi-hop reasoning tasks effectively, as insufficient layers can hinder the emergence of final reasoning results [36][41]. - Research has established that the achievable length of reasoning chains is linearly related to the number of layers, positioning layer depth as a primary bottleneck for latent reasoning capacity [45]. Group 5: Advanced Reasoning Paradigms - The concept of "infinite depth reasoning" is proposed, allowing AI to allocate unlimited "thinking time" to refine solutions without output length constraints [53]. - This can be achieved through spatial infinite reasoning, which utilizes text diffusion models, and temporal infinite reasoning, which equates longer sequences with more optimization iterations [54][57]. - The article discusses specific methods for implementing these advanced paradigms, emphasizing their potential to enhance latent space reasoning [58].
The Interwoven Histories of Malay Cuisine | Khir Johari | TEDxSingapore
TEDx Talks· 2025-07-10 16:12
What did you have for lunch today. Food is more than what we eat. It is who we are.Have you ever thought what your food says about you. My own journey into the rich complex world of Mallayic astronomy is like reading a love story. A love story that's lost and I'm rediscovering it.This endeavor triggers interviews with custodians of food knowledge, food wisdom, travels to distant libraries across the globe and reconstructing food scenarios that sometimes requires the whole village. Exactly 200 years ago in 1 ...
“椰子很甜,和三亚一样甜!”
Hai Nan Ri Bao· 2025-05-02 00:30
Group 1 - Sanya is distributing 51,000 free coconuts to tourists from May 1 to May 5, enhancing the holiday experience and showcasing local hospitality [1][4] - Volunteers are actively engaging in the distribution of fresh coconuts at key locations such as the airport, train stations, and popular tourist spots [1][3] - The initiative has attracted participation from over 100 tourism-related businesses, aiming to create a warm and inviting atmosphere for visitors [4] Group 2 - Various fun activities, including coconut bowling and coconut racing, are being organized at the Dadonghai scenic area to entertain both domestic and international tourists [2] - The event is designed to promote interaction among tourists while providing them with free coconuts as rewards for participation [2] Group 3 - A special "Sweet Coconut" taxi fleet has been established, offering free transportation services to tourists, with vehicles marked by different colors to indicate availability [4] - This initiative aims to enhance the overall tourist experience while maintaining the primary business operations of the taxi service [4]