AGI
Search documents
X @Sam Altman
Sam Altman· 2025-10-21 23:21
RT Ben Goodger (@bengoodger)When I joined @OpenAI last year all I had was an idea: that putting the world’s best AI assistant at the heart of your browsing experience would transform the way we get stuff done online.Since then we have been able to build an incredible team, and today we’re thrilled to release an all-new product: ChatGPT Atlas - a new web browser designed for the AI era - an era that will be shaped by more human natural language interaction, agents and ultimately AGI.Atlas has ChatGPT as its ...
Bill Ackman: I wish we had a better relationship with China
CNBC Television· 2025-10-21 13:52
Where are you on the relationship with China. >> I wish we had a better relationship with China. Uh, absolutely.I think um I think it's really unfortunate um that you know two of the most important two most important powers in the world are at loggerheads and I just think um you know we should we should make peace with China. I think that would be very very good. That would be uh that would be an incredible white swan.Let's put it that Let me ask you a slightly different question about maybe capitalism on a ...
OpenAI元老Karpathy 泼了盆冷水:智能体离“能干活”,还差十年
3 6 Ke· 2025-10-21 12:42
Group 1 - Andrej Karpathy emphasizes that the maturity of AI agents will take another ten years, stating that current agents like Claude and Codex are not yet capable of being employed for tasks [2][4][5] - He critiques the current state of AI learning, arguing that reinforcement learning is inadequate and that true learning should resemble human cognitive processes, which involve reflection and growth rather than mere trial and error [11][12][22] - Karpathy suggests that future breakthroughs in AI will require a shift from knowledge accumulation to self-growth capabilities and a reconstruction of cognitive structures [4][5][22] Group 2 - The current limitations of large language models (LLMs) in coding tasks are highlighted, with Karpathy noting that they struggle with structured and nuanced engineering design [6][7][9] - He categorizes human interaction with code into three types, emphasizing that LLMs are not yet capable of functioning as true collaborators in software development [7][9][10] - Karpathy believes that while LLMs can assist in certain coding tasks, they are not yet capable of writing or improving their own code effectively [9][10][11] Group 3 - Karpathy discusses the importance of a reflective mechanism in AI learning, suggesting that models should learn to review and reflect on their processes rather than solely focusing on outcomes [18][19][20] - He introduces the concept of "cognitive core," advocating for models to retain essential thinking and planning abilities while discarding unnecessary knowledge [32][36] - The potential for a smaller, more efficient model with only a billion parameters is proposed, arguing that high-quality data can lead to effective cognitive capabilities without the need for massive models [34][36] Group 4 - Karpathy asserts that AGI (Artificial General Intelligence) will gradually integrate into the economy rather than causing a sudden disruption, focusing on digital knowledge work as its initial application area [38][39][40] - He predicts that the future of work will involve a collaborative structure where agents perform 80% of tasks under human supervision for the remaining 20% [40][41] - The deployment of AGI will be a gradual process, starting with structured tasks like programming and customer service before expanding to more complex roles [48][49][50] Group 5 - The challenges of achieving fully autonomous driving are discussed, with Karpathy stating that it is a high-stakes task that cannot afford errors, unlike other AI applications [59][60] - He emphasizes that the successful implementation of autonomous driving requires not just technological advancements but also a supportive societal framework [61][62] - The transition to widespread autonomous driving will be a slow and incremental process, beginning with specific use cases and gradually expanding [63]
DeepSeek新模型被硅谷夸疯了!
华尔街见闻· 2025-10-21 10:13
Core Viewpoint - DeepSeek has introduced a groundbreaking model called DeepSeek-OCR, which utilizes a novel approach of "contextual optical compression" to efficiently process long texts by compressing textual information into visual tokens, significantly reducing computational costs while maintaining high accuracy in document parsing [5][13][14]. Summary by Sections Model Overview - DeepSeek-OCR is designed to tackle the computational challenges associated with processing long texts, achieving a high accuracy of 97% when the compression ratio is below 10 times, and maintaining around 60% accuracy even at a 20 times compression ratio [6][15]. - The model has gained significant attention, quickly accumulating 3.3K stars on GitHub and ranking second on HuggingFace's hot list [7]. Technical Innovations - The model comprises two core components: the DeepEncoder, which converts images into highly compressed visual tokens, and the DeepSeek3B-MoE-A570M decoder, which reconstructs text from these tokens [19][20]. - The DeepEncoder employs a serial design that processes high-resolution images in three stages: local feature extraction, token compression, and global understanding, allowing it to produce a minimal number of high-density visual tokens [21][22]. Performance Metrics - DeepSeek-OCR outperforms existing models by using only 100 visual tokens to exceed the performance of GOT-OCR2.0, which uses 256 tokens per page [18][19]. - The model supports various input modes, allowing it to adapt its compression strength based on specific tasks, ranging from "Tiny" (64 tokens) to "Gundam" (up to 800 tokens) [23][25]. Future Implications - The research suggests that the unified approach of visual and textual processing may be a pathway toward achieving Artificial General Intelligence (AGI) [11]. - The team has also proposed a concept of simulating human memory's forgetting mechanism through optical compression, potentially enabling models to allocate computational resources dynamically based on the context's temporal relevance [34][37][38].
Karpathy泼冷水:AGI要等10年,根本没有「智能体元年」
3 6 Ke· 2025-10-21 02:15
Core Insights - Andrej Karpathy discusses the future of AGI and AI over the next decade, emphasizing that current "agents" are still in their early stages and require significant development [1][3][4] - He predicts that the core architecture of AI will likely remain similar to Transformer models, albeit with some evolution [8][10] Group 1: Current State of AI - Karpathy expresses skepticism about the notion of an "agent era," suggesting it should be termed "the decade of agents" as they still need about ten years of research to become truly functional [4][5] - He identifies key issues with current agents, including lack of intelligence, weak multimodal capabilities, and inability to operate computers autonomously [4][5] - The cognitive limitations of these agents stem from their inability to learn continuously, which Karpathy believes will take approximately ten years to address [5][6] Group 2: AI Architecture and Learning - Karpathy predicts that the fundamental architecture of AI will still be based on Transformer models in the next decade, although it may evolve [8][10] - He emphasizes the importance of algorithm, data, hardware, and software system advancements, stating that all are equally crucial for progress [12] - The best way to learn about AI, according to Karpathy, is through hands-on experience in building systems rather than theoretical approaches [12] Group 3: Limitations of Current Models - Karpathy critiques current large models for their fundamental cognitive limitations, noting that they often require manual coding rather than relying solely on AI assistance [13][18] - He categorizes coding approaches into three types: fully manual, manual with auto-completion, and fully AI-driven, with the latter being less effective for complex tasks [15][18] - The industry is moving too quickly, sometimes producing subpar results while pretending to achieve significant advancements [19] Group 4: Reinforcement Learning Challenges - Karpathy acknowledges that while reinforcement learning is not perfect, it remains the best solution compared to previous methods [22] - He highlights the challenges of reinforcement learning, including the complexity of problem-solving and the unreliability of evaluation models [23][24] - Future improvements may require higher-level "meta-learning" or synthetic data mechanisms, but no successful large-scale implementations exist yet [26] Group 5: Human vs. Machine Learning - Karpathy contrasts human learning, which involves reflection and integration of knowledge, with the current models that lack such processes [28][30] - He argues that true intelligence lies in understanding and generalization rather than mere memory retention [30] - The future of AI should focus on reducing mechanical memory and enhancing cognitive processes similar to human learning [30] Group 6: AI's Role in Society - Karpathy views AI as an extension of computation and believes that AGI will be capable of performing any economically valuable task [31] - He emphasizes the importance of AI complementing human work rather than replacing it, suggesting a collaborative approach [34][36] - The emergence of superintelligence is seen as a natural extension of societal automation, leading to a world where understanding and control may diminish [37][38]
华为招募全球顶尖AI人才,余承东发声
Guan Cha Zhe Wang· 2025-10-21 01:41
Group 1 - Huawei has launched a recruitment campaign for top global AI talent, emphasizing the ambition to build a world-class AI team and achieve breakthroughs in AGI technology [1][3] - The company seeks candidates who are academic pioneers, passionate about technology, and possess innovative thinking, requiring strong mathematical foundations and a commitment to AI [3][5] - Competitive compensation packages will be offered, along with opportunities to tackle cutting-edge AI challenges and collaborate with award-winning researchers [5][7] Group 2 - The recruitment targets undergraduate and master's graduates from domestic universities graduating between January 1, 2026, and December 31, 2026, as well as doctoral students from domestic and overseas institutions [7] - Positions available include AI software engineers, AI algorithm engineers, and AI data engineers for undergraduates and master's graduates, while doctoral positions focus on advanced AI algorithms and applications [7] - Huawei has appointed Yu Chengdong as the head of the Product Investment Review Board (IRB), tasked with ensuring efficient resource allocation towards strategic goals in AI [8]
某巨头史上最大规模裁员!遣散费最高超400万;曝阿里夸克秘密开展C计划AI业务,或对标字节豆包;格力朱磊曝友商买水军丨雷峰早报
雷峰网· 2025-10-21 00:41
Key Points - Gree's marketing director Zhu Lei exposed competitors buying fake reviews, suggesting that these actions were aimed at promoting Gree while disparaging Xiaomi [4][5] - Alibaba's Quark is secretly developing an AI project called "C Plan," focusing on conversational AI applications, potentially competing with ByteDance's Doubao [8][9] - Micron plans to exit the Chinese data center market due to a ban, which has resulted in a significant loss of revenue, with its sales in China dropping from 14.03% in 2023 to an expected 7.1% by 2025 [10] - Hongguo is reportedly testing short drama e-commerce, leveraging its 200 million monthly active users to connect with Douyin's e-commerce platform [11] - Meituan's S-team has added two new members, continuing its trend of promoting young talent within the organization [12] - DJI is urgently recruiting a new head for its overseas stores following the departure of its previous leader [13] - Mercedes-Benz is undergoing a significant layoff, with around 4,000 employees opting for a generous severance package, highlighting the company's restructuring efforts [38] - Apple's AI department is reportedly in disarray, with key personnel leaving, raising concerns about the future of its AI initiatives [40]
Andrej Karpathy devastates AI optimists...
Matthew Berman· 2025-10-20 21:22
AGI Timelines and Agent Development - Andre Karpathy 认为 AGI (Artificial General Intelligence,通用人工智能) 还需要 10 年以上的时间才能实现 [1] - 行业普遍认为 2025 年至 2035 年将是 Agent (代理) 的十年,但要使 Agent 真正可用并普及到整个经济领域,还需要大量的开发工作 [1] - 行业观察到 LLM (Large Language Model,大型语言模型) 在近年取得了巨大进展,但仍然存在大量的基础工作、集成工作、物理世界的传感器和执行器、社会工作、安全工作以及研究工作需要完成 [1] Learning Approaches and Model Capabilities - Karpathy 认为 LLM 的学习方式更像是“幽灵”,而不是动物,动物天生就具备大量通过进化预先设定的智能 [1][2] - 行业对强化学习 (RL) 的有效性表示怀疑,认为其每次计算所获得的学习信号较差,并倾向于 agentic 交互,即为 Agent 创建一个可以进行实验和学习的“游乐场” [2] - 行业正在探索系统提示学习 (System Prompt Learning),这是一种通过改变系统提示来影响模型行为的新学习范式,类似于人类做笔记 [2][3] Model Size and Memorization - 行业趋势是模型尺寸先增大后减小,认知核心 (Cognitive Core) 的概念是剥离 LLM 的百科全书式知识,使其更擅长泛化 [3] - 行业对当前 Agent 行业提出了批评,认为其在工具方面过度投入,而忽略了当前的能力水平,并强调与 LLM 协作,结合人类的优势和 LLM 的长处 [3]
Erotic ChatGPT, Zuck’s Apple Assault, AI’s Sameness Problem
Alex Kantrowitz· 2025-10-20 19:07
Ranjan Roy from Margins is back for our weekly discussion of the latest tech news. We cover: 1) Sam Altman says ChatGPT will start to have erotic chats with interested adults 2) Also, more sycophancy? 3) Is sycophancy the lost love language 4) Is erotic ChatGPT good for OpenAI’s business? 5) Is erotic ChatGPT a sign that AGI is actually far away? 6) OpenAI’s latest business metrics revealed 7) Google’s AI contributes to cancer discovery 8) Anthropic’s Jack Clark on AI becoming self aware 9) Is Zuck poaching ...
腾讯研究院AI速递 20251021
腾讯研究院· 2025-10-20 16:01
Group 1: Oracle's AI Supercomputer - Oracle launched the world's largest cloud AI supercomputer, OCI Zettascale10, consisting of 800,000 NVIDIA GPUs, achieving a peak performance of 16 ZettaFLOPS, serving as the core computing power for OpenAI's "Stargate" cluster [1] - The supercomputer utilizes a unique Acceleron RoCE network architecture, significantly reducing communication latency between GPUs and ensuring automatic path switching during failures [1] - Services are expected to be available to customers in the second half of 2026, with the peak performance potentially based on low-precision computing metrics, requiring further validation in practical applications [1] Group 2: Google's Gemini 3.0 - Google's Gemini 3.0 appears to have launched under the aliases lithiumflow (Pro version) and orionmist (Flash version) in the LMArena, with Gemini 3 Pro being the first AI model capable of accurately recognizing clock times [2] - Testing shows that Gemini 3 Pro excels in SVG drawing and music composition, effectively mimicking musical styles while maintaining rhythm, with significantly improved visual performance compared to previous versions [2] - Despite the notable enhancements in model capabilities, the evaluation methods in the AI community remain traditional, lacking innovative assessment techniques [2] Group 3: DeepSeek's OCR Model - DeepSeek has open-sourced a 3 billion parameter OCR model, DeepSeek-OCR, which achieves a compression rate of less than 10 times while maintaining 97% accuracy, and around 60% accuracy at a 20 times compression rate [3] - The model consists of DeepEncoder (380M parameters) and DeepSeek 3B-MoE decoder (activated parameters 570M), outperforming GOT-OCR2.0 in OmniDocBench tests using only 100 visual tokens [3] - A single A100-40G GPU can generate over 200,000 pages of LLM/VLM training data daily, supporting recognition in nearly 100 languages, showcasing its efficient visual-text compression potential [3] Group 4: Yuanbao AI Recording Pen - Yuanbao has introduced a new feature for its AI recording pen, utilizing Tencent's Tianlai noise reduction technology to enable clear and accurate recording and transcription without additional hardware [4] - The "Inner OS" feature interprets the speaker's underlying thoughts and nuances, helping users stay focused on the core content of meetings or conversations [4] - The recording can intelligently separate multiple speakers in a single audio segment, enhancing clarity in meeting notes without the need for repeated listening [4] Group 5: Vidu's Q2 Features - Vidu's Q2 reference generation feature officially launched globally on October 21, with a reasoning speed three times faster than the Q1 version, supporting multi-subject consistency generation and precise semantic understanding while maintaining 1080p HD video quality [5][6] - The video extension feature allows free users to generate videos up to 30 seconds long, while paid users can extend videos up to 5 minutes, supporting text-to-video, image-to-video, and reference video generation [6] - The Vidu app has undergone a comprehensive redesign, transitioning from an AI creation platform to a one-stop AI content social platform, featuring a vast subject library for easy collaborative video generation [6] Group 6: Gemini's Geolocation Intelligence - Google has opened the Gemini API to all developers, integrating Google Maps functionality to provide location awareness for 250 million places, charging $25 for every 1,000 fact-based prompts [7] - The feature supports Gemini 2.5 Flash-Lite, 2.5 Pro, 2.5 Flash, and 2.0 Flash models, applicable in scenarios such as restaurant recommendations, route planning, and travel itinerary planning, offering real-time traffic and business hours queries [7] - This development signifies a shift in AI from static tools to dynamic "intelligent spaces," with domestic competitor Amap having previously launched smart applications [7] Group 7: AI Trading Experiment - The Alpha Arena experiment initiated by nof1.ai allocated $10,000 each to GPT-5, Gemini 2.5 Pro, Claude 4.5 Sonnet, Grok 4, Qwen3 Max, and DeepSeek V3.1 for real market trading, with DeepSeek V3.1 achieving over $3,500 in profits, ranking first [8] - DeepSeek secured the highest returns with only five trades, while Grok-4 followed closely with one trade, and Gemini 2.5 Pro incurred the most losses with 45 trades [8] - This experiment views the financial market as the ultimate test for intelligence, focusing on survival in uncertainty rather than mere cognitive capabilities [8] Group 8: Robotics Development - Yushu has released its fourth humanoid robot, H2, standing 180 cm tall and weighing 70 kg, with a BMI of 21.6, featuring 31 joints, an increase of about 19% compared to the R1 model [9] - H2 has significantly upgraded its movement fluidity and bionic features, capable of ballet dancing and martial arts, with a "face" appearance, earning the title of "the most human-like bionic robot" [9] - Compared to its predecessor H1, H2's joint control and balance algorithms have been greatly optimized, expanding its application prospects from industrial automation to entertainment and companionship services [9] Group 9: Karpathy's Insights on AGI - Karpathy expressed in a podcast that achieving AGI may still take a decade, presenting a more pessimistic view compared to the general optimism in Silicon Valley, being 5-10 times more cautious [10] - He criticized the inefficiency of reinforcement learning, likening it to "sucking supervision signals through a straw," highlighting its susceptibility to noise and interference [10] - He introduced the concept of a "cognitive core," suggesting that future models will initially grow larger before becoming smaller and more focused on a specialized cognitive nucleus [11]