BERT
Search documents
LeCun预言成真?这有一份通往AGI的硬核路线图:从BERT到Genie,在掩码范式的视角下一步步构建真正的世界模型
量子位· 2026-01-01 02:13
非羊 整理自 凹非寺 量子位 | 公众号 QbitAI 然而,繁荣的背后是概念的混战:世界模型究竟是什么?是强化学习里用来训练Agent的环境模拟器?是看过所有YouTube视频的预测模型? 还是一个能生成无限3D资产的图形引擎? 近日,一篇题为 《From Masks to Worlds: A Hitchhiker's Guide to World Models》 的论文在arXiv上引发关注。来自 MeissonFlow Research、Georgia Tech、UCLA和UC Merced 的联合研究团队提出了一份通往AGI的"建造指南"。 与罗列数百篇论文的传统综述不同,作者团队在文中专注于如何构建真正的世界模型,作者团队指出:正如LeCun所言,通往真正世界模型 (World Model) 的道路可能并非自回归,而是一条由"掩码 (Masking) "铺就的窄路。 从BERT到MAE/MaskGIT,再到如今的Genie-3与离散扩散 (Discrete Diffusion) 模型,Masking正在统一不同模态之间的表征。 论文认为,从早期的掩码预训练 (Masked Modeling) 出发, ...
NUS尤洋教授深度探讨智能增长的瓶颈:或许我们将这样实现AGI?
机器之心· 2025-12-31 04:09
机器之心发布 2026 年即将到来,AI 的发展也已经进入了一个新的阶段:我们已经取得了惊人成就,却同时面临进一步增长的瓶颈。 新加坡国立大学(NUS)的尤洋教授近期发表了一篇深度分析:《 智能 增 长的瓶颈 》。 原文链接: https://zhuanlan.zhihu.com/p/1989100535295538013 在这篇分析文章中,尤洋教授从技术本质出发,直指智能增长的核心矛盾,为我们揭示了 AGI(通用人工智能)的可能路径。 观点速览 ✅ 智能增长的本质不是架构变革,而是算力如何转化为智能 :AI 的核心智能来自于预训练及其 Loss 结构(例如 GPT 的 Next-Token Prediction)。这些机制更像 是把算力转化为智能的方法,而非智能本身。 ✅ 现有智能增长遇到瓶颈的根源 :当前范式(Transformer + 超大算力)在面对进一步增长时, 难以充分消化不断增长的算力资源,这导致了所谓 "预训练红利递 减"。 ✅ 算力并不是无限扩展就能解决问题 :即使算力指数级增长,如果现有算法无法有效利用这些计算资源,智能提升仍将受限。 ✅ 未来方向不在于工程优化,而是底层范式突破 :文章探 ...
Transformer能否支撑下一代Agent?
Tai Mei Ti A P P· 2025-12-22 07:39
文 | 划重点KeyPoints,作者 | 李越 12月18日,2025腾讯ConTech大会暨腾讯科技Hi Tech Day正式播出,中国工程院院士、知名专家和学 者、头部科技企业创始人及知名投资人齐聚一堂,共同探讨智能时代的机遇与挑战。 原本能够带领我们通往AGI的Transformer,是否已经触碰到了天花板? 只会做题的优等生 在2017年之前,AI自然语言处理(NLP)的主流方式还是RNN(循环神经网络)和LSTM(长短期记忆 网络)。它们处理信息的方式像一个勤恳的阅读者,必须按顺序一个字一个字地读,效率低下且难以捕 捉长距离的语义关联。 2017年,Google论文《Attention Is All You Need》横空出世,彻底改变了这一切。 Transformer架构抛弃了循环,引入了"自注意力机制"。它不再按顺序阅读,而是能同时关注句子中的所 有词,并计算它们之间的关联权重。 在圆桌论坛环节,当主持人把话筒递给阶跃星辰首席科学家张祥雨,询问关于模型架构未来时,这位学 术大牛抛出了一枚"深水炸弹":现有的Transformer架构无法支撑下一代Agent。 而就在不久前,斯坦福大学教授、"A ...
谷歌AI往事:隐秘的二十年,与狂奔的365天
3 6 Ke· 2025-11-27 12:13
Core Insights - Google has undergone a significant transformation in the past year, moving from a state of perceived stagnation to a strong resurgence in AI capabilities, highlighted by the success of its Gemini applications and models [2][3][44] - The company's long-term investment in AI technology, dating back over two decades, has laid a robust foundation for its current advancements, showcasing a strategic evolution rather than a sudden breakthrough [3][6][45] Group 1: Historical Context and Development - Google's AI journey began with Larry Page's vision of creating an ultimate search engine capable of understanding the internet and user intent [9][47] - The establishment of Google Brain in 2011 marked a pivotal moment, focusing on unsupervised learning methods that would later prove essential for AI advancements [12][18] - The "cat paper" published in 2012 demonstrated the feasibility of unsupervised learning and led to the development of recommendation systems that transformed platforms like YouTube [15][16] Group 2: Key Acquisitions and Innovations - The acquisition of DeepMind in 2014 for $500 million solidified Google's dominance in AI, providing access to top-tier talent and innovative research [22][24] - Google's development of Tensor Processing Units (TPUs) was a strategic response to the limitations of existing hardware, enabling more efficient processing of AI workloads [25][30] Group 3: Challenges and Strategic Shifts - The emergence of OpenAI and the success of ChatGPT in late 2022 prompted Google to reassess its AI strategy, leading to a restructuring of its AI teams and a renewed focus on a unified model, Gemini [41][42] - The rapid development and deployment of Gemini and its variants, such as Gemini 3 and Nano Banana Pro, have positioned Google back at the forefront of the AI landscape [43][44] Group 4: Future Outlook - Google's recent advancements in AI reflect a culmination of years of strategic investment and innovation, reaffirming its identity as a company fundamentally rooted in AI rather than merely a search engine [47][48]
扩散不死,BERT永生,Karpathy凌晨反思:自回归时代该终结了?
3 6 Ke· 2025-11-05 04:44
Core Insights - The article discusses Nathan Barry's innovative approach to transforming BERT into a generative model using a diffusion process, suggesting that BERT's masked language modeling can be viewed as a specific case of text diffusion [1][5][26]. Group 1: Model Transformation - Nathan Barry's research indicates that BERT can be adapted for text generation by modifying its training objectives, specifically through a dynamic masking rate that evolves from 0% to 100% [13][27]. - The concept of using diffusion models, initially successful in image generation, is applied to text by introducing noise and then iteratively denoising it, which aligns with the principles of masked language modeling [8][11]. Group 2: Experimental Validation - Barry conducted a validation experiment using RoBERTa, a refined version of BERT, to demonstrate that it can generate coherent text after being fine-tuned with a diffusion approach [17][21]. - The results showed that even without optimization, the RoBERTa Diffusion model produced surprisingly coherent outputs, indicating the potential for further enhancements [24][25]. Group 3: Industry Implications - The article highlights the potential for diffusion models to challenge existing generative models like GPT, suggesting a shift in the landscape of language modeling and AI [30][32]. - The discussion emphasizes that the generative capabilities of language models can be significantly improved through innovative training techniques, opening avenues for future research and development in the field [28][30].
前阿里、字节大模型带头人杨红霞创业:大模型预训练,不是少数顶尖玩家的算力竞赛|智能涌现独家
Sou Hu Cai Jing· 2025-10-30 08:35
Core Insights - Yang Hongxia, a key figure in large model research from Alibaba and ByteDance, has launched a new AI company, InfiX.ai, focusing on decentralized model training and innovation in the AI space [1][15][36] - InfiX.ai aims to democratize access to large model training, allowing small and medium enterprises, research institutions, and individuals to participate in the process [4][16][19] Company Overview - InfiX.ai was founded by Yang Hongxia after her departure from ByteDance, with a focus on model-related technologies [1][15] - The company has quickly assembled a team of 40 people in Hong Kong, leveraging the region's strong talent pool and funding opportunities [3][15] Technological Innovations - InfiX.ai is developing a decentralized approach to large model training, contrasting with the centralized models dominated by major institutions [4][16] - The company has released the world's first FP8 training framework, which enhances training speed and reduces memory consumption compared to the commonly used FP16/BF16 [7][10] - InfiX.ai's model fusion technology allows for the integration of different domain-specific models, reducing resource waste and enhancing knowledge sharing [10][16] Market Positioning - The company is targeting challenging fields, particularly in healthcare, with a focus on cancer detection, to demonstrate the capabilities of its models [15][41] - InfiX.ai's approach is gaining traction, with increasing interest from investors and a shift in perception towards decentralized model training in the industry [15][36] Future Vision - Yang Hongxia envisions a future where every organization has its own expert model, facilitated by model fusion across different domains and geographical boundaries [16][19] - The company aims to make model training accessible and affordable, fostering a collaborative environment for AI development [16][19]
Embedding黑箱成为历史!这个新框架让模型“先解释,再学Embedding”
量子位· 2025-10-21 09:05
Core Insights - The article introduces GRACE, a new explainable generative embedding framework developed by researchers from multiple universities, aimed at addressing the limitations of traditional text embedding models [1][6]. Group 1: Background and Limitations - Text embedding models have evolved from BERT to various newer models, mapping text into vector spaces for tasks like semantic retrieval and clustering [3]. - A common flaw in these models is treating large language models as "mute encoders," which output vectors without explaining the similarity between texts [4]. - This black-box representation becomes a bottleneck in tasks requiring high interpretability and robustness, such as question-answer matching and cross-domain retrieval [5]. Group 2: GRACE Framework Overview - GRACE transforms "contrastive learning" into "reinforcement learning," redefining the meaning of contrastive learning signals [6]. - The framework emphasizes generating explanations (rationales) for text before learning embeddings, allowing the model to produce logical and semantically consistent reasoning [7][25]. - GRACE consists of three key modules: 1. Rationale-Generating Policy, which generates explanatory reasoning chains for input texts [8]. 2. Representation Extraction, which combines input and rationale to compute final embeddings [9]. 3. Contrastive Rewards, which redefines contrastive learning objectives as a reward function for reinforcement learning updates [11]. Group 3: Training Process - GRACE can be trained in both supervised and unsupervised manners, utilizing labeled query-document pairs and self-alignment techniques [12][18]. - In the supervised phase, the model learns semantic relationships from a dataset of 1.5 million samples [13]. - The unsupervised phase generates multiple rationales for each text, encouraging consistent representations across different explanations [17]. Group 4: Experimental Results - GRACE was evaluated across 56 datasets in various tasks, showing significant performance improvements over baseline models in retrieval, pair classification, and clustering [19][20]. - The results indicate that GRACE not only enhances embedding capabilities without sacrificing generative abilities but also provides transparent representations that can be understood by users [25][27]. Group 5: Conclusion - Overall, GRACE represents a paradigm shift in embedding models, moving towards a framework that can explain its understanding process, thus enhancing both performance and interpretability [28].
X @THE HUNTER ✴️
GEM HUNTER 💎· 2025-09-23 16:57
Cryptocurrency Trends - The document identifies a list of trending cryptocurrencies, including DOG, TOSHI, ASTER, APEX, MOMO, TRUMP, WLFI, PUMP, SUN, UFD, TROLL, BERT, NMR, BITCOIN, and BLESS [1] - The document acknowledges that the list of trending cryptocurrencies is incomplete and seeks community input to identify missing cryptocurrencies [1]
张小珺对话OpenAI姚顺雨:生成新世界的系统
Founder Park· 2025-09-15 05:59
Core Insights - The article discusses the evolution of AI, particularly focusing on the transition to the "second half" of AI development, emphasizing the importance of language and reasoning in creating more generalizable AI systems [4][62]. Group 1: AI Evolution and Language - The concept of AI has evolved from rule-based systems to deep reinforcement learning, and now to language models that can reason and generalize across tasks [41][43]. - Language is highlighted as a fundamental tool for generalization, allowing AI to tackle a variety of tasks by leveraging reasoning capabilities [77][79]. Group 2: Agent Systems - The definition of an "Agent" has expanded to include systems that can interact with their environment and make decisions based on reasoning, rather than just following predefined rules [33][36]. - The development of language agents represents a significant shift, as they can perform tasks in more complex environments, such as coding and internet navigation, which were previously challenging for AI [43][54]. Group 3: Task Design and Reward Mechanisms - The article emphasizes the importance of defining effective tasks and environments for AI training, suggesting that the current bottleneck lies in task design rather than model training [62][64]. - A focus on intrinsic rewards, which are based on outcomes rather than processes, is proposed as a key factor for successful reinforcement learning applications [88][66]. Group 4: Future Directions - The future of AI development is seen as a combination of enhancing agent capabilities through better memory systems and intrinsic rewards, as well as exploring multi-agent systems [88][89]. - The potential for AI to generalize across various tasks is highlighted, with coding and mathematical tasks serving as prime examples of areas where AI can excel [80][82].