无监督学习
Search documents
LeCun离职前的吐槽太猛了
量子位· 2025-12-21 05:45
Core Viewpoint - LeCun expresses skepticism about the potential of large language models (LLMs) to achieve artificial general intelligence (AGI), arguing that the path to superintelligence through LLMs is fundamentally flawed [2][78]. Group 1: Departure from Meta - LeCun is leaving Meta after nearly 12 years, criticizing the company's increasingly closed approach to research and its focus on short-term projects [3][11][26]. - He plans to establish a new company named Advanced Machine Intelligence (AMI), which will prioritize open research and focus on world models [10][19]. Group 2: World Models vs. LLMs - LeCun believes that world models, which handle high-dimensional and continuous data, are fundamentally different from LLMs, which excel at discrete text data [28][29]. - He argues that relying solely on text data will never allow AI to reach human intelligence levels, as the complexity of real-world data is far greater than that of text [31][32]. Group 3: Research Philosophy - LeCun emphasizes the importance of open research and publication, stating that without sharing results, research lacks validity [15][17]. - He critiques Meta's shift towards short-term projects, suggesting that true breakthroughs require long-term, open-ended research [18][26]. Group 4: Future of AI - LeCun envisions that the development of world models and planning capabilities could lead to significant advancements in AI, but achieving human-level intelligence will require substantial foundational work and theoretical innovation [84][85]. - He asserts that the most challenging aspect of AI development is not reaching human intelligence but rather achieving the intelligence level of dogs, as this requires a deep understanding of foundational theories [88][89]. Group 5: Personal Mission - At 65, LeCun remains committed to enhancing human intelligence, viewing it as the most scarce resource and a key driver for societal progress [92][94]. - He reflects on his career, expressing a desire to continue contributing to the field and emphasizing the importance of open collaboration in scientific advancement [103].
倒计时3周离职,LeCun最后警告:硅谷已陷入集体幻觉
3 6 Ke· 2025-12-16 07:11
Core Viewpoint - LeCun criticizes the obsession with large language models (LLMs) in Silicon Valley, asserting that this approach is a dead end and will not lead to artificial general intelligence (AGI) [1][3][26] Group 1: Critique of Current AI Approaches - LeCun argues that the current trend of stacking LLMs and relying on extensive synthetic data is misguided and ineffective for achieving true intelligence [1][3][26] - He emphasizes that the real challenge in AI is not achieving human-like intelligence but rather understanding basic intelligence, as demonstrated by simple creatures like cats and children [3][12] - The focus on LLMs is seen as a dangerous "herd mentality" in the industry, with major companies like OpenAI, Google, and Meta all pursuing similar strategies [26][30] Group 2: Introduction of World Models - LeCun is advocating for a different approach called "world models," which involves making predictions in an abstract representation space rather than relying solely on pixel-level outputs [3][14] - He believes that world models can effectively handle high-dimensional, continuous, and noisy data, which LLMs struggle with [14][12] - The concept of world models is tied to the idea of planning, where the system predicts the outcomes of actions to optimize task completion [14][12] Group 3: Future Directions and Company Formation - LeCun plans to establish a new company, Advanced Machine Intelligence (AMI), focusing on world models and maintaining an open research tradition [4][5][30] - AMI aims to not only conduct research but also develop practical products related to world models and planning [9][30] - The company will be global, with headquarters in Paris and offices in other locations, including New York [30] Group 4: Perspectives on AGI and AI Development Timeline - LeCun dismisses the concept of AGI as meaningless, arguing that human intelligence is highly specialized and cannot be replicated in a single model [31][36] - He predicts that significant advancements in AI could occur within 5-10 years, potentially achieving intelligence levels comparable to dogs, but acknowledges that unforeseen obstacles may extend this timeline [31][33] Group 5: Advice for Future AI Professionals - LeCun advises against pursuing computer science as a primary focus, suggesting instead to study subjects with long-lasting relevance, such as mathematics, engineering, and physics [45][46] - He emphasizes the importance of learning how to learn and adapting to rapid technological changes in the AI field [45][46]
黄仁勋最新采访:依然害怕倒闭,非常焦虑
半导体芯闻· 2025-12-08 10:44
Core Insights - The discussion highlights the transformative impact of artificial intelligence (AI) and the role of NVIDIA in driving this technological revolution, emphasizing the importance of GPUs in various applications from gaming to modern data centers [2][10]. Group 1: AI and Technological Competition - The conversation underscores that the world is in a significant technological race, particularly in AI, where the first to reach advanced capabilities will gain substantial advantages [11][12]. - Historical context is provided, indicating that the U.S. has always been in a technological competition since the Industrial Revolution, with AI being the latest frontier [12][13]. Group 2: Energy and Manufacturing - The importance of energy growth and domestic manufacturing is emphasized as critical for national security and economic prosperity, with a call for revitalizing U.S. manufacturing capabilities [8][9]. - The discussion points out that without energy growth, industrial growth and job creation would be severely hindered, linking energy policies directly to advancements in AI and technology [9][10]. Group 3: AI Development and Safety - Concerns about the risks associated with AI are acknowledged, particularly regarding its potential military applications and ethical implications [19][20]. - The conversation suggests that AI's development will be gradual rather than sudden, with a focus on enhancing safety and reliability in AI systems [14][15]. Group 4: Future of AI and Knowledge Generation - The potential for AI to generate a significant portion of knowledge in the future is discussed, with predictions that AI could produce up to 90% of knowledge within a few years [41][42]. - The necessity for continuous verification of AI-generated information is highlighted, stressing the importance of ensuring accuracy and reliability in AI outputs [41][42]. Group 5: Cybersecurity and Collaboration - The dialogue emphasizes the collaborative nature of cybersecurity, where companies share information and best practices to combat threats collectively [23][24]. - The need for a unified approach to cybersecurity in the face of evolving threats is reiterated, suggesting that cooperation is essential for effective defense [23][24].
黄仁勋最新采访:依然害怕倒闭,非常焦虑
半导体行业观察· 2025-12-06 03:06
Core Insights - The discussion highlights the transformative impact of artificial intelligence (AI) and the role of NVIDIA in driving this technological revolution, emphasizing the importance of GPUs in various applications from gaming to modern data centers [1] - Huang Renxun discusses the risks and rewards associated with AI, the global AI race, and the significance of energy and manufacturing for future innovations [1] Group 1: AI and Technological Competition - The ongoing technological competition has been a constant since the Industrial Revolution, with the current AI race being one of the most critical [10][11] - Huang Renxun emphasizes that technological leadership is essential for national security and economic prosperity, linking energy growth to industrial growth and job creation [7][8] - The conversation touches on the historical context of technological races, including the Manhattan Project and the Cold War, underscoring the continuous nature of these competitions [11] Group 2: AI Development and Safety - Huang Renxun expresses optimism about the gradual development of AI, suggesting that advancements will be incremental rather than sudden [13] - The discussion addresses concerns about AI's potential risks, including the ethical implications of military applications and the need for robust cybersecurity measures [16][20] - Huang Renxun believes that AI's capabilities will increasingly focus on safety and reliability, reducing the occurrence of errors or "hallucinations" in AI outputs [14] Group 3: Future of Work and AI's Impact - The conversation explores the potential for AI to create a future where traditional jobs may become obsolete, leading to a society where individuals receive universal basic income [37] - Huang Renxun acknowledges the challenges of identity and purpose as AI takes over tasks traditionally performed by humans, emphasizing the need for society to adapt to these changes [38] - The discussion highlights the importance of maintaining human engagement and problem-solving in a future dominated by AI technologies [38] Group 4: Quantum Computing and Security - Huang Renxun discusses the implications of quantum computing on encryption and cybersecurity, suggesting that while current encryption methods may become outdated, the industry is actively developing post-quantum encryption technologies [22][23] - The conversation emphasizes the collaborative nature of cybersecurity efforts, where companies share information to enhance collective defenses against threats [20][21] - Huang Renxun asserts that AI will play a crucial role in future cybersecurity measures, leveraging its capabilities to protect against evolving threats [21]
黄仁勋万字访谈:33年来每天都觉得公司要倒闭,AI竞赛无“终点线”,技术迭代才是关键
华尔街见闻· 2025-12-05 09:39
英伟达创始人兼CEO黄仁勋近日在播客节目中进行了一场长达两小时的深度访谈,详细阐述了他对人工智能竞赛、公司经营以及个人成长的看法。 这位全球市值最高科技公司之一的掌门人,以罕见的坦诚揭示了一个令人意外的事实: 尽管英伟达已成为AI时代的核心企业,但他每天醒来仍然感到公司"距 离倒闭还有30天"。 在谈及当前全球关注的AI竞赛时,黄仁勋提出了与主流观点截然不同的看法。他认为,这场竞赛并非如外界想象的那样存在一条明确的"终点线",也不会出现 某一方突然获得压倒性优势的局面。相反, 技术进步将是渐进式的,所有参与者都将站在AI的肩膀上共同进化。 他认为, 真正的竞争力在于持续迭代能力,而非一次性突破 。过去10年AI算力提升10万倍,但这些算力用于让AI更谨慎思考、检验答案,而非做危险的事。 英伟达2005年推出CUDA时股价暴跌80%,但坚持投入最终成就了今天AI革命的基础设施。迭代不是重复,而是基于第一性原理的持续修正。 黄仁勋还详细回顾了英伟达多次濒临破产的创业经历,包括1995年技术路线选择错误、依靠世嘉500万美元投资和台积电张忠谋的信任才得以生存的惊险时 刻。 这些经历塑造了他对风险、战略和领导力的独特 ...
黄仁勋万字深度访谈:AI竞赛无“终点线”,技术迭代才是关键,33年来每天都觉得公司要倒闭
美股IPO· 2025-12-04 23:43
黄仁勋在访谈中指出,AI竞赛无明确终点线,持续迭代能力比一次性突破更重要,技术进步是渐进 的,所有参与者将共同进化。过去10年AI算力提升10万倍,但这些算力用于让AI更谨慎思考、检验答 案,而非做危险的事。黄仁勋还详细回顾了英伟达多次濒临破产的创业经历,包括1995年技术路线选 择错误、依靠世嘉500万美元投资和台积电张忠谋的信任才得以生存的惊险时刻。 英伟达创始人兼CEO黄仁勋近日在播客节目中进行了一场长达两小时的深度访谈,详细阐述了他对人 工智能竞赛、公司经营以及个人成长的看法。 这位全球市值最高科技公司之一的掌门人,以罕见的坦诚揭示了一个令人意外的事实: 尽管英伟达已 成为AI时代的核心企业,但他每天醒来仍然感到公司"距离倒闭还有30天"。 在谈及当前全球关注的AI竞赛时,黄仁勋提出了与主流观点截然不同的看法。他认为,这场竞赛并非 如外界想象的那样存在一条明确的"终点线",也不会出现某一方突然获得压倒性优势的局面。相反, 技术进步将是渐进式的,所有参与者都将站在AI的肩膀上共同进化。 他认为, 真正的竞争力在于持续迭代能力,而非一次性突破 。过去10年AI算力提升10万倍,但这些 算力用于让AI更谨慎 ...
谷歌AI往事:隐秘的二十年,与狂奔的365天
3 6 Ke· 2025-11-27 12:13
Core Insights - Google has undergone a significant transformation in the past year, moving from a state of perceived stagnation to a strong resurgence in AI capabilities, highlighted by the success of its Gemini applications and models [2][3][44] - The company's long-term investment in AI technology, dating back over two decades, has laid a robust foundation for its current advancements, showcasing a strategic evolution rather than a sudden breakthrough [3][6][45] Group 1: Historical Context and Development - Google's AI journey began with Larry Page's vision of creating an ultimate search engine capable of understanding the internet and user intent [9][47] - The establishment of Google Brain in 2011 marked a pivotal moment, focusing on unsupervised learning methods that would later prove essential for AI advancements [12][18] - The "cat paper" published in 2012 demonstrated the feasibility of unsupervised learning and led to the development of recommendation systems that transformed platforms like YouTube [15][16] Group 2: Key Acquisitions and Innovations - The acquisition of DeepMind in 2014 for $500 million solidified Google's dominance in AI, providing access to top-tier talent and innovative research [22][24] - Google's development of Tensor Processing Units (TPUs) was a strategic response to the limitations of existing hardware, enabling more efficient processing of AI workloads [25][30] Group 3: Challenges and Strategic Shifts - The emergence of OpenAI and the success of ChatGPT in late 2022 prompted Google to reassess its AI strategy, leading to a restructuring of its AI teams and a renewed focus on a unified model, Gemini [41][42] - The rapid development and deployment of Gemini and its variants, such as Gemini 3 and Nano Banana Pro, have positioned Google back at the forefront of the AI landscape [43][44] Group 4: Future Outlook - Google's recent advancements in AI reflect a culmination of years of strategic investment and innovation, reaffirming its identity as a company fundamentally rooted in AI rather than merely a search engine [47][48]
预测下一个像素还需要几年?谷歌:五年够了
机器之心· 2025-11-26 07:07
Core Insights - The article discusses the potential of next-pixel prediction in image recognition and generation, highlighting its scalability challenges compared to natural language processing tasks [6][21]. - It emphasizes that while next-pixel prediction is a promising approach, it requires significantly more computational resources than language models, with a token-per-parameter ratio that is 10-20 times higher [6][15][26]. Group 1: Next-Pixel Prediction - Next-pixel prediction can be learned in an end-to-end manner without the need for labeled data, making it a form of unsupervised learning [3][4]. - The study indicates that achieving optimal performance in next-pixel prediction requires a higher token-parameter ratio compared to text token learning, with a minimum of 400 for pixel models versus 20 for language models [6][15]. - The research identifies three core questions regarding the evaluation of model performance, the consistency of scaling laws with downstream tasks, and the variation of scaling trends across different image resolutions [7][8]. Group 2: Experimental Findings - Experiments conducted at a fixed resolution of 32×32 pixels reveal that the optimal scaling strategy is highly dependent on the target task, with image generation requiring a larger token-parameter ratio than classification tasks [18][22]. - As image resolution increases, the model size must grow faster than the data size to maintain optimal scaling, indicating that computational capacity is the primary bottleneck rather than data availability [18][26]. - The study shows that while the scaling trends for next-pixel prediction can be predicted using established frameworks from language models, the optimal scaling strategies differ significantly between tasks [21][22]. Group 3: Future Outlook - The article predicts that next-pixel modeling will become feasible within the next five years due to the rapid growth of training computational power, which is expected to increase by four to five times annually [8][26]. - It concludes that despite the current challenges, the path towards pixel-level modeling remains viable and could achieve competitive performance in the future [26].
当大脑独处时,它在思考什么?
Hu Xiu· 2025-10-08 01:33
Core Insights - The article discusses the concept of unsupervised learning in the brain, highlighting its significance in understanding how organisms, including humans, learn without external rewards or guidance [1][2][4]. Summary by Sections Unsupervised Learning as Brain Preparation - Unsupervised learning is not unique to humans; for instance, mice can form spatial memories without rewards when exploring new environments [2][3]. - A study conducted by scientists at the Howard Hughes Medical Institute utilized a controlled virtual reality environment to observe the neural changes in mice during unsupervised and supervised learning [3][4]. Neural Plasticity and Learning Pathways - Results indicated that both task-trained and unsupervised learning groups exhibited similar neural plasticity changes in the visual cortex, suggesting that neural plasticity may not rely solely on task feedback [4][5]. - The study revealed that unsupervised learning allows the brain to categorize and encode visual information efficiently, akin to pre-studying before formal tasks [5][6]. Visual and Spatial Plasticity - The research explored whether neural responses were sensitive to spatial positions or visual features, concluding that visual features significantly influenced unsupervised learning behavior [7][8]. - Mice demonstrated a capacity to ignore spatial configurations of textures, indicating a preference for visual feature similarity over spatial positioning [8]. Collaboration of Learning Types - The study suggests a division of labor in the brain, where unsupervised learning extracts features while supervised learning assigns meaning to those features [9][12]. - This dual learning approach may be crucial for rapid adaptation in complex environments [12]. Implications for Neuroscience and AI - The findings bridge neuroscience and artificial intelligence, challenging the traditional view that learning requires reinforcement signals [14]. - The study's insights into the brain's feature extraction capabilities could inform the design of more efficient AI models, reducing reliance on labeled data [14][15]. Future Research Directions - Several unresolved questions remain regarding the molecular basis of neural plasticity and the universality of these findings across species and cognitive levels [16][17]. - The potential age-related limitations of unsupervised learning abilities and their implications for cognitive development warrant further investigation [18]. Broader Insights on Learning - The article emphasizes the evolutionary significance of unsupervised learning as a survival mechanism, suggesting that over-reliance on reward-driven learning may hinder natural exploratory abilities [19][20].
语音分离最全综述来了!清华等团队深度分析200+文章,系统解析「鸡尾酒会问题」研究
机器之心· 2025-09-03 04:33
Core Viewpoint - The article discusses the revolutionary advancements in the field of speech separation, particularly addressing the "cocktail party problem" through the development of deep neural networks (DNN) [2]. Group 1: Overview of Speech Separation - Speech separation has become crucial for enhancing speech clarity in complex acoustic environments and serves as a preprocessing method for other speech processing tasks [2]. - Researchers from various institutions conducted a comprehensive survey of over 200 representative papers, analyzing the latest research methods across multiple dimensions including deep learning methods, model architectures, evaluation metrics, datasets, and future challenges [2]. Group 2: Problem Definition - The authors categorize speech separation tasks into known and unknown speaker separation based on whether the number of speakers is fixed or variable, highlighting the challenges associated with each scenario [6]. - The need for dynamic output channel determination and the balance between separation quality and termination timing are emphasized as significant challenges in unknown speaker scenarios [6]. Group 3: Learning Paradigms - The article compares supervised and unsupervised learning methods, detailing the advantages and limitations of each approach in the context of speech separation [10]. - Supervised learning is currently the most mature paradigm, utilizing paired mixed audio and clean source audio for training, while unsupervised methods explore training models directly on unlabelled mixed audio [12]. Group 4: Model Architectures - The core components and evolution of speech separation models are summarized, including encoder, separation network, and decoder [14]. - Various architectures such as RNN-based, CNN-based, and transformer models are discussed, showcasing their strengths in capturing long-term dependencies and local feature extraction [17][18]. Group 5: Evaluation Metrics - A comprehensive evaluation metric system is necessary for assessing model performance, which includes both subjective and objective metrics [19]. - The article compares various metrics, highlighting the trade-offs between subjective evaluations that reflect human experience and objective metrics that are efficient but may focus on different aspects [20]. Group 6: Datasets - The article summarizes publicly available datasets for speech separation research, categorizing them based on single-channel and multi-channel formats [22]. - Understanding the coverage and difficulty of these datasets aids researchers in selecting appropriate datasets for algorithm evaluation and identifying gaps in current research [22]. Group 7: Performance Comparison - The authors present a comparison of different models' performance on standard datasets, illustrating the progress in speech separation technology over recent years [24]. - Notable improvements in performance metrics, such as SDR, are highlighted, with advanced architectures achieving SDR levels around 20 dB [24][25]. Group 8: Tools and Platforms - The article introduces various open-source tools and platforms that facilitate the development and application of speech separation tasks, comparing their functionalities and limitations [28]. - These tools provide convenient interfaces for researchers to replicate results and build prototype systems, accelerating the transition from research to application [28]. Group 9: Challenges and Future Directions - The article discusses current challenges in the field, including long-duration audio processing, mobile and embedded applications, real-time speech separation, and the rise of generative methods [32][33]. - The integration of pre-training techniques and the focus on target speaker extraction are also identified as key areas for future exploration [33].