反向传播
Search documents
刚刚,Geoffrey Hinton成为第二位引用量破百万的科学家
机器之心· 2026-01-16 01:55
Core Viewpoint - Geoffrey Hinton has officially become the second computer scientist in history to surpass 1 million citations on Google Scholar, marking a significant milestone in his academic career and contributions to artificial intelligence [1][3]. Group 1: Academic Achievements - Hinton's citation count currently stands at 1,000,083, with an h-index of 192, indicating his substantial impact in the field of computer science and artificial intelligence [2]. - He is renowned for his work on backpropagation, which addressed the training challenges of multilayer neural networks, laying the groundwork for the deep learning revolution [10]. - Hinton, along with Yoshua Bengio and Yann LeCun, received the Turing Award in 2018, recognizing their pivotal contributions to the field of deep learning [13]. Group 2: Key Contributions - Hinton's notable innovations include the Boltzmann Machine, Restricted Boltzmann Machine, Deep Belief Network, Dropout technique, t-SNE for data visualization, Capsule Networks, and Knowledge Distillation, among others [14]. - His collaboration on AlexNet, which won the ImageNet competition in 2012, is considered a landmark moment that demonstrated the power of deep learning [16]. - The paper "Deep Learning," co-authored by Hinton, has garnered over 100,000 citations, summarizing the evolution and principles of deep learning [16]. Group 3: Personal Background and Career - Born into an academic family, Hinton's early life was marked by high expectations, which shaped his relentless pursuit of knowledge [5][8]. - He moved to Canada in the 1980s, where he established a long-term academic career at the University of Toronto, contributing significantly to the development of AI in Canada [9]. - Hinton's later years have seen him express concerns about the potential risks of AI, emphasizing the need for caution in its development [20]. Group 4: Legacy and Impact - Hinton's citation milestone reflects not only his individual achievements but also the collaborative efforts of his students, Alex Krizhevsky and Ilya Sutskever, who have also made significant contributions to AI [29]. - The historical context of Hinton's work illustrates the broader narrative of humanity's quest to understand intelligence, highlighting the transformative impact of his research on modern AI [31].
AI教父Hinton首爆十年前拍卖:我早已内定谷歌必赢
3 6 Ke· 2025-12-21 23:25
Core Insights - The conversation between AI pioneers Hinton and Jeff Dean at NeurIPS 2025 highlighted the evolution of AI, discussing key breakthroughs and challenges in the field [1][4][14] Group 1: Historical Context and Key Developments - Hinton and Dean reflected on the early breakthroughs in machine learning and the significant impact of the Transformer paper, with Dean stating that Google does not regret publishing it due to its global influence [3][43] - The discussion included anecdotes about the development of AlexNet, which revolutionized image recognition, and the early days of Google Brain, emphasizing the importance of scaling in AI models [14][25][31] Group 2: Technical Insights and Innovations - Hinton's realization about the importance of scaling in AI models came after attending a talk by Ilya Sutskever, which shifted his perspective on computational power [13][31] - The conversation also covered the development of the Transformer model, which improved efficiency in processing and understanding data, allowing for better performance with less computational power [43][45] Group 3: Future Directions and Predictions - Looking ahead, Dean expressed excitement about scaling attention mechanisms and the potential for models to access vast amounts of data, which would require innovations in hardware [52][54] - Both Hinton and Dean acknowledged the transformative potential of AI in fields like healthcare and education, while also recognizing the uncertainty regarding job displacement and the creation of new opportunities [56][57]
苹果提出新型反向传播:一台iPhone 15 Pro Max就能微调LLM
机器之心· 2025-10-30 01:41
Core Viewpoint - Apple has demonstrated the feasibility of fine-tuning large language models (LLMs) on iPhones using a new method called Memory-Efficient Backpropagation (MeBP), which offers better trade-offs between memory usage and computation time compared to existing methods [1][4]. Summary by Sections Introduction - The article discusses Apple's recent paper on MeBP, which allows for model fine-tuning on resource-constrained mobile devices like the iPhone 15 Pro Max [1][3]. Methodology - MeBP focuses on using LoRA for fine-tuning LLMs, aiming to keep memory usage below 1GB, as recommended by PocketLLM [4]. - The fine-tuning process using MeBP consists of three main steps: compressing base model weights, implementing gradient checkpointing, and creating an efficient runtime for executing the training graph [5][10]. Model Weight Compression - The team employed 4-bit symmetric INT4 quantization for non-LoRA parameters, including embeddings, to reduce disk space usage [7][10]. Gradient Checkpointing - The LLM is divided into blocks to ensure that memory consumption during backpropagation remains within device limits. Automatic differentiation is used to generate a backward graph for each block [8][9]. Runtime Implementation - The MeBP runtime is designed to minimize memory usage by memory-mapping compressed model weights and only decompressing them on demand during training [15][16]. Experimental Performance - The team compared MeBP with MeZO, the only known optimization method for mobile LLM fine-tuning, using server-side simulations and performance evaluations on mobile devices [18][24]. - The experiments were conducted on models with parameters ranging from 0.5B to 4B, focusing on loss and next token accuracy as evaluation metrics [20]. Utility Comparison - Results indicated that while zero-order (ZO) optimization showed slower convergence compared to first-order (FO) optimization, MeBP significantly outperformed ZO in terms of convergence speed and computational efficiency [23]. Performance Comparison - MeBP was implemented in Swift on an iPhone 15 Pro Max with 8GB RAM, showing that MeBP's computation time per gradient step was 43% to 94% longer than MeZO, but it converged faster overall due to fewer required steps [24][28]. - The memory usage of MeBP was slightly higher than MeZO in the worst case, but overall training memory usage was approximately 10 times smaller than previous mobile implementations [28]. Conclusion - All tested LLMs could be efficiently fine-tuned within 1GB of memory, making them suitable for background training on mobile devices [28].
Hinton暴论:AI已经有意识,它自己不知道而已
量子位· 2025-10-12 04:07
Core Viewpoint - The article discusses Geoffrey Hinton's perspective on artificial intelligence (AI), suggesting that AI may already possess a form of "subjective experience" or consciousness, albeit unrecognized by itself [1][56]. Group 1: AI Consciousness and Understanding - Hinton posits that AI might have a nascent form of consciousness, which is misunderstood by humans [2][3]. - He emphasizes that AI has evolved from keyword-based search systems to tools that can understand human intentions [10][14]. - Modern large language models (LLMs) exhibit capabilities that are close to human expertise in various subjects [15]. Group 2: Neural Networks and Learning Mechanisms - Hinton explains the distinction between machine learning and neural networks, with the latter inspired by the human brain's functioning [17][21]. - He describes how neural networks learn by adjusting the strength of connections between neurons, similar to how the brain operates [21][20]. - The breakthrough of backpropagation in 1986 allowed for efficient training of neural networks, significantly enhancing their capabilities [38][40]. Group 3: Language Models and Cognitive Processes - Hinton elaborates on how LLMs process language, drawing parallels to human cognitive processes [46][47]. - He asserts that LLMs do not merely memorize but engage in a predictive process that resembles human thought [48][49]. - The training of LLMs involves a cycle of prediction and correction, enabling them to learn semantic understanding [49][55]. Group 4: AI Risks and Ethical Considerations - Hinton highlights potential risks associated with AI, including misuse for generating false information and societal instability [68][70]. - He stresses the importance of regulatory measures to mitigate these risks and ensure AI aligns with human interests [72][75]. - Hinton warns that the most significant threat from advanced AI may not be rebellion but rather its ability to persuade humans [66]. Group 5: Global AI Landscape and Competition - Hinton comments on the AI competition between the U.S. and China, noting that while the U.S. currently leads, its advantage is diminishing due to reduced funding for foundational research [78][80]. - He acknowledges China's proactive approach in fostering AI startups, which may lead to significant advancements in the field [82].
首访上海,“AI之父”缘何掀起浪潮?
Guo Ji Jin Rong Bao· 2025-07-28 13:06
Group 1 - Geoffrey Hinton, known as the "father of AI," made his first public appearance in China at the WAIC 2025, sparking global attention and reflection on AI development [1] - Hinton's family background is deeply rooted in science, with connections to mathematics, physics, and agriculture, highlighting a legacy of scientific achievement [3][4] - Hinton's research journey began in the 1970s, focusing on artificial neural networks at a time when the field was largely overlooked, leading to significant breakthroughs in AI [6][7] Group 2 - The development of GPU technology in the early 2000s revitalized interest in neural networks, culminating in Hinton's pivotal work on backpropagation, which transformed machine learning [6][8] - In 2012, Hinton and his students developed AlexNet, winning the ImageNet competition and marking a turning point for deep learning as a core technology in AI [7][8] - Hinton has received both the Turing Award and the Nobel Prize in Physics, recognizing his contributions to deep learning and neural networks [8] Group 3 - Hinton has consistently raised alarms about the rapid advancement of AI, warning that it could surpass human intelligence and pose existential risks [10][11] - He emphasizes the need for a global AI safety collaboration mechanism and has criticized tech companies for prioritizing profits over regulation [11] - Hinton estimates a 10% to 20% probability that AI could take over and destroy human civilization, advocating for significant investment in AI safety research [11]
重磅!AlexNet源代码已开源
半导体芯闻· 2025-03-24 10:20
Core Points - The article discusses the release of the source code for AlexNet, a groundbreaking neural network developed in 2012, which has significantly influenced modern AI methods [1][18] - AlexNet was created by researchers from the University of Toronto, including Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, and it is primarily used for image recognition tasks [2][15] Group 1: Background of Deep Learning - Geoffrey Hinton is recognized as one of the fathers of deep learning, which utilizes neural networks and forms the foundation of contemporary AI [4] - The revival of neural network research in the 1980s was led by cognitive scientists who rediscovered the backpropagation algorithm, essential for training multilayer neural networks [5][6] Group 2: ImageNet and GPU Development - The ImageNet project, initiated by Stanford professor Fei-Fei Li, provided a large dataset necessary for training neural networks, significantly contributing to the success of AlexNet [8][9] - NVIDIA played a crucial role in making GPU technology more versatile and programmable, which was essential for the computational demands of training neural networks [9][12] Group 3: Creation and Impact of AlexNet - AlexNet combined deep neural networks, large datasets, and GPU computing, achieving groundbreaking results in image recognition [13] - The paper on AlexNet published in 2012 has been cited over 172,000 times, marking it as a pivotal moment in AI research [17] - The release of AlexNet's source code by the Computer History Museum (CHM) is seen as a significant historical contribution to the field of artificial intelligence [18]