Workflow
Transformer
icon
Search documents
警钟敲响!Hinton 最新万字演讲:怒怼乔姆斯基、定义“不朽计算”、揭示人类唯一生路
AI科技大本营· 2026-02-09 04:03
编译 | 王启隆 出品丨AI 科技大本营(ID:rgznai100) 2026 年的冬天,安大略省金斯顿的寒风似乎比往年更凛冽一些。 在女王大学(Queen's University)的礼堂里,气氛却呈现出一种奇特的庄重与躁动。这里通常是 讨论中微子、暗物质或者宇宙起源的地方——麦克唐纳研究所的物理学家们习惯于在这里观测宇宙中 最微小的粒子,试图解开最宏大的谜题。但今晚,讲台属于一位计算机科学家。 Geoffrey Hinton,这位 78 岁的老人走上讲台时,背显得有些佝偻,但眼神依然锐利。 对于科技圈而言,Hinton 的名字本身就是一座丰碑,也是一道裂痕。他是反向传播算法的奠基人, 是深度学习的布道者,是被供奉在神坛上的"AI 教父"。也是他,在 2012 年用 AlexNet 撞开了神经 网络的大门,亲手点燃了这场席卷全球的 AI 革命。 然而,在 2023 年离开谷歌后,他却转身成为了这场革命最冷静、最悲观的"吹哨人"。 讲座并不是一场常规的技术布道。开场前发生了一个极具讽刺意味的小插曲:主办方没有亲自撰写 Hinton 的介绍词,而是把这个任务交给了一个 AI。那个 AI 仅仅用了几秒钟,就生成了 ...
扎心真相!20万vs50万vs100万大模型算法工程师,差的不只是薪资…大厂6年面试官实锤
Sou Hu Cai Jing· 2026-02-02 15:48
谁懂啊家人们!现在的人工智能算法圈,早就不是"会跑BERT、写Transformer就能拿高薪"的时代了 三四年前,随便在简历上写个"熟悉Transformer""参与过NLP项目",就能骗个大包;可到了2025、2026年,大模型把行业门槛踩得稀碎,却把天花板顶到了 平流层——同样是人工智能算法工程师,有人年薪20万只能写胶水代码,有人年薪50万成为团队中流砥柱,还有人年薪100万+成为行业大佬,差距到底在 哪? 今天不绕弯、不广告,纯个人干货分享!我在大厂做了6年算法工程师,2022年前搞CV、NLP,2023年全面转向大模型,年均负责3个大模型项目,既当过 面试官,也当过候选人,见多了20万、50万、100万工程师的差距,今天一次性说透! 不管你是刚入门的新手、想转行的小白,还是卡在20万薪资想冲50万的从业者,看完这篇,你就知道劲该往哪里使,再也不用盲目焦虑、白费力气! 现在网上的大模型课程、文章多到泛滥,动辄列一个几十条的知识清单,把新手吓得无从下手。其实学大模型,最核心的不是"学得多全",而是"学得多 准"——掌握那些绕不开、面试必问、实际干活必需的最少必要知识,比盲目啃论文、背公式有用10倍! ...
烦人的内存墙
半导体行业观察· 2026-02-02 01:33
公众号记得加星标⭐️,第一时间看推送不会错过。 前所未有的无监督训练数据的可用性,以及神经网络的扩展规律,导致用于服务/训练低层逻辑模型 (LLM)的模型规模和计算需求出现了前所未有的激增。然而,主要的性能瓶颈正日益转移到内存 带宽上。 过去20年,服务器硬件的峰值浮点运算能力(FLOPS)以每两年3倍的速度增长,超过了DRAM和互 连带宽的增长速度,后两者分别仅以每两年1.6倍和1.4倍的速度增长。这种差距使得内存而非计算成 为人工智能应用(尤其是服务应用)的主要瓶颈。 本文分析了编码器和解码器Transformer模型,并展示了内存带宽如何成为解码器模型的主要瓶颈。 我们提出重新设计模型架构、训练和部署策略,以克服这一内存限制。 引言 近年来,训练大型语言模型 (LLM) 所需的计算量以每两年 750 倍的速度增长。这种指数级增长趋势 是人工智能加速器发展的主要驱动力,这些加速器致力于提升硬件的峰值计算能力,但往往以牺牲其 他部分(例如内存层次结构)的简化为代价。 然而,这些趋势忽略了训练和服务人工智能模型过程中一个新兴的挑战:内存和通信瓶颈。事实上, 许多人工智能应用的瓶颈并非计算能力,而是芯片内部/芯 ...
SpaceX申请部署百万颗卫星,欲建轨道AI数据中心网络;速率超百G!我国星地激光通信业务化应用能力迈上新台阶——《投资早参》
Mei Ri Jing Ji Xin Wen· 2026-02-02 01:13
每经记者|杨建 每经编辑|彭水萍 (一)重要市场新闻 3、据央视新闻报道,当前全球AI算力建设进入爆发期,高功率、高稳定的供电成为算力集群的"生命 线",电力设备变压器正升级为算力基础设施的核心。据调研发现,广东等地大量变压器工厂已经处于 满产的状态,其中部分面向数据中心的业务订单都排到了2027年。全球AI算力中心的爆发式增长,让 变压器成为稀缺资源,美国市场交付周期已经从50周延长至127周。在佛山另一家电气设备企业,工厂 的核心产品干式变压器主要用于数据中心等场景,出口订单也出现快速增长。 点评:变压器满产的情况也出现在长三角,在江苏的一家变压器工厂,产品订单已经排到2027年底,其 中,国内首台全绝缘超高压大容量变压器近日发往北美市场。数据显示,我国变压器行业企业约3000 家。2025年,我国变压器出口总值达646亿元,比2024年增长近36%。我国已经成为世界第一大变压器 生产国,建成了全球最完备的变压器生产体系,具有全产业链自主可控的硬实力,产能约占全球60%。 概念股包括金华利电、金冠电气、永福股份等。 (三)避雷针 (二)行业掘金 1、1月31日,SpaceX已向美国联邦通信委员会(FCC) ...
挑战Transformer,前OpenAI研究VP宣布创业,拟融资10亿美元
机器之心· 2026-01-31 04:10
Core Insights - The article discusses the shift in focus from Transformer models to alternative approaches in AI research, as highlighted by Llion Jones, co-founder and CTO of Sakana AI, who is reducing his research time on Transformers and seeking new goals [1][3] - Jerry Tworek, former VP of research at OpenAI, has founded Core Automation, which aims to explore a different path in AI model development, specifically focusing on "Continual Learning" capabilities [6][10] - Core Automation is seeking $500 million to $1 billion in funding and plans to develop models that require significantly less data and computational resources compared to current leading models [11][16] Company Developments - Core Automation is in its early stages, with its funding and product direction still subject to change, but it represents a growing group of researchers advocating for a fundamental transformation in AI [8][9] - Tworek's vision includes a single algorithm named "Ceres," which contrasts with the typical multi-stage training process used by major companies [16] - The company aims to automate the production of its own products, with initial goals in industrial automation and long-term ambitions that include creating self-replicating factories and bio-machines [16] Industry Trends - The article notes a trend among researchers who believe that current mainstream model development techniques are inadequate for achieving significant breakthroughs in fields like biology and medicine [9] - There is a growing enthusiasm in the capital markets for new experimental labs, as evidenced by recent funding rounds for startups like Humans& and Thinking Machines Lab, despite many lacking revenue or products [15] - The exploration of "Continual Learning" is not exclusive to Core Automation, as other labs like Safe Superintelligence are pursuing similar goals [13][14]
大模型的第一性原理:(二)信号处理篇
机器之心· 2026-01-30 08:49
Core Viewpoint - The article discusses the transformation of natural language processing problems into signal processing problems through semantic vectorization, emphasizing the importance of token embedding in large models and its connection to signal processing and information theory [2][32]. Semantic Embedding / Vectorization - The concept of using vectors to model semantics dates back to Luhn's 1953 paper, but significant breakthroughs were achieved in 2013 by Mikolov and others, who successfully trained neural network models to convert tokens into semantic vectors [6][9]. - The ideal semantic vectorization has not been fully realized, but the inner product of semantic vectors can represent semantic relevance at the token level [7][11]. - The semantic vector space can be modeled as a probability-inner product space, balancing complexity and effectiveness by using a unit sphere to define the space [8][10]. Optimal Semantic Vectorization - The optimal semantic encoding is closely related to downstream tasks, with the goal of predicting the next token. The semantic encoder should maximize the conditional mutual information between the next token and the current sequence [13][14]. - The article highlights that existing methods like Contrastive Predictive Coding (CPC) optimize the upper bound of the semantic encoder but may not achieve the optimal solution [15][19]. Transformer as a Nonlinear Time-Varying Vector Autoregressive Time Series - The Transformer model is identified as a self-regressive large language model that predicts the next token based on the input token sequence and previously generated tokens [21][30]. - The attention mechanism in Transformers can be mathematically expressed as a nonlinear time-varying vector autoregressive time series, which is crucial for predicting the next token [22][24]. Signal Processing and Information Theory - The article establishes a relationship between signal processing and information theory, noting that signal processing implements information theory principles in specific computational architectures [32][33]. - The transition from BIT in the information age to TOKEN in the AI era is proposed as a way to apply Shannon's information theory to the mathematical principles behind large models [36].
清华姚班校友刘壮团队再发力,无需归一化的Transformer性能进化
机器之心· 2026-01-22 11:00
编辑|陈陈、冷猫 刘壮带队的无需归一化 Transformer 又有新的版本了。 一直以来,在 Transformer 架构里,LayerNorm 几乎是标配,但它也有明显问题:比如计算和访存成本高,尤其在大模型推理阶段。 因此,「无归一化(Normalization-Free)」Transformer 成为研究者探索的一个长期目标,但一直卡在两个难点上:训练不稳定,以及性能明显不如带归一化的模 型。 而这篇新论文提出了一种非常简单的新激活层 Derf(Dynamic erf),让「无归一化(Normalization-Free)」的 Transformer 不仅能稳定训练,还在多个设置下性 能超过了带 LayerNorm 的标准 Transformer。 刘壮本人也在 X 账号上分享了这一成果。他表示,这是一篇关于更强无归一化 Transformer 的新论文:研究团队提出了 Derf(Dynamic erf),一种结构极其简单 的逐点(point-wise)层。借助 Derf,完全不依赖归一化层的 Transformer 不仅能够稳定训练,而且 在实 际性能上 已经可以超越传统依赖 LayerNorm 等 ...
超越“第四次工业革命”:关于人工智能与人类主体性的再思考
3 6 Ke· 2026-01-20 12:11
Core Insights - The current discourse around artificial intelligence (AI) is often framed as the "Fourth Industrial Revolution," likening it to previous industrial transformations, but this perspective is limited in understanding the deeper cognitive and existential implications of AI [1] - The emergence of generative AI signifies not just an upgrade in tools but a profound crisis and reconstruction of subjectivity, akin to a digital renaissance [2] Historical Context - To comprehend the mixed emotions of excitement and fear regarding AI, it is essential to revisit the Middle Ages, where human reason was seen as auxiliary to divine order, limiting human agency [3] - The Renaissance marked a significant shift in value systems, emphasizing human dignity and the freedom of self-definition, as articulated by thinkers like Pico della Mirandola [4][5] Technological Parallels - The Renaissance was not solely a philosophical movement but was also driven by technological advancements, such as linear perspective in art, which parallels today's AI technologies [8] - The introduction of linear perspective transformed visual representation, allowing for a measurable and calculable understanding of the world, similar to how Transformer models process language in high-dimensional semantic spaces [10][12] Knowledge Distribution - The invention of the printing press by Gutenberg drastically reduced the marginal cost of information distribution, leading to a democratization of knowledge, which generative AI is now extending by lowering the barriers to creative skills [15][17] - Generative AI is enabling a form of "skill democratization," allowing individuals without formal training to access advanced capabilities, thereby disrupting existing social structures more profoundly than the Industrial Revolution [17] Ethical Considerations - There is a risk of a resurgence of "digital theocracy," where algorithmic systems increasingly dictate human choices, leading to a potential loss of agency [18][19] - The commodification of individuals as mere data sources in AI systems threatens the ethical principle of viewing humans as ends in themselves, raising concerns about the erosion of human dignity [21][22] Future Outlook - The path forward requires a redefinition of human irreplaceability in the face of advancing AI, emphasizing the importance of human values and ethical considerations in technology [22][25] - The future will likely favor individuals who possess deep humanistic knowledge and the ability to define problems and assign meaning, rather than merely those who can accumulate knowledge [24][25]
思源电气:预告 2025 财年净利润同比增长 54%;维持 “买入” 评级
2026-01-19 02:32
Sieyuan Electric (002028.SZ) Conference Call Summary Company Overview - **Company**: Sieyuan Electric - **Ticker**: 002028.SZ - **Industry**: Grid Equipment Key Financial Highlights - **FY25 Revenue**: Rmb 21,205 million, representing a **37% year-over-year increase** and a **2% increase** from previous estimates [4] - **FY25 Net Income**: Rmb 3,163 million, reflecting a **54% year-over-year increase** and a **1% increase** from previous estimates [4] - **4Q25 Implied Revenue**: Rmb 7,378 million, up **46% year-over-year** [4] - **4Q25 Implied Net Income**: Rmb 971 million, up **74% year-over-year** [4] - **Net Profit Margin (NPM)** for 4Q25: 13.2%, which is **2.7 percentage points lower** than the first three quarters of FY25 [4] Growth Projections - **Revenue CAGR (2025-2030)**: Expected to be **23%** [5] - **Net Profit CAGR (2025-2030)**: Expected to be **28%** [5] - **Overseas Revenue CAGR (2025-2030)**: Expected to be **36%**, increasing its contribution from **33% to 56%** of total revenue [5][6] Market Position and Strategy - Sieyuan is positioned among the **top 1-3** in various product categories within the Chinese grid equipment market [6] - The company is expected to benefit from a **global grid upgrade cycle** driven by aging infrastructure, economic development, and renewable energy [6] - Market share in switchgear is projected to grow from **6% in 2025** to **8% in 2030**, and in power transformers from **1% to 6%** [6] Valuation and Price Target - **12-month Price Target**: Rmb 195.6, based on a **2028E P/E of 25x**, discounted to 2026E at a **cost of equity (CoE) of 9.5%** [6][7] - Current Price: Rmb 185.9, indicating an **upside potential of 5.2%** [9] Risks - Key risks include: 1. **Overseas execution risk** [8] 2. Potential for margins to fall below expectations [8] 3. A slowdown in data center construction pace [8] Additional Insights - The company has a **multi-product portfolio** that enhances its competitive advantages and execution capabilities overseas [6] - Sieyuan's unique positioning is attributed to its ability to combine high quality with a long-term commitment to rigorous certification processes and sustained investments [6] Conclusion - Sieyuan Electric is well-positioned for growth in the grid equipment sector, with strong financial projections and a clear strategy to enhance its market share both domestically and internationally. The investment thesis remains positive, supported by robust growth forecasts and a solid valuation framework.
电网设备:全球分接开关与变压器需求保持强劲-Global Power Grid Equipment Global Tap Changer and Transformer Demand Remains Strong
2026-01-14 05:05
Summary of Global Power Grid Equipment Conference Call Industry Overview - **Industry**: Power Grid Equipment - **Key Company**: Maschinenfabrik Reinhausen (MR), a leading manufacturer of high voltage tap changers Transformer Market Outlook United States - **Demand Growth**: Expected to grow at an 8-10% CAGR from 2026-2030, driven by data centers, renewable energy projects, nuclear power plants, and public grid replacements [3][4] - **Public Grid Replacement**: Two-thirds of demand is attributed to public grid replacement, while one-third comes from new projects like renewables and data centers [3] - **Aging Infrastructure**: The US has one of the oldest grid infrastructures, with transformer service life reaching 30 to 60 years, leading to strong replacement demand [4] Europe - **Demand Growth**: Anticipated 4-6% CAGR from 2026-2030, influenced by electrification and decarbonization [6] - **Regional Variation**: Demand varies by country; France will see less demand due to reliance on nuclear power, while Germany, Poland, Italy, and Spain will experience higher demand due to transitions from fossil fuels [6] Middle East - **Demand Growth**: Expected 4-5% CAGR from 2026-2030, with a recent sharp increase driven by Saudi Arabia [7] - **Solar Projects**: Some projects, like NEOM, have been shelved due to financial reasons, indicating a potential slowdown [7] South Korea - **Demand**: Over 65% of tap changer demand is export-related, with less than 30% for domestic use [11] Supply & Pricing - **US Supply**: Two-thirds of transformers are imported, with tariffs and high domestic costs keeping prices high [5] - **Pricing Trends**: MR has increased prices annually for the past 3-4 years, but not as dramatically as power transformer prices, which have nearly doubled in some areas [18] Tap Changer Capacity Expansion - **Global Capacity**: Significant production increases planned, particularly in Europe, where capacity is expected to rise from 15,000 units in 2024 to 25,000 units by 2028 [13] - **US Capacity**: Current capacity remains at 2,500 units, with potential expansion postponed due to flattening demand [14] - **China Capacity**: Existing capacity is 4,000 units, with potential to increase to 8,000 units within 18-24 months if needed [15] Delivery Times - **US**: 15-20 weeks for delivery of tap changers [16] - **China**: Less than 10 weeks, preferred for logistical benefits [16] - **Europe**: 15-20 weeks for smaller tap changers, with higher-end models potentially taking up to six months [17] Market Share Strategy - **Market Share Defense**: MR aims to defend its market share rather than aggressively pursue growth, focusing on output growth in line with main markets [19] Valuation Comparison - **Global Companies**: Valuation metrics for various companies in the power grid equipment sector are provided, indicating a range of price targets and potential upside [20] Conclusion The power grid equipment industry is poised for growth, particularly in the US and Europe, driven by infrastructure needs and renewable energy projects. However, challenges such as high import costs and regional demand variations must be navigated. Companies like MR are focusing on capacity expansion and market share defense strategies to capitalize on these trends.