Workflow
过拟合
icon
Search documents
字节跳动张一鸣隐退4年首次露面,不聊抖音不聊豆包,这次讲了啥?
Sou Hu Cai Jing· 2025-10-12 03:40
文 / 汪淼 美编 / 顾青青 出品 / 网界 这场沉寂四年后的公开亮相,看似低调,却悄悄透露出互联网行业从"抢流量"转向"育人才"的新信号。 01 四年低调后,张一鸣为何选择为这家机构站台? 很多人好奇,过去四年很少露面的张一鸣,为什么会亲自为上海徐汇知春创新中心的开业站台? 答案要 从这家机构的由来和张一鸣长期的关注方向里找。 知春创新中心不是突然成立的。早在2016年,张一鸣就注意到字节里不少表现突出的算法工程师,都毕 业于上海交大的ACM班。那时候他还专程去上海交大,拜访了ACM班创始人俞勇教授,还有班级顾 问、图灵奖得主John Hopcroft。 10月9日,上海徐汇知春创新中心的开业仪式上,字节跳动创始人张一鸣近年来首次公开亮相。这是他 自2021年5月卸任字节跳动CEO后,四年多来第一次出现在国内公开活动现场。 过去四年,张一鸣几乎淡出了公众视野,很少公开露面,也从未对字节的业务和发展公开发声。这次他 不再以企业管理者的身份站在台前,而是以创新中心发起人的角色,和上海交通大学ACM班创始人俞 勇教授一起,为一家民办非营利性机构揭牌。 现场没有高调的仪式,反而摆着学生们做的敦煌风格AI游戏、火箭 ...
张一鸣多年来首次露面,站台上海创新中心并发言
Sou Hu Cai Jing· 2025-10-11 17:19
他强调,创新中心要找的不是"死读书"的人,而是思维活跃、敢于实践、愿意独立思考的人。他还说,创新路上要敢于试错,保持平常心,别怕失败。 值得一提的是,根据《福布斯富豪榜》最新数据,截止2025年3月,张一鸣的财富估值达655亿美元(约2902亿令吉),其身家已超越腾讯创办人马化腾及 农夫山泉创办人钟睒睒,成为中国首富,在全球富豪榜中排名第23位。 而同为创始人的余勇教授,是上海交通大学特聘教授、博士生导师,首批入选"国家高层次人才特殊支持计划"领军人物——教学名师,ACM班创始人。 近日,字节跳动创始人张一鸣突然现身上海徐汇知春创新中心开业现场,这是他近年罕见的公开亮相。 张一鸣这次回归是因为上海徐汇知春创新中心正式开业,该创新中心由字节跳动创始人张一鸣、上海交通大学ACM班创始人俞勇教授共同发起。作为民 办非营利性机构,中心计划招聘对泛计算机和人工智能感兴趣的年轻人。 张一鸣在发言时直言不讳。他说,很多年轻人有本事,但潜力却被埋没。他用"过拟合"这个AI术语打比方:有些人知识很扎实,技能很强,但一遇到创新 任务就掉链子。他提到,自己长期关注人才招聘和培养,注意到很多人才潜力没有被充分挖掘,他以机器学习模型中 ...
张一鸣近年来首次公开露面,对字节跳动意味着什么
Sou Hu Cai Jing· 2025-10-10 13:39
Core Insights - Zhang Yiming's recent public appearance in China after four years has garnered significant attention, though it may not surpass the interest generated by Jack Ma's return [1][15] - The focus of Zhang's appearance was on talent development, emphasizing the importance of nurturing innovative and resilient individuals [3][4] Group 1: Talent Development and Innovation - Zhang Yiming highlighted the need for talent recruitment and development, noting that many individuals' potential remains untapped [3] - He compared the phenomenon of overfitting in machine learning to the challenges faced by talented individuals in innovation tasks, advocating for independent thinking and practical experience [3] - Zhang's talent philosophy evolved during the growth of ByteDance, where he prioritized curiosity and optimism over traditional experience [4] Group 2: Leadership Transition - In May 2021, Zhang Yiming stepped down as CEO of ByteDance, with co-founder Liang Rubo taking over, as Zhang aimed for greater innovation and creativity within the company [5][6] - Zhang expressed a desire to focus on long-term strategic matters, corporate culture, and social responsibility rather than daily management [6][8] Group 3: Market and Regulatory Environment - The external environment for tech companies is changing, with emerging fields like virtual reality and life sciences beginning to impact daily life [9] - ByteDance's potential IPO has faced delays due to regulatory uncertainties and the need for greater business transparency [12] - Zhang Yiming's movements are seen as a significant indicator of ByteDance's future direction, especially in light of the ongoing TikTok controversies [10][15] Group 4: Public Perception and Media Attention - Zhang Yiming's return to the public eye has sparked discussions about his citizenship status and ByteDance's IPO plans, with rumors circulating frequently [11][12] - The media narrative surrounding Zhang's public appearances often reflects broader themes of corporate leadership and innovation within the tech industry [15]
张一鸣,罕见公开露面
Core Insights - Zhang Yiming, the founder of ByteDance, re-emerged in public as a talent developer, sharing his thoughts on innovation and education at the opening of the Xuhui Zhichun Innovation Center in Shanghai [1] - The center, co-founded with Professor Yu Yong from Shanghai Jiao Tong University, aims to recruit young talents interested in computer science and artificial intelligence [1] Group 1: Talent Development Philosophy - Zhang emphasized the importance of nurturing talent, noting that many individuals' potential remains untapped [1] - He used the concept of "overfitting" from machine learning to illustrate the pitfalls in current talent development, where individuals may excel in specialized knowledge but struggle with innovation tasks [1] - The innovation center aims to cultivate young talents who are active thinkers, passionate, resilient, and capable of embracing uncertainty while maintaining a long-term perspective [1] Group 2: Entrepreneurial Insights - Zhang's views on talent stem from his entrepreneurial experience, highlighting the challenges faced when ByteDance was founded in 2012, particularly in developing recommendation engines [2] - He believes that merely making incremental innovations without addressing fundamental issues would not lead to significant breakthroughs in the mobile internet space [2] Group 3: Leadership Transition - In May 2021, Zhang announced his resignation as CEO of ByteDance, with co-founder Liang Rubo taking over the role, citing a desire for the company to achieve greater innovation and creativity [3] - Zhang expressed that he had been relying on past successes and had not kept up with advancements in machine learning over the last few years [3] - Post-resignation, he plans to focus on long-term strategic matters, including corporate culture and social responsibility, while dedicating time to learning and exploring new ideas [3]
张一鸣,罕见公开露面
21世纪经济报道· 2025-10-10 10:27
Core Viewpoint - Zhang Yiming, the founder of ByteDance, emphasizes the importance of talent cultivation and innovation, highlighting the need for a shift in educational approaches to better prepare young talents for real-world challenges [1][2]. Group 1: Talent Cultivation - The newly established Xuhui Zhichun Innovation Center aims to recruit young individuals interested in computer science and artificial intelligence, reflecting Zhang's commitment to nurturing talent [1]. - Zhang draws a parallel between the concept of "overfitting" in machine learning and current talent training pitfalls, where individuals may excel in specific skills but struggle with innovation tasks [2]. - The center seeks to foster active thinking, passion, resilience, and a long-term perspective among youth, encouraging independent thought and practical experience [2]. Group 2: Zhang Yiming's Philosophy - Zhang's views on talent development are rooted in his entrepreneurial experience, particularly the founding of ByteDance in 2012, where he focused on solving fundamental problems rather than merely making incremental improvements [2]. - After stepping down as CEO in May 2021, Zhang expressed a desire for the company to continue innovating and becoming more meaningful, indicating a shift towards strategic thinking and long-term vision [4]. - He aims to dedicate time to learning and exploring new ideas, focusing on areas like virtual reality and life sciences, which he believes will significantly impact human life [4].
张一鸣罕见露面,联合上海交大培育AI新锐
Core Insights - Zhang Yiming, the founder of ByteDance, emphasizes the importance of cultivating innovative talent who are resilient, passionate, and capable of independent thinking, rather than merely focusing on technical skills [1][3][4] Group 1: Talent Development - The newly established Xuhui Zhichun Innovation Center aims to recruit young individuals interested in computer science and artificial intelligence [1] - Zhang Yiming draws a parallel between talent development and the concept of "overfitting" in machine learning, highlighting the pitfalls of training individuals to excel in specific areas without fostering adaptability [2] - The center seeks to nurture youth who value practical experience and maintain a long-term perspective, encouraging them to embrace uncertainty [3] Group 2: Company Philosophy - Zhang Yiming's approach to talent development is influenced by his entrepreneurial experiences, particularly the early days of ByteDance when few companies focused on recommendation engines [4] - The philosophy of addressing fundamental problems has been central to ByteDance's product development strategy [5] - After stepping down as CEO, Zhang Yiming expressed a desire for the company to continue achieving significant innovations and to focus on long-term strategic matters [5][6] Group 3: Future Focus - Zhang Yiming plans to dedicate time to learning and exploring new concepts, aiming to create more possibilities for the company over the next decade [6] - He acknowledges the changing external environment for tech companies, with emerging fields like virtual reality and life sciences beginning to impact human life [6]
别让成功的惯性“锁死” 未来
3 6 Ke· 2025-09-25 00:51
Core Insights - The article discusses the concept of "path dependence" and how reliance on past experiences can hinder innovation and adaptation in business environments [1][3][5] - It highlights the dangers of "success dependence," where companies fail to evolve due to over-reliance on previous successful strategies [3][11] - The need for entrepreneurs to break free from these mental constraints to unlock new growth opportunities is emphasized [11][12] Group 1: Path Dependence - Path dependence can lead to a rigid adherence to familiar strategies, which may become a liability in changing environments [2][5] - Examples of companies like Nokia and Kodak illustrate how over-reliance on past successes can result in missed opportunities and decline [3][10] - The concept of "local optimum" is introduced, where businesses may settle for satisfactory solutions without exploring potentially better alternatives [7][8] Group 2: Cognitive Biases - Cognitive biases, such as the tendency to stick with familiar methods, can limit the ability to adapt to new challenges [6][9] - The article explains how the brain's predictive coding can reinforce existing beliefs and hinder the acceptance of new information [6][9] - Entrepreneurs often attribute success to specific methods without recognizing the importance of context and adaptability [6][9] Group 3: Overfitting in Business - The analogy of "overfitting" from machine learning is used to describe how businesses can become too specialized in their past methods, failing to generalize to new situations [4][11] - This overfitting can lead to a lack of responsiveness when faced with new data or market changes [4][11] Group 4: Strategies for Overcoming Constraints - To break free from path dependence, companies should actively seek new experiences and challenges [12][14] - Developing transferable skills is crucial for adapting to changing environments and avoiding the pitfalls of being locked into a single path [14][15] - Regularly reassessing goals and strategies can help identify when a company is stuck in a local optimum and needs to pivot [13][15]
别让成功的惯性“锁死” 未来 | 创业Lifestyle
红杉汇· 2025-09-25 00:04
Core Viewpoint - The article discusses the dangers of "path dependence" and "success dependence" in entrepreneurship, emphasizing that reliance on past experiences can hinder innovation and adaptation to new market conditions [4][6][15]. Group 1: Path Dependence - Path dependence can lead to a reliance on outdated strategies, making it difficult for companies to adapt to new technologies and market demands [4][6]. - Examples include Nokia and Kodak, which failed to transition to smartphones and digital photography due to their reliance on past successes [4][6]. - The concept of path dependence is rooted in increasing returns and transfer costs, which discourage companies from changing established practices [6][7]. Group 2: Success Dependence - Success dependence refers to the tendency to attribute past successes solely to specific methods, ignoring the context that made those methods effective [7][8]. - This cognitive bias can lead to a failure to question the relevance of established practices when market conditions change [7][8]. Group 3: Local Optima - The article highlights the issue of "local optima," where individuals or companies settle for satisfactory solutions without exploring potentially better options [10][11]. - This phenomenon can hinder personal growth and innovation, as sticking to familiar paths may prevent the discovery of superior alternatives [11][12]. Group 4: Breaking Free from Constraints - To overcome these limitations, companies should actively seek new experiences and challenge existing habits [16][18]. - Developing transferable skills can help entrepreneurs adapt to changing environments and avoid being trapped by outdated practices [18][19]. - The article advocates for a mindset shift from relying on past experiences to actively shaping future paths through continuous learning and adaptation [19].
华人团队终结Token危机:扩散模型数据潜力超自回归三倍
量子位· 2025-08-13 09:13
Core Viewpoint - The article discusses the potential of diffusion language models (DLMs) in data learning, highlighting their ability to outperform autoregressive models in terms of data utilization and learning efficiency [1][4]. Group 1: Diffusion Language Models - Diffusion language models can achieve over three times the data potential compared to autoregressive models when token quantity is limited [1]. - A diffusion model with 1 billion parameters trained on 1 billion tokens for 480 cycles achieved 56% and 33% accuracy on HellaSwag and MMLU benchmarks, respectively, without any data filtering or tricks [5]. - The model's performance did not show saturation even under extreme repetition, indicating that it can extract more useful information from the data [4]. Group 2: Learning Mechanisms - The strong data learning capability of diffusion language models is attributed to two main factors: the diffusion objective and bidirectional attention mechanisms, allowing for comprehensive data utilization beyond causal relationships [8][9]. - Diffusion models invest more computational resources (FLOPs) during training and inference, enhancing model performance through iterative optimization [11]. - Unlike autoregressive models that prioritize computational efficiency, diffusion models focus on maximizing data potential, which leads to improved learning outcomes [14]. Group 3: Overfitting and Data Utilization - The research team observed that the number of training cycles before overfitting occurs is positively correlated with the amount of unique data and negatively correlated with model size [18]. - Even when overfitting occurs, the model's performance on downstream tasks may continue to improve, suggesting that absolute loss values do not necessarily translate to relative performance changes [19][21]. - The phenomenon of overconfidence in certain text segments after repeated exposure to limited training data may explain the observed performance trends [26][27]. Group 4: Future Research Directions - The research team plans to use larger models and more unique data in future studies to further validate their findings and hypotheses regarding diffusion language models [28].
token危机解决?扩散模型数据潜力3倍于自回归,重训480次性能仍攀升
机器之心· 2025-08-10 04:31
Core Viewpoint - The article discusses the advancements in diffusion language models (DLMs) as superior data learners compared to autoregressive (AR) models, particularly in data-constrained environments [1][8]. Group 1: Token Crisis and Research Findings - The research addresses the impending token crisis in large language models (LLMs), where the availability of high-quality training text data is diminishing, limiting model performance [2][3]. - The team pre-trained DLMs and AR models from scratch, achieving a maximum scale of 8 billion parameters and 480 billion tokens [3][4]. Group 2: Performance Comparison - In scenarios with limited tokens, DLMs outperform AR models, demonstrating over three times the data potential [5][8]. - A DLM trained on 1 billion tokens achieved 56% accuracy on the HellaSwag benchmark and 33% on the MMLU benchmark, significantly surpassing AR models [14]. Group 3: Repeated Training Benefits - Repeated training on the same dataset enhances performance, with DLMs showing no signs of performance saturation even after extensive training [14][19]. - The study indicates that DLMs can extract more effective information from a fixed dataset, leading to improved performance metrics [14][19]. Group 4: Mechanisms Behind DLMs' Superiority - DLMs utilize a bidirectional modeling approach, allowing them to extract more information from web data compared to purely causal modeling used by AR models [19][22]. - DLMs are described as "super dense models," translating their computational density into enhanced intelligence [22][24]. Group 5: Methodological Critique of Related Research - The article critiques a concurrent study, highlighting methodological flaws that may skew its conclusions regarding DLMs and AR models [25][30]. - It emphasizes that the loss function used in the other study does not accurately represent model likelihood, potentially leading to misleading results [26][32].