大语言模型(LLM)
Search documents
别装了,AI巨头们,谁在卡脖子,谁在割韭菜?这张图一目了然
3 6 Ke· 2025-11-25 05:59
卡内基梅隆揭秘美国AI产业链:谁能扼住AI的喉咙? OpenAI和迪士尼如何捆绑,AMD、软银和英伟达究竟在下哪盘棋? 卡内基梅隆大学刚刚抛出一颗「产业核弹」:首个系统刻画数据、算力、模型、资本乃至人才流向的AI供应链数据集。 谁控制上游瓶颈?谁掐住了全球AI产业的咽喉? 这一次,资本、科技与权力的真实版「关系网」被摊在了阳光下。 | Allow Fuzzy Deduplication 0 Merge similar actor names (e.g., "OpenAl" and "OpenAl Inc.") to reduce duplicates | | | | | | | | | --- | --- | --- | --- | --- | --- | --- | --- | | 13694 | ప్లిక్ | 18547 | 191 | 3106 | III | 370 CE | = | | Unique Actors | | Unique Actor Pairs | | SEC Relationships | | News Relationships | | | Data Source Breakd ...
大模型正沦为“免费基建”,真正Alpha机会在应用层?
Hua Er Jie Jian Wen· 2025-11-24 06:23
一位资深投资者通过对冲基金CIO发声,提出AI领域的投资重心应从基础设施转向应用层。 根据One River Asset Management首席投资官Eric Peters于2025年11月24日发表的文章,企业家、资深投 资者Sparks指出,当前市场对大语言模型(LLM)的认知存在误区,真正的长期投资价值不在于构建这 些模型本身,而在于其上的应用生态。 这一判断直接影响其投资策略。Sparks表示,自己宁愿"做那个利用搜索引擎赚取真金白银的人",而不 是搜索引擎的开发者。 他正将全部的"前瞻性精力"用于寻找未来2-3年(他称之为"T+3")的机会。他认为,最大的机遇是那些 能够将AI能力与特定行业深度结合,从而创造巨大效率提升和商业模式变革的初创企业。 对于当前市场追捧的AI基础设施,Sparks则持谨慎态度。他将英伟达高达5万亿美元的估值与1999年时 的思科相提并论,认为这更多是"向后看"的估值,反映了已发生的成就而非未来的潜力。 他预测,尽管未来几年美国将投入5000亿美元建设数据中心以满足"疯狂的"需求,但这轮建设更像一 个"小繁荣"(boomlet),投入其中的资本和关注度已经"越位"(off ...
瑞银:中国互联网大厂正提升资本开支 加大人工智能领域投入
Xin Hua Cai Jing· 2025-11-20 07:19
在瑞银看来,目前在人工智能芯片层面,虽然仍存在性能差距,但在中国互联网公司持续的自研投入以 及本土人工智能芯片厂商的发展推动下,芯片性能正快速提升。在系统层面,国内企业通过"超节点"技 术的采用,在一定程度上弥补国产单颗图形处理器的差距,实现机柜级上更好的算力表现。另外在大模 型算法层面,国内大模型开发者正在针对本土图形处理器进行算法优化。 中国云厂商仍持续坚定投资人工智能。2025年二季度财报显示,头部公司普遍维持全年资本开支指引, 重点提升芯片利用率和部署效率。在人工智能拉动收入方面,大语言模型(LLM)/人工智能有望成 为推动云市场增长的可持续驱动力,也将带来传统业务的交叉销售机会。 方锦聪表示,目前中国互联网行业另一大趋势是加大即时零售投入。平台希望通过高频外卖交易带动低 频电商业务,提升用户活跃度。目前即时零售短期竞争趋于稳定,行业交易量增长放缓,随着电商双十 一大促的落幕,行业有望触底,竞争在四季度末趋于缓和并逐步回归常态。 (文章来源:新华财经) 新华财经上海11月20日电(记者高少华、王鹤)瑞银投资银行中国互联网行业研究主管方锦聪20日表 示,2025年整体较为有利的市场情绪推动了中国互联网行 ...
图灵奖得主竟「忘了提及」中国学者成果?马库斯重锤Yann LeCun
3 6 Ke· 2025-11-19 11:19
【导读】硅谷最新「神仙打架」来了:一个是预言40年全准的AI教父,一个是嘴炮满级、逮谁怼谁的深度学习头号黑粉。也许LeCun离开Meta仅仅是未 来硅谷AI风起云涌的一次预演。 如果要评选近期AI圈的「超级地震」,Yann LeCun被曝将离开Meta绝对算得上大地震。 作为图灵奖得主,LLM的公开反对者,世界模型的苦行僧,开源模型的守护者,X的全职博主。。。 Yann LeCun身上有太多的标签。 最近外媒WSJ给LeCun写了一篇有点「歌颂功德」味道的文章,声称「他一直正确了40年」。 作为一名和Hinton齐名,乃至共事过的AI老兵,LeCun的地位也当得起这些夸赞。 但这篇文章,和LeCun本人却遭到了马库斯的极度不认可。 马库斯的观点是我们都被他骗了十年,不要将他神化!甚至表示: Yann LeCun赖以成名的CNN卷积神经网络成果晚于1988年张伟等人发表的文章。 虽然LeCun在卷积神经网络的发展过程中发挥了作用。但他既非其发明者,也非首个将反向传播算法应用于训练网络权重的研究者(尽管许多 人误以为是如此)。 如果说Yann LeCun是深度学习阵营内部的「反对派」,他反对LLM,但坚持深度学习 ...
AI为啥不懂物理世界?李飞飞、杨立昆:缺个「世界模型」,得学大脑新皮质工作
量子位· 2025-11-17 13:23
Core Insights - The future of AI may be linked to understanding the evolutionary secrets of the human brain, as highlighted by recent developments in the AI field, including Yann LeCun's plans to establish a new AI company focused on "World Models" [1] - Fei-Fei Li emphasizes the limitations of current large language models (LLMs) and advocates for the development of "Spatial Intelligence" as a crucial step towards achieving Artificial General Intelligence (AGI) [3][4] Summary by Sections World Models - "World Models" are essential for AI to understand and predict real-world scenarios, which current AI systems struggle with, such as generating realistic videos or performing household tasks [5][6] - The concept of "World Models" arises from reflections on the limitations of LLMs and the exploration of animal intelligence, suggesting that the ability to learn these models is what current AI lacks [8] Human Perception and Intelligence - Max Bennett's research identifies three key attributes of human perception that are crucial for understanding intelligence: filling-in, sequentiality, and irrepressibility [11] - The brain's ability to fill in gaps in perception and to focus on one interpretation at a time is fundamental to how humans process information [12][20][23] Generative Models - The "Helmholtz Machine" concept illustrates how generative models can learn to recognize and generate data without being explicitly told the correct answers, demonstrating the brain's inferential processes [27] - Modern generative models, including deep fakes and AI-generated art, validate Helmholtz's theories and show that the brain's neocortex operates similarly [28] Advanced Cognitive Abilities - The neocortex not only facilitates imagination and prediction but also enables complex behaviors such as planning, episodic memory, and causal reasoning, which are desired traits for future AI systems [33] - Bennett's book, "A Brief History of Intelligence," connects neuroscience with AI, outlining the evolutionary milestones of the brain and their implications for AI development [35][37]
大语言模型仍无法可靠区分信念与事实 为高风险领域应用敲警钟
Ke Ji Ri Bao· 2025-11-07 01:43
Core Insights - A recent study from Stanford University highlights significant limitations of large language models (LLMs) in distinguishing between user beliefs and factual information, raising concerns about their reliability in high-stakes fields such as medicine, law, and scientific decision-making [1][2] Group 1: Model Performance - The study analyzed 24 LLMs, including DeepSeek and GPT-4o, across 13,000 questions, revealing that newer models achieved an average accuracy of 91.1% or 91.5% in verifying factual data, while older models had an average accuracy of 84.8% or 71.5% [1] - When responding to first-person beliefs ("I believe..."), newer models identified false beliefs 34.3% less accurately compared to true beliefs, while older models showed a 38.6% lower accuracy in identifying false beliefs compared to true beliefs [1] Group 2: Implications for AI Development - The study indicates that LLMs tend to correct users factually rather than identifying their beliefs, with newer models showing a 4.6% decrease in accuracy for third-person beliefs and older models showing a 15.5% decrease [2] - The findings emphasize the necessity for LLMs to effectively differentiate between facts and beliefs to prevent the spread of misinformation, particularly in complex social contexts [2]
大语言模型仍无法可靠区分信念与事实 为高风险领域应用敲响警钟
Ke Ji Ri Bao· 2025-11-07 00:01
Core Insights - A recent study from Stanford University highlights significant limitations of large language models (LLMs) in distinguishing between user beliefs and factual information, raising concerns about their application in high-risk fields such as medicine, law, and scientific decision-making [1][2] Group 1: Model Performance - The study analyzed 24 LLMs, including DeepSeek and GPT-4o, across 13,000 questions, revealing that newer models achieved an average accuracy of 91.1% or 91.5% in verifying factual data, while older models had an average accuracy of 84.8% or 71.5% [1] - When responding to first-person beliefs ("I believe..."), newer models (post-May 2024 GPT-4o) had a 34.3% lower probability of identifying false beliefs compared to true beliefs, while older models had a 38.6% lower probability [1] Group 2: Belief Recognition Challenges - LLMs tend to prioritize correcting users factually rather than identifying their beliefs, with newer models showing a 4.6% decrease in accuracy for third-person beliefs ("Mary believes...") and older models showing a 15.5% decrease [2] - The study concludes that LLMs must effectively differentiate between the nuances of fact and belief to respond accurately to user queries and prevent the spread of misinformation [2]
科学家发现:去掉推荐算法,社会极化反而更严重?
3 6 Ke· 2025-11-06 07:50
Core Insights - The concept of "information cocoon" has gained significant attention, highlighting the challenges posed by social media in limiting individuals' exposure to diverse information [1][2][4] - Concerns about algorithmic recommendations leading to echo chambers and polarization are prevalent, yet empirical evidence on the existence and impact of these phenomena remains limited [2][17] - Research indicates that exposure to opposing viewpoints on social media may not foster reflection but could instead reinforce extreme political positions [4][17] Group 1: Information Cocoon and Algorithmic Impact - The term "information cocoon" describes how social media algorithms can confine users to a narrow range of information, exacerbating concerns about self-isolation and the amplification of extreme views [1][2] - Algorithm engineers have proposed various intervention strategies to mitigate the effects of personalized recommendations, aiming to create a more balanced information environment [2][9] - Despite attempts to address these issues, studies show that many interventions have only marginal effects, and some may even worsen the problems of polarization and inequality in attention [9][10] Group 2: Research Findings and Theoretical Perspectives - A study by Chris Bail suggests that encountering opposing viewpoints on social media does not necessarily lead to self-reflection but may intensify users' existing political beliefs [4][17] - The research conducted by Törnberg and Larooij utilized generative social simulation to explore the dynamics of social media interactions, revealing persistent negative phenomena such as echo chambers and the amplification of extreme voices [7][9] - Historical analysis of platforms like Reddit indicates that political polarization is often driven by external political events rather than the internal dynamics of social media [14][17] Group 3: Broader Implications and Human Behavior - The relationship between social media and political polarization is complex, with evidence suggesting that societal divisions are reflected in online content rather than solely created by algorithms [17][18] - Understanding the limitations of human behavior in the context of social media is crucial, as individuals tend to gravitate towards like-minded groups, reinforcing their own beliefs [18]
国信证券:LLM拓展传统投研信息边界 关注机构AI+投资技术落地途径
智通财经网· 2025-10-29 07:38
Group 1 - The core viewpoint is that large language models (LLMs) are transforming vast amounts of unstructured text into quantifiable Alpha factors, fundamentally expanding the information boundaries of traditional investment research [1] - AI technology is deeply reconstructing asset allocation theory and practice across three levels: information foundation, decision-making mechanisms, and system architecture [1] - LLMs enhance the understanding of financial reports and policies, while deep reinforcement learning (DRL) shifts decision frameworks from static optimization to dynamic adaptability [1] Group 2 - The practical application of AI investment research systems relies on a modular collaboration mechanism rather than the performance of a single model [2] - The architecture of AI investment systems, as demonstrated by BlackRock's AlphaAgents, involves model division of labor, enhancing decision robustness and interpretability [2] - This modular approach creates a replicable technology stack from signal generation to portfolio execution, laying a solid foundation for building practical investment agents [2] Group 3 - Leading institutions are elevating competition to an "AI-native" strategy, focusing on building proprietary, trustworthy AI core technology stacks capable of managing complex systems [3] - JPMorgan's strategy emphasizes proprietary technology layout across three pillars: trustworthy AI and foundational models, simulation and automated decision-making, and alternative data [3] - This approach creates complex barriers that are difficult for competitors to overcome in the short term [3] Group 4 - For domestic asset management institutions, the path to breakthrough lies in strategic restructuring and organizational transformation, focusing on differentiated and targeted technology implementation [4] - Institutions should prioritize the practical and efficient "human-machine collaboration" system, leveraging LLMs to explore unique policy and text Alpha in the A-share market [4] - It is essential to break down departmental barriers and cultivate cross-disciplinary teams that integrate investment and technology, embedding risk management throughout the AI governance lifecycle [4]
Karpathy 回应争议:RL 不是真的不行,Agent 还需要十年的预测其实很乐观
Founder Park· 2025-10-20 12:45
Group 1 - The core viewpoint expressed by Andrej Karpathy is that the development of Artificial General Intelligence (AGI) is still a long way off, with a timeline of approximately ten years being considered optimistic in the current hype environment [10][21][23] - Karpathy acknowledges the significant progress made in Large Language Models (LLMs) but emphasizes that there is still a considerable amount of work required to create AI that can outperform humans in any job [11][12] - He critiques the current state of LLMs, suggesting they have cognitive flaws and are overly reliant on pre-training data, which may not be a sustainable learning method [13][14] Group 2 - Karpathy expresses skepticism about the effectiveness of reinforcement learning (RL), arguing that it has a poor signal-to-noise ratio and is often misapplied [15][16] - He proposes that future learning paradigms should focus on agentic interaction rather than solely relying on RL, indicating a shift towards more effective learning mechanisms [15][16] - The concept of a "cognitive core" is introduced, suggesting that LLMs should be simplified to enhance their generalization capabilities, moving away from excessive memory reliance [19] Group 3 - Karpathy critiques the current development of autonomous agents, advocating for a more collaborative approach where LLMs assist rather than operate independently [20][21] - He believes that the next decade will be crucial for the evolution of agents, with significant improvements expected in their capabilities [21][22] - The discussion highlights the need for realistic expectations regarding the abilities of agents, warning against overestimating their current capabilities [20][21] Group 4 - Karpathy emphasizes the importance of understanding the limitations of LLMs in coding tasks, noting that they often misinterpret the context and produce suboptimal code [47][48] - He points out that while LLMs can assist in certain coding scenarios, they struggle with unique or complex implementations that deviate from common patterns [48][49] - The conversation reveals a gap between the capabilities of LLMs and the expectations for their role in software development, indicating a need for further advancements [52]