大语言模型(LLM)

Search documents
晚点独家丨理想自研智驾芯片上车路测,部分计算性能超英伟达 Thor-U
晚点LatePost· 2025-08-28 06:09
以下文章来源于晚点Auto ,作者晚点团队 理想方面曾表示,如果将来智驾算法锁定,自研芯片有利于实现更优的效率和成本。 文 丨 赵宇 编辑 丨 龚方毅 我们独家获悉,理想汽车自研智驾芯片 M100 于今年一季度样片回片,迈过量产前的关键阶 段。随后,M100 在两周内完成功能测试和性能测试,后续通过理想研发人员的压力测试。目 前,M100 已经小批量上样车做道路测试。 晚点Auto . 从制造到创造,从不可能到可能。《晚点LatePost》旗下汽车品牌。 据我们了解,在处理不同类型的计算任务时,M100 表现出特定的性能特点:如在运行大语言 模型(LLM, Large Language Model)的计算任务时,1 颗 M100 所能提供的有效算力与 2 颗英 伟达 Thor-U 大致相当;而在处理卷积神经网络(CNN, Convolutional Neural Network)相关的 传统视觉任务(如图像识别)时,1 颗 M100 所能提供的有效算力可对标 3 颗英伟达 Thor-U。 M100 预计明年量产上车。在此之前,理想仍将依赖其现有的两家合作伙伴——英伟达和地平线。与 此同时,为保障自研智驾芯片项 ...
盘后跌超3%!英伟达二季度Blackwell提速,数据中心稳居核心,为何股价还会跳水?(附Q2财报详情)
美股IPO· 2025-08-27 23:46
Q2英伟达营收同比增速逾两年最低,仍高于分析师预期,释放1.8亿美元在华H20库存;数据中心收入连续两季逊色,其中Blackwell产品营收环比增 17%,数据中心计算收入环比降1%、源于H20销售收入减少40亿美元;游戏业务收入增49%、再创新高;Q3营收指引未考虑对华出口H20;黄仁勋称中 国今年或带来500亿美元商机;新增回购授权600亿美元。英伟达盘后跌超3%。 财报显示,截至7月末的上一财季,英伟达保持两位数的总营收增长,新一代架构Black芯片的收入环比增长17%,被CEO黄仁勋视为"需求非常旺盛"的 迹象。但公司核心业务数据中心的收入仍继续逊色,部分源于H20芯片收入减少,当季未在华出售任何H20,倒是释放了1.8亿美元预留中国市场的该芯 片库存。 相比上季业绩,英伟达本季的指引似乎更令人担心。评论认为,英伟达本财季的营收指引不愠不火,引发了投资者对AI支出增长势头放缓的担忧。 据央视新闻,英伟达CEO黄仁勋7月中访华时表示,美国政府已批准该司的出口许可,将开始向中国市场销售H20芯片。而从英伟达的业绩看,特朗普 政府放松出口限制还未转换为实质性的收入反弹,英伟达在华的困境给业绩前景蒙上阴影。 ...
拒稿警告,靠大模型「偷摸水论文」被堵死,ICLR最严新规来了
机器之心· 2025-08-27 08:36
机器之心报道 编辑:杜伟、+0 刚刚,又一个人工智能国际顶会为大模型「上了枷锁」。 ICLR 2025 已于今年 4 月落下了帷幕,最终接收了 11565 份投稿,录用率为 32.08%。 就在今天, ICLR 2026 出台了大语言模型(LLM)使用政策,以明确规范论文作者与审稿人在研究过程和 审稿过程中使用 LLM 的做法。 这届会议将于明年 4 月 23 日至 27 日在巴西里约热内卢举办。 政策1:任何对 LLM 的使用都必须如实披露 ,这遵循了《道德准则》中「所有对研究的贡献都必须得到承 认」以及「贡献者应期望……为其工作获得认可」的政策。 政策2:ICLR 的论文作者和审稿人最终要为自己的贡献负责 ,这遵循了《道德准则》中「研究人员不得故 意做出虚假或误导性的声明,不得捏造或伪造数据,也不得歪曲结果」的政策。 违反上述政策的投稿将面临具体处罚,其中最严重的后果之一是 直接拒稿 (desk rejection) 。 具体应用场景说明 为阐明政策的实际应用,ICLR 官方列举了几个关键场景: 此次发布的所有政策均以 ICLR《道德准则》为基础,旨在确保学术诚信,同时规避 LLM 可能带来的风 险,如事 ...
榨干GPU性能,中兴Mariana(马里亚纳)突破显存壁垒
量子位· 2025-08-26 05:46
Nvidia开源的Dynamo项目,实现存储系统多级缓存算法,热数据在显存、温数据在主机内存、冷数据在 SSD 或远端对象存储,并通过一套 统一的索引 + 异步流水线实现自动迁移与透明访问,但是多级存储之间的数据迁移流程复杂,延迟开销难以压缩。 微软推出的LMCahce存储系统,高度兼容vLLM等推理框架,但是对分布式存储支持较低,空间上限低。 阿里巴巴提出一种将KV Cache空间扩展到Tair数据库的远端存储方案,存储空间易扩展,但是读写性能难以满足LLM推理业务的低延迟需 求。 CXL(Compute Express Link) 作为一种新兴的高速互联技术,以其高带宽、低延迟和硬件级缓存一致性的特性,为破解内存瓶颈带来了 新的希望,可以解决AI和高性能计算中遇到的内存瓶颈问题。 业界关于CXL存储加速LLM推理的研究仍然较少,探索如何利用CXL等新型介质扩展KV Cache空间,进而将成熟的软件栈迁移到CXL硬件场 景,是一项非常有意义的工作。 当大语言模型(LLM)走向千行百业,推理效率与显存成本的矛盾日益尖锐。 KV Cache (Key-Value Cache)作为提升生成速度的核心技术,却像一个 ...
电改“136号文”半年考,新能源资产后服务赛道马太效应放大
2 1 Shi Ji Jing Ji Bao Dao· 2025-08-25 06:13
21世纪经济报道记者费心懿 实习生余名伟 报道 今年1月,国家发展改革委、国家能源局联合印发《关于深化新能源上网电价市场化改革促进新能源高 质量发展的通知》(以下简称"136号文")。 这一政策不仅终结了新能源电站传统"固定电价"的盈利模式,更推动电力市场从"政策驱动"迈向"市场 驱动"的新阶段。 如今,"136号文"落地已满半年,"半年考"节点下,新能源装机容量持续高增。今年上半年,全国可再 生能源新增装机2.68亿千瓦,同比增长99.3%,约占新增装机的91.5%。 同时,伴随电力市场化改革的深入,新能源资产后服务行业也面临着深刻的蝶变——从早期依附于发电 集团的"内部生产职能",到运维服务成为独立赛道,再到当前对覆盖"运维+交易+数字化"全链条的资产 运营能力提出更高要求。 后服务市场马太效应放大 北京协合运维风电技术有限公司(以下简称"协合运维")成立于2007年,拥有18年的新能源资产管理经 验。近日,协合运维董事长陆一川接受了21世纪经济报道记者采访。 协合运维也从集团内部的"服务部门"逐步转型为对外提供专业运营服务的企业。"2020年之后,业务定 位就开始以满足市场化的服务需求为主。"陆一川回顾 ...
理想VLA到底是不是真的VLA?
自动驾驶之心· 2025-08-21 23:34
Core Viewpoint - The article discusses the capabilities of the MindVLA model in autonomous driving, emphasizing its advanced scene understanding and decision-making abilities compared to traditional E2E models. Group 1: VLA Capabilities - The VLA model demonstrates effective defensive driving, particularly in scenarios with obstructed views, by smoothly adjusting speed based on remaining distance [4][5]. - In congested traffic situations, VLA shows improved decision-making by choosing to change lanes rather than following the typical detour logic of E2E models [7]. - The VLA model exhibits enhanced lane centering abilities in non-standard lane widths, significantly reducing the occurrence of erratic driving patterns [9][10]. Group 2: Scene Understanding - VLA's decision-making process reflects a deeper understanding of traffic scenarios, allowing it to make more efficient lane changes and route selections [11]. - The model's ability to maintain stability in trajectory generation is attributed to its use of diffusion models, which enhances its performance in various driving conditions [10]. Group 3: Comparison with E2E Models - The article highlights that E2E models struggle with nuanced driving behaviors, often resulting in abrupt maneuvers, while VLA provides smoother and more context-aware driving responses [3][4]. - VLA's architecture allows for parallel optimization across different scenarios, leading to faster iterations and improvements compared to E2E models [12]. Group 4: Limitations and Future Considerations - Despite its advancements, VLA is still classified as an assistive driving technology rather than fully autonomous driving, requiring human intervention in certain situations [12]. - The article raises questions about the model's performance in specific scenarios, indicating areas for further development and refinement [12].
3000万融资,20%付费转化,语音输入工具Wispr Flow如何精准找到PMF?
Founder Park· 2025-08-21 07:30
Core Insights - Wispr Flow successfully pivoted from hardware to software, focusing on a voice input tool that meets user needs, resulting in $30 million in funding and a 20% conversion rate to paid users [2][11] - The company experienced high user engagement, with active users averaging 100 dictations per day and keyboard input dropping to 25-30% of total input [2][13] Group 1: Company Transformation - Initially, the company developed a hardware device that lacked a clear consumer market, leading to its eventual failure [7][10] - The decision to pivot was driven by the realization that the software ecosystem was not ready for their hardware product, prompting a shift to focus solely on the Wispr Flow software [9][10] - The transition involved significant layoffs, reducing the team from 40 to 5 employees to ensure a focused and stable environment for the remaining staff [12][19] Group 2: Product Market Fit (PMF) - The company accelerated its product launch timeline, achieving a successful release of Wispr Flow within six weeks, which garnered millions of views and topped charts on Product Hunt [13][14] - The product resonated strongly with users, leading to a conversion rate of nearly 20% to paid subscriptions, significantly higher than the industry average of 3-4% [13][14] Group 3: Key Lessons Learned - Rapid decision-making and execution are crucial to avoid stagnation and ensure effective leadership during transitions [17] - It is essential to make decisive cuts in staffing to provide clarity and stability for the remaining team members [18] - Gathering genuine feedback from customers is vital, as assumptions about product desirability can lead to misguided efforts [20]
个人AI助理开发万字指南:从0到1,把AI打造成顶级思考伙伴
3 6 Ke· 2025-08-20 07:10
神译局是36氪旗下编译团队,关注科技、商业、职场、生活等领域,重点介绍国外的新技术、新观点、新风向。 编者按:别再迷信提示词魔法了,AI更像是需要"入职"的新同事。给足上下文,它就是你的专属思考伙伴。如果你正在寻求大家都在谈论的、AI许诺可带来 的生产力提升,那就看看这篇指南吧。文章来自编译。 先分享一件我的糗事:11个月前,我工作时还完全不用AI。 我和团队当时正在开发供数万人使用的AI产品。但轮到在我自己的工作中使用AI时,我却持抵制的态度,还引以为傲。 我不想让自己的言论听起来像网上的乌合之众。我担心如果让AI工具为我做事,我会失去自己的优势。我在尝试使用ChatGPT时,发现它在处理战略和创新 性工作方面令人失望——就像在咨询一个话痨版的维基百科一样。 在内心深处,我感到很沮丧,因为我似乎不懂那些只有网红们才知道的、如同魔法咒语般的提示词技巧。 后来,我的工程团队为了一个大型项目而启动了一个文书工作繁重的Scrum新版本,要求我撰写数十个详细的用户故事。我根本跟不上进度,一下子成了团 队的瓶颈。 我做了任何一个经验丰富的产品经理都会做的事:抱怨和发牢骚。最终,我的工程经理Oleksii决定可怜可怜我, ...
大模型给自己当裁判并不靠谱!上海交通大学新研究揭示LLM-as-a-judge机制缺陷
量子位· 2025-08-17 03:43
Core Viewpoint - The article discusses the evolution of large language models (LLMs) from tools to evaluators, specifically focusing on their ability to judge AI-generated content, which has not been thoroughly validated for reliability and consistency with human judgment [1][6]. Group 1: Research Background - A fundamental question arises regarding whether AI evaluators can accurately identify who is speaking in a dialogue before assessing the model's performance [2]. - The research paper titled "PersonaEval: Are LLM Evaluators Human Enough to Judge Role-Play?" by a team from Shanghai Jiao Tong University introduces a new benchmark test called PersonaEval, aimed at evaluating LLMs' ability to identify speakers in dialogues [2][11]. Group 2: Testing Results - The results indicate that even the best-performing model, Gemini-2.5-pro, achieved an accuracy of only 68.8%, while the average accuracy of human participants was 90.8% [4][15]. - This significant gap highlights the current limitations of LLMs in accurately judging role-play scenarios [17]. Group 3: Model Evaluation and Challenges - The paper emphasizes that LLMs tend to focus on superficial language style rather than the underlying intent and context of the dialogue, leading to misjudgments [9][10]. - The PersonaEval benchmark is designed to align evaluations with human judgment and includes carefully selected distractors to challenge the models [13][12]. Group 4: Improvement Strategies - The authors explored two common strategies for improving model performance: training-time adaptation and test-time compute [18][20]. - Interestingly, fine-tuning models on role-related data did not enhance their identification capabilities and could even degrade performance, suggesting that rote memorization of character knowledge interferes with general reasoning abilities [20][22]. Group 5: Future Directions - The research calls for a reevaluation of how to construct AI systems that align with human values and judgment, emphasizing the need for reasoning-oriented enhancement methods rather than merely increasing character knowledge [24][25].
国泰海通|产业:AI Agent的技术演进与产业洞察
国泰海通证券研究· 2025-08-08 09:24
Core Insights - The evolution of AI Agents is fundamentally driven by the paradigm shift towards large language models (LLMs) as the "brain," showcasing commercial value through vertical applications that address specific industry pain points and high precision [1][2] - AI Agents are reshaping software development and human-computer interaction, transitioning from traditional architectures to modern LLM-based frameworks that enable autonomous planning, environmental perception, and tool invocation [1][2] Technical Evolution - The core of AI Agent's technological advancement lies in the significant changes introduced by modern LLM architectures, moving away from traditional architectures that were limited by hardware and pre-programmed rules [2] - The modern LLM-based Agent architecture consists of three main modules: brain, perception, and action, allowing multiple specialized agents to collaborate or compete to overcome the limitations of single agents in handling complex tasks [2] Industry Chain Formation - A complete industry chain is emerging with upstream dominated by a few tech giants providing foundational models and computing power, while the midstream sees the rise of open-source frameworks and platforms that lower development barriers [3] - Downstream applications are categorized into general-purpose agents for complex multi-step tasks and vertical agents deeply integrated with industry knowledge, showing significant commercial value in sectors like software development, law, finance, and healthcare [3] Challenges and Future Trajectory - Despite rapid advancements, AI Agents face challenges such as limitations in LLM's planning and reasoning capabilities, context window constraints, memory bottlenecks, multi-agent collaboration issues, and evaluation dilemmas [3] - The future development of AI Agents will depend on the continuous evolution of foundational LLMs, the proliferation of multimodal perception capabilities, and the restructuring of the software and hardware ecosystem, moving closer to AGI [3]