AI科技大本营
Search documents
超760万元奖金悬赏,谁能徒手重构 DeepSeek 与 Kimi 的性能底层?
AI科技大本营· 2026-03-27 04:12
责编 | 梦依丹 出品丨AI 科技大本营(ID:rgznai100) 极致的推理延迟、极高的吞吐量、极大的模型规模……在大模型工程化的战场 上,这曾是一个被公认为'不可能'的三角。 回望 2025 年,DeepSeek-V3 技术报告为大家揭示了超大规模模型推理的新一代范式。通过 MLA 架构将 KV Cache 压缩 93%,配合 MTP(多 Token 预测) 技术大幅提升访存效率,全球开发者见证了万亿参数模型在大规模并发下实现"高吞吐、低延迟"的工程突破。 然而,站在 2026 年的当下,依靠 FP8 精度和基础架构已难以满足爆发式的即时响应需求。在大规模真实并发的洪流前,每一毫秒的延迟缩减,都直接 挂钩着数以亿计的算力成本与集群效能。 正是在这种"性能即生命"的行业背景下, 2026 线上黑客松:AMD E2E Model Speedrun 全球挑战赛正式拉开帷幕! AMD 联手 GPU MODE,豪掷 110 万美元 发起这场全球竞速。 寻找那些 能徒手 拆解底层逻辑、 将 AMD 旗舰算力的每一滴潜能都榨取出来的顶级开发者。 预选赛——入围即拿 1 万美金 本次大赛采用"预选赛 + 端到端决赛大考 ...
信息量极大!黄仁勋最新论断:AGI已实现,OpenClaw是AI界iPhone,未来将有10亿程序员
AI科技大本营· 2026-03-26 11:18
出品 | CSDN(ID:CSDNnews) 过去三十多年里,英伟达从一家做图形加速卡、甚至几度濒临破产的公司, 到推出 CUDA 打开通用计算的大门,再到今天提出"AI 工厂"的概念, 一路 成长为全球首个市值突破 4 万亿美元的科技公司,几乎重塑了整个计算产业的底层形态。 而掌舵这家公司的,是 黄仁勋 ——一个早年在餐厅打工洗盘子、清洁厕所,后来还差点成为了台积电 CEO 的科技领袖。 来源 | Lex Fridman 整理 | 屠敏 他自己曾多次坦言,如果一开始就知道这件事有多难," 可能根本不会做 "。 那么问题来了—— 近日,美国知名计算机科学家、人工智能研究员,同时也是热门播客主持人的 Lex Fridman,与黄仁勋进行了一场长达两个多小时的深度对谈。 两人从英伟达的发展历程聊起,延伸到中国、马斯克、程序员生态、AI 时代的"护城河"、太空数据中心,再到 NVIDIA 内部那种几乎"没有一对一会 议、所有问题都公开讨论"的管理方式。在这家公司里,向黄仁勋直接汇报的对象多达 60 人。 期间,黄仁勋也给出了一系列观点鲜明的见解以及趋势判断: 至于个人,他的态度同样直接。 他 不信奉接班规划 , 而 ...
Agent重塑软件与互联网产业新范式,2026奇点智能技术大会初版日程出炉!
AI科技大本营· 2026-03-25 01:35
4 月 17-18 日,由 CSDN 与奇点智能研究院联合举办的「2026 奇点智能技术大会」将在上海·环球 港凯悦酒店隆重举行。 奇点智能技术大会由"全球机器学习技术大会"全新升级而来。翻阅过去两年的嘉宾演讲 PPT( ⬇️⬇️可扫 码领取 ),一条贯穿产业的技术跃迁线格外清晰:从 2024 年行业热衷于基座模型打磨、深度预训练与 算力军备竞赛,到 2025 年全面沉淀于 RAG 落地、轻量化微调与 AI 编程工程化初探,再到如今 2026 年,全行业聚焦 Agent 工程攻坚与商业闭环打造。 奇点智能技术大会见证并引领着 AI 从实验室里的"高智商游戏",进化为驱动企业提质增效的核心生产 力。 本次大会云集 BAT、英伟达、AWS、微软、小红书、vLLM、京东、昆仑万维、网易等国内外顶尖机构 与企业一线 AI 实践者 ,围绕 Agent 系统与工程、AI 原生应用创新与开发实践、AI Infra 基础设施与 运维等前沿议题展开分享。 | 4月17-18日 · 上海环球港凯悦酒店 | | | --- | --- | | 50+ 12大 | 1000+ | | 一线技术 专题覆盖 直击AI原生 | 产业精英 ...
100年后 K8s 还会存在吗?创始人 Brendan Burns:它将像 Linux 一样消失在 AI 之下
AI科技大本营· 2026-03-24 10:13
K8s 最早的 MVP,大概花了不到一周就写出来了。 几个人,几台机器,一个很粗糙的 demo:能把容器分发出去,能做最基础的负载均衡,进程挂了能自动拉起,升级时能从 v1 切到 v2。放到今天看, 这样的开场甚至有点寒酸。很难把它和后来那个改写了云原生格局、几乎重塑了整个云计算语言体系的 Kubernetes 联系在一起。 软件的宿命,最终都是走向死亡。(The inevitable trajectory of software is death.) 编译 | 王启隆 出品丨AI 科技大本营(ID:rgznai100) 但这段历史最值得回看的,恰恰不是 Kubernetes 后来如何成为事实标准,而是它在最开始 为什么必须被做出来,而且必须被开源 。 今天看到 Brendan Burns ( Kubernetes 联合创始人,后来参与创办了 Heptio,如今在微软 Azure 担任 Technical Fellow / CVP) 的最新访谈,最 有意思的地方,不是他又复述了一遍 Kubernetes 的成功史,而是他把很多人默认已经写进历史的事情,重新拉回到了那个还没有结果的时刻: 以下为这场对话的全文翻 ...
110万美元悬赏!AMD发起全球战书:谁能打破DeepSeek与Kimi的推理速度极限?
AI科技大本营· 2026-03-23 03:43
2026 线上黑客松:AMD E2E Model Speedrun 正式吹响集结号! 你,敢来迎战吗? 面向所有热爱硬核技术的你:这一次,用代码说话,用性能封神。 责编 | 梦依丹 出品丨AI 科技大本营(ID:rgznai100) 在 DeepSeek-R1 和 Kimi K2.5 等顶级开源模型确立了万亿参数的工业基准后,真正的极限性能压榨才刚刚开始。谁能在极高并发下打破内存墙?谁 能用最优雅的代码实现最高效的算子重构?在当下的 AI 战场,速度即是正义,吞吐量决定生死。 由 AMD 与 GPU MODE 联合发起的极客巅峰对决,正式向全球发榜: 在这里,你将直接操控 AMD 专为大模型打造的顶配云端 GPU 阵列 在这里,没有纸上谈兵的 PPT,只有硬核的绝对速度与吞吐量。 在这里,你的极限优化不仅能赢取巨额美金,更将直接合入主流开源框架,定义下一代 AI 推理的工业标准! 百万奖金, 名利双收 进入决赛的 Top 10 队伍,保底获得 1 万美元奖金! 【赛道一】 DeepSeek-R1-0528 (FP4 + MTP) | 桂冠: 35 万美元奖金 【赛道二】 Kimi K2.5 1T (FP4) ...
Andrew Karpathy 最新对谈:未来软件的第一客户是 Agent,软件业还剩下多少“人的位置”?
AI科技大本营· 2026-03-22 09:23
" skill 在我看来,本质上就是一种「如何教 agent 去教人」的脚本。 " 编译 | 王启隆 出品丨AI 科技大本营(ID:rgznai100) 如果今天还有谁能同时代表深度学习研究、自动驾驶落地、LLM 工程直觉,以及 AI 教育这几条线, Andrew Karpathy 仍然是少数几个名字之一。 他是 OpenAI 早期的创始成员,做过 Tesla AI 和 Autopilot,也是在斯坦福把 CS231n 这门课真正讲成一代人入门教材的人。后面他成为了推特 AI 圈 上的"顶流网红",提出了 氛围编程 (Vibe Coding)这个 2025 年度热词。 今年年初这段时间,他又把注意力压到了一个更前沿、也更让人不安的问题上: 当 coding agent、持续运行的"龙虾"以及 AutoResearch 这种自动 闭环系统开始成形,人到底还该留在 loop 的哪个位置? 以前你的瓶颈还是打字速度,是你自己一行行写代码的速度。但有了这些 agent 之后,情况完全变了。我会说,真正的变化大概是在去年 12 月发生 的。那时候像是有个开关被拨了一下: 我原来大概还是 80% 自己写代码、20% 委托 ...
黄仁勋对话 10 位开源 AI 掌门人:未来算力将向后训练倾斜,OpenClaw 开启了现代计算机的新想象|GTC 2026
AI科技大本营· 2026-03-20 00:56
Core Insights - The future of AI is characterized by "harness engineering," which emphasizes the integration of models, tools, and systems rather than focusing solely on individual models [16][19][21] - Open models are becoming increasingly significant, potentially forming the largest model group across various industries and applications [5][6] - The discussion highlights a shift from viewing AI as merely models to understanding it as a complex system that includes agents, orchestration, and governance [10][12][26] Group 1: Open Models and System Integration - Huang Renxun emphasizes that open models collectively represent the second-largest model group globally, with the potential to become the largest in various applications [5][6] - The conversation shifts from a binary view of open vs. closed models to a more nuanced understanding of how models, tools, and governance create a new system [6][10] - The emergence of a third category of companies that utilize the best model APIs while developing their own tools and agents indicates a more complex software stack [11][12] Group 2: The Role of Agents - Agents are evolving from simple models to complex systems capable of handling multi-step tasks and integrating various resources [36][40] - The concept of "agentic systems" is introduced, where agents can continuously process tasks and maintain state over time, moving beyond traditional models [36][40] - OpenClaw is highlighted as a significant project that exemplifies the capabilities of agentic systems, showcasing a new paradigm in computing [38][39] Group 3: Governance and Trust in AI - The discussion emphasizes that the real challenge for enterprises is not whether agents can perform tasks, but how to govern and manage them effectively [52][56] - Trust in AI systems is crucial, with open models being preferred for their transparency and verifiability, which helps build confidence in their deployment [56][67] - The need for a governance framework that addresses data access, action capabilities, and accountability is underscored as organizations begin to integrate AI into their operations [52][56] Group 4: The Importance of Open Models - Open models are seen as essential for customization, control, and cost-sharing in AI development, allowing organizations to tailor solutions to their specific needs [66][68] - The potential for open models to facilitate the creation of specialized digital twins in various fields, such as healthcare, is discussed [68][70] - The conversation highlights the need for open infrastructure to support the ongoing development and deployment of open models, ensuring they remain viable in the long term [72][73] Group 5: Future Directions and Industry Impact - The integration of AI into various sectors, including coding, healthcare, and robotics, is expected to accelerate as agents become more capable and reliable [62][64] - The discussion points to a broader trend where AI is not just about creating powerful models but about developing systems that can operate effectively in real-world environments [88][89] - The emergence of AI factories or foundries is anticipated, enabling companies to access necessary computational resources without needing to own them outright [83][84]
GTC 巅峰对话 Jeff Dean x Bill Dally:预训练范式已死、延迟瓶颈不在计算、谈透 AI 五年未来 | GTC 2026
AI科技大本营· 2026-03-19 02:08
Core Insights - The dialogue between NVIDIA's Bill Dally and Google's Jeff Dean at GTC 2026 highlighted significant advancements in AI and machine learning, particularly in model capabilities and agent-based workflows [2][4][5]. Group 1: Model Advancements - The past year has seen rapid improvements in model capabilities, particularly in areas requiring verifiable rewards, such as mathematics and programming [7][8]. - Models like Gemini have achieved remarkable success in complex tasks, winning gold medals in competitions like IMO and ICPC, showcasing their enhanced abilities [8][9]. - There is a notable shift towards agent-based workflows that can autonomously handle longer tasks without constant human supervision, indicating a significant evolution in AI capabilities [9][11]. Group 2: Inference and Latency - A critical focus is on achieving ultra-low-latency inference to enhance the efficiency of autonomous systems, as inference latency directly impacts problem-solving efficiency [12][14]. - Dally emphasized the need to redesign architectures to minimize communication delays, which are a major source of latency in large language models (LLMs) [18][19]. - Innovations in on-chip communication and physical interfaces are being pursued to reduce latency from hundreds of nanoseconds to approximately 30 nanoseconds [20][21]. Group 3: Future of AI and Hardware - The discussion touched on the potential for AI to autonomously design future models, with Dean noting that while the complete closed-loop system is not yet realized, early forms are emerging [27][29]. - The hardware landscape is expected to evolve, with a clear distinction between training and inference hardware, as inference becomes increasingly critical in data centers [78][80]. - Dally highlighted the importance of future-proofing hardware to adapt to rapidly changing model requirements, emphasizing the need for efficient resource allocation [43][46]. Group 4: Data Utilization and Scaling - There is a belief that there is still a vast amount of untapped data available for training models, particularly in video and real-world scenarios [57][58]. - The conversation also explored the challenges of scaling models when data availability becomes constrained, with Dean suggesting that synthetic data generation could fill this gap [60][61]. - Techniques like data augmentation and regularization are seen as valuable methods to enhance model training without overfitting [67]. Group 5: AI in Chip Design - AI is increasingly being integrated into the chip design process, with systems like NVCell significantly reducing the time and effort required for tasks that previously took months [104][106]. - The use of AI in design verification and bug reporting is also improving productivity, allowing junior designers to access information without constantly consulting senior staff [112][116]. - The potential for AI to automate various stages of chip design is recognized, with aspirations for a future where design can be initiated with simple commands [122]. Group 6: Societal Impact of AI - The dialogue concluded with reflections on the positive societal impacts of AI, particularly in education and healthcare, where personalized learning and health coaching could revolutionize these fields [160][161]. - Both Dally and Dean expressed excitement about the potential for AI to provide personalized tutoring and health advice, enhancing individual learning and health outcomes [162][178].
一个大脑控制所有机器人,真的可能吗?特斯拉、Skild AI、Agility 激辩人形机器人的量产路线|GTC 2026
AI科技大本营· 2026-03-18 07:52
人形机器人终于要出实验室了,但真正的战争才刚开始。 责编 | 王启隆 出品丨AI 科技大本营(ID:rgznai100) 黄仁勋这两年的 GTC 主题演讲 ,几乎每次都会把机器人和 Physical AI 拉出来讲一遍。过去大家听这些内容,总还有点"未来已来但还没落地"的距离 感:模型很强,仿真很热闹,视频也很惊艳,但机器人到底什么时候才能真正离开实验室,进入工厂、仓库、家庭和各种复杂现场,始终还是个悬而未 决的问题。 到了 GTC 2026,这件事的气氛明显变了。 今年英伟达安排的这场圆桌,题目就很直接:《From Concept to Production: Humanoid Robotics at Scale》。翻成大白话就是, 人形机器人这件 事,讨论重点已经不是"能不能做出来",而是"怎么把它真正做成产品、铺到现实世界里去"。 这场对话请来的也都不是来聊概念的人。主持人是英伟达机器人与边缘计算生态负责人 Amit Goel;嘉宾则包括特斯拉 AI 软件副总裁 Ashok Elluswamy、Hexagon Robotics CEO Arnaud Robert、Agility CTO Pras Vel ...
OpenClaw、Agent 企业级落地……2026 奇点智能技术大会硬核议题发布
AI科技大本营· 2026-03-17 08:27
Core Insights - The article highlights the ongoing transition from "technical experimentation" to "engineering paradigm shift" as large models and AI agents become deeply integrated into production environments [2][3] - It emphasizes the need for a comprehensive understanding of this transformation among developers, industry experts, and business leaders, as well as the importance of establishing engineering standards and safety systems to match the rapid advancements in AI technology [2][3] Group 1: Conference Overview - The "2026 Singularity Intelligent Technology Conference" aims to address how to systematically understand the ongoing transformation and find pathways for adaptation [3] - The conference will explore 12 cutting-edge topics, including multimodal models, AI-native development, and agent systems, to create a forward-looking and practical cognitive map for navigating this "tenfold speed transformation" [5] Group 2: Key Topics and Speakers - The "Evolution of Large Language Model Technology" session will feature top scholars and experts who will construct a new coordinate system for the evolution of large model technology [7] - The "Agent Design Patterns and Deep Water Landing" session will focus on building reliable agents, moving away from "blind box" development [13] - The "OpenClaw Industry Practice" session will provide a complete guide for IT leaders and tech enthusiasts on introducing digital employees and adapting to the OpenClaw framework [17] Group 3: AI Infrastructure and Operations - The "AI Infra Infrastructure and Operations" session will present practical guides for transforming operational systems using agent-based approaches, aimed at infrastructure engineers and system architects [21][24] - The session will include insights on automating operations for multi-GPU clusters and enhancing infrastructure with self-awareness and repair capabilities [24] Group 4: AI Application and Industry Practices - The "AI Native Application Innovation and Development Practice" session will showcase successful AI applications that have achieved significant user engagement and valuation, focusing on engineering practices that led to their success [25] - The "AI + Industry Landing Practices" session will provide methodologies for converting large models into tangible business ROI across various sectors, including e-commerce and finance [29] Group 5: Multimodal and Embodied Intelligence - The "Multimodal and World Models" session will cover the underlying technical secrets from video generation to multimodal document understanding, providing a comprehensive engineering path for deployment [39][41] - The "Embodied Intelligence and Intelligent Hardware" session will offer methodologies for achieving large-scale practical applications in high-risk environments, focusing on visual perception and control [47][51] Group 6: Future of AI - The conference serves as a platform for deep communication in the tech field and aims to promote AI ecosystem integration and industry collaborative innovation [53] - It invites global AI industry participants to capture cutting-edge trends and explore paths for industrial upgrades, contributing to the broader application of AI [53]