Workflow
Seek .(SKLTY)
icon
Search documents
DeepSeek悄悄开源LPLB:用线性规划解决MoE负载不均
3 6 Ke· 2025-11-20 23:53
昨天,DeepSeek 在 GitHub 上线了一个新的代码库:LPLB。 项目地址:https://github.com/deepseek-ai/LPLB 没有发推文,也没有公众号更新,少有的几个技术博主分享的推文也关注不多。截至目前,该项目的 star 数量也还没超过 200。 但仔细一看,这个项目却似乎并不简单,值得更多关注。X 网友 gm8xx8 评论认为这表明 DeepSeek 正在解决正确性和吞吐量瓶颈问题,为下一版模型发 布做准备。 项目简介 顾名思义,LPLB 是一个并行负载均衡器,它利用线性规划(Linear Programming)算法来优化 MoE(混合专家)模型中的专家并行工作负载分配。 具体来说,LPLB 通过以下三个步骤实现动态负载均衡: 动态重排序: 基于工作负载统计信息对专家进行重排序(Reordering)。 构建副本: 结合静态拓扑结构构建专家副本(Replicas)。 求解最优分配: 针对每个批次(Batch)的数据,求解最优的 Token 分配方案。 更具体而言,LPLB 的专家重排序过程由 EPLB 协助完成。而实时工作负载统计信息可以由用户提供、通过 torch.d ...
SEEK Limited (SKLTY) Shareholder/Analyst Call Transcript
Seeking Alpha· 2025-11-19 07:38
PresentationGraham Goldsmith Well, good afternoon, shareholders, visitors and SEEK team members. Welcome to Seek's 2025 Annual General Meeting. I'm Graham Goldsmith, the Chairman of SEEK Limited, and thank you for your attendance today. Today, we are hosting this Annual General Meeting on the traditional lands of the Wurundjeri Woi-wurrung peoples of the Kulin Nation. On behalf of the Board of SEEK, I would like to pay my respects to the traditional custodians, elders past and present and extend that respec ...
民进党当局要求民众避免下载DeepSeek,国台办回应
Ren Min Ri Bao· 2025-11-19 05:09
国务院台办发言人朱凤莲表示,大陆的人工智能技术加速创新并惠及全球,多款大型语言模型广泛应用 于各行业,同时也为公众提供了个性化学习和便捷的生活服务。民进党当局出于谋"独"政治私利,对大 陆高科技产品又怕又恨,动辄以维护安全为由进行限制,只会损害台湾企业和民众的利益,被岛内各界 所反感和反对。 11月19日,国务院台办举行例行新闻发布会。 有记者问:民进党当局称DeepSeek等大陆"生成式AI语言模型"会生成"严重偏颇与不实信息",要求民众 避免下载。对此有何评论? ...
美国发布大模型评估报告:DeepSeek性能差、不安全
Tai Mei Ti A P P· 2025-11-19 00:07
Core Insights - The report by NIST's CAISI evaluates the performance, cost, and security of the DeepSeek AI model from China against leading U.S. AI models, revealing that U.S. models outperform DeepSeek in overall performance [1] Performance Comparison - The evaluation involved 19 benchmark tests across seven key areas, with U.S. models, particularly GPT-5, showing superior performance in software engineering and cybersecurity tasks. For instance, GPT-5 achieved an accuracy of 68.9% in cybersecurity, while DeepSeek-V3.1 only reached 36.7%, a difference of 32.2 percentage points [2] - In software engineering, GPT-5 scored 75.8% compared to DeepSeek-V3.1's 54.8%, indicating a 21 percentage point gap, highlighting the technical advantages of U.S. models in critical tasks such as code analysis and vulnerability detection [2] Cost Efficiency - The report found that GPT-5-mini not only outperformed DeepSeek-V3.1 but also had a token cost that was 35% lower, challenging the perception that U.S. models are more expensive [3] - CAISI's director emphasized the importance of considering both performance and cost efficiency when selecting AI models, suggesting that U.S. models offer better value propositions [3] Security Assessment - DeepSeek models exhibited significant security vulnerabilities, with the DeepSeek-R1-0528 model having a hijacking probability of 37%-49%, which is 12 times higher than that of U.S. models. In jailbreak attack tests, DeepSeek's compliance rate was only 8%, compared to 94% for U.S. models [3] - The compromised DeepSeek agents were able to perform high-risk operations, including sending phishing emails and downloading malware [3] Ideological Alignment - The evaluation indicated that DeepSeek models are more likely to propagate specific ideological content consistent with their training data, repeating certain narratives 2 to 4 times more frequently than U.S. models, with variations depending on language and topic [4] Usage Trends - Despite the identified deficiencies, the usage of DeepSeek is on the rise, with downloads increasing nearly 1000% since January 2025 and API requests surging by 5900% on certain platforms [5]
阿里千问APP上线次日即冲进苹果App Store总榜前四 排名超越DeepSeek
Zheng Quan Ri Bao Wang· 2025-11-18 07:13
本报讯 (记者梁傲男)11月18日,阿里巴巴新推出的AI应用千问APP,在公测上线次日便迅速冲入苹 果App Store免费应用总榜第四位,排名超越DeepSeek。其火爆人气一度导致服务器拥堵,"阿里巴巴千 问崩了"的话题登上热搜,官方则以"我好着呢"幽默回应,侧面印证了其公测首日的火热流量。 阿里方面表示,千问APP的战略目标是打造未来的"AI生活入口",成为一个"会聊天能办事"的个人AI助 手。除了智能对话,"能办事"将是其核心发力点。目前,千问已能实现一句指令生成PPT等复杂任务, 并在实盘投资大赛中战胜过全球顶级模型。据透露,阿里计划将地图、外卖、订票、办公等各类生活场 景全面接入千问,构建更强大的办事能力。 千问APP的底气源于阿里Qwen系列开源大模型的强大性能和广泛影响力。自2023年全面开源以来, Qwen模型全球下载量已突破6亿次。近期发布的旗舰模型Qwen3-Max,在性能上已超过GPT-4、Claude 3 Opus等国际顶尖模型。 此次发布标志着阿里正全力进军AI to C市场。11月17日,阿里巴巴正式宣布"千问"项目,并将其视 为"AI时代的未来之战"。千问APP主打免费,并计划 ...
从DeepSeek到千问灵光,杭州AI梦之队引领2025 AI风口
Di Yi Cai Jing Zi Xun· 2025-11-18 06:40
Core Insights - Alibaba and Ant Group are intensifying their AI application ambitions, launching new products to compete directly with established players like ChatGPT in the overseas market [1][4] - The AI application landscape is rapidly evolving, with a focus on user engagement and the development of versatile AI tools that cater to various user needs [3][5] Group 1: Product Launches and Features - Alibaba's Qianwen app and Ant Group's Lingguang AI assistant are positioned to challenge existing AI applications, with Lingguang supporting multi-modal outputs and rapid application generation [1][3] - Lingguang is described as a comprehensive AI assistant, capable of generating structured and visualized responses, including 3D models and interactive maps, within 30 seconds [3][5] - Alibaba's Quark has also integrated an AI conversational assistant, enhancing its functionality across multiple life scenarios [3][4] Group 2: Market Dynamics and Competition - The competition between major players like Alibaba, Ant Group, and ByteDance is intensifying, with a clear division emerging in the AI landscape characterized by "South Alibaba, North Byte" [4][6] - The year 2025 is anticipated to be a pivotal moment for AI applications, with significant user engagement and technological advancements driving the market [4][5] - The focus on addressing user pain points through C-end applications is seen as crucial for the commercialization of AI [4][5] Group 3: Industry Trends and Future Outlook - The AI application sector is witnessing a surge in user adoption, with projections indicating that by the end of 2024, the user base for generative AI products in China will reach 249 million, accounting for 17.7% of the population [5][6] - The emergence of "Hangzhou AI Dream Team" highlights the importance of industry clustering in fostering innovation and competition in AI applications [6][7] - The AI landscape is evolving into a strategic battleground for user attention, with major companies vying for dominance in the AI ecosystem [10][11]
“DeepSeek冲击”后最大抛压!美国AI巨头举债豪赌算力 华尔街买账吗
Di Yi Cai Jing· 2025-11-17 09:21
过去一周,人工智能(AI)热门股经历了抛售潮,高盛称之为"DeepSeek冲击"以来最大的动能回撤。 据第一财经记者了解,高盛交易台的信息显示,电力瓶颈可能拖慢美国在AI竞赛中的步伐,对AI"支出 太多、收益太少"的怀疑日益增长、软银抛售英伟达、美联储12月降息概率下降等导致AI股遭遇抛售。 上周四,任何被认为存在商业模式瑕疵、估值过高的股票都承受巨大抛压:甲骨文跌4%,CoreWeave跌 16%,Nebius跌6%,Palantir跌6.5%。这些公司都是近一年来备受追捧的"黑马",股价涨幅很多都超过 100%。 更早些时候,Meta、Alphabet和甲骨文等科技巨头的大额发债计划冲击市场,其中一些期限长达40年。 这也标志着AI债市元年降临,这些"现金牛"开始为这场"AI豪赌"和"算力军备竞赛"举债。当举债恰 逢"AI泡沫论"升温之际,各界对AI巨头的庞大资本开支能否获得中长期回报的质疑声渐强。 对华尔街投资人而言,这是发挥财务杠杆的创举,还是债务风险的开始?这又会对科技巨头的股价有何 影响?第一财经记者采访了多位华尔街投行资深银行家和债券策略师。 "节奏至关重要" 当前,AI企业无疑正在进行一场"登 ...
“DeepSeek冲击”后最大抛压!美国AI巨头举债豪赌算力,华尔街买账吗
Di Yi Cai Jing Zi Xun· 2025-11-17 09:17
过去一周,人工智能(AI)热门股经历了抛售潮,高盛称之为"DeepSeek冲击"以来最大的动能回撤。 据第一财经记者了解,高盛交易台的信息显示,电力瓶颈可能拖慢美国在AI竞赛中的步伐,对AI"支出 太多、收益太少"的怀疑日益增长、软银抛售英伟达、美联储12月降息概率下降等导致AI股遭遇抛售。 上周四,任何被认为存在商业模式瑕疵、估值过高的股票都承受巨大抛压:甲骨文跌4%,CoreWeave跌 16%,Nebius跌6%,Palantir跌6.5%。这些公司都是近一年来备受追捧的"黑马",股价涨幅很多都超过 100%。 更早些时候,Meta、Alphabet和甲骨文等科技巨头的大额发债计划冲击市场,其中一些期限长达40年。 这也标志着AI债市元年降临,这些"现金牛"开始为这场"AI豪赌"和"算力军备竞赛"举债。当举债恰 逢"AI泡沫论"升温之际,各界对AI巨头的庞大资本开支能否获得中长期回报的质疑声渐强。 对华尔街投资人而言,这是发挥财务杠杆的创举,还是债务风险的开始?这又会对科技巨头的股价有何 影响?第一财经记者采访了多位华尔街投行资深银行家和债券策略师。 "节奏至关重要" 当前,AI企业无疑正在进行一场"登 ...
投机主题都在抛!高盛交易台:周四美股动量交易创DeepSeek冲击以来最大跌幅
Hua Er Jie Jian Wen· 2025-11-14 13:25
市场对AI周期巨额融资需求的担忧正在发酵,美股科技股遭遇大幅抛售,动量交易策略重挫,AI相关主题篮子、比特币敏感股等投机板块亦遭遇 猛烈抛售。 纳斯达克指数100周四大跌超2%,收于当日低点,过去六个交易日中五天下跌。这场抛售集中打击了动量交易策略和AI相关标的。尽管纳斯达克 100指数距历史高点仅约5%,并试图守住50日均线这一关键支撑位,但市场情绪已明显转向防御。 在年内仅剩30个完整交易日的背景下,投资者正等待市场叙事和价格走势企稳。高盛指出,主经纪商账簿中的动量敞口仍处于高位(一年期第76 百分位、五年期第88百分位),而年末减仓和税损收割的季节性压力正在显现。目前,投资者正等待更清晰的信号——无论是英伟达财报能否超 预期,还是美联储货币政策路径能否明朗化——来判断这轮抛售何时企稳。 高盛数据显示,该行高贝塔动量配对交易(GSPRHIMO)周四暴跌7%,创下今年第二差表现,也是自DeepSeek事件以来的最大单日跌幅。AI相关主 题篮子、比特币敏感股和量子计算等投机性板块均遭遇猛烈抛售。 AI泡沫等五大压力因素引发抛售 高盛交易团队列出了此轮下跌的五大触发因素。 首先,投资者在英伟达下周公布财报前选择 ...
雷军用千万年薪挖人?DeepSeek关键开发者加入小米
Sou Hu Cai Jing· 2025-11-12 10:13
Core Insights - The article discusses the recent hiring of Luo Fuli by Xiaomi, who is focused on advancing artificial general intelligence (AGI) through innovative research [1][7] - Luo Fuli has a strong background in AI, having previously worked at Alibaba and DeepSeek, and is now leading Xiaomi's AI large model team [6][7] Group 1: Company Developments - Luo Fuli announced her new role at Xiaomi, emphasizing the company's commitment to building a future where AI transitions from language to the physical world [1] - Xiaomi's AI team collaborated with Peking University to publish a paper on MoE and reinforcement learning, highlighting the involvement of Luo Fuli [4] - Xiaomi has been actively building its GPU resources, with an initial 6,500 GPUs and plans for a larger GPU cluster to enhance its AI model development capabilities [7] Group 2: Research and Innovations - Xiaomi has made significant strides in AI model development, having open-sourced several models this year, including Xiaomi MiMo for reasoning and Xiaomi MiMo-VL for multimodal applications [9] - The company has continuously improved its models, with updates enhancing reasoning, document, GUI, and video understanding capabilities [9]