Seek .(SKLTY)

Search documents
爆冷,首届大模型争霸,Grok 4下出“神之一手”?DeepSeek、Kimi惨遭淘汰
3 6 Ke· 2025-08-07 01:16
AI国际象棋对抗?这次玩真的!谷歌Kaggle推出首届全球AI象棋争霸赛,八款顶级语言模型正面对抗,胜负只在一步之间! 国际象棋全球AI争霸赛来了! 首战即放狠招:让全球八款最强语言模型,国际象棋正面对抗: 闭源的大模型:Gemini 2.5 Pro、OpenAI o4-mini、Grok 4、OpenAI o3、Claude 4 Opus、Gemini 2.5 Flash; 开源的大模型:DeepSeek R1和Kimi K2 Instruct。 首战落幕 今日凌晨1点,这场大赛正式打响了8进4淘汰战: Gemini 2.5 Pro、o4-mini、Grok 4、和o3,以4-0的碾压式战绩横扫对手,进入半决赛。 而Claude 4 Opus、DeepSeek R1、Gemini 2.5 Flash和Kimi K2没能撑过中盘,就已纷纷崩溃退场。 第二天的半决赛,OpenAI的o3-mini和o3将「自相残杀」,而Gemini 2.5 Pro和Grok 4则狭路相逢 整场赛事由谷歌旗下的Kaggle举办。为此,他们专为通用大模型打造了竞技平台——「Game Arena」。 谷歌表示游戏是评估模型与智能体的理 ...
战报:马斯克Grok4笑傲AI象棋大赛,DeepSeek没干过o4-mini,Kimi K2被喊冤
3 6 Ke· 2025-08-06 08:41
最新战报最新战报:首届AI国际象棋对战……马斯克家的Grok 4"遥遥领先"了。 是的,谷歌给大模型整了个国际象棋比赛:Kaggle AI象棋竞赛。 在首日对决之后,参赛选手中OpenAI的o3和o4-mini、DeepSeek R1、Kimi K2 Instruct、Gemini 2.5 Pro和2.5 Flash、Claude Opus 4、Grok 4都有了第一轮较 量,结果—— Grok 4表现最佳,DeepSeek R1表现强劲,但不敌o4-mini,Kimi K2最惨——都让网友喊冤了。 眼见自家Grok 4表现出色,马斯克当然不会错过PR良机,不过回应略显凡尔赛: 本次比赛由谷歌发布,作为推广Kaggle游戏竞技场的一个环节。首次比赛以国际象棋开始。 参赛"选手"包括OpenAI的o3和o4-mini、DeepSeek R1、Kimi K2 Instruct、Gemini 2.5 Pro和2.5 Flash、Claude Opus 4、Grok 4。 8月5日至8月7日每天10:30(太平洋时间)准时直播赛况。 我们没有刻意去训练,这只是一个副作用。 u1s1谁又能为这么个"无厘头"比赛专门刻意 ...
DeepSeek终于把OpenAI逼急了
Feng Huang Wang· 2025-08-06 08:21
摘要: 中国开源模型的爆发式发展很难不触动OpenAI的神经,以及硅谷的神经。 北京时间8月6日凌晨,OpenAI突然发布了其首个开源语言模型 GPT-OSS,在全球科技圈投下了一枚炸弹。 具体来看,gpt-oss-120b采用了MoE架构,拥有1170亿参数,其中激活参数约51亿,仅需在单张80GB的GPU上就能运行,其性能与闭源的o4-mini十分接 近。 而gpt-oss-20b同样基于MoE架构,有210亿参数,激活参数约36亿,可在配备16GB内存的设备上流畅运行,性能表现接近o3-mini。 其实,回顾过去几年,OpenAI一直在走"闭源+收费"的路线。无论是GPT-4还是GPT-4o,核心模型始终没有开放。业界也一度认为,"最强模型永远不会开 源"。 但GPT-OSS的出现,打破了这一共识。 据OpenAI官方称,GPT-OSS是一款"小型但高效"的语言模型,训练数据涵盖多语种、多领域。 更重要的是,OpenAI声称该模型"可以免费用于商业用途",这对中国乃至全球的AI初创企业来说,简直是"天降神兵"。 准备向国产模型宣战? 作为ChatGPT世代的开创者,OpenAI此举意味着一个巨大的转向: ...
闹玩呢,首届大模型对抗赛,DeepSeek、Kimi第一轮被淘汰了
3 6 Ke· 2025-08-06 08:01
Group 1 - The core focus of the article is the first international chess competition for large models, where Grok 4 is highlighted as a leading contender for the championship [1][24]. - The competition features various AI models, including Gemini 2.5 Pro, o4-mini, Grok 4, and others, all of which advanced to the semifinals with a 4-0 victory in their initial matches [1][9]. - The event is hosted on the Kaggle Game Arena platform, aiming to evaluate the performance of large language models (LLMs) in dynamic and competitive environments [1]. Group 2 - Kimi k2 faced o3 and lost 0-4, with Kimi k2 struggling to find legal moves after the opening phase, indicating potential technical issues [3][6]. - DeepSeek R1 lost to o4-mini with a score of 0-4, showcasing a pattern of initial strong moves followed by significant errors [10][13]. - Gemini 2.5 Pro achieved a 4-0 victory over Claude 4 Opus, but its true strength remains uncertain due to the opponent's mistakes [14][18]. - Grok 4's performance was particularly impressive, winning 4-0 against Gemini 2.5 Flash, demonstrating a strong ability to capture unprotected pieces [21][27]. Group 3 - The article notes that current AI models in chess exhibit three main weaknesses: insufficient global board visualization, limited understanding of piece interactions, and issues with executing legal moves [27]. - Grok 4's success suggests it may have overcome these limitations, raising questions about the consistency of these models' advantages and shortcomings in future matches [27]. - The article also mentions a poll where 37% of participants favored Gemini 2.5 Pro as the likely winner before the competition began [27].
OpenAI发布开源模型“王者归来”,DeepSeek剧情会反转吗
Hu Xiu· 2025-08-06 03:47
最大的开源社区Hugging Face创始人兼CEO Clement Delangue称之为"王者归来"。 "这就像剧情反转, 像是一场王者归来, OpenAI终于重新发布开源模型gpt-oss-120b和gpt-oss-20b。这是其自从GPT-2之后,首次发布开源语言模型。 这也是上半年DeepSeek-R1发布,引发中国掀起一股开源狂潮,7月份中国K2、GLM-4.5、Step-3及Qwen3更新版本等密集发布之后,美国AI实验室首次发 出最强开源模型。 Llama4上半年发布失败,美国朝野一致对开源AI落后于中国感到焦虑之际,OpenAI看起来要扳回一局。 像是某件大事的开端。 让我们一起推进开源AI吧" gpt-oss vs. DeepSeek StabilityAI创始人Emad Mostaque等人,对比了gpt-oss与DeepSeek: 训练效率:gpt-oss-120b每个token激活约5.1B参数,而DeepSeek是37B,少了7倍以上,因此可以处理超过5倍的tokens,即大约80万亿tokens(作为参考, Qwen3使用了30万亿)。 计算消耗:gpt-oss比DeepSeek ...
OpenAI发布低成本模型 与Meta(META.US)和DeepSeek正面竞争
智通财经网· 2025-08-06 01:53
智通财经APP获悉,OpenAI周二发布了自2019年推出GPT-2以来首批开放权重语言模型。这两款纯文本 模型分别命名为gpt-oss-120b和gpt-oss-20b,旨在为开发者、研究人员和企业提供更易运行和定制的低成 本选项。 当人工智能模型的参数(即训练过程中优化输出和预测能力的要素)公开可用时,该模型即被视为开放权 重。此类模型虽能提供透明度和控制权,但不同于开源模型——后者允许用户完全获取并修改源代码。 近年来,Meta、微软支持的Mistral AI以及中国初创企业DeepSeek等公司也相继发布了开放权重模型。 此次发布备受期待,部分原因是公司曾多次推迟上线。OpenAI CEO萨姆·奥尔特曼七月在X平台发文称 需要更多时间"进行额外安全测试并审查高风险领域",此前六月他也明确表示模型不会当月发布。 OpenAI周二声明已对开放权重模型实施全面安全训练与测试。在预训练阶段清除了有害的化学、生 物、放射性与核武器相关数据,并模拟了恶意行为者可能进行的模型微调。 测试表明,经恶意调优的模型无法达到其"准备框架"设定的高能力阈值——这是该公司衡量和防范危害 的评估体系。OpenAI还透露已邀请三个 ...
谁在往“DeepSeek们”的回答里塞广告?
3 6 Ke· 2025-08-04 09:37
Core Viewpoint - AI is transforming modern workplaces and daily life, shifting user behavior from "searching" to "asking AI" for solutions, leading to a significant increase in AI search users from 310 million in January 2024 to 1.98 billion by February 2025, a growth rate of 538.7% [1] Group 1: User Experience and Concerns - Users are increasingly questioning whether AI-generated answers contain advertisements, as seen in the experiences of individuals like Zhao Xinting, who noticed brand mentions in AI responses and expressed skepticism about their authenticity [1][4] - Social media platforms are filled with users voicing concerns that AI responses are becoming "advertising spaces," with examples of AI tools like DeepSeek and Doubao incorporating promotional content in their answers [5][9] Group 2: Marketing Opportunities - The rise of AI has created new marketing opportunities, particularly through Generative Engine Optimization (GEO), which aims to influence AI outputs by producing content that aligns with AI preferences, similar to traditional Search Engine Optimization (SEO) [10] - The GEO market is projected to grow significantly, with estimates suggesting a market size of approximately 2.1 billion yuan in 2023, expected to reach 24.2 billion yuan by 2027, indicating a potential market value transformation exceeding 300 billion yuan in the next five years [14] Group 3: Service Providers and Pricing - GEO service companies are emerging, offering services that optimize brand visibility in AI responses, with pricing models based on the number of keywords and entries, ranging from 6,000 yuan for 50 entries to 20,000 yuan for 500 entries per month [12][13] - The effectiveness of GEO services is measured by the frequency of brand mentions in AI responses, with some companies offering guarantees of performance or refunds if results are unsatisfactory [14]
AI周报 | DeepSeek斩获ACL 2025最佳论文;库克称苹果计划“大幅”增加AI投资
Di Yi Cai Jing· 2025-08-03 01:16
Group 1: DeepSeek and ACL Conference - DeepSeek, in collaboration with Peking University, won the Best Paper Award at the 63rd ACL conference, highlighting a significant achievement in natural language processing with the introduction of the Native Sparse Attention (NSA) mechanism [1][2] - The ACL conference saw a record submission of over 8000 papers, with a main conference acceptance rate of 20.3% and a Findings acceptance rate of 16.7% [1] Group 2: Anthropic's Market Position - Anthropic has surpassed OpenAI in popularity among enterprises, capturing 32% of the large language model market, while OpenAI's share has decreased to 25% [3][4] - Two years ago, OpenAI held a dominant 50% market share, with Anthropic at only 12%, indicating a significant shift in the competitive landscape [3] Group 3: AI Model Developments - The AI startup Step 3 has released an open-source foundational model with 321 billion parameters, showcasing advanced capabilities in visual perception and complex reasoning [5] - Multiple companies, including Tencent and Moonlight, have also released open-source models, indicating a trend towards open-source solutions in the AI industry [5] Group 4: Baidu's AI Integration - Baidu is testing an AI application entry point on its search homepage, allowing users to access various AI applications directly [6][7] - This move follows a major redesign of Baidu's search platform and reflects the company's commitment to integrating AI into its services [6] Group 5: Robotics Industry Insights - Tencent's chief scientist, Zhang Zhengyou, stated that the embodied intelligence industry is still in its early stages, comparing it to the mobile phone industry's evolution [8] - He emphasized that current humanoid robots are primarily used for data collection and research, and a significant breakthrough is needed for widespread adoption [8] Group 6: Supernode Solutions - Several companies showcased supernode solutions at the WAIC, addressing the challenges of large-scale computing clusters [9] - Supernodes aim to enhance performance by integrating computing chip resources, which is increasingly necessary as model parameters grow larger [9] Group 7: Financial Performance of Major Tech Companies - Meta reported a 22% year-over-year revenue increase in Q2, reaching $47.5 billion, with a net profit of $18.3 billion, up 36% [10] - Microsoft achieved a revenue of $76.4 billion in Q4, an 18% increase, with its market capitalization reaching $4 trillion, driven by demand for AI services [11]
产学研联动!DeepSeek上市前夕与中科院共建“新一代算力实验室
Jiang Nan Shi Bao· 2025-08-01 03:09
Core Insights - DeepSeek is enhancing its technological barriers by collaborating with the Institute of Computing Technology, Chinese Academy of Sciences to establish a joint laboratory focused on cutting-edge technologies such as "storage-compute integration" [1] - The dual-driven model of "listing + R&D" is expected to accelerate the transformation of scientific research achievements into practical applications [1] - The laboratory has already filed three patents that have entered the PCT international application stage, which may lead to new profit growth points in the future [1]
DeepSeek流量暴跌,要凉了?是它幻觉太严重还是它在闷声发大财?
3 6 Ke· 2025-07-28 23:45
Core Insights - DeepSeek, once hailed as a "national-level" project, has seen a significant decline in its monthly downloads, dropping from 81.13 million in Q1 to 22.59 million, a decrease of 72.2% [1] - Users are increasingly frustrated with DeepSeek's tendency to generate "hallucinated" content, leading to discussions on social media about how to eliminate the "AI flavor" from its outputs [1][2] - The phenomenon of "AI flavor" is characterized by overly mechanical and formulaic responses, which users have begun to recognize and criticize [15] User Experiences - Users have reported instances where DeepSeek provided nonsensical or fabricated advice, such as suggesting irrelevant actions for personal issues or generating non-existent references [2][8][9] - The model's responses often include fabricated data and sources, leading to a lack of trust in its outputs [9][12] Underlying Issues - The decline in DeepSeek's performance is attributed to its reliance on rigid logical structures and formulaic language, which detracts from the quality of its responses [16] - The model's training data is heavily skewed towards English, with less than 5% of its corpus being high-quality Chinese content, limiting its effectiveness in generating diverse and nuanced outputs [22] - Content moderation and the expansion of sensitive word lists have further constrained the model's ability to produce creative and varied language [22] Recommendations for Improvement - Users are encouraged to develop skills to critically assess AI-generated content, including cross-referencing data and testing the model's logic [23] - Emphasizing the importance of human oversight in AI applications, the industry should focus on using AI as a tool for enhancing human creativity rather than as a replacement [24][25]