多模态

Search documents
DeepSeek-R1\Kimi1.5及类强推理模型开发解读
Peking University· 2025-03-05 10:54
Investment Rating - The report does not explicitly state an investment rating for the industry or company Core Insights - DeepSeek-R1 introduces a new paradigm of strong reasoning under reinforcement learning (RL), showcasing significant advancements in reasoning capabilities and long-text processing [4][7] - The model demonstrates exceptional performance in complex tasks, marking a milestone in the open-source community's competition with closed-source models like OpenAI's o1 series [7] - The report highlights the potential of RL-driven models to enhance reasoning abilities without relying on human-annotated supervised fine-tuning [21][56] Summary by Sections Technical Comparison - The report discusses the comparison between STaR-based methods and RL-based methods, emphasizing the advantages of RL in reasoning tasks [3] - It details the innovative RL algorithms used, such as GRPO, which optimize training efficiency and reduce computational costs [49][50] DeepSeek-R1 Analysis - DeepSeek-R1 Zero is built entirely on RL without supervised fine-tuning, showcasing its ability to develop reasoning capabilities autonomously [13][21] - The model's performance metrics indicate strong results in various benchmarks, including AIME 2024 and MATH-500, where it achieved 79.8% and 97.3% respectively, comparable to OpenAI's models [7][15] Insights and Takeaways - The report emphasizes the importance of a robust base model, DeepSeek-V3, which was trained on 671 billion parameters and 14.8 trillion high-quality tokens, enabling significant reasoning capabilities [45][56] - The use of rule-based rewards in training helps avoid reward hacking issues, allowing for automated verification and annotation of reasoning tasks [17][22] Future Directions - The report discusses the potential for further advancements in RL-driven models, suggesting that future training will increasingly focus on RL while still incorporating some supervised fine-tuning [56] - It highlights the need for models to maintain high reasoning performance while ensuring safety and usability in diverse applications [59] Economic and Social Benefits - The exploration of low-cost, high-quality language models is expected to reshape industry dynamics, leading to increased competition and innovation [59] - The report notes that the capital market's volatility is a short-term phenomenon driven by rapid advancements in AI technology, which will lead to a long-term arms race in computational resources [59]
实测腾讯元宝电脑版:搭载满血版DeepSeek,装上就是AI PC
量子位· 2025-03-02 05:18
金磊 发自 凹非寺 量子位 | 公众号 QbitAI 听说了嘛?朋友, 元宝电脑版 新鲜出炉了! 而且啊,从logo上来看,就是主打一个 "满血版" 。 果不其然,从内嵌的模型来看,元宝电脑版在配置了自家的 混元大模型 之外,还有就是大火的 DeepSeek 。 以及从功能上来看,也是非常DeepSeek模式,主要包含 深度思考 和 联网搜索 。 从官方介绍的功能特点来看,区别于其它大模型产品, 微信公众号 这个资源渠道成了元宝电脑版的一大特点。 当然,相比于早就已经上线电脑版的另一个产品——ima(艾玛),也有朋友提出了自己的困惑。 对此,腾讯官方也是下场做了回复,称二者各有侧重。 那么元宝电脑版到底体验如何?话不多说,我们实测走一波。 实测元宝电脑版 首先我们来一个 时效性 的问题: 截至2025年3月最新SpaceX星舰发射计划有哪些技术突破? 截至2025年3月最新5paceX星舰发射计划有哪些技术突 破? 引用 7 篇资料作为参考 v 已深度思考(用时12秒) ^ 好的,我现在需要回答用户关于截至2025年3月SpaceX星舰发射计划的技术突破的问 题。首先,我需要仔细阅读用户提供的搜索结果,找出其 ...
为什么我们对 25 年 AI 极度乐观?| 42章经
42章经· 2025-01-05 21:54
Core Viewpoint - The current AI market is experiencing a significant transformation, with optimism for 2024 and beyond, driven by advancements in technology and a shift in investment focus towards practical applications and embodied intelligence [1][3][4]. Summary by Sections AI Market Overview - In 2023, the AI market saw a surge in interest, reminiscent of the internet boom, as many internet professionals sought new opportunities in AI due to a lack of significant developments in the internet sector since 2015 [1]. - The influx of capital primarily targeted large model companies, with less focus on intermediate layers and applications [1]. 2024 Market Dynamics - The first half of 2024 is expected to be the lowest point for the primary market in the past decade, with minimal funding for new startups [2]. - A notable shift occurred post-September 2024, with a recovery in the financing market and the emergence of high-valuation startups [3]. - The focus in 2024 has shifted from large models to embodied intelligence, with some institutions also exploring AI hardware and consumer electronics [2]. Investment Trends - The valuation of application companies remains low, with most having zero annual recurring revenue (ARR) as they refine their products [2]. - The market is witnessing a bifurcation, with a preference for investing in founders with strong backgrounds and higher valuations [5]. Future Directions for AI - The anticipated trends for 2025 include a focus on application landing, particularly in productivity tools, which are expected to dominate the market [5][6]. - The concept of "Prosumer" or "Pro C" individuals is emerging, targeting a demographic that combines consumer traits with business capabilities [6]. Key Areas of Growth - Two promising areas for 2025 are Agents and multimodal applications, with Agents seen as a potential evolution of traditional SaaS models [7][9]. - The potential for Agents to disrupt existing SaaS companies is significant, as they may offer a results-based payment model rather than a subscription model [9][10]. Multimodal Applications - The future of AI products may lie in multimodal capabilities, allowing for a complete transformation in user interaction and content consumption [13][14]. - The development of products that integrate various forms of content and user interaction is expected to redefine how information is managed and utilized [14][15]. Conclusion - The AI market is poised for growth, with a strong emphasis on practical applications and innovative business models, suggesting a bright future for companies that adapt to these changes [16].
为什么我们对 25 年 AI 极度乐观?| 42章经
42章经· 2025-01-05 13:54
我对当下的 AI 市场和明年的发展都极度乐观,明年肯定是个 AI 大年,我发现市场太悲观了,这 就是我拖延了两周,最后决定一定要做这期内容的原因。 来,我们直入主题,先来看这两年 AI 发生了什么。 23 年,AI 来了,很多互联网人和美元基金就直接冲了,因为不管从什么角度看,AI 这波都和大 家熟悉的互联网那波机会太像了,而且天下其实已经苦互联网人久已,从 15 年以后大机会其实就 不多了,18 年以后更是几乎没有,我记得过去两年涨起来的到千万日活的产品可能也就是番茄小 说等极少数的几个。 那互联网人发现我练成一身武功绝学,江湖却没了,这怎么能忍? 然后市场做了个判断,AI 是不是大机会?肯定是,是什么量级先不说,是对标电、互联网还是云 也先不管,但这里又有个判断,就是 AI 肯定还在早期,所以很多人有个结论,说 AI 要先投技术 背景的人,所以像清华的教授都被撸了一遍,然后 23 年最多的钱就都流向了大模型公司,很少量 的钱流向了做中间层和应用的公司。23 年的时候,但凡你是从 OpenAI 出来的,都像是神坛上的 人,大家到美国去学习也是千方百计拼谁能约到个 OpenAI 的人聊聊。 那冲的结果怎么样呢 ...
大模型的 5 月:热闹的 30 天和鸿沟边缘
晚点LatePost· 2024-05-29 14:00
"Mayday" 可直译为 5 月天,它也是国际通用的无线电求助信号。当飞机有坠落危险时,飞行员会对着对讲机大喊 "Mayday"! 这个 5 月,可能是 ChatGPT 发布至今大模型行业最热闹的时候:OpenAI、Google、微软、字节跳动、阿里巴巴等中美两国公司至少举办了 13 场与 大模型相关的发布会,介绍了 10 多款新模型,拿出了一堆新产品。 热闹中的风险与失望是:不少从业者认为技术没有重大进步。 OpenAI 本月新发布的 GPT-4o 处理语言的能力停留在 GPT-4 水平,被期待已久的 GPT-5 仍未登场。 多模态成为顶尖 AI 公司的技术焦点:从 OpenAI、Google 到微软,发布能同时处理语音、图像,甚至理解现实世界的模型。但这些能力支持的产品 和应用都还在 Demo 阶段,没正式发布就引出了侵权、隐私隐患等各种麻烦。 唱衰大模型创业机会的金沙江创投主管合伙人朱啸虎有一个观点:如果语言能力的进化速度变慢,"这波热潮就到头了"。 "没什么令人兴奋的。" 一位在中国大公司带队研发大模型的人士说,一系列发布会让他更相信,开发能力更强的小模型才是未来。 一位 AI 创业者说 GPT-4 ...