多模态

Search documents
李彦宏点评 DeepSeek 又贵又慢,网友:这就有点“既要又要”了
程序员的那些事· 2025-04-26 15:13
以下文章来源于MaxAIBox ,作者Max 2 月 14 日,百度宣布了文心大模型不止要免费,而且还要开源。 2 月 16 日晚,百度搜索和文心智能体平台分别宣布,将全面接入 DeepSeek 和文心大模型最新的深度 搜索功能。2 月 18 日,DeepSeek-R1 满血版已经在百度 APP 搜索上线。 此外,2 月 18 日晚间,李彦宏在 2024 年第四季度及全年财报表示: MaxAIBox . MaxAIBox.com 汇集优秀 AI 工具,探索 AI 无限可能 1 众所周知,百度曾经坚持闭源路线,但 DeepSeek 爆火出圈后,随着各行各业众多企业接入满血版 DeepSeek-R1,百度也跟上了。 从 DeepSeek 我们学到一点,那就是将最为优秀的模型开源供所有人使用,将可以极大地推动其 应用,因为大家出于好奇自然会想去尝试开源模型,进而推动其更广泛的应用。 2 4 月 25 日,百度在武汉举办了一场 AI 开发者大会,李彦宏上台发表了题为《模型的世界,应用的天 下》的演讲。 他指出,"只要找对场景,选对基础模型,学一点调模型的方法,做出来的应用不会过时。" + "没有应 用,芯片、模型都没 ...
酷开一口气甩出 6 个超级智能体!CEO:一定要做 AI 原生,性价比是我们追求的主要方向
AI前线· 2025-04-25 13:48
当下,市面上各类智能体如雨后春笋涌现,但由于缺乏应用广度及深度,以及设备交互无法承载场景需求,智能体的应用价值未得到充分发挥。市面上 不缺乏智能体,但缺少能够提供满意服务的智能体。 据王志国介绍,此次推出超级智能体后,酷开接下来的规划是分步走的。第一,做用户数据的闭环,要观察三个月左右的时间,尤其是用户留存、活跃 数据和功能满足率大方面;第二,主动服务能力是下一个重心,准备把超级智能体的意图识别模型从 7B 模型换到 32B 模型,把它做成跟用户情感对话 的工具;第三,时刻保持着跟行业内最领先的大模型做,一定要做 AI 原生,只要中间隔着人,大模型的能力就会被大幅度衰减。 同时,酷开超级智能体和六大专业智能体支持软件售卖、设备授权、PaaS 服务、生态共赢的等合作模式,致力构建开放智能生态。据王志国透露,今 年 Q1 季度,酷开签约智能体销售(软件销售)已经达到了软件和硬件各占一半。 作者 | 华卫 4 月 22 日,酷开在以"大爱 AI"为主题的 2025 春季发布会上发布超级智能体,包括影音、健康、生活、设备、创作、教育六大智能体,以及智能体硬件 酷开学习机 Y41 Air、酷开闺蜜机 C20 系列等产品 ...
“DeepSeek不是万能的”,李彦宏今年押注AI 应用:模型价再“打骨折”,重点布局多智能体、多模态
AI前线· 2025-04-25 08:25
作者 | 褚杏娟、华卫 在 4 月 25 日的百度 Create 开发者大会现场,百度创始人李彦宏发布了两大模型、多款热门 AI 应用,并宣布将帮助开发者全面拥抱 MCP。同时,百度 正式点亮了国内首个全自研的三万卡集群,可同时承载多个千亿参数大模型的全量训练,支持 1000 个用户同时做百亿参数的大模型精调。 "所有这些发布,都是为了让开发者们可以不用担心模型能力、不用担心模型成本、更不用担心开发工具和平台,可以踏踏实实地做应用,做出最好的应 用!"李彦宏说道。 李彦宏表示,大模型厂商卷生卷死,几乎每周都在发布新模型,但开发者不敢大胆用,因为担心自己的应用被模型迭代快速覆盖掉。李彦宏认为这是把 双刃剑:一方面,开发者确实需要理解技术发展趋势;另一方面,这么多日益强大的模型提供了更多的选择,打开了更多的可能性。 "只要找对场景,选对基础模型,有时候还要学一点调模型的方法,在此基础上做出来的应用是不会过时的"。他强调,"没有应用,芯片、模型都没有价 值。模型会有很多,但未来真正统治这个世界的是应用,应用才是王者。" 发布两大新模型, 价格最高降 80% 文心大模型 4.5 Turbo 和文心大模型 X1 Tur ...
字节快手迎来关键对决
Hua Er Jie Jian Wen· 2025-04-22 12:39
作者 | 刘宝丹 编辑 | 周智宇 近日,快手正式发布可灵2.0视频生成模型及可图2.0图像生成模型,将视频及图像创作的精准度带上一 个新高度。同期,字节Seed团队正式发布Seedream 3.0 技术报告,据第三方榜单Artificial Analysis, Seedream 3.0综合性能已追平文生图SOTA模型GPT-4o,进入全球第一梯队。 作为短视频平台,字节和快手被认为是AI多模态领域的有力竞争者。经过一年多的技术追赶,双方在 AI视频生成领域都取得了不错的进展。 根据AI产品榜3月数据,在全球AI产品增速榜(仅APP)上,即梦AI 以173.57%的月活增速位居第5,是 增速最快的AI视频应用,其月活规模约2037万,而可灵AI的增速仅为36.44%,排名第14。根据快手公 布的数据,截至目前,可灵AI全球用户规模突破2200万。 AI竞赛焦点已经开始转向多模态,字节和快手在AI视频赛道的竞争也日趋激烈。 不过,当前AI视频生成领域尚未涌现类似DeepSeek在大型语言模型(LLM)领域的标杆性产品,根据 Gartner 2024年新兴技术成熟度曲线显示,该技术仍处于创新触发期,这也意味着,字 ...
郑宏达详解Llama
2025-04-15 14:30
Summary of Conference Call on LAMAS Model Company and Industry - The discussion revolves around the LAMAS model, a significant development in the artificial intelligence (AI) industry, particularly in the context of multi-modal capabilities and its implications for technology companies like Meta and others in the AI sector [1][20]. Core Points and Arguments 1. **Importance of LAMAS Model**: The LAMAS model is highlighted as a crucial development in the AI industry, particularly for its multi-modal capabilities, which integrate text, images, and videos during training [1][20]. 2. **Model Versions**: Three versions of the LAMAS model were introduced: - **Scout**: A smaller parameter model with 109 billion parameters, designed for low-cost inference, capable of running on a single H100 card [6][10]. - **Maverick**: A larger model with several hundred billion parameters, requiring a DGX server for operation [10]. - **Two Trillion Parameter Model**: A yet-to-be-released model that serves as the foundation for the other two versions [11][20]. 3. **Dynamic Routing Mechanism**: The model employs a dynamic routing mechanism that activates only a portion of its parameters during inference, significantly reducing operational costs [5][6]. 4. **Multi-modal Training**: LAMAS utilizes a novel "native multi-modal" training approach, allowing it to learn cross-modal associations effectively [14][20]. 5. **Limitations**: The model currently lacks deep reasoning capabilities and has relatively poor programming skills compared to competitors like OpenAI's models [12][21]. 6. **Market Response**: Following the release of LAMAS, several U.S. computing companies, including Microsoft, have announced support for its deployment [12][20]. 7. **Future Developments**: There is anticipation for the release of a deep reasoning model from Meta, which could enhance the capabilities of LAMAS significantly [16][21]. Other Important but Overlooked Content 1. **Impact of Trade Wars**: The discussion briefly touches on the implications of trade wars and tariffs on the technology sector, although this was not the main focus of the call [1]. 2. **AI Market Trends**: The call suggests that AI will be a driving force in the next wave of technological advancements, with various AI applications expected to emerge in the near future [19]. 3. **Chinese Tech Industry**: The ongoing geopolitical issues are seen as beneficial for the Chinese tech industry, potentially accelerating domestic advancements in high-tech products [19]. This summary encapsulates the key points discussed in the conference call regarding the LAMAS model and its implications for the AI industry, highlighting both its strengths and limitations.
科技龙珠雷达系列 - 上海篇-系统梳理中国科技龙珠
2025-04-15 14:30
Summary of Conference Call Notes Industry or Company Involved - The conference call discusses advancements in the AI and robotics sectors, focusing on various companies and their innovations in AI models, robotics, and GPU technology. Core Points and Arguments 1. **AI Model Development**: A company has developed a trillion-parameter MOE language model, aiming to establish AGI capabilities. The step1 model, with 100 billion parameters, excels in image processing, mathematical abilities, logical reasoning, and text creation, ranking highly in industry evaluations [1] 2. **AI Tone Services**: A newly established company in Shanghai focuses on providing AI tone services for large models, backed by state-owned enterprises and local government support. This service involves creating high-quality training data for AI models [2][3] 3. **Robotics Innovations**: A company has launched three series of robots, including the "Yuanqi" series, with over 1,000 units produced. These robots are designed for various applications, showcasing advanced capabilities [4][5] 4. **Intelligent Robotics**: The introduction of the "Lingxi XR" humanoid robot, which features 28 degrees of freedom, allows for complex movements and interactions, enhancing its adaptability in various environments [6][7] 5. **Cloud Robotics**: A company has proposed a cloud robotics architecture that integrates cloud computing with robotics, enabling self-learning and continuous evolution of robots [8] 6. **Industrial Robotics**: Feixi Technology focuses on industrial robotic solutions, leveraging advanced sensors for applications in manufacturing, healthcare, and agriculture [9][10] 7. **GPU Technology**: A company named Muxi has developed high-performance GPUs, achieving significant breakthroughs in computing power, including the BR100 GPU, which set a global record for computing capabilities [11][12] 8. **AI Model Deployment**: Companies are rapidly deploying AI models, such as the TM106 series, which supports advanced inference capabilities, competing with leading models in the market [13] 9. **Computing Solutions**: A company offers standardized AI computing solutions, enabling quick deployment for clients and reducing operational costs [14] 10. **Market Positioning**: The conference highlights the competitive landscape of nine leading companies in AI, robotics, and GPU sectors, emphasizing their potential to challenge international giants and drive technological advancements in China [15][16] Other Important but Possibly Overlooked Content - The establishment of a "super factory" for high-quality training data is underway, aiming to significantly increase the capacity of the tone library by 2025 [3] - The conference underscores the importance of investing in technology assets to support emerging companies that are breaking international monopolies in their respective fields [16]
Meta,重磅发布!
证券时报· 2025-04-06 04:58
Core Viewpoint - Meta has launched the Llama 4 series, which includes the most advanced models to date, Llama 4 Scout and Llama 4 Maverick, marking a significant advancement in open-source AI models and a response to emerging competitors like DeepSeek [1][3][10]. Group 1: Model Features - Llama 4 series includes two efficient models: Llama 4 Scout and Llama 4 Maverick, with a preview of the powerful Llama 4 Behemoth [5][8]. - The Llama 4 models utilize a mixture of experts (MoE) architecture, enhancing computational efficiency by activating only a small portion of parameters for each token [7][8]. - Llama 4 Behemoth boasts a total parameter count of 2 trillion, while Llama 4 Scout has 109 billion parameters and Llama 4 Maverick has 400 billion parameters [8]. Group 2: Multi-Modal Capabilities - Llama 4 is designed as a native multi-modal model, employing early fusion technology to integrate text, images, and video data seamlessly [8][9]. - The model supports extensive visual understanding, capable of processing up to 48 images during pre-training and 8 images during post-training, achieving strong results [9]. Group 3: Contextual Understanding - Llama 4 Scout supports a context window of up to 10 million tokens, setting a new record for open-source models and outperforming competitors like GPT-4o [9]. Group 4: Competitive Landscape - The release of Llama 4 comes amid increasing competition in the open-source model space, particularly from DeepSeek and Alibaba's Tongyi Qianwen series [11][12]. - Meta's previous open-source initiatives, such as Llama 2, have spurred innovation within the developer community, leading to a vibrant ecosystem [11]. - The competitive environment is intensifying, with ongoing advancements in model capabilities and frequent releases from various companies [13].
7B模型搞定AI视频通话,阿里最新开源炸场,看听说写全模态打通,开发者企业免费商用
量子位· 2025-03-27 04:16
西风 明敏 发自 凹非寺 量子位 | 公众号 QbitAI 深夜重磅!阿里发布并开源首个端到端全模态大模型—— 通义千问Qwen2.5-Omni-7B ,来了。 仅靠一个 一体式模型 ,就能搞定文本、音频、图像、视频全模态,并实时生成文本和自然语音。 堪称7B模型的全能冠军。 你的iPhone搭载的很可能就是它! 现在打开Qwen Chat,就能直接和它实时进行视频或语音交互: 话不多说,先来看一波能力展示。 在大街上同它视频通话,它能正确识别周围环境,按照你的需求为你推荐餐馆: 走进厨房,它又化身"智能菜谱",一步步指导你变成大厨: 在多模态任务OmniBench评测中,Qwen2.5-Omni表现刷新记录拿下 新SOTA ,远超谷歌Gemini-1.5-Pro等同类模型。 在单模态的语音识别、翻译、音频理解、图像推理、视频理解、语音生成任务中,Qwen2.5-Omni的全维度表现也都优于类似大小的单模态模 型以及闭源模型。 在seed-tts-eval语音生成基准中,Qwen2.5-Omni展现出与人类水平相当的语音合成能力。 这意味着Qwen2.5-Omni-7B能很好地和世界进行实时交互,甚至能轻松识 ...
活动回顾 | DeepSeek:AI大模型开启金融数据领域的智能变革
Refinitiv路孚特· 2025-03-24 05:44
Core Viewpoint - The article emphasizes the transformative impact of DeepSeek, an open-source large language model, on the financial industry, highlighting its cost-effectiveness, efficiency, and innovative technology that supports the intelligent transformation of financial data [1][3][21]. Group 1: Core Technical Advantages of DeepSeek - DeepSeek employs a permissive open-source strategy (MIT License), enabling rapid global dissemination and application of its technology, fostering a developer ecosystem that allows small and medium enterprises to adopt AI capabilities at low costs [3][4]. - The model enhances traditional large models by introducing "active learning" capabilities, allowing it to adapt and optimize its performance based on market changes, thus improving decision-making in financial data analysis [5][6]. - DeepSeek optimizes the entire training process, significantly improving efficiency and reducing costs through techniques like mixed expert models and data compression, making top-tier AI technology accessible to smaller enterprises [7]. Group 2: AI Applications in the Financial Industry - AI, including DeepSeek, enhances operational efficiency in financial institutions by automating customer service and programming tasks, leading to a 50% increase in customer service efficiency at LSEG [9]. - In risk management, AI optimizes risk control models by analyzing large datasets and generating timely risk assessments, enabling financial institutions to mitigate potential losses [10]. - AI improves investment strategies by providing personalized investment advice based on market dynamics, as demonstrated by TwoSigma's use of large models to analyze financial reports and news [11]. - AI enhances customer experience through personalized recommendations and intelligent interactions, increasing customer satisfaction and engagement, exemplified by Standard Chartered's collaboration with LSEG [12]. Group 3: Compliance Challenges and Strategies - The financial sector faces data privacy and security risks due to its reliance on sensitive data, with potential threats from misuse of biometric information and phishing attacks [13]. - Financial institutions using DeepSeek can ensure data security and compliance through localized deployment and encryption technologies, mitigating legal risks [14]. - User education is crucial in the AI era, with financial institutions employing AI to monitor and alert users about potential risks, creating a dual defense of technology and education [15]. Group 4: Future Trends and Innovations - AI Agents are expected to automate business processes, significantly improving efficiency and reducing human error in tasks like fundamental and technical analysis [16]. - The development of multimodal capabilities in DeepSeek will allow for better integration of visual and auditory data, enhancing investment decision-making [17]. - Future language model developers may use natural language to "code," lowering the technical barriers for AI development and fostering rapid business innovation [18]. - DeepSeek's low-cost AI approach may democratize access to advanced analytical capabilities, reshaping the competitive landscape in the financial sector [19].
AI时代的量化投资与产品策略 ——申万宏源2025资本市场春季策略会
2025-03-12 07:52
Summary of Key Points from the Conference Call Industry or Company Involved - The conference call focuses on the **AI investment strategies** and **ETF market** in the context of the **capital market** as discussed by **Huatai Securities** during their **2025 Spring Strategy Meeting**. Core Points and Arguments - **AI Strategies in Investment**: AI strategies significantly enhance traditional multi-factor models by processing vast amounts of data and complex factors, particularly in volume and price data analysis, optimizing investment decisions [1][4][9]. - **Acceptance of AI in Asset Management**: The asset management industry is increasingly accepting AI strategies, particularly those based on statistical models, due to their strong performance. However, the ability of reasoning-based large language models to reach expert-level performance remains to be validated [1][13][14]. - **ETF Market Growth**: The ETF market has surpassed **3.8 trillion yuan**, with a focus on smart beta strategies to achieve stable returns through industry rotation and asset allocation models [1][22]. - **Investment Strategy Focus**: Huatai Securities emphasizes a robust return strategy, primarily focusing on bond investments, and utilizes global asset allocation models and qualitative analysis for market judgment [1][27]. - **Industry Rotation Strategy**: The industry rotation strategy combines macro, meso, and micro factors with AI identification and qualitative analysis, favoring technology, consumer, and pharmaceutical sectors while adjusting investment targets based on significant events like the Two Sessions [3][31]. - **AI's Role in Financial Engineering**: AI enhances traditional multi-factor frameworks by integrating diverse data types, leading to more precise and efficient data analysis, thus optimizing portfolio design and improving returns while reducing risks [7][18]. - **Performance of AI in Quantitative Investment**: AI strategies outperform traditional multi-factor methods by effectively aggregating information and conducting global analyses, leading to superior excess returns [9][12]. - **Future of Large Models in Finance**: Large models like DeepSeek and ChatGPT show potential in subjective analysis, suggesting a new paradigm of combining subjective and quantitative investment approaches, although their expert-level capabilities need further validation [11][15]. - **ETF Product Development**: Huatai Securities is committed to providing ETF products and solutions, focusing on smart beta strategies and offering professional services, including market reports and strategy analyses [1][23]. Other Important but Possibly Overlooked Content - **Historical Context of AI in Quantitative Investment**: The application of AI in quantitative investment began around 2003, evolving through various phases, with significant adoption starting in 2017, leading to substantial investment returns [2][13]. - **Impact of Two Sessions on Market**: The analysis of the Two Sessions' impact on the market involves reviewing historical key topics and market performance, indicating that different time periods around the event affect market dynamics [32]. - **Investment Heat and Valuation Levels**: The current investment heat in AI-related sectors is at historical highs, with significant trading activity and valuation levels, necessitating cautious investment strategies [62][64]. - **Differentiation of Index Products**: Index products vary significantly in valuation levels and stock resonance, suggesting that investors should choose based on their risk appetite and investment strategy [68][70]. - **Performance of Active Equity Fund Managers**: Different fund managers exhibit varying performance in the AI sector, categorized into stable allocation, focused sector, and flexible adjustment types, highlighting the importance of selecting managers based on their stability and risk-return profile [73][74]. This summary encapsulates the essential insights from the conference call, providing a comprehensive overview of the discussions surrounding AI investment strategies and the ETF market.