大模型开源
Search documents
(经济观察)中国大模型密集开源 影响几何?
Zhong Guo Xin Wen Wang· 2025-03-25 16:39
Core Insights - The trend of open-sourcing large models in China is rapidly gaining momentum, with major companies like Alibaba Cloud and DeepSeek leading the charge by releasing multiple models in a short time frame [1][2][3] Group 1: Market Dynamics - The demand for edge intelligence is rising, driven by the need for personal AI deployment, which is accelerating the development of edge intelligence [2] - There is a significant increase in AI deployment needs across various industries, with open-source models providing the flexibility and customization required for differentiated business scenarios and data privacy [2][3] - As of March 25, the global download count for Alibaba's Qwen series of open-source models has exceeded 200 million, indicating widespread adoption across sectors such as healthcare, education, finance, and transportation [2] Group 2: Industry Ecosystem - The AI industry is entering a phase of accelerated ecosystem development, characterized by clearer upstream and downstream collaboration, with leading companies focusing on model capabilities while smaller firms develop niche applications based on open-source models [2][3] - The number of derivative models from Alibaba's open-source models has surpassed 100,000, making it the largest open-source model family globally [3] Group 3: Future Outlook - Experts suggest that open-source models will become a powerful engine driving the development of AI in China, with recommendations for a more proactive embrace of open-source initiatives at all levels, including government and enterprises [4][5] - The open-source approach not only fosters competition among tech companies but also accelerates AI adoption and innovation by reducing costs and opening doors for product innovation [4][5]
与 00 后开源者聊 DeepSeek 开源周:一直开源最强模型,可能是不想赚钱,也可能是想推动更大变化丨开源对话#2
晚点LatePost· 2025-02-27 14:03
"当 AI 足够强大后,开源还是不是一个好选择?" 整理丨刘倩 程曼祺 嘉宾丨美国西北大学 MLL Lab 博士王子涵 ▲扫描上图中的二维码,可收听播客。《晚点聊 LateTalk》#102 期节目。欢迎在小宇宙、喜马拉雅、苹果 Podcast 等渠道关注、收听我们。 《晚点聊 LateTalk》是《晚点 LatePost》 推出的播客节目。"最一手的商业、科技访谈,最真实的从业者思考。" 这是《晚点 LatePost》 「开源对话」系列的第 2 篇。该系列将收录与开源相关的访谈与讨论。系列文章见文末的合集#开源对话。 上周五,DeepSeek 在官方 Twitter 上预告了下一周会连续 5 天开源 5 个代码库,进入 "open-source week"开源周。 目前 DeepSeek 已放出的 4 个库,主要涉及 DeepSeek-V3/R1 相关的训练与推理代码 。 这是比发布技术报告和开源模型权重更深度的开源。 有了训练和推理 工具,开发者才能更好地在自己的系统里,实现 DeepSeek 系列模型的高效表现。 (注:所有 4 个库和后续开源可见 DeepSeek GitHub 中的 Open-Inf ...
对谈 98 年就做开源的章文嵩:要像维基百科那样,开源共建大模型数据集丨开源对话#1
晚点LatePost· 2025-02-27 14:03
"真正的大模型开源,应该把数据集也开源。" 文丨贺乾明 编辑丨宋玮 过去两个月,DeepSeek 重塑全球大模型格局,也扭转了整个行业对开源的理解。 OpenAI 反思走向闭源是 "站在历史错误的一边",百度、MiniMax、阶跃星辰等原本闭源的公司转向开源。 "如果在以前,一个拿几亿美金融资的公司说自己要开源,估计投资人会吐血。" 一位科技投资人说。 DeepSeek 还在加大开源力度。这周,DeepSeek 计划开源 5 个训练、推理大模型相关的代码库——而大多数开源模型的公司还停留 在开放模型权重层面。 到底该怎么看待 DeepSeek 的开源?它对大模型开源社区意味着什么?为什么不同公司选择不同的开源策略?选择开源对一家商业 公司到底意味着什么? 近期,我们访谈了中国开源先驱章文嵩。他 1995 年读硕士期间接触到开源,那时中国刚通互联网不久,不少 DeepSeek 的研究者还 没有出生。 1998 年,章文嵩在国防科大读博期间开源了 LVS(Linux 虚拟服务器)软件,这个均衡服务器访问流量、避免宕机的系统,是中国 最早在全球科技行业扩散的开源项目,如今是互联网基础设施的组件。 "几乎所有的互联网 ...