Workflow
开源科学
icon
Search documents
一场聚焦AI“前世今生与未来”的对话
Core Insights - The third China International Supply Chain Promotion Expo featured a significant dialogue on AI, highlighting the importance of AI in modern technology and its rapid evolution [4][5][9] Group 1: AI Development and Trends - Huang Renxun emphasized that AI has transitioned from relying on manual programming to utilizing machine learning on vast datasets, marking a significant technological breakthrough since 2012 [4][5] - The focus of AI technology is shifting towards reasoning intelligence, enabling AI to understand, decompose, and solve problems similarly to humans [4][5] - Huang introduced the concept of Physical AI, which integrates AI capabilities into the physical world, particularly in robotics and autonomous vehicles [5] Group 2: The Role of Computing Power - Wang Jian highlighted that computing power is the foundation of AI, asserting that advancements in computing capabilities have transformed the landscape of AI technology [7] - Huang revealed that NVIDIA's computing power has increased by 100,000 times over the past decade, allowing for more effective machine learning [7] Group 3: Open Source and Collaboration - Huang noted that China leads in the number of AI research papers published globally, with researchers collaborating on open-source projects to advance AI technology [8] - He stressed the importance of open-source engineering, which allows contributions from individuals and organizations, thereby accelerating innovation in the AI ecosystem [8] Group 4: AI's Impact on Science and Society - AI is poised to reshape scientific paradigms, with applications in drug design and climate modeling, showcasing its potential to revolutionize various fields [9] - Huang provided advice to young people, encouraging them to embrace AI and understand its foundational principles, as it presents significant opportunities for future generations [9]
完全开源的7B模型,性能比肩主流LLM,训练成本仅16万美元,复现DeepSeek的强化学习!
AI科技大本营· 2025-05-14 09:31
责编 |梦依丹 出品丨AI 科技大本营(ID:rgznai100) 自从 GPT-3 横空出世,生成式 AI 彻底点燃了全球科技圈: 尽管 LLMs 如 GPT-4、Claude 等展现了惊人的能力,但闭源模型的闭源特性让研究者难以深入理解其运作机制,同时开源模型的开放程度有限: Moxin-7B:从预训练到强化学习,全面透明的 AI 革新 Moxin-7B 的诞生,正是为了解决这一问题! 它由来自东北大学、哈佛、康奈尔等机构的研究团队联合开发,完全遵循"开源科学"原则,公开了从数据 清洗到强化学习的全流程细节,从预训练到 DeepSeek 同款强化学习,成为目前透明度最高的开源 LLM 之一。 2. 高性能低成本:小模型的大能量 零样本任务:在 ARC-C(AI2推理挑战)上达到 58.64%,超越 LLaMA 3.1-8B(53.67%)和 Qwen2-7B(50.09%)。 数学推理:经过 RL 微调后,在 MATH-500 上准确率 68%,超越 70B 参数的Llama-3-Instruct 模型(64.6%)。 长上下文支持:通过滑动窗口注意力(SWA)和分组查询注意力(GQA),高效处理 32K ...