Open-source model

Search documents
X @The Economist
The Economist· 2025-08-08 15:25
It is six months since a barely known AI startup, DeepSeek, caused a huge stir by releasing an open-source model trained for a sliver of the cost of fancier Western ones. Its breakthrough has helped shift China’s approach to AI in profound ways https://t.co/0EVvCxXDJw https://t.co/1mRVAwJmzU ...
OpenAI Dropped a FRONTIER Open-Weights Model
Matthew Berman· 2025-08-05 17:17
Model Release & Capabilities - Open AAI released GPTOSS, state-of-the-art open-weight language models in 120 billion and 20 billion parameter versions [1] - The models outperform similarly sized open-source models on reasoning tasks and demonstrate strong tool use capabilities [3] - The models are optimized for efficient deployment on consumer hardware, with the 120 billion parameter version running efficiently on a single 80 GB GPU and the 20 billion parameter version on edge devices with 16 GB of memory [4][5] - The models excel in tool use, few-shot learning, function calling, chain of thought reasoning, and health issue diagnosis [8] - The models support context lengths of up to 128,000 tokens [12] Training & Architecture - The models were trained using a mix of reinforcement learning and techniques informed by OpenAI's most advanced internal models [3] - The models utilize a transformer architecture with a mixture of experts, reducing the number of active parameters needed to process input [10][11] - The 120 billion parameter version activates only 5 billion parameters per token, while the 20 billion parameter version activates 36 billion parameters [11][12] - The models employ alternating dense and locally banded sparse attention patterns, group multi-query attention, and RoPE for positional encoding [12] Safety & Security - OpenAI did not put any direct supervision on the chain of thought for either OSS model [21] - The models were pre-trained and filtered to remove harmful data related to chemical, biological, radiological, and nuclear data [22] - Even with robust fine-tuning, maliciously fine-tuned models were unable to reach high capability levels according to OpenAI's preparedness framework [23] - OpenAI is hosting a challenge for red teamers with $500,000 in awards to identify safety issues with the models [24]
This might be OpenAI's New Open-Source Model...
Matthew Berman· 2025-08-01 00:00
Model Capabilities & Performance - Horizon Alpha demonstrates impressive spatial awareness and problem-solving skills, accurately visualizing complex rotations [1] - The model exhibits multimodal capabilities, effectively understanding and interpreting images with speed [2] - Horizon Alpha successfully solves the Tower of Hanoi puzzle despite lacking chain-of-thought reasoning [6] - The model shows an ability to recognize its limitations, indicating when it lacks knowledge [20][21] - Horizon Alpha achieves top rankings in creative writing and emotional intelligence benchmarks [23][11] Model Characteristics & Limitations - Horizon Alpha is a fast model, outputting tokens at approximately 150 tokens per second [2] - The model lacks a "thinking mode," initially outputting the first response that comes to mind [2] - Horizon Alpha provides incorrect answers to simple logic and percentage-based questions [7][8] - The model refuses to provide instructions for illegal activities, such as hotwiring a car [8][9] - The model incorrectly identifies itself as a GPT4 class model from OpenAI, despite likely being an open-source model [9] Open Router & Box AI - Horizon Alpha is available on Open Router and free to use [1] - Box AI allows users to leverage the latest AI models, including open-source options, for document workflows with enterprise-level security [3][4]
Meta Adds ‘Multimodal' Models to Its Llama AI Stable
PYMNTS.com· 2025-04-06 19:50
Group 1 - Meta has launched its latest Llama AI models, including Llama 4 Scout and Llama 4 Maverick, which are described as the first open-weight natively multimodal models capable of processing various media types beyond text [1] - The company plans to invest up to $65 billion in AI by 2025 to enhance its artificial intelligence capabilities [2] - Meta is exploring premium subscription trials for its AI assistant, Meta AI, aimed at functionalities like booking reservations and video creation [3] Group 2 - OpenAI is planning to release an open-source version of its LLM, the first since GPT-2 in 2019, and is seeking feedback from developers and the public [4] - OpenAI's shift to proprietary models followed a $1 billion investment from Microsoft, which has invested over $13 billion in total [5] - Open-source models, including Meta's Llama, are gaining traction, with Llama reportedly downloaded 1 billion times since its launch in 2023 [6]
Baidu, once China's generative AI leader, is battling to regain its position
CNBC· 2025-03-18 06:12
Core Viewpoint - Baidu is launching new AI models to regain its competitive edge in the AI market, particularly focusing on reasoning capabilities and open-source strategies [1][2][9] Group 1: New AI Models - Baidu has introduced its first reasoning-focused AI model, ERNIE X1, which claims to match the performance of DeepSeek's R1 model at half the cost [4] - The new models are part of Baidu's strategy to catch up with competitors who have already released advanced AI models [5][6] Group 2: Competitive Landscape - Baidu's Ernie chatbot has struggled to gain widespread adoption, falling behind competitors like Alibaba and ByteDance [6][7] - Experts indicate that Baidu's slow innovation pace and reliance on proprietary models have hindered its competitiveness [5][7][8] Group 3: Shift in Strategy - Baidu is shifting towards an open-source model strategy, which contrasts with its previous proprietary approach [9] - This shift is seen as a response to the success of open-source models from competitors like DeepSeek, Alibaba, and Tencent [9] Group 4: Advantages and Future Outlook - Baidu maintains advantages due to its extensive user base and popular applications, which can support its AI initiatives [11] - The company possesses significant data resources, which are crucial for AI development, as highlighted by its CEO [12]