Core Insights - ByteDance's Seed team has launched Seed-OSS, an open-source model series that mirrors OpenAI's GPT-OSS strategy, providing a version tailored for the open-source community without directly releasing the core commercial model Doubao [2] - The Seed-OSS model features a native 512K context window, significantly surpassing the 128K context window of mainstream open-source models, enabling it to handle complex tasks requiring extensive information processing [3][5] - The model architecture includes 36 billion parameters, utilizing advanced techniques such as RoPE position encoding and GQA attention mechanism, making it a robust option for various applications [5][6] Model Features - Seed-OSS allows users to set a "Thinking Budget" to control the depth of the model's reasoning, enhancing its adaptability for different task complexities [3] - The model is designed to be trained on integer multiples of 512 tokens, ensuring optimal performance [5] - Two versions of the base model are provided: one with synthetic instruction data for enhanced performance and one without for a purer model [6] Performance Metrics - Seed-OSS-36B-Base achieved a score of 65.1 on the MMLU-Pro benchmark, outperforming similar models like Qwen2.5-32B-Base, which scored 58.5 [7] - In reasoning capabilities, it scored 87.7 on the BBH benchmark, setting a new record for open-source models [7] - The model also demonstrated strong performance in coding tasks, with scores of 76.8 on HumanEval and 80.6 on MBPP [7] Benchmark Comparisons - Seed-OSS-36B-Base outperformed several competitors across various benchmarks, including MMLU-Pro, TriviaQA, and GSM8K, showcasing its superior knowledge and reasoning capabilities [8][9] - The instruction-tuned version, Seed-OSS-36B-Instruct, scored 91.7 on the AIME24 math competition, ranking just below OpenAI's OSS-20B [9] Development Background - The ByteDance Seed team, established in 2023, aims to create advanced AI foundational models, having previously released several impactful projects in niche areas [10] - Recent projects include Seed-Coder, a code generation model, and BAGEL, a multimodal model capable of processing text, images, and videos [12] - The introduction of Seed-OSS adds a significant player to the domestic open-source base model landscape [12]
字节突然开源Seed-OSS,512K上下文碾压主流4倍长度,推理能力刷新纪录
3 6 Ke·2025-08-21 03:55