Model hallucination - filings, earnings calls, financial reports, news - Reportify

Model hallucination

Search documents

DeepSeek-R1 重磅更新：幻觉降低近 50%，深度思考、推理能力提升

Founder Park· 2025-05-29 14:53

「DeepSeek 一更新，我们就知道又要放假了。」昨天，DeepSeek 宣布其 R1 系列推理模型小版本升级，最新版本 DeepSeek-R1-0528 参数量高达 6850 亿，模型在思维深度和推理方面的能力显著提升。刚刚，DeepSeek 公布了 R1-0528 在各类基准测评上的具体得分情况。R1-0528 在数学、编程与通用逻辑等多个基准测评中成绩亮眼，整体表现接近 o3 与 Gemini-2.5-Pro。 | Benchmarks | DeepSeek-R1- | OpenAI- | Gemini-2.5- | Qwen3- | DeepSeek-R1 | | --- | --- | --- | --- | --- | --- | | | 0528 | o3 | Pro-0506 | 235B | | | AIME 2024 数学竞赛 pass@1 | 91.4 | 91.6 | 90.8 | 85.7 | 79.8 | | AIME 2025 数学竞赛 pass@1 | 87.5 | 88.9 | 83.0 | 81.5 | 70.0 | | GPQA Diamond 科学测试 pass@ ...

Artificial Intelligence

Model hallucination

Artificial Intelligence

DeepSeek-R1-0528

DeepSeek-R1-0528-Qwen3-8B

Artificial Intelligence

Model hallucination

Artificial Intelligence

DeepSeek-R1-0528

DeepSeek-R1-0528-Qwen3-8B