DeepSeek降本秘诀曝光：2招极致压榨推理部署，算力全留给内部AGI研究

Core Insights - DeepSeek has significantly disrupted the large model market within 128 days of its launch, notably reducing the prices of inference models, with OpenAI's June update pricing dropping to 20% of its previous version [1][2] - The usage of DeepSeek models on third-party platforms has surged nearly 20 times since its release, benefiting numerous cloud computing providers [3] - However, DeepSeek's own website and API market share have been declining, unable to keep pace with the rapid growth of AI products in the first half of the year [4][6] Group 1 - DeepSeek's model usage on its own platform has decreased, with its share of tokens generated dropping to only 16% by May [10][11] - The traffic for DeepSeek's web-based chatbot has also significantly declined, while other major models have seen increases in traffic [13] - Monthly active users for DeepSeek dropped from 614.7 million in February to 436.2 million in May, a decrease of 29% [14] Group 2 - DeepSeek's strategies to reduce costs have led to compromises in service quality, resulting in longer wait times for users on its official platform [15][26] - Other platforms, despite being more expensive, offer much faster response times, with some achieving near-zero latency [16][18] - DeepSeek's context window is limited to 64k, which is among the smallest in the industry, making it inadequate for certain applications [22][23] Group 3 - DeepSeek's approach involves bundling user requests, which lowers the cost per token but increases individual wait times [26] - The company appears to prioritize internal research and development over user experience, focusing on achieving AGI rather than monetizing its services [27][28] - The competition in the AI space is heavily reliant on computational resources, with DeepSeek's strategies reflecting a focus on optimizing these resources [30] Group 4 - Other model providers, like Claude, are also adjusting their speeds to manage computational strain while trying to maintain user experience [31] - Claude's output speed has decreased by 40% since the release of its latest version, yet it remains faster than DeepSeek [32] - The industry is evolving to enhance the intelligence provided by each token, rather than just increasing the overall model capabilities [36]