Deepseek V3/R1
Search documents
神秘资金持续加码AI应用!软件龙头ETF(159899)盘中连续第7日净流入!石基信息、四方精创、莱斯信息跌幅居前
Sou Hu Cai Jing· 2026-01-20 07:00
消息面上,政策加码工业互联网与AI融合,推动平台向"质优效高"转变。工信部印发《推动工业互联网平台高质量发展行动方案(2026—2028 年)》,提出到2028年具有影响力的平台超450家、工业设备连接数突破1.2亿台(套)、平台普及率达到55%以上;该行动方案与《工业互联 网与人工智能融合赋能行动方案》及《"人工智能+制造"专项行动实施意见》形成政策合力。 近期市场预期部分大厂将于春节前发布旗舰版大模型,包括字节、Deepseek等可能发布新版的大模型,进一步缩小与海外的性能差距。去年因 Deepseek V3/R1的发布导致的行情历历在目,今年春节前部分投资者也怀有较高期待。 GEO概念持续火爆,但这只是AI应用的冰山一角。GEO(生成式引擎优化)是一种针对生成式AI平台(如DeepSeek、豆包等)的优化策略,核 心目标是让企业的品牌、产品或服务,能在AI生成式回答中被优先提及、精准推荐。GEO的产生仅是AI大模型对AI搜索的一个升级;但事实 上,AI大模型是底层生产力的变革,它的应用场景包含传媒、游戏、办公、智能制造、智能驾驶等众多场景,有望重塑更多场景的生产力内 核。 近期产业催化频繁。AI软件商业 ...
AI软件商业化进程加速,关注软件ETF(515230)、计算机ETF(512720)
Mei Ri Jing Ji Xin Wen· 2026-01-15 01:49
Group 1 - The software and computer sectors have shown significant growth, with the Software ETF (515230) rising by 4.38% and the Computer ETF (512720) increasing by 4.36% [1] - The Ministry of Industry and Information Technology has issued an action plan for the high-quality development of industrial internet platforms, aiming for over 450 influential platforms and more than 120 million industrial equipment connections by 2028 [2] - There is a strong market expectation for major companies to release flagship large models before the Spring Festival, with notable companies like ByteDance and Deepseek anticipated to narrow the performance gap with overseas counterparts [2] Group 2 - The GEO (Generative Engine Optimization) concept is gaining traction, representing just the tip of the iceberg in AI applications, focusing on optimizing generative AI platforms for better brand and product visibility [2] - The commercialization of AI software is accelerating, with AI Agents expected to be rapidly deployed by 2026, presenting investment opportunities in software ETFs and computer ETFs [3]
SGLang 推理引擎的技术要点与部署实践|AICon 北京站前瞻
AI前线· 2025-06-13 06:42
Core Insights - SGLang has gained significant traction in the open-source community, achieving nearly 15K stars on GitHub and over 100,000 monthly downloads by June 2025, indicating its popularity and performance [1] - Major industry players such as xAI, Microsoft Azure, NVIDIA, and AMD have adopted SGLang for their production environments, showcasing its reliability and effectiveness [1] - The introduction of a fully open-source large-scale expert parallel deployment solution by SGLang in May 2025 is noted as the only one capable of replicating the performance and cost outlined in the official blog [1] Technical Advantages - The core advantages of SGLang include high-performance implementation and easily modifiable code, which differentiates it from other open-source solutions [3] - Key technologies such as PD separation, speculative decoding, and KV cache offloading have been developed to enhance performance and resource utilization while reducing costs [4][6] Community and Development - The SGLang community plays a crucial role in driving technological evolution and application deployment, with over 100,000 GPU-scale industrial deployment experiences guiding technical advancements [5] - The open-source nature of SGLang encourages widespread participation and contribution, fostering a sense of community and accelerating application implementation [5] Performance Optimization Techniques - PD separation addresses latency fluctuations caused by prefill interruptions during decoding, leading to more stable and uniform decoding delays [6] - Speculative decoding aims to reduce decoding latency by predicting multiple tokens at once, significantly enhancing decoding speed [6] - KV cache offloading allows for the storage of previously computed KV caches in larger storage devices, reducing computation time and response delays in multi-turn dialogues [6] Deployment Challenges - Developers often overlook the importance of tuning numerous configuration parameters, which can significantly impact deployment efficiency despite having substantial computational resources [7] - The complexity of parallel deployment technologies presents compatibility challenges, requiring careful management of resources and load balancing [4][7] Future Directions - The increasing scale of models necessitates the use of more GPUs and efficient parallel strategies for high-performance, low-cost deployments [7] - The upcoming AICon event in Beijing will focus on AI technology advancements and industry applications, providing a platform for further exploration of these topics [8]