Workflow
DeepSeek
icon
Search documents
“死了么”App宣布改名;美股三大指数集体收跌|21早新闻
Macro Economy - The Ministry of Industry and Information Technology (MIIT) issued an action plan for the high-quality development of industrial internet platforms from 2026 to 2028, aiming to have over 450 influential platforms and more than 120 million industrial devices connected by 2028, with a platform penetration rate exceeding 55% [2] - MIIT held the 18th manufacturing enterprise symposium, emphasizing active participation in industry rule-making and self-regulation to promote a win-win environment and protect industry development [2] Investment News - On January 13, A-shares experienced a collective pullback, with the Shanghai Composite Index ending a 17-day winning streak, closing down 0.64%. The total trading volume in Shanghai, Shenzhen, and Beijing reached 3.7 trillion, setting a new historical record [5] - The Hang Seng Index rose by 0.9%, with the Hang Seng Technology Index increasing by 0.11%. Pharmaceutical stocks performed well, with WuXi AppTec rising over 8% and WuXi Biologics nearly 6%. The H-share listing of Zhaoyi Innovation saw a peak increase of over 53% on its first day, closing with a 37.53% rise, valuing the company at 155.2 billion HKD [5] - As of January 12, the management scale of ETF products under Huaxia Fund officially surpassed 1 trillion RMB, reaching 1,016.424 billion RMB, making it the first domestic ETF manager to enter the "trillion club" [5] - Approximately 130 A-share listed companies have disclosed their 2025 performance forecasts, with around 70 companies expecting positive results, including profit increases and recoveries [5] - A total of 186 listed companies have been subject to concentrated research by public funds and brokerages, with the number of research instances reaching 220, indicating strong interest in popular stocks like Aipeng Medical and Entropy Technology [5] Company Movements - Guizhou Moutai is establishing a dynamic adjustment mechanism for retail prices based on market orientation, aiming for a "market-following, relatively stable" self-operated system [6] - The "Dead or Alive APP" will officially adopt the global brand name Demumu in its upcoming new version [7] - Alibaba Cloud has completed further strategic investment in ZStack, achieving a controlling stake, and plans to create a standardized and inclusive cloud-edge integrated solution [7] - DeepSeek published a new paper on conditional memory for large language models, co-authored with Peking University [7] - Huaxia Happiness expects a net loss of 16 billion to 24 billion RMB in 2025, with potential delisting risk warnings for its stock [8] - Zhongwen Online anticipates a net loss of 580 million to 700 million RMB in 2025, as its overseas short drama business is still in the investment phase [8] - Luxshare Precision has terminated its acquisition of the business asset package held by India's Wingtech [8]
Microsoft AI Diffusion Report: How AI is being adapted worldwide
Microsoft· 2026-01-13 16:59
There are problems that you cannot solve without using AI, and that's the part that we see so much opportunity for the world. I mean, it goes back to the history of every general-purpose technology. The countries that grow the most over the course of years and decades are not necessarily the ones that produce it.It's the ones that figure out how to use it across the economy. That's what this report’s about. What this dataset provides is a broad view of what is happening across platforms. And we have a full ...
腾讯研究院AI速递 20260114
腾讯研究院· 2026-01-13 16:29
Group 1 - Anthropic has launched an AI office tool called Cowork, designed to automate daily tasks such as document creation, planning, data analysis, and file organization [1] - Cowork features proactive and autonomous capabilities, allowing it to create plans and sync progress in real-time, and integrates with external information sources and Chrome [1] - The development of Cowork took only a week and a half, with 100% of the code written by Claude Code, ensuring user control and the ability to halt operations at any time [1] Group 2 - Apple has announced a partnership with Google to develop the next generation of its foundational model based on Gemini, which will also overhaul Siri [2] - The Apple AI team has experienced significant talent loss, with dozens of core members leaving, making collaboration with Google a necessary choice due to Gemini's 1.2 trillion parameters compared to Apple's 150 billion [2] - Google processes 13 trillion tokens monthly, and Gemini has captured over 20% of the global market share, while Elon Musk criticized the concentration of power in this partnership [2] Group 3 - DeepSeek has introduced a new paper proposing a conditional memory module called Engram, which complements MoE conditional computation and addresses the lack of native knowledge retrieval in Transformers [3] - Engram significantly outperforms pure MoE baselines, with improvements in MMLU by 3.4, BBH by 5.0, and HumanEval by 3.0, while increasing long-context retrieval accuracy from 84.2% to 97.0% [3] - The upcoming DeepSeek V4 is becoming clearer, with conditional memory expected to be a core modeling primitive for the next generation of sparse large models [3] Group 4 - OpenAI has acquired AI healthcare startup Torch for approximately $100 million, with $60 million paid upfront and the remainder for employee retention incentives [4] - Torch integrates with healthcare systems like Kaiser Permanente and Apple Health, allowing for unified access to lab results, prescriptions, and medical records, while using AI for classification and health insights [4] - The founding team of Torch has joined OpenAI to develop the ChatGPT Health module, following their previous experience with an online clinic platform [4] Group 5 - Anthropic has launched HIPAA-compliant AI services for healthcare, enabling institutions and individuals to process protected health data while referencing authoritative databases [6] - Claude can export personal health data from applications like Apple Health for aggregation and understanding, with a commitment not to use any medical user data for model training [6] - Over 22,000 clinical service providers from Banner Health are using Claude, with 85% reporting increased work efficiency, and collaborations with major healthcare institutions are underway [6] Group 6 - Baichuan has released the open-source medical model M3, achieving a top score of 65.1 in HealthBench and winning the Hard category with a score of 44.4, surpassing GPT-5.2 [7] - M3 introduces native end-to-end serious inquiry capabilities, following the SCAN principles, and demonstrates superior inquiry abilities compared to average human doctors [7] - M3 employs a dynamic Verifier System and a new SPAR algorithm to address long dialogue training issues, with applications already integrated for doctors and patients [7] Group 7 - OpenAI is set to produce a special audio product called "Sweetpea," designed to replace AirPods, with mass production planned by Foxconn by Q4 2028 [8] - The device, designed by Jony Ive's team, features a metallic design resembling a pebble and includes two capsule-like units for behind-the-ear wear, with a focus on local AI processing [8] - The product is expected to launch in September 2026, with an estimated first-year shipment of 40-50 million units, allowing users to control functions via commands instead of an iPhone [8] Group 8 - Meituan has introduced a new sparse attention mechanism called LoZA, replacing 50% of low-performance MLA modules with a streaming sparse attention structure [9] - The new mechanism improves decoding speed for 128K context by 10 times and preloading speed for 256K context by 50%, while reducing computational complexity to linear O(L·S) [9] - LoZA can be implemented without retraining from scratch, featuring a design that balances local detail and overall logic within sparse windows [9] Group 9 - MIT Technology Review has released its list of the top ten breakthrough technologies for 2026, including large-scale AI data centers, sodium-ion batteries, base editing, and advanced nuclear reactors [10][11] - The report highlights the significant energy consumption of large-scale data centers and the successful application of sodium-ion batteries in specific vehicle models [11] - It emphasizes the shift in AI development focus from "what can be done" to "what should be done," with ethical considerations becoming a central theme in life sciences [11] Group 10 - The CEO of Fal platform revealed that generating a 5-second 24-frame video consumes 12,000 times the computational power of generating 200 tokens of text, with 4K resolution requiring ten times more [12] - The platform supports over 600 generative media models, with top clients using an average of 14 different models simultaneously, indicating a trend towards scaling AI-generated content [12] - The discussion suggests that as content generation becomes limitless, finite intellectual property will gain more value, with education and personalized advertising identified as promising application areas [12]
美股三大指数涨跌不一 贵金属、半导体板块走强
Sou Hu Cai Jing· 2026-01-13 14:52
| | 英特尔[INTC] 2026-01-13 22:50 | | 价格 ■ 均线 | | | | --- | --- | --- | --- | --- | --- | | 46.779 | | | | | 6.17 % | | 46.099 | | | | | 4.63 % | | 45.419 | | | | | 3.09 % | | 44.740 | | | | | 1.54% | | 44.060 | | | | | 0.00 % | | 43.380 | | | | | 1.54 % | | 42.701 | | | | | 3.09 % | | 42.021 | | | | | 4.63 % | | 41.341 | | | | | 6.17 % | | 22:30 | 24:00 | 02:00 | 03:30 | 05:00 | | | 694.13万 | | | | | | | 520.60万 | | | | | | | 347.06万 | | | | | | | 173.53万 | | | | | | | 100 for to | | | | | | 费城半导体指数美股上涨0.9%,A ...
DeepSeek母公司去年进账50亿,够烧2380个R1
3 6 Ke· 2026-01-13 13:02
Core Insights - DeepSeek has not engaged in new financing or significant commercialization activities despite the buzz surrounding large model players in the market [1] - DeepSeek continues to produce high-quality research papers, indicating a stable output of academic contributions [2] - The financial success of its parent company, Huanfang Quantitative, which earned approximately $7 billion last year, provides substantial funding for DeepSeek's research endeavors [6][8] Group 1: Financial Performance - Huanfang Quantitative's funds are showing impressive returns, with nearly all of its funds projected to yield over 55% in 2025 [3] - The average return for quantitative funds in China last year was 30.5%, significantly outperforming global competitors [4] - Huanfang Quantitative's asset management exceeds $70 billion, contributing to its substantial earnings [7] Group 2: Research and Development - DeepSeek's research expenditures are relatively low, with the latest V3 training costing $557,600 and R1 costing $29,400, allowing for the potential production of numerous models with available funds [6] - DeepSeek has maintained a focus on AGI research without the pressure of immediate financial returns, as it has not accepted external funding and is not tied to any major tech company [11][15] - The company has consistently released significant research outputs, including recent advancements in OCR and V3.2, while also open-sourcing components like the memory module [9][10] Group 3: Market Position and Strategy - DeepSeek operates with a unique business model that allows it to focus solely on AGI without the distractions of monetization pressures [10][12] - The company benefits from a stable and committed research team, with minimal turnover and even some returning members, indicating a strong internal culture [28][30] - DeepSeek's research outputs have become valuable to investors, as its technical papers provide insights that influence stock movements in related hardware companies [34][39] Group 4: Competitive Landscape - Compared to other major players like OpenAI, DeepSeek's approach is characterized by a lack of aggressive monetization strategies, focusing instead on pure research [26][9] - The ability to leverage a mature business model for cross-subsidization of AI research is often underestimated in the market [19][20] - DeepSeek's model integrates the strengths of both established companies and pure AI startups, positioning it uniquely in the competitive landscape [26]
梁文锋署名DeepSeek最新论文,提出新方法突破GPU内存限制
Xin Lang Cai Jing· 2026-01-13 12:33
Core Viewpoint - DeepSeek, a Chinese AI startup, has developed a new model training technique that bypasses GPU memory limitations, enhancing cost efficiency and performance in AI model training [1][3]. Group 1: Technology and Innovation - DeepSeek and researchers from Peking University introduced a "conditional memory" technique called "Engram" to address the limitations of high bandwidth memory (HBM) in scaling AI models [3][4]. - The Engram technology allows for more efficient retrieval of foundational information by decoupling computation from storage, improving the model's performance in handling long contexts [4][6]. - In a model with 27 billion parameters, the new technique improved performance on key industry benchmarks by several percentage points, preserving capacity for complex reasoning tasks [4][6]. Group 2: Competitive Landscape - The HBM gap between China and the US is significant, with Chinese storage chip manufacturers lagging behind their US and South Korean counterparts [4]. - DeepSeek's previous model, DeepSeek-R1, was trained in two months at a cost of $5.5 million, significantly lower than the expenses incurred by US companies like OpenAI, while achieving comparable performance [6][7]. - Microsoft President Brad Smith highlighted that Chinese companies like DeepSeek are rapidly gaining ground in the global AI market, particularly in emerging markets, due to their low-cost open-source models [7]. Group 3: Future Developments - Anticipation is building for DeepSeek's upcoming V4 model, expected to launch in mid-February, which is said to possess strong programming capabilities [7].
梁文锋署名DeepSeek新论文,“突破GPU内存限制”
Guan Cha Zhe Wang· 2026-01-13 12:28
Core Insights - DeepSeek, a Chinese AI startup, has published a technical paper introducing a new model training technique that bypasses GPU memory limitations, highlighting its focus on cost efficiency despite existing gaps with leading US firms [1][2] - The new technique, termed "Engram," addresses the bottleneck of limited high-bandwidth memory (HBM) in scaling AI models, which is a significant gap between China and the US in AI hardware [3][4] - The paper has garnered attention from industry professionals in both China and the US, indicating DeepSeek's role as a leader in AI innovation over the past year [1][2] Technical Developments - The paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" presents the "conditional memory" technology aimed at improving the efficiency of AI models when processing long contexts, a major challenge for AI chatbots [2][3] - The Engram technique allows for the decoupling of computation and storage, enhancing the model's ability to retrieve foundational information more efficiently [3][4] - Validation of this technology was conducted on a model with 27 billion parameters, showing performance improvements in key industry benchmarks [3] Market Position and Competition - DeepSeek's previous model, DeepSeek-R1, was trained in two months at a cost of $5.5 million, significantly lower than competitors like OpenAI, while achieving comparable performance [6][7] - Microsoft President Brad Smith has noted that US AI companies are being surpassed by Chinese competitors like DeepSeek, particularly in emerging markets due to the low-cost and user-friendly nature of Chinese open-source models [7] - Anticipation is building for DeepSeek's upcoming V4 model, expected to launch in mid-February, which is said to possess strong programming capabilities [8]
DeepSeek V4诞生前夜?梁文锋署名新论文发布
华尔街见闻· 2026-01-13 11:01
Core Viewpoint - The article discusses a groundbreaking paper by DeepSeek and Peking University that introduces a new module called Engram, which separates memory from computation in AI models, leading to a significant increase in reasoning capabilities [3][12]. Group 1: Introduction of Engram Module - DeepSeek's Engram module represents a supply-side reform in AI model architecture, allowing static knowledge to be stored separately from computational tasks, thus enhancing AI's reasoning abilities [3][14]. - The Engram module is inspired by the classic N-gram concept from natural language processing, modernized to allow for efficient retrieval of static knowledge with a time complexity of O(1) [15][16]. Group 2: Technical Innovations - Engram utilizes a large, scalable embedding table to store static knowledge, allowing for direct retrieval without complex computations, contrasting with traditional Transformer models where knowledge is embedded in weights [18]. - Three technical barriers were addressed: - A. Vocabulary compression reduced the effective vocabulary size by 23% through normalization of semantically similar terms [19]. - B. Multi-head hashing resolves hash collisions by mapping multiple N-grams to limited memory slots, enhancing robustness [20]. - C. Context-aware gating acts as a referee to filter out irrelevant static knowledge based on the current context [21][22]. Group 3: Resource Allocation and Model Performance - A large-scale ablation study revealed a U-shaped scaling law for resource allocation, indicating that the optimal distribution of parameters is approximately 75%-80% for Engram and 20%-25% for MoE, minimizing loss [30][31]. - The introduction of Engram not only improved knowledge tasks but also unexpectedly enhanced performance in logic, coding, and mathematics, with significant score increases across various benchmarks [39][40]. Group 4: Engineering Breakthroughs - Engram's architecture allows for a separation of memory and computation, enabling large models to offload memory to cheaper, scalable CPU resources, thus reducing reliance on expensive GPU memory [46][49]. - This separation allows for prefetching of memory data, maintaining high throughput even with large parameter sizes, which is a significant advantage for future AI model development [51][52]. Group 5: Future Implications - The upcoming DeepSeek V4 model is expected to integrate Engram technology, achieving a balance between computation and memory, enhancing both knowledge capacity and reasoning capabilities while reducing inference costs [61][64]. - The paper signals a shift in the AI industry towards architectural innovation, moving away from merely increasing computational power and parameters, and redefining competitive standards in AI development [65].
微软总裁炒作:争夺西方以外AI用户方面,美国公司正被中国竞争对手超越
Guan Cha Zhe Wang· 2026-01-13 10:51
Core Insights - Microsoft warns that U.S. AI companies are being surpassed by Chinese competitors, particularly in emerging markets, due to the advantages of low-cost open-source models [1][4] - The research indicates that DeepSeek's R1 model has significantly accelerated AI adoption in global southern countries, leading to a shift in market share towards China [1][5] Group 1: Competitive Landscape - Microsoft President Brad Smith highlights that China now possesses multiple competitive open-source models, contrasting with U.S. companies that maintain strict control over their advanced technologies [1][4] - DeepSeek has captured significant market shares in Africa, with 18% in Ethiopia and 17% in Zimbabwe, showcasing its rapid growth in these regions [3][4] - In countries where U.S. tech products are restricted, DeepSeek holds even larger market shares, such as 56% in Belarus and 49% in Cuba [4] Group 2: Market Dynamics - The application of AI is primarily concentrated in developed countries, with 25% of the population in global northern countries using AI compared to 14% in global southern countries [4] - Smith expresses concern over the widening AI gap, warning that it could exacerbate economic disparities between the global north and south [4] - The need for increased investment from international development banks and financial institutions is emphasized to build data centers and subsidize electricity costs in Africa [3][4] Group 3: Industry Reactions - OpenAI CEO Sam Altman acknowledges the competitive threat posed by DeepSeek and admits that OpenAI's closed strategy may have flaws [6] - Altman praises DeepSeek's latest model as "very good," indicating a potential shift in OpenAI's approach to open-source AI software [6]
微软急了:西方以外的市场,中国领先
Guan Cha Zhe Wang· 2026-01-13 10:30
Core Insights - Microsoft warns that U.S. AI companies are being surpassed by Chinese competitors in the race for users outside the West, with China's low-cost open-source models being a significant advantage [1][2] - Microsoft's research indicates that DeepSeek's R1 model has accelerated AI adoption in emerging markets, particularly in the Global South, allowing China to surpass the U.S. in the global market share of open-source AI models [1][2] - The competition is intensifying, with DeepSeek achieving significant market shares in countries like Ethiopia (18%) and Zimbabwe (17%) [3] Group 1 - Microsoft President Brad Smith emphasizes the need for international investment in African data centers to compete with heavily subsidized Chinese firms [3][4] - DeepSeek has gained substantial market shares in countries under U.S. sanctions, such as Belarus (56%), Cuba (49%), and Russia (43%) [4] - The application of AI is currently concentrated in developed countries, with only 14% of the population in Global South countries using AI compared to nearly a quarter in Global North countries [4][5] Group 2 - Smith warns that neglecting regions like Africa could lead to the emergence of AI systems that do not align with democratic values [5] - DeepSeek's R1 model was trained at a cost of $5.5 million, significantly lower than the expenses incurred by U.S. companies like OpenAI [5][6] - OpenAI's CEO Sam Altman acknowledges the potential flaws in the company's closed strategy and hints at a possible shift towards more open models in response to competition from DeepSeek [6]