Workflow
DeepSeek
icon
Search documents
'DeepSeek is only the beginning' for #China says professor #tech
Bloomberg Television· 2025-09-15 21:00
How should we look at the Chinese economy right now. Um, still tested. I'd say resilient in some ways if you look at the macro numbers, but still tested by deflationary pressures, real estate way down.But you know what I found out this summer was that there's a real dichotomy between how good high-tech is, how really strong high-tech is going forward. Deep deepseek is really only the beginning, but how weak uh the microlevel economy is on consumption, all that. like you know they're leaning off of policy bu ...
X @Bloomberg
Bloomberg· 2025-09-15 12:47
"DeepSeek is just the beginning."Economics professor and author Keyu Jin tells @flacqua that China is "going after a drastic cost-cutting innovation" for its economic path forward https://t.co/3e1Reko1kT https://t.co/WWhtv2aGnU ...
罗永浩提议与贾国龙公开直播对质;宇树入选MIT“聪明公司”
(原标题:罗永浩提议与贾国龙公开直播对质;宇树入选MIT"聪明公司") 21世纪经济报道新质生产力研究院综合报道 早上好,新的一天又开始了。在过去的24小时内,科技行业发生了哪些有意思的事情?来跟21tech一起看看吧。 【巨头风向标】 罗永浩提议与贾国龙公开直播对质 9月14日,西贝创始人贾国龙在某个行业群内的表态截图流出。贾国龙表示:"我应对方式有错,改。做饭的围着吃饭的转,你说咋好就咋办。"并 称"罗永浩是网络黑嘴,是网络黑社会,太坏了。但他打醒了我,算变相的帮西贝进步。"9月15日凌晨消息,罗永浩针对西贝创始人贾国龙在行业 群里发言并提及自身一事发文称:"贾总,你说我是网络黑社会,我认为你是诬蔑诽谤。这次的事件,总是我说几句,你说几句,容易各说各话, 媒体转来转去也容易出现信息偏差,我们还是找一个大的网络平台直播,当面公平公正冷静理性地对一次话吧。相信这也能澄清西贝的真相,并 且对中国预制菜产业和餐饮行业的健康发展做一些贡献@西贝贾国龙。" DeepSeek、宇树科技等被MIT科技评论评为聪明公司 9月12日,《麻省理工科技评论》"50家聪明公司"最新评选结果揭晓,DeepSeek、宇树科技等明星创企 ...
OpenAI将与微软分成比例降至8%,将获利500亿美元;DeepSeek、宇树科技等被MIT科技评论评为聪明公司丨AIGC日报
创业邦· 2025-09-15 00:08
Group 1 - Penske Media Corporation (PMC) has filed a lawsuit against Google, accusing the tech giant of illegally using its news content to generate AI summaries, resulting in decreased website traffic [2] - DeepSeek and Yushu Technology have been recognized as "smart companies" by MIT Technology Review, highlighting their innovative use of technology and understanding of market opportunities [2] - A study from Turku University in Finland reveals that GPT-4V can assess social situations similarly to humans, which could enhance efficiency in brain science experiments and have applications in medical, security, and market analysis fields [2] Group 2 - OpenAI plans to reduce its revenue-sharing ratio with Microsoft to 8%, which is expected to generate an additional $50 billion in revenue [2]
大模型碰到真难题了,测了500道,o3 Pro仅通过15%
机器之心· 2025-09-14 03:07
Core Insights - The article discusses the development of a new benchmark called UQ (Unsolved Questions) to evaluate the capabilities of large language models, focusing on unsolved problems that reflect real-world challenges [2][3][5] - UQ consists of 500 challenging questions sourced from the Stack Exchange community, designed to assess reasoning, factual accuracy, and browsing capabilities of models [3][8] - The study highlights the limitations of existing benchmarks, which often prioritize difficulty over real-world applicability, and proposes a continuous evaluation method through community validation [1][5] Group 1 - UQ is a test set of 500 unsolved questions covering various topics, including computer science, mathematics, and history, aimed at evaluating model performance in a realistic context [3][8] - The selection process for UQ involved multiple filtering stages, reducing an initial pool of approximately 3 million questions to 500 through rule-based, model-based, and manual reviews [10][11] - The best-performing model in the UQ validation only succeeded in answering 15% of the questions, indicating the high difficulty level of the benchmark [5][7] Group 2 - The UQ validation process employs a composite verification strategy that leverages the strengths of different models to assess candidate answers without requiring standard answers [14][26] - The study found that using a composite validator significantly reduces self-bias and over-optimism in model evaluations, which is a common issue when models assess their own performance [24][25][26] - Results showed that a stronger answer generation model does not necessarily correlate with better answer validation performance, highlighting the complexity of model capabilities [27][28]
清华、上海AI Lab等顶级团队发布推理模型RL超全综述,探索通往超级智能之路
机器之心· 2025-09-13 08:54
Core Insights - The article emphasizes the significant role of Reinforcement Learning (RL) in enhancing the reasoning capabilities of large language models (LLMs), marking a pivotal shift in artificial intelligence development [2][5][16] - It highlights the emergence of Large Reasoning Models (LRMs) that utilize RL to improve reasoning through verifiable rewards, showcasing advancements in complex tasks such as mathematics and programming [3][5][10] Summary by Sections Introduction - The introduction outlines the historical context of RL since its inception in 1998 and its evolution into a crucial method for training intelligent agents to surpass human performance in complex environments [2] Recent Trends - A new trend is emerging where researchers aim to enhance models' reasoning abilities through RL, moving beyond mere compliance to actual reasoning skills [3][5] Overview of RL in LRM - The article reviews recent advancements in RL applied to LLMs, noting significant achievements in complex logical tasks, and identifies RL as a core method for evolving LLMs into LRMs [5][12] Foundational Components - The foundational components of RL for LRMs include reward design, policy optimization, and sampling strategies, which are essential for effective model training [13][14] Foundational Problems - Key challenges in RL for LRMs include the design of appropriate reward signals, efficient scaling under computational and data constraints, and ensuring reliability in practical applications [12][16] Training Resources - The article discusses the necessary training resources, including static corpora, dynamic environments, and RL infrastructure, emphasizing the need for standardization and development [13][15] Applications - RL has been applied across various tasks, including coding, agentic tasks, multimodal tasks, and robotics, showcasing its versatility and potential for broader applications [13][15] Future Directions - Future research directions for RL in LLMs include the development of new algorithms, mechanisms, and functionalities to further enhance reasoning capabilities and address existing challenges [15][16]
How Baidu (BIDU) Is Positioning Its AI Against OpenAI, Google, and DeepSeek
Yahoo Finance· 2025-09-12 21:33
Core Insights - Baidu, Inc. has released an updated version of its proprietary reasoning model, X1.1, which showcases capabilities comparable to advanced AI systems from competitors like DeepSeek, OpenAI, and Google [1][4] - The X1.1 model has demonstrated a 34.8% improvement in knowledge accuracy and enhanced capabilities through a mixed reinforcement learning process [2] - The closed-source X1.1 model is now accessible to corporate clients via Baidu's cloud platform, while individual users can access it through the Ernie Bot website and app [3] Company Overview - Baidu, Inc. is recognized as a leading Chinese internet giant and AI pioneer, with significant investments in artificial intelligence technology and a dominant position in the country's search engine market [4]
吴世春:2025,AI重塑一切
FOFWEEKLY· 2025-09-12 10:01
Core Viewpoint - The year 2025 is seen as a watershed moment for the AI era, with a strong emphasis on the necessity of believing in trends to capitalize on opportunities, particularly in AI [3][11]. Investment Landscape - Early-stage investment is crucial in the equity investment market, as it initiates entrepreneurial ventures [7]. - In the robotics sector, funding has surged, with the financing amount in the first eight months of this year exceeding the total for the previous year by 80% [4]. - The focus of capital has shifted from "technology stories" to "mass production capabilities," indicating a preference for commercial viability [4]. AI Trends and Opportunities - The rise of DeepSeek is prompting a global reassessment of Chinese tech assets, marking 2025 as the true beginning of the AI era [6][10]. - AI is driving a transformation in the physical world, necessitating a redesign of all hardware, including toys, intelligent robots, drones, and autonomous vehicles [9][10]. - The "Artificial Intelligence +" strategy has been elevated to a national strategy, pushing for industrial upgrades [11]. Competitive Landscape - To avoid the pitfalls of homogenized competition, companies must engage in differentiated competition, focusing on personalized demand-side strategies [15][16]. - The essence of "involution" is profit shrinkage due to homogeneous competition, necessitating a shift towards unique value propositions [15]. Entrepreneurial Strategies - Entrepreneurs are encouraged to focus on niche markets and create unique value propositions rather than relying on low-cost competition [16]. - The importance of organizational capability is emphasized, with a need for companies to leverage AI to streamline processes and enhance collaboration [17]. Investment Directions - The investment focus is on two main areas: AI agents' application fields and verticalized AI infrastructure [20]. - In the robotics sector, several innovative companies are being supported, including those specializing in humanoid robots and industrial automation [21]. Conclusion - The entrepreneurial journey is challenging, and the goal is to assist aspiring entrepreneurs in becoming impactful leaders in the AI era [23].
GPT-5 为啥不 “胡说” 了?OpenAI 新论文讲透了
腾讯研究院· 2025-09-12 08:58
Core Viewpoint - The article discusses the advancements and challenges of OpenAI's GPT-5, particularly focusing on the significant reduction in hallucination rates compared to previous models, while also highlighting the underlying mechanisms and implications of these changes [5][6][25]. Group 1: Hallucination Rates and Mechanisms - GPT-5 has a hallucination rate that is approximately 45% lower than GPT-4 and about 80% lower than OpenAI's earlier models [6]. - The reduction in hallucination rates is attributed to enhanced reinforcement learning techniques that allow models to refine their reasoning processes and recognize their errors [8][9]. - The paper published by OpenAI indicates that hallucinations are an inevitable byproduct of the statistical learning nature of language models, making it more challenging to generate reliable information than to assess its reliability [12][16]. Group 2: Theoretical Framework - OpenAI introduces a theoretical "Is-It-Valid" (IIV) judgment mechanism that determines the validity of generated sentences based on their internal probabilities [13]. - The model's tendency to generate plausible-sounding but incorrect information is exacerbated by data sparsity, complexity, and noise in training data [14][16]. - The mathematical conclusion presented in the paper suggests that the error rate of generative models is at least double that of the IIV judgment errors, indicating a compounding effect of judgment mistakes on hallucinations [15][16]. Group 3: Post-Training Challenges - Post-training processes have not effectively mitigated hallucinations, as current evaluation metrics tend to reward models for providing confident but potentially incorrect answers [18][24]. - The article critiques the binary scoring systems used in mainstream AI evaluations, which penalize uncertainty and discourage models from expressing "I don't know" [21][24]. - The reinforcement learning processes that utilize binary reward paths may inadvertently promote overconfidence in models, leading to increased hallucination rates [27][29]. Group 4: Future Directions and Solutions - The article suggests that introducing a penalty-based scoring mechanism during post-training could help models better calibrate their confidence levels and reduce hallucinations [33]. - A shift from a score-optimization focus to a truth-oriented approach is proposed as a potential solution to the hallucination problem [34].
你的AI越来越蠢?因为它学会见人下菜碟了
创业邦· 2025-09-12 03:14
Core Viewpoint - The article discusses the perceived decline in the performance of AI models, particularly OpenAI's ChatGPT, highlighting a trend where AI models are designed to conserve resources by reducing their computational effort when possible [6][13][18]. Group 1: AI Model Performance - OpenAI's ChatGPT was found to struggle with basic arithmetic, raising concerns about its current capabilities compared to earlier versions [6][7]. - The introduction of models like LongCat and DeepSeek indicates a shift in the industry towards efficiency, with these models employing mechanisms to optimize token usage and processing [10][15][24]. Group 2: Cost Efficiency and Token Management - AI companies are implementing strategies to reduce token consumption, with OpenAI's GPT-5 reportedly saving 50%-80% in output tokens, which translates to significant cost savings for large organizations [13][18]. - The concept of a "perceptual router" has been introduced, allowing models to determine when to engage in complex processing versus simpler tasks, thereby enhancing efficiency [22][24]. Group 3: User Experience and Model Limitations - The new routing mechanisms have led to instances where models fail to engage deeply with user prompts, resulting in a lack of nuanced responses [30][34]. - Users have expressed frustration over the perceived loss of control and depth in interactions with AI models, particularly with the introduction of a one-size-fits-all approach [29][30].