Workflow
Seek .(SKLTY)
icon
Search documents
DeepSeek陈德里:这一轮的AI革命,我们还处在上半场 | 直击乌镇
Xin Lang Ke Ji· 2025-11-07 09:36
Core Insights - The discussion at the 2025 World Internet Conference highlighted the limitations of current AI technologies, particularly in their ability to generalize across different domains and perform simple tasks effectively [1] - The concept of "jagged intelligence" was introduced, indicating that while AI can excel in complex areas, it still struggles with tasks that humans find straightforward [1] - The need for AI to develop stable generalization learning algorithms and establish more connections with the real world was emphasized, suggesting a path towards more human-like learning capabilities [1] Group 1 - The leaders of the "Hangzhou Six Little Dragons" gathered for a dialogue at the conference, indicating a collaborative effort among key players in the tech industry [1] - Chen Deli, a senior researcher at DeepSeek, pointed out the current inadequacies of AI in self-iteration and evolution compared to human learning processes [1] - The discussion included the importance of multimodal and embodied intelligence to enhance AI's learning in real-world environments [1] Group 2 - Looking ahead 10 to 20 years, there is optimism about the potential for achieving Artificial General Intelligence (AGI), as technological advancements often accelerate over time [2] - The example of ChatGPT's rapid improvement in solving mathematical problems illustrates the potential for significant breakthroughs in AI capabilities [2] - The sentiment expressed is that the current phase of the AI revolution is still in its early stages, with expectations for future advancements [2]
DeepSeek陈德里:AI变革短期机遇多,但长期可能风险更大
Xin Lang Ke Ji· 2025-11-07 08:51
Core Viewpoint - The discussion at the 2025 World Internet Conference highlights the dual nature of AI development, presenting both short-term opportunities and long-term risks [1][2]. Group 1: Short-term Perspective (3-5 years) - In the short term, AI is expected to enhance human capabilities, allowing for the resolution of more complex problems and the creation of greater value [1]. - This period is characterized as a "honeymoon phase" where AI cannot independently realize its value [1]. Group 2: Mid-term Perspective (5-10 years) - In the mid-term, AI is anticipated to start replacing certain human jobs, leading to increased risks of unemployment [1]. - Technology companies are urged to act as whistleblowers, informing the public about which jobs are irreplaceable and the skills needed for the future [1]. Group 3: Long-term Perspective (10-20 years) - Over the long term, AI may pose significant dangers by potentially replacing a large number of jobs, challenging the existing social order [2]. - Technology companies are expected to play a role in safeguarding human safety and participating in the restructuring of social order [2]. - The current AI revolution is noted to differ significantly from past industrial revolutions, as it may have a more profound impact on human institutions and roles [2].
两周复刻DeepSeek-OCR,两人小团队还原低token高压缩核心,换完解码器更实用
3 6 Ke· 2025-11-07 07:11
Core Insights - A two-person team has successfully replicated the previously acclaimed DeepSeek-OCR in just two weeks, naming their version DeepOCR, which retains the original's low token count and high compression advantages while matching its performance on key tasks [1][3]. Technology and Design - DeepSeek-OCR's design philosophy focuses on "visual compression," using a small number of visual tokens to represent content that would typically require a large number of text tokens, thus reducing computational costs and addressing the challenges of processing long texts with large models [3][4]. - The core strategy of the two-person team was to accurately replicate the original's logical architecture, particularly the DeepEncoder encoder, which follows a three-stage structure: local processing, compression, and global understanding [6][9]. - The first stage involves processing high-resolution images into patches, controlling memory activation to avoid overload, followed by a compression stage that reduces the number of tokens while increasing feature dimensions, and finally, a global attention stage that captures document semantics without causing memory issues [6][7]. Performance Metrics - The DeepOCR version uses approximately 250 visual tokens, which, while slightly less efficient than the DeepSeek-OCR Base version, is significantly more efficient than baseline models like Qwen2.5-VL-7B, which require 3,949 tokens for similar performance [15]. - In foundational tasks, DeepOCR excels in English text recognition and table parsing, with table parsing even outperforming the original version due to precise replication of the original's 2D spatial encoding [15][17]. Training Methodology - The team employed a two-stage training process, freezing the DeepEncoder throughout, which significantly reduced memory requirements. The first stage trained a multi-modal projector, while the second stage involved pre-training the entire model [13][18]. - The training setup was designed to be compatible with the resources of small to medium teams, utilizing two H200 GPUs [13]. Future Developments - The team plans to enhance the model by incorporating additional data types such as formulas, multilingual support, and old scans, as well as experimenting with techniques like dynamic temperature scaling and RLVR to further narrow performance gaps [18]. Team Background - The team consists of Ming Liu, who has a background in applied physics and is currently pursuing a PhD in computer science, and Liu Shilong, who holds degrees in engineering and computer science and is a postdoctoral researcher at Princeton University [19][20].
5款AI原生App月活破千万,字节、腾讯、DeepSeek、蚂蚁纷纷落子
Jing Ji Guan Cha Wang· 2025-11-05 06:38
Core Insights - Ant Group's AI health application AQ has surpassed 10 million monthly active users (MAU) within four months of its launch, making it the fifth AI native app in China to achieve this milestone [1] - The top five AI applications with over 10 million MAU are from major companies including ByteDance, Tencent, DeepSeek, and Ant Group, indicating a competitive landscape in the AI application market [1] - AQ stands out as the only professional-grade AI application among the top five, showcasing significant growth potential with a compound annual growth rate (CAGR) of 83.4%, far exceeding the industry average of 13.5% [1] Company Performance - AQ's rapid user growth positions it as a leading player in the "AI + healthcare" sector, highlighting its potential as a strong contender in the AI application market by 2025 [1] - The competitive dynamics among the top five applications show a stable "tripod" situation with Doubao, DeepSeek, and Yuanbao, while Doubao has recently overtaken DeepSeek in both downloads and MAU [1] - The launch of the all-in-one AI creation platform, Jimeng AI, by ByteDance further enriches its content creation ecosystem, indicating ongoing innovation in the sector [1]
首届AI实盘投资大赛:阿里千问20%收益率夺冠,DeepSeek第二,美国四大模型均亏损
Guan Cha Zhe Wang· 2025-11-04 14:57
Core Insights - The AI investment competition "Alpha Arena" concluded with Alibaba's Qwen3-Max winning with a return of over 20% [1][4] - DeepSeek v3.1 secured the second position, marking a strong performance for Chinese models, while all four leading American models reported losses, with GPT-5 suffering a loss exceeding 60% [1][6] Competition Overview - The competition lasted 17 days and involved six top AI models, including Qwen3-Max, DeepSeek v3.1, GPT-5, Gemini 2.5 Pro, Claude Sonnet 4.5, and Grok 4, with a total investment of $10,000 per model [1][2] - All models received the same market data and prompts, ensuring fairness and transparency, with real-time trading records and account values publicly available [2] Performance Analysis - Initially, DeepSeek v3.1 led the competition, but a significant downturn occurred on October 21-22, causing Grok 4 and Claude Sonnet 4.5 to shift from profit to loss [2][4] - Qwen3-Max and DeepSeek v3.1 adapted their investment strategies during this downturn, allowing them to recover and outperform the other models [4] Final Rankings - The final standings showed Qwen3-Max with an account value of $12,232 and a return of +22.32%, while DeepSeek v3.1 had an account value of $10,489 and a return of +4.89% [8] - The American models, including Claude Sonnet 4.5, Grok 4, Gemini 2.5 Pro, and GPT-5, reported significant losses, with GPT-5 at -62.66% [8] Industry Implications - The success of Chinese models like Qwen and DeepSeek highlights their potential in real-world applications and the importance of understanding AI in practical scenarios [14] - The competition reflects a growing trend in the AI industry towards open-source models, which are seen as crucial for fostering innovation and competitiveness in the global AI landscape [14]
投资大赛:阿里千问、DeepSeek赚了,GPT-5大亏
Nan Fang Du Shi Bao· 2025-11-04 13:41
Core Insights - The first AI large model trading competition initiated by the American AI research lab nof1 concluded, with six leading models participating in autonomous trading using market data without human intervention [1][5][7] - Two Chinese models, Alibaba's Qwen3 Max and DeepSeek Chat V3.1, achieved positive returns, with Qwen3 Max leading at a return rate of 22.3% and a profit of $2,232 [1][2][3] Performance Summary - Qwen3 Max achieved a return of 22.3%, with an account value of $12,232 and a win rate of 30.2% [3] - DeepSeek Chat V3.1 had a return of 4.89%, with an account value of $10,489 and a win rate of 24.4% [3] - Other models, including Claude Sonnet 4.5, Grok 4, Gemini 2.5 Pro, and GPT 5, experienced significant losses, with GPT 5 losing 62.66% [2][3] Trading Dynamics - The competition involved trading cryptocurrency derivatives, including Bitcoin, Ethereum, and Dogecoin, with each model starting with $10,000 [5] - Models were required to process quantitative data and execute trades without access to news or market information [5] - Qwen3 Max maintained the largest position size throughout the competition, while Grok 4 had the longest holding period [6] Model Behavior - Grok 4, GPT-5, and Gemini 2.5 Pro exhibited a higher frequency of short-selling compared to others, while Claude Sonnet 4.5 rarely engaged in short-selling [6] - Qwen3 Max had the narrowest stop-loss and take-profit distances, indicating a more conservative exit strategy [6] - The competition highlighted the need for dynamic testing of models in real market conditions, as opposed to static benchmark tests [7]
首届AI交易大赛落幕,6个AI炒币2周:Qwen、DeepSeek赚钱,GPT-5血亏6000刀
3 6 Ke· 2025-11-04 11:13
Core Insights - The inaugural Nof1 AI Model Trading Competition concluded, designed to measure AI investment capabilities, likened to a "Turing test" for the crypto space [1] - Six AI models participated, representing the latest technology from both Chinese and American developers, with Qwen3 Max emerging as the top performer [1][12] Competition Overview - The competition ran from October 17 to November 3, 2025, with each model starting with $10,000 in initial capital [1] - Trading was conducted on Hyperliquid, focusing on six popular cryptocurrencies: BTC, ETH, SOL, BNB, DOGE, and XRP [3] - The trading strategies were limited to buying, selling, holding, or closing positions, with a focus on mid-frequency trading [3] Performance Results - Qwen3 Max ranked first with a return of 22.3%, total profit of $2,232, and a win rate of 30.2% over 43 trades [2][5] - DeepSeek Chat V3.1 secured second place with a return of 4.89%, total profit of $489.08, and a win rate of 24.4% over 41 trades [2][5] - Other models, including Claude Sonnet 4.5, Grok 4, Gemini 2.5 Pro, and GPT-5, experienced significant losses, with GPT-5 showing the worst performance at -62.66% [4][11] Model Characteristics - Qwen3 Max exhibited an aggressive trading style with a high return and significant trading frequency, reflected in its Sharpe ratio of 0.273 [9] - DeepSeek Chat V3.1 demonstrated a more conservative approach with a higher Sharpe ratio of 0.359, indicating better risk management [9] - Claude Sonnet 4.5 and Grok 4 showed cautious strategies but suffered from low win rates and high losses [10] - Gemini 2.5 Pro and GPT-5 were characterized by high trading activity but poor performance, indicating ineffective strategies [11] Industry Implications - The competition has garnered significant attention, with industry leaders like Binance's founder commenting on the potential impact of AI trading strategies on market dynamics [7] - The results suggest that AI models from China, particularly Qwen3 Max and DeepSeek, are currently outperforming their American counterparts in terms of risk control and trend identification [12]
震荡股市中的AI交易员:DeepSeek从从容容游刃有余? 港大开源一周8k星标走红
Xin Lang Cai Jing· 2025-11-04 09:15
Core Insights - The article discusses the launch of the AI-Trader project by a team led by Professor Huang Chao from the University of Hong Kong, which aims to test AI trading capabilities in a volatile market environment [3][4][19] - The project involves six AI models trading in the Nasdaq 100, each starting with $10,000, and showcases their performance over a month of real trading [4][5] Performance Summary - The AI models exhibited varying performance, with DeepSeek-Chat-V3.1 leading at +13.89%, followed by MiniMax-M2 at +10.72%, and Claude-3.7-Sonnet at +7.12% [5][6] - In comparison, the Nasdaq 100 ETF (QQQ) only increased by +2.30% during the same period, highlighting the effectiveness of the AI models [5] Behavioral Finance Experiment - The experiment serves as a behavioral finance study, testing three key capabilities of AI systems: trading discipline, market patience, and information filtering [6][19] - The results illustrate the differences in algorithmic architecture and decision-making frameworks among the AI models, reflecting typical human investor behaviors [7][18] Individual AI Strategies - **DeepSeek-Chat-V3.1**: Utilized contrarian strategies by increasing positions in NVDA and MSFT during market downturns, achieving a +13.89% return [8] - **MiniMax-M2**: Maintained a balanced portfolio with low turnover, resulting in a +10.72% return, demonstrating the importance of consistency in high-volatility environments [9] - **Claude-3.7-Sonnet**: Focused on long-term value investing, holding positions in major tech stocks despite market fluctuations, yielding a +7.12% return [10] - **GPT-5**: Attempted dynamic rebalancing but faced timing issues, resulting in a +7.11% return [11] - **Qwen3-Max**: Adopted a wait-and-see approach, leading to a lower return of +3.44% due to missed opportunities [12] - **Gemini-2.5-Flash**: Engaged in high-frequency trading but suffered a -0.54% return due to overtrading and emotional decision-making [13] Insights on AI Trading - The experiment revealed that effective trading is not solely about action but also about knowing when to refrain from trading, as demonstrated by the success of DeepSeek and MiniMax [14][19] - The findings suggest that AI can provide valuable insights into investment decision-making processes, emphasizing the management of uncertainty rather than perfect market predictions [19] Future Implications - The AI-Trader project indicates a shift in Chinese AI technology from conversational capabilities to practical task execution, showcasing potential in complex financial decision-making [19] - The financial trading environment serves as an ideal testing ground for AI decision-making capabilities, with future applications anticipated in various sectors such as supply chain optimization and urban management [19]
AI大模型实时投资比赛落幕,阿里千问Qwen以22.32%的收益率夺冠!Qwen和DeepSeek两款中国模型也成为唯二盈利的模型,而四大美国顶尖模型全部亏损
Sou Hu Cai Jing· 2025-11-04 03:41
来源:新浪网 【免责声明】本文仅代表作者本人观点,与和讯网无关。和讯网站对文中陈述、观点判断保持中立,不对所包含内容的准确性、可靠性或完整性提供任何明 示或暗示的保证。请读者仅作参考,并请自行承担全部责任。邮箱:news_center@staff.hexun.com 据悉,历时17天,阿里千问Qwen以22.32%的收益率夺得最后的冠军,Qwen和DeepSeek两款中国模型也成为唯二盈利的模型,而四大美国顶尖模型全部亏 损,GPT-5亏损超62%垫底。(文猛) 新浪讯11月4日上午消息,在AI大模型实时投资比赛"Alpha Arena"上,阿里千问Qwen夺下最终冠军。该竞赛由第三方机构Nof1于10月18日发起,集合 Qwen3-Max、DeepSeek v3.1、GPT-5、Gemini 2.5 Pro、Claude Sonnet 4.5、Grok 4等全球六大顶尖模型,每个模型拥有一万美元初始资金,在真实市场上无 人工干预地自主决策、交易,根据盈亏情况决出最后冠军。 ...
台积电前副总警告:绕过现有架构,大陆说不定走新的路径反超我们,就像DeepSeek把大家都吓到!网友:不是说不定,是一定
Xin Lang Cai Jing· 2025-11-03 10:24
Core Viewpoint - The discussion centers around whether the semiconductor industry in mainland China can find alternative paths to achieve technological advancements and potentially surpass Taiwan's TSMC, especially in light of the challenges faced in advanced process nodes like 3nm and 2nm [1][5]. Group 1: Industry Dynamics - TSMC's former vice president suggested that mainland China might develop new technologies that could bypass the challenges of advanced nodes, such as using 7nm technology to achieve functionalities similar to 5nm [5]. - The semiconductor industry has seen escalating costs in advanced process development, with 3nm research costs reaching billions of dollars, while physical limitations like leakage and heat generation become more pressing [5]. - Despite being behind in traditional processes, mainland China's large market and lower costs provide fertile ground for innovation and experimentation [5]. Group 2: Recent Developments - A domestic research institute recently announced breakthroughs in gallium oxide semiconductor materials, which could significantly enhance performance in high-frequency and high-voltage applications, potentially circumventing existing silicon-based process limitations [5]. - A leading chip design company is reportedly testing a "compute-storage integration" architecture chip, achieving near 5nm AI computing power using 7nm technology, exemplifying the concept of using older nodes to perform tasks typically associated with newer nodes [5]. - The demand for cost-effective chips in the booming domestic electric vehicle and IoT markets is driving the commercialization of these new paths, with local wafer fabs operating at full capacity on 14nm lines [5]. Group 3: Community Reactions - Industry professionals express optimism about the potential for "new path" advancements, citing examples of overcoming Western sanctions and developing new packaging technologies that can match the performance of newer processes while reducing costs [6]. - Historical parallels are drawn, suggesting that the semiconductor industry could replicate past successes in other tech sectors where countries have "leapfrogged" traditional methods to achieve leadership [7]. Group 4: Challenges Ahead - While there are opportunities in pursuing new paths, challenges remain in standardizing technologies like Chiplet, which require collaboration across the supply chain, making it more complex than simply advancing process nodes [7].