Workflow
Gemini 2.0
icon
Search documents
吴恩达:图灵测试不够用了,我会设计一个AGI专用版
量子位· 2026-01-10 03:07
光看名字就知道,这个测试专为AGI而生。 去年是AGI水涨船高的一年,吴恩达在其年度总结中也曾表示: 鹭羽 发自 凹非寺 量子位 | 公众号 QbitAI 新年新气象!AI大神 吴恩达 2026年目标公开: 要做一个新的图灵测试,他称之为 图灵-AGI测试 。 2025年或许会被铭记为 人工智能工业时代的开端 。 创新推动模型性能到达新的高度,AI驱动的应用变得不可或缺,顶尖企业人才争夺激烈,基础设施建设推动社会生产总值增长。 学术界和工业界频繁提及AGI概念,硅谷的公司也会为抢先AGI定下季度目标。 但关于AGI的定义至今还没有统一标准,现有基准测试还常常误导大众,使其高估当前的AI水平。 吴恩达注意到该趋势,于是新的图灵测试将试图弥补这一空白。 正如网友所言: 要衡量智能首先要定义智能。 图灵-AGI测试设想 传统的图灵测试在AGI时代显然不够用。 它由艾伦·图灵在上世纪五十年代提出,提出用人机对话来测试机器的智能水平。 在测试过程中,人类评估者需要确定他们是在与人还是与机器交谈。如果机器能够成功骗过评估者,那么就算通过了测试。 但现在的AI显然不再满足于简单的对话交互,而是要构建起经济有用的系统,所以亟 ...
上晚会、进演讲,AI竞争已经进入「大厂时间」
创业邦· 2026-01-05 03:10
以下文章来源于窄播 ,作者窄播 窄播 . 关于商业的认知和乐趣。 来源丨 窄播(ID:exact-interaction) 作者丨李威 2026年,大厂在AI助手、AI硬件、AI编程等关键战略节点上,还会持续投入更多资源进行争夺。同 时,2026年又是被大家寄予厚望的AI应用爆发和AI创新回暖之年, 已经在路上的创业可能需要校正 自己的方向,而准备入局者也需要找到a16z合伙人所说的「绿地」。 图源丨Midjourney 大厂AI应用出现在跨年晚会、跨年演讲上,明星AI创业公司有了新一波资本动作,整个AI行业的竞争 已经进入了大厂主导的时间。 自从开启新一波AI浪潮的ChatGPT 3.5选择在2022年11月发布以来,这几年的年末逐渐成为观察AI 行业发展趋势的一个重要时间节点。 2024年末,OpenAI风头正盛,连续12日的直播发布拿出了一系列新产品,风头直接压过了在同时间 段发布Gemini 2.0的Google。在国内,大模型六小龙也是AI创业公司中绝对的明星,大家曾经对他 们的期待是,从中走出下一个时代的巨头企业。 但到了2025年末,我们看到的是国内外的大厂从基建投入、模型研发、AI应用推广等多 ...
上晚会、进演讲,AI竞争已经进入「大厂时间」
Tai Mei Ti A P P· 2026-01-05 00:57
文 | 窄播,作者 | 李威 大厂AI应用出现在跨年晚会、跨年演讲上,明星AI创业公司有了新一波资本动作,整个AI行业的竞争 已经进入了大厂主导的时间。 自从开启新一波AI浪潮的ChatGPT 3.5选择在2022年11月发布以来,这几年的年末逐渐成为观察AI行业 发展趋势的一个重要时间节点。 2024年末,OpenAI风头正盛,连续12日的直播发布拿出了一系列新产品,风头直接压过了在同时间段 发布Gemini 2.0的Google。在国内,大模型六小龙也是AI创业公司中绝对的明星,大家曾经对他们的期 待是,从中走出下一个时代的巨头企业。 但到了2025年末,我们看到的是国内外的大厂从基建投入、模型研发、AI应用推广等多个层次发力, 已经在AI入口、AI算力等关键领域主导了AI叙事的走向,大厂和创业公司之间的竞争格局已经悄然发 生变化。 一个标志性体现就是,千问、豆包、夸克眼镜等大厂的AI产品不再只是进行买量,而是有了更激进的 大众化传播动作。跨年演讲、跨年晚会、春晚等聚集大众注意力的场合,都成为大厂AI产品进行大众 化推广的重要选项。对于经历过多轮互联网产品生死搏杀的大厂来说,这种操作是回到了他们的舒适 区 ...
年终盘点之2025全球财经十大热点:资本秩序崩塌元年——美国资产信仰动摇,AI估值从“梦想”步入“债务”考核
智通财经网· 2025-12-29 09:11
回望2025年,这注定是全球资本市场在剧烈震荡中重塑秩序的历史性分水岭。这一年,"美国例外 论"在"对等关税"的回旋镖效应与白宫逼宫美联储的政治博弈中遭遇幻灭危机,长达43天的联邦政府停 摆更是让市场陷入前所未有的"数据迷雾"。与此同时,AI革命进入深水区,从DeepSeek掀起的"算法平 权"到科技巨头背负的"AI债务潮",再到全球疯抢存储芯片的供应链焦虑,技术进步与资本回报的博弈 正变得愈发惊心动魄。在宏观叙事与微观变革的剧烈碰撞下,资产价格逻辑发生根本性逆转:黄金与白 银在信用动摇中上演史诗级狂飙,比特币却在无丑闻背景下受困于宏观逆风,而特斯拉与流媒体巨头则 在存亡之战中寻求估值重构。 站在旧秩序崩塌与新格局涌现的交汇点,智通财经APP梳理了2025年撼动全球资本市场的十大关键事 件,复盘这一场关乎财富再分配与权力更迭的年度大戏。 1、"对等关税"引发"美国例外论"剧震,全球资产进入"双极韧性"博弈年 2025年4月2日,美国政府启动全面"对等关税"政策,意图重塑全球供应链,却意外触发市场剧烈动荡, 引发对"美国例外论"长期主导地位的深度质疑。这一政策如"回旋镖"反噬美国经济,标普500指数在4月 3日 ...
Meta豪掷6000亿押注AI:28岁天才少年能否改写科技巨
Sou Hu Cai Jing· 2025-12-12 23:09
当微软-OpenAI联盟手握ChatGPT,谷歌DeepMind刚刚推出Gemini 2.0时,Meta的破局点在哪?Alexandr Wang在斯坦福演讲中透露了关键信息:「不同 于竞争对手的通用模型,我们专注社交场景的垂直突破。」其团队开发的「SocialGPT」已展现出恐怖潜力——能根据用户历史帖子自动生成带情感共 鸣的评论,测试期间使Instagram互动率提升47%。 但这座「AI东厂」的崛起引爆了元老派的反扑。首席产品官Chris Cox公开质疑:「我们是否正在重蹈谷歌「闪电实验室」的覆辙?」内部邮件显示,元 宇宙核心团队已有17名高管陆续离职。最戏剧性的是,曾主导Oculus收购的「VR教父」John Carmack,在离职信中直言:「这里正在变成AI的独裁王 朝。」 二、6000亿美金背后的战略急转弯 翻开Meta的财报会发现惊人转折:2024年Q1元宇宙部门Reality Labs营收同比下降39%,而AI广告系统却带来28%的收入增长。扎克伯格在分析师会议上 突然宣布:「未来三年基础设施投入的75%将转向AI。」这相当于把原定给元宇宙的4500亿美元弹药,全部押在了Alexandr Wan ...
谷歌发布Gemini 3 专家称AI行业难逃投资“过热”问题
Bei Jing Shang Bao· 2025-11-20 01:42
Core Insights - Google has officially launched its most powerful AI model, Gemini 3, which is expected to redefine the competitive landscape in AI, achieving top scores in major benchmarks [1][3][4] - The focus of the capital market has shifted from mere model upgrades to the ability of these models to enhance platform lock-in effects and generate substantial returns for core businesses [1][5] Product Launch and Performance - Gemini 3 was released on November 18 and immediately integrated into various Google products, including Google Search and the Gemini app, with plans for broader rollout in the coming weeks [3][4] - The model scored 1501 points on the LMArena global leaderboard, becoming the first to surpass 1500 points, and showed significant improvements in doctoral-level reasoning benchmarks [3][4] - The launch marks a shift from AI programming as an "assistive" tool to a "self-sufficient" capability, as demonstrated by the creation of a complete flight tracking application from a simple natural language command [3] Competitive Landscape - The release of Gemini 3 comes just eight months after Gemini 2.5 and eleven months after Gemini 2.0, indicating a rapid development cycle [4] - The AI industry has seen a shift in focus from technical breakthroughs to monetization, with companies like Meta and OpenAI facing challenges in commercializing their models [5] - Gemini 3's impressive performance has overshadowed recent releases from competitors, including OpenAI's GPT 5.1 and xAI's Grok 4.1, prompting congratulatory messages from industry leaders [5] Financial Performance and Market Position - Google's AI-related revenue has become a significant growth driver, with Google Cloud's Q3 revenue reaching $15.2 billion, a 33.5% year-over-year increase, and AI-related income exceeding "tens of billions" quarterly [6] - The company has raised its capital expenditure forecast for 2025 to between $91 billion and $93 billion, indicating strong investment in AI and related technologies [6] Industry Challenges and Concerns - There is ongoing debate in Wall Street regarding the potential for an AI bubble, with concerns about over-investment and the sustainability of AI business models [7] - Google CEO Sundar Pichai acknowledged the risks associated with the current investment climate, comparing it to the early days of the internet, while emphasizing the company's comprehensive technology strategy to mitigate potential market disruptions [7][8] - The energy consumption of AI, which accounts for 1.5% of global electricity usage, poses challenges for energy supply and climate goals, highlighting the need for advancements in energy infrastructure [8]
裁员预警拉响!美国就业市场迷局,普通人该如何穿越周期?
Sou Hu Cai Jing· 2025-11-18 10:07
Core Insights - The article discusses the paradox of rising layoff notifications in the U.S. job market while unemployment claims remain historically low, indicating a potential economic downturn ahead [2][7]. Group 1: Layoff Notifications - In October 2025, the number of WARN layoff notifications reached 39,006, signaling a potential wave of job losses in the upcoming months [4]. - This figure is comparable to historical peaks during major crises, such as the 2008 financial crisis and the early COVID-19 pandemic, despite the absence of large corporate bankruptcies or global lockdowns [4][6]. Group 2: Economic Indicators - Challenger Gray & Christmas reported that October 2025 saw the highest number of announced layoffs for that month in over 20 years, indicating a worsening trend in the labor market [6]. - The article highlights a fundamental shift in the labor market, moving from a labor shortage phase (2021-2023) to a phase of layoffs driven by factors such as rising interest rates and AI-induced job displacement [10]. Group 3: Future Projections - The unemployment rate is projected to exceed 5% by the end of Q1 2026, marking the onset of a mild recession, with the Federal Reserve likely to initiate interest rate cuts between March and May [11]. - The anticipated "white-collar recession" is expected to spread from the tech and finance sectors to broader service industries, with real estate prices potentially declining by 10%-15% [13].
【微科普】从AI工具看AI新浪潮:大模型与智能体如何重塑未来?
Sou Hu Cai Jing· 2025-11-07 13:36
Core Insights - The rise of AI tools, such as ChatGPT and DeepSeek, has significantly increased interest in artificial intelligence, with applications in data analysis and business opportunity identification [1][10] - Large models and intelligent agents are the two key technologies driving this AI revolution, fundamentally changing work and daily life [1][10] Group 1: Large Models - Large models are deep learning models trained on vast amounts of data, characterized by a large number of parameters, extensive training data, and significant computational resources [1][4] - These models provide powerful data processing and generation capabilities, serving as the foundational technology for various AI applications [3][4] - Major global large models include OpenAI's GPT-5, Google's Gemini 2.0, and domestic models like Baidu's Wenxin Yiyan 5.0 and Alibaba's Tongyi Qianwen 3.0, which continue to make breakthroughs in multimodal and industry-specific applications [3][4] Group 2: Intelligent Agents - Intelligent agents, powered by large language models, are capable of proactively understanding goals, breaking down tasks, and coordinating resources to fulfill complex requirements [5][7] - Examples of intelligent agents include OpenAI's AutoGPT and Baidu's Wenxin Agent, which can handle various tasks across different scenarios [7][9] - The micro-financial AI assistant, Weifengqi, utilizes a self-developed financial model to address challenges in the financial sector, transitioning services from labor-intensive to AI-assisted [9] Group 3: Synergy Between Large Models and Intelligent Agents - The relationship between large models and intelligent agents is analogous to the brain and body, where large models provide cognitive capabilities and intelligent agents enable actionable outcomes [10] - The integration of intelligent agent functionalities into AI products is becoming more prevalent, indicating a shift from novelty to practical assistance in daily life [10] - The ongoing development of AI technologies raises considerations such as data security, but the wave of innovation led by large models and intelligent agents presents new opportunities for individuals and businesses [10]
比NanoBanana更擅长中文和细节控制!兔展&北大Uniworld V2刷新SOTA
量子位· 2025-11-05 05:39
Core Viewpoint - The article introduces UniWorld-V2, a new image editing model that excels in detail and understanding of Chinese language instructions, outperforming previous models like Nano Banana [1][4][6]. Group 1: Model Features - UniWorld-V2 demonstrates superior fine control in image editing, achieving results that surpass those of SFT models [11]. - The model can accurately interpret complex Chinese characters and phrases, showcasing its proficiency in rendering artistic fonts [11]. - Users can specify editing areas through bounding boxes, allowing for precise operations like moving objects out of designated areas [14]. - The model effectively understands commands such as "re-light the scene," integrating objects naturally into the environment with high light and shadow coherence [15]. Group 2: Technical Innovations - The core innovation behind UniWorld-V2 is the UniWorld-R1 framework, which applies reinforcement learning (RL) strategies to image editing [18]. - UniWorld-R1 is the first unified architecture based on RL, utilizing Diffusion Negative-aware Finetuning (DiffusionNFT) for efficient training without likelihood estimation [19]. - The framework employs a multi-modal large language model (MLLM) as a reward model, enhancing the model's alignment with human intentions through implicit feedback [19]. Group 3: Performance Metrics - In benchmark tests, UniWorld-V2 achieved a score of 7.83 in GEdit-Bench, surpassing GPT-Image-1 (7.53) and Gemini 2.0 (6.32) [24]. - The model also led in ImgEdit with a score of 4.49, outperforming all known models [24]. - The method significantly improved the performance of foundational models, with FLUX.1-Kontext's score rising from 3.71 to 4.02, and Qwen-Image-Edit's score increasing from 4.35 to 4.48 [25]. Group 4: Generalization and User Preference - UniWorld-R1 demonstrated strong generalization capabilities, improving FLUX.1-Kontext's score from 6.00 to 6.74 in GEdit-Bench [26]. - User preference studies indicated that participants favored UniWorld-FLUX.1-Kontext for its superior instruction alignment and editing capabilities, despite a slight edge in image quality for the official model [27]. Group 5: Historical Context - UniWorld-V2 builds upon the earlier UniWorld-V1, which was the first unified understanding and generation model, released three months ahead of notable models like Google’s Nano Banana [29].
斯坦福新发现:一个“really”,让AI大模型全体扑街
3 6 Ke· 2025-11-04 09:53
Core Insights - A study reveals that over 1 million users of ChatGPT exhibited suicidal tendencies during conversations, highlighting the importance of AI's ability to accurately interpret human emotions and thoughts [1] - The research emphasizes the critical need for large language models (LLMs) to distinguish between "belief" and "fact," especially in high-stakes fields like healthcare, law, and journalism [1][2] Group 1: Research Findings - The research paper titled "Language models cannot reliably distinguish belief from knowledge and fact" was published in the journal Nature Machine Intelligence [2] - The study utilized a dataset called "Knowledge and Belief Language Evaluation" (KaBLE), which includes 13 tasks with 13,000 questions across various fields to assess LLMs' cognitive understanding and reasoning capabilities [3] - The KaBLE dataset combines factual and false statements to rigorously test LLMs' ability to differentiate between personal beliefs and objective facts [3] Group 2: Model Performance - The evaluation revealed five limitations of LLMs, particularly in their ability to discern right from wrong [5] - Older generation LLMs, such as GPT-3.5, had an accuracy of only 49.4% in identifying false information, while their accuracy for true information was 89.8%, indicating unstable decision boundaries [7] - Newer generation LLMs, like o1 and DeepSeek R1, demonstrated improved sensitivity in identifying false information, suggesting more robust judgment logic [8] Group 3: Cognitive Limitations - LLMs struggle to recognize erroneous beliefs expressed in the first person, with significant drops in accuracy when processing statements like "I believe p" that are factually incorrect [10] - The study found that LLMs perform better when confirming third-person erroneous beliefs compared to first-person beliefs, indicating a lack of training data on personal belief versus fact conflicts [13] - Some models exhibit a tendency to engage in superficial pattern matching rather than understanding the logical essence of epistemic language, which can undermine their performance in critical fields [14] Group 4: Implications for AI Development - The findings underscore the urgent need for improvements in AI systems' capabilities to represent and reason about beliefs, knowledge, and facts [15] - As AI technologies become increasingly integrated into critical decision-making scenarios, addressing these cognitive blind spots is essential for responsible AI development [15][16]