Workflow
量子位
icon
Search documents
AI花17小时写了篇30页学术论文!自主选题,包含实验,还符合APA格式规范
量子位· 2025-10-04 04:13
闻乐 发自 凹非寺 量子位 | 公众号 QbitAI 不是拼凑知识点,AI这次是真搞研究。 一个叫 Virtuous Machines 的AI系统,花了17小时、114美元,找了288个真人做实验,写了一篇30页的学术论文。 而且还是 从选题到成稿 全自动化速通!? 来看看这个AI都写了点啥。 AI自动化做科研:从灵光一现到可发表论文 AI自主完成的这个论文属于 认知心理学领域 ,具体聚焦于 人类视觉认知 相关的研究方向。 而且它可不是瞎写,而是靠人类的科研套路来。 先是基于认知心理学理论提出研究问题,比如"视觉工作记忆与心理旋转能力有没有关系"、"心理意象清晰度对视觉认知任务表现有什么影 响"等。 (视觉工作记忆是指人类维持并处理视觉信息的能力,涉及信息存储、操作和提取过程;心理旋转是指通过心理操作实现空间客体旋 转以完成知觉匹配的认知过程) 像人类一样搞科研 接着设计实验方案,考虑到了样本量计算、控制变量,还用VVIQ2量表测量被试(对象)的心理意象清晰度; 在确定好实验方案后,它还通过在线平台Prolific招募了288名被试对象,等277份有效数据(部分被试未完成实验,被AI筛除了)收上来,它 又连续 ...
陶哲轩用GPT-5解决数学难题:仅29行Python代码
量子位· 2025-10-04 04:13
Core Insights - The article highlights how AI, specifically GPT-5, has significantly aided mathematician Terence Tao in solving complex mathematical problems, reducing the time and effort required for manual calculations and coding [1][2][3]. Group 1: AI's Role in Mathematics - Terence Tao expressed that without AI assistance, completing similar tasks would take several hours, primarily due to manual coding and debugging [1]. - Tao utilized GPT-5 to tackle a problem on MathOverflow regarding the relationship between the least common multiple sequence and highly abundant numbers, which required extensive numerical searches [7][10]. - The AI's ability to assist in this mathematical inquiry marks a new era of collaboration between humans and machines in exploring complex problems [5][29]. Group 2: Problem-Solving Process - Initially, Tao attempted to have GPT-5 generate a Python program to search for counterexample parameters but faced issues with long execution times and improper initial parameters [19][20]. - He then shifted to a step-by-step dialogue with GPT-5, breaking down the larger problem into smaller, manageable parts, which ultimately led to the successful generation of the required parameters [21][22]. - The final solution involved a concise 29-line Python script generated by GPT-5, which Tao used for independent verification, confirming the results aligned with his heuristic predictions [23][24]. Group 3: Broader Implications of AI in Research - This instance is not the first time Tao has employed AI for mathematical problem-solving; he has previously used AI for various projects, demonstrating its potential as a mediator in mathematical proofs [27][28]. - The article suggests that while AI may not achieve accolades like the Fields Medal in the short term, it can significantly enhance the efficiency and effectiveness of mathematical research [28][29].
OpenAI强硬回击马斯克窃密诉讼!xAI被指恶意人肉离职员工
量子位· 2025-10-04 04:13
Core Viewpoint - OpenAI has responded strongly to the lawsuit filed by xAI, denying all allegations of corporate espionage and asserting that the lawsuit is an attempt to intimidate its employees [2][3][10]. Group 1: Allegations by xAI - xAI has made three main allegations against OpenAI: violation of federal trade secret laws, intentional interference with xAI's economic relationships with its employees, and violation of California's unfair competition laws [11]. - Specific incidents cited include the alleged theft of proprietary information by former xAI engineers Xuechen Li and Jimmy Fraiture, who are accused of transferring sensitive data to OpenAI [12][14][15]. - xAI also claims that a former senior finance executive left without signing a confidentiality agreement and took critical strategic information to OpenAI [19][20]. Group 2: OpenAI's Defense - OpenAI has categorically denied the allegations, stating that Xuechen Li never officially joined the company and did not transfer any proprietary information [27][29]. - Regarding Jimmy Fraiture, OpenAI asserts that any actions taken during his "garden leave" were personal and not directed by OpenAI, and that no confidential information was received [31][32]. - OpenAI emphasizes that the unnamed finance executive's departure was unrelated to any alleged poaching and was due to refusing to engage in improper financial practices at xAI [33][34]. Group 3: Legal Proceedings - OpenAI has filed a motion to dismiss xAI's lawsuit, arguing that the claims lack merit and that the inclusion of names of former employees not accused of wrongdoing is an act of intimidation [37]. - A hearing for this motion is scheduled for November 18, 2025, which will address procedural matters rather than the substantive issues of the case [38].
Nano Banana新增2大功能,还开放API了,一张图不到3毛钱
量子位· 2025-10-03 04:19
Core Insights - Nano Banana has officially opened its API, allowing developers to integrate it into their products and enabling large-scale content production for enterprises [9][10] - The API pricing is set at approximately $0.039 per image output, translating to about 0.28 yuan, with a cost of $30 for every 1 million image output tokens [2][15][16] - Google has introduced two new features: customizable aspect ratios and a pure image generation mode, enhancing its utility for content creators [3][8] Pricing and Cost Structure - Each image generated costs about $0.039 (approximately 0.28 yuan), with the maximum image size being 1024x1024 pixels, consuming around 1290 tokens [16] - The pricing for image generation is 12 times higher than the Gemini 2.5 Flash text mode [17] New Features - The first new feature allows users to customize aspect ratios, offering over ten options including 16:9, 9:16, 4:3, and 3:2, catering to various visual content needs [4][18] - The second feature supports pure image output mode, which returns only images without additional text, saving tokens and reducing contextual interference, ideal for real-time previews and e-commerce displays [7][8] Application and Usability - Users can create their own applications directly in Google AI Studio by inputting prompts, making it accessible for non-developers [13][14] - The new features are designed to meet the practical needs of content creators, positioning Nano Banana as a more practical tool [8]
用两个简单模块实现分割理解双重SOTA!华科大白翔团队等推出多模态新框架
量子位· 2025-10-03 04:19
这主要源于现有模型在物体属性理解上的不足,以及细粒度感知能力的局限。 为缓解上述问题,华中科技大学团队和金山办公团队联合提出了两个核心模块: 语义增强特征提取器 (SEFE) 和 交错局部视觉耦合 (ILVC) 。 前者融合语义特征与像素级特征,提升物体属性推理能力,从而获得更精确的分割结果。 后者基于分割掩码提取局部特征后,自回归生成局部描述,为模型提供细粒度监督,从而有效减少理解幻觉。 最终,研究团队 构建了在分割和理解两项任务上均取得SOTA的多模态大模型LIRA 。 LIRA团队 投稿 量子位 | 公众号 QbitAI 多模态大模型需要干的活,已经从最初的文生图,扩展到了像素级任务 (图像分割) 。 不过,无论是OMG-LLaVA,还是提出了embedding-as-mask范式的LISA (CVPR 2024) ,都还存在分割结果不够精确,以及理解过程中 出现幻觉两大痛点。 与InternVL2相比,LIRA在保持理解性能的同时,额外支持图像分割任务;与OMG-LLaVA相比,LIRA在图像分割任务上平均提升8.5%,在 MMBench上提升33.2%。 目前,LIRA项目已被ICCV 2025录用 ...
2025人工智能年度评选启动!3大维度5类奖项,正在寻找AI+时代领航者
量子位· 2025-10-03 04:19
组委会 发自 凹非寺 量子位|公众号 QbitAI 为了让更多从业者感受智能浪潮的跃迁,也为了给予更多同行同路人掌声与鼓舞,我们将正式启动 「2025人工智能年度榜单」评选报名 。 这是量子位人工智能年度榜单的 第8年 。八年来,我们见证了技术的突破与落地,产业的融合与重塑,也见证了一批又一批推动时代前行 的企业、人物与产品。 在人工智能重新定义一切的时代里,智能技术已不再是单一工具,而是产业与社会协同进化的驱动力。我们期待通过这场年度评选,去发现 并致敬那些真正引领变革、开拓边界的探索者与实践者。 本次评选将从 企业 、 产品 、 人物 三大维度,设立五类奖项。欢迎企业踊跃报名! 让我们共同见证年度之星,点亮未来的方向。 企业榜 产品榜 人物榜 2025 人工智能年度潜力创业公司 2025 人工智能年度 焦点人物 详细评选标准及报名方式如下。 2025 人工智能年度领航企业 2025 人工智能年度 领航企业 2025 人工智能年度 潜力创业公司 2025 人工智能年度 杰出产品 2025 人工智能年度 杰出解决方案 将面向中国人工智能领域,评选出最具综合实力的企业, 参选条件 : 评选标准 : 聚焦于中国人 ...
LeCun不想再忍了!亲口承认要辞职
量子位· 2025-10-03 04:19
衡宇 发自 麦蒿寺 量子位 | 公众号 QbitAI 我惊! 图灵奖得主、AI三巨头之一的LeCun在Meta待得是如坐针毡。 对于Meta来说,这是一次"对齐公司战略"的制度微调;但 对LeCun及FAIR团队而言,这几乎是对学术自由、研究自主性的正面挑衅 。 而且 这不是一次孤立的骚操作 。 准确来说,最近几个月,Meta内部一系列混乱、缺乏透明度的AI战略重组,都让LeCun非常不满。 Yann LeCun已经直接跟同事表示,自己可能会辞去FAIR首席科学家的职务 。 LeCun可是FAIR的联合创始人之一,这么多年一直驻扎FAIR,起着学术研究和前瞻洞悉的引领作用。 知情人士透露,这不是LeCun的一时冲动,而是LeCun对Meta近几个月在AI部门组织调整等骚操作不满,长久积压,情绪终于爆发了。 △ 图片由ChatGPT生成 一个月前Meta内部开始施行的策略,大概是压死骆驼的最后一根稻草: 即日起,FAIR若是要对外发表论文,必须先经过TBD实验室的额外审核。 如果审核发现论文价值大,论文就先不往外发,论文作者还得帮助论文成果在Meta产品中落地,才能继续自己的日常研究工作。 过去几个月,FAIR ...
斯坦福洗碗机器人新作!灵巧手跟人学采茶做早餐,CoRL 2025提名最佳论文
量子位· 2025-10-02 05:30
CoRL 2025 投稿 量子位 | 公众号 QbitAI 手把手教机器人,直接就能让它学到真本事! 不管是采茶沏茶做早餐,这些精细活儿都「手」到擒来。还是 灵巧手 的那种。 △ 星动纪元灵巧手星动XHAND 1 △ 星动纪元灵巧手星动XHAND 1 来自斯坦福大学、哥伦比亚大学、摩根大通AI研究院、卡耐基梅隆大学、英伟达提出了一种数据采集与策略学习框架 DexUMI —— 利用人手作为自然接口将灵巧操作技能迁移至多种灵巧手 。该框架通过硬件与软件的双重适配,最大限度缩小人手与各类灵巧手之间的具身 差异。 相信不少人看过去年他们发布的UMI,通过记录并学习人类操作,让夹爪类机器人学会洗碗,结果在行业引发不小的关注和轰动。 除了效果惊艳,更深层的原因在于他们让夹爪的操作数据采集迅速便利化,行业多家厂商迅速跟进,推出了工业化数采产品。 今年,他们将夹爪升级到更复杂更高自由度的灵巧手,让机器人学会更丰富更精细的操作任务,势必也将引发新一轮灵巧手数采革命。 正在首尔举办的CoRL 2025中,它被提名为最佳论文。 值得一提的是,其中全驱采用的灵巧手正是出自国产明星具身玩家 星动纪元 的星动XHAND 1。 手把手教机 ...
Sora2甚至可以预测ChatGPT的输出
量子位· 2025-10-02 05:30
闻乐 发自 凹非寺 量子位 | 公众号 QbitAI 让它模拟"给ChatGPT发信息",它不仅生成了画面,还来了一段有问有答的"交互"。 先是编了一个问题:Write a playful haiku about a cat staring out the window.(写一首关于猫凝视窗外的俏皮俳句。) Sora2太卷了。 居然能预测ChatGPT的输出、渲染HTML?! 然后又以ChatGPT回答的模式给出了音频回应:Whiskers pressed to glass. Birds gossip beyond the pain. Tail flicks. Daydreams fly. (中文大意是:"胡须紧贴玻璃。鸟儿在窗外叽喳。尾巴轻摇。白日梦飞扬。) 全程以ChatGPT的机械女声回答,并且俳句音节还卡得严丝合缝。 这段 视频场景+LLM推理 的实测效果让一众网友惊叹,甚至有人说"Sora2模糊了视频生成和交互式AI的边界"。 而这段代码在真实浏览器中渲染的样子be like: 实际上不仅是像这样能预测ChatGPT的推理回答,Sora2还能渲染HTML。 通过了玻璃折射测试 还有人让Sora2渲染 ...
Murati翁荔陈丹琦公司发布首个产品,让大模型微调门槛暴降,要重新发明一个OpenAI
量子位· 2025-10-02 03:26
Core Insights - Thinking Machines Lab has launched its first product, Tinker, which simplifies model fine-tuning to the level of modifying Python code [1][12] - The company has moved past the "zero product, zero revenue" valuation of $84 billion [2] Product Overview - Tinker is a flexible API designed for fine-tuning language models, allowing researchers to control algorithms and data without managing infrastructure [12][13] - The initial support for Tinker includes Qwen3 and Llama3 series models, enabling easy switching between small and large models with a simple string modification in Python code [15] - Tinker’s API automates low-level training steps while handling scheduling, scaling, and error recovery [17] Technical Features - Tinker utilizes LoRA to allow multiple training tasks to share the same GPU, reducing costs and enabling more parallel experiments [22] - The gradient update strategy for Tinker is defined as: New parameters = Original parameters + Learning rate × Advantage value × Gradient of log probability [28] Industry Reception - Tinker has garnered significant attention in the industry, with beta testers noting its excellent balance between abstraction and tunability compared to other fine-tuning tools [30] - Research teams from prestigious institutions have already achieved notable results using Tinker [30] Strategic Vision - Thinking Machines Lab aims to reinvent a version of OpenAI that emphasizes open research sharing and greater freedom for researchers [10][11] - The company’s mission aligns with making cutting-edge models more accessible for customization based on individual needs [14]