Workflow
锦秋集
icon
Search documents
融资20亿美元的Thinking Machines Lab首次公开:破解LLM随机性,实现可复现的“确定性”推理
锦秋集· 2025-09-11 09:19
Core Insights - The article discusses the fundamental issue of reproducibility in large language models (LLMs), attributing the uncertainty in inference results not to "concurrent computation and floating-point errors" but to the lack of "Batch Invariance" in core computational operators [1][7][11]. Group 1: Problem Identification - The article identifies that inference servers dynamically batch user requests, leading to results that depend on the batch size and composition, introducing inherent uncertainty [1][29]. - It challenges the common belief that floating-point non-associativity is the primary cause of uncertainty, suggesting that the real issue lies in how kernel functions are implemented [20][21]. Group 2: Proposed Solutions - The authors propose a solution by rewriting key computational modules in the Transformer model—RMSNorm, matrix multiplication, and attention mechanisms—to ensure they possess "Batch Invariance," thus making the computation independent of batch size [2][34]. - Experimental results demonstrate that with the new approach, repeated requests yield consistent results, contrasting with the previous method where 1000 identical requests produced 80 different outputs [2][75]. Group 3: Technical Implementation - The article details the implementation of batch-invariant RMSNorm, matrix multiplication, and attention mechanisms, emphasizing the need for consistent reduction strategies that do not depend on batch size [34][47][62]. - It highlights the challenges in maintaining batch invariance, particularly in attention mechanisms where the reduction order must remain consistent regardless of the number of tokens processed [66][72]. Group 4: Performance Analysis - The performance of the batch-invariant kernels is evaluated, showing a 20% performance loss compared to cuBLAS, but still maintaining acceptable efficiency for LLM inference [59][78]. - The article notes that while the performance of the batch-invariant implementation is not optimized, it remains viable for practical applications [78]. Group 5: Implications for Reinforcement Learning - The article discusses the implications of achieving deterministic inference for reinforcement learning (RL), enabling true on-policy RL by ensuring consistent results between training and inference [79][83]. - It emphasizes that achieving bitwise identical results between the sampler and trainer is crucial for effective RL training [80].
锦秋基金被投数美万物:破解 Nano Banana 实物化难题,让 3D 设计实现全民平权 | Jinqiu Spotlight
锦秋集· 2025-09-11 04:00
Core Viewpoint - The article discusses the emergence of "数美万物" (Math Magic) as a transformative player in the AI-driven creative economy, focusing on its innovative platform "Hitems" that enables users to turn creative ideas into physical products through advanced AI technologies [3][6][52]. Investment and Company Background - "锦秋基金" (Jinqiu Capital) has invested in "数美万物" during its angel round in 2024 and its Pre-A round in 2025, with the latter round raising several million dollars and valuing the company at approximately $150 million [3][4]. - The founding team of "数美万物" includes key members from the original Douyin (TikTok) team, enhancing its credibility and potential for innovation [3]. Technology and Platform Features - The core platform "Hitems" utilizes generative AI technology to provide a full-service chain from creation to production and consumption, allowing users to generate high-fidelity product images from keywords, pictures, or sketches [6]. - "Hitems" employs proprietary AI 3D modeling technology and a mature supply chain system to convert creators' ideas into tangible products, significantly lowering the barriers for entry into 3D design and production [6][50]. User Engagement and Community Initiatives - A recent campaign called "手办免费造" (Free Figurine Creation) was launched to engage users in the process of creating physical products from AI-generated designs, allowing participants to win free figurines [7][8]. - The platform encourages community interaction, enabling users to share their creations and participate in contests, thus fostering a collaborative environment [6][8]. Market Potential and Economic Impact - The article highlights the potential for a new wave of creativity and economic growth driven by AI, suggesting that if ordinary individuals can design and create 3D models, it could lead to a significant expansion of the creative economy [52]. - The platform's ability to democratize access to 3D design tools is compared to the historical impact of Ford's assembly line on the automotive industry, suggesting a similar transformation in the creative sector [51][52]. Production and Commercialization - "造好物" (Zaohaowu) offers a flexible supply chain network that allows creators to produce as few as one item, drastically reducing the cost and complexity of bringing creative ideas to market [50]. - The platform's approach to commercialization includes leveraging data analytics to identify market trends and facilitate connections between creators and potential consumers, thereby enhancing the viability of new products [44][45]. Conclusion - The integration of AI in the creative process is positioned as a game-changer, enabling a broader range of individuals to participate in design and production, ultimately leading to a more vibrant and diverse creative economy [56].
锦秋基金被投生数科技上线参考生图功能,国产Nano Banana来了 | Jinqiu Spotlight
锦秋集· 2025-09-11 02:29
Core Viewpoint - Jinqiu Capital has completed an investment in Shengshu Technology in 2023, focusing on innovative AI startups with breakthrough technologies and business models [1][2]. Group 1: Investment and Company Overview - Jinqiu Capital, an AI fund with a 12-year history, emphasizes a long-term investment philosophy and seeks out AI startups with disruptive technologies [2]. - Shengshu Technology launched its video model Vidu Q1 in September 2025, introducing a reference image feature that allows for the input of up to 7 reference images, surpassing domestic generation limits [2][11]. Group 2: Product Features and Comparisons - Vidu Q1's reference image feature allows for the seamless combination of multiple images, achieving high consistency and realism, outperforming competitors like Flux Kontext and Nano Banana [13][36]. - The tool supports the input of up to 7 reference images, which is a leading capability in the industry, while most competing AI tools only support 1-3 images [22][23]. - Vidu Q1 demonstrates superior subject consistency, effectively addressing common issues such as character distortion and detail loss, which are prevalent in other models [38][39]. Group 3: Creative Applications and Use Cases - Vidu Q1 enables a wide range of creative applications, allowing users to easily change outfits, backgrounds, and props with just one image and a prompt [68][107]. - The tool can generate high-quality promotional materials for e-commerce, significantly reducing production time and costs compared to traditional methods [169][176]. - Vidu Q1's capabilities extend to generating complex scenes and character interactions, making it suitable for various industries, including advertising and media [169][182].
网友玩疯的 10 大整活测试,究竟谁能和 Nano-Banana 一战?
锦秋集· 2025-09-10 04:01
Core Viewpoint - Nano-Banana has emerged as a significant reference point in the AI model landscape, gaining popularity across various user demographics, including tech enthusiasts, investors, and casual users [2][3]. Group 1: Model Comparison - A comparative evaluation was conducted between Nano-Banana and nine other popular models, focusing on their performance in various tasks [3][6]. - The models evaluated include Google’s Nano-Banana, OpenAI’s GPT-Image-1, ByteDance’s Seedream, Alibaba’s Qwen-Image-Edit, Kuaishou’s KeLing, MiniMax’s Hailuo image-01, Tencent’s Yuanbao, Baidu’s Wenxin Yiyan, Black Forest Labs’ Flux.1 Kontext, and SenseTime’s SenseMirage Artist v2.1 [6][7]. Group 2: Task Performance - Nano-Banana demonstrated superior performance in tasks such as local modifications, style transfer, identity retention, narrative expression, and three-dimensional generation, consistently outperforming other models [99][100]. - In specific tasks, Nano-Banana excelled in generating Funko Pop figures, maintaining character consistency during clothing changes, and producing coherent four-panel comics [19][30][64]. - The model's results were noted for their detail presentation, structural consistency, and natural appearance, making it a reliable choice for users [99][100]. Group 3: Market Insights - The article highlights a gap in the Chinese market regarding text generation capabilities, particularly in Chinese characters, suggesting an opportunity for models that can provide a smoother experience in this area [106][107]. - The potential for monetization through engaging and entertaining AI applications is emphasized, indicating that models capable of transforming fun interactions into business opportunities will have a competitive edge [104][109]. Group 4: Future Directions - The future of AI models is expected to focus on higher-level capabilities, such as precise control over modifications and the ability to tell coherent stories across multiple images [108]. - The importance of stability in performance across everyday scenarios is underscored, as users are likely to favor models that consistently deliver reliable results [103].
为什么 2025 年的种子轮团队人数减半,却能干更多事? | Jinqiu Select
锦秋集· 2025-09-09 15:26
Carta 的最新薪酬与团队报告,给出了一个极具冲击力的答案。 在新周期下,创业公司如何用有限的人和钱,跑出效率与成果,赢得投资人的认可? 2025 年,种子轮初创公司的平均团队人数,比 2021 年缩小了 44%,从 11 人降至 6 人。但这些公司并没有因此停滞不前,反而依靠 AI 工具和更精简的组织模式, 实现了更高的产出效率。 这背后,折射出一个创业逻辑的彻底转变。过去的黄金年代里,创业者讲故事的关键词是"规模"和"增长":融资多少、团队多大、扩张速度有多快。但在资本更为 谨慎的新周期,投资人不再愿意为"人力堆出来的增长"买单。他们想看到的,是一个团队能否用最少的资源交付最扎实的成果。 这也是为什么 AI 创业者在这份报告里能读到几条格外重要的信号: 第一,AI 人才依旧掌握定价权。 AI/ML 工程师的薪资在过去 18 个月持续上涨,尤其是顶尖人才溢价更高。对创业者来说,关键不是多招人,而是能否吸引到那 1–2 位核心工程师,撑起产品差异化。 第二,小团队+AI 工具,成为新范式。 精简团队配合 AI 工具链,正在取代"大团队+流水线"的旧模式。一个 6 人团队完全可能做出过去 20 人的产出,这对 ...
一份基于500篇论文的Agentic RL技术全景与未来 | Jinqiu Select
锦秋集· 2025-09-09 05:51
Core Viewpoint - The development of Large Language Models (LLMs) is increasingly focused on enhancing their agentic capabilities through Reinforcement Learning (RL), marking a significant strategic direction for leading AI companies globally [1][2]. Group 1: Transition from Static Generators to Agentic Entities - The rapid integration of LLMs with RL is fundamentally changing the conception, training, and deployment of language models, shifting from viewing LLMs as static generators to recognizing them as agentic entities capable of autonomous decision-making [4][5]. - This new paradigm, termed Agentic Reinforcement Learning (Agentic RL), allows LLMs to operate within sequential decision-making cycles, enhancing their capabilities in planning, reasoning, tool usage, memory maintenance, and self-reflection [5][6]. Group 2: Need for a Unified Framework - Despite the proliferation of research on LLM agents and RL for LLMs, there is a lack of a unified, systematic framework for Agentic RL that integrates theoretical foundations, algorithmic methods, and practical systems [7][8]. - Establishing standardized tasks, environments, and benchmarks is essential for exploring scalable, adaptable, and reliable agentic intelligence [9]. Group 3: Evolution from Preference Tuning to Agentic Learning - Initial training of LLMs relied on behavior cloning and maximum likelihood estimation, but subsequent methods aimed to align model outputs with human preferences, leading to the emergence of agentic reinforcement learning [10][12][14]. - The focus has shifted from optimizing fixed preference datasets to developing agentic RL tailored for specific tasks and dynamic environments, highlighting fundamental differences in assumptions, task structures, and decision granularity [14][19]. Group 4: Key Components of Agentic RL - Agentic RL encompasses several key capabilities, including planning, tool usage, memory, self-improvement, reasoning, and perception, which are interdependent and can be jointly optimized [51]. - The integration of RL into memory management allows agents to dynamically decide what to store, when to retrieve, and how to forget, enhancing their adaptability and self-improvement capabilities [68][75]. Group 5: Tool Usage and Integration - RL has become a critical methodology for evolving tool-using language agents, transitioning from static imitation to dynamic optimization of tool usage in various contexts [61][65]. - Recent advancements in tool-integrated reasoning systems demonstrate the ability of agents to autonomously determine when and how to use tools, adapting to new contexts and unexpected failures [66]. Group 6: Future Directions - The future of agentic planning lies in integrating external search and internal strategy optimization, aiming for a seamless blend of intuitive rapid planning and careful slow reasoning [58]. - There is a growing emphasis on developing structured memory representations that can dynamically control the construction, optimization, and evolution of memory systems, presenting an open and promising direction for enhancing agent capabilities [76].
挥刀中国,豪赌续命:Claude停服背后的算力危机 | Jinqiu Select
锦秋集· 2025-09-05 15:17
Core Viewpoint - Anthropic's decision to suspend Claude services for Chinese users reflects not only geopolitical pressures but also its ongoing challenges with computing power and strategic choices [2][3]. Group 1: Suspension of Services - The suspension of Claude services to Chinese users has significant implications for developers and companies, effectively excluding them from access to leading AI models [1]. - This action is interpreted as a response to a computing power crisis, where limiting market access allows Anthropic to allocate resources to core clients in Europe and the U.S. [2]. Group 2: Strategic Partnerships and Technology Choices - Anthropic is making a bold bet on Amazon's Trainium chips, opting to bypass Nvidia GPUs, which raises questions about the long-term viability of this strategy [3]. - The partnership with AWS involves a substantial investment in data center capacity, with plans for nearly one million Trainium chips to support future growth [3][18]. - The competition in generative AI is shifting from algorithmic capabilities to a broader contest involving computing power, chip technology, and capital investments [3]. Group 3: Implications for Domestic Entrepreneurs - The suspension of Claude services serves as a cautionary tale for domestic entrepreneurs, highlighting the importance of finding sustainable solutions amid uncertainty [4]. - The ongoing computing power challenges are likely to remain a significant bottleneck for AI startups, affecting both large model companies and application-layer entrepreneurs [4]. Group 4: AWS's Position in the Cloud Market - AWS, while a leader in the cloud computing market, is facing increasing competition from Microsoft Azure and Google Cloud, which have made significant strides in AI capabilities [12]. - Despite concerns about a "cloud crisis," predictions suggest that AWS's AI business could see a revival, with expected annual growth rates exceeding 20% by the end of 2025 [14]. - Anthropic's rapid revenue growth, projected to increase from $1 billion to $5 billion by 2025, underscores the potential benefits of its partnership with AWS [18][31]. Group 5: Cost of Ownership Analysis - Trainium chips, while currently less powerful than Nvidia's offerings, present a total cost of ownership (TCO) advantage in specific scenarios, particularly in memory bandwidth [50][54]. - The TCO analysis indicates that Trainium's cost efficiency could align well with Anthropic's aggressive scaling strategies in reinforcement learning [54]. Group 6: Future Outlook - Anthropic's deep involvement in the design of Trainium chips positions it uniquely among AI labs, potentially allowing it to leverage custom hardware for enhanced performance [54]. - The ongoing development of AWS's data centers, specifically designed to meet Anthropic's needs, is expected to significantly contribute to AWS's revenue growth by 2025 [38][40].
无代码还是无用?11款 AI Coding 产品横评:谁能先跨过“可用”门槛
锦秋集· 2025-09-04 14:03
Core Viewpoint - The article evaluates various AI coding tools to determine their effectiveness in transforming quick drafts into deliverable products, focusing on their capabilities in real business tasks [3][12]. Group 1: AI Coding Tools Overview - The evaluation includes a selection of representative AI coding products and platforms such as Manus, Minimax, Genspark, Kimi, Z.AI, Lovable, Youware, Metagpt, Bolt.new, Macaron, and Heyboss, covering both general-purpose tools and low-code solutions [6]. - The assessment is based on six real-world tasks designed to measure efficiency, quality, controllability, and sustainability of the AI coding tools [14]. Group 2: Performance Metrics - Each product was evaluated on four dimensions: efficiency (speed and cost), quality (logic and expressiveness), controllability (flexibility in meeting requirements), and sustainability (post-editing and practical applicability) [14]. - The tools demonstrated varying levels of performance in terms of content accuracy, information density, and logical coherence [40][54]. Group 3: Specific Tool Highlights - Manus: Capable of autonomous task execution with multi-modal processing and adaptive learning [8]. - Minimax: Supports advanced programming and multi-modal capabilities including text, image, voice, and video generation [8]. - Genspark: Can automate business processes by scheduling various external tools [8]. - Z.AI: Functions as an intelligent coding agent for full-stack website construction through multi-turn dialogue [10]. - Lovable: Quickly generates user interfaces and backend logic through prompts [10]. Group 4: Evaluation Results - Minimax and Manus showed the best performance in terms of content completeness and logical clarity, with Minimax providing a detailed framework and real information [31][54]. - Genspark and Z.AI followed closely, offering clear logic and concise presentations, although they lacked depth in analysis [39][55]. - Tools like Kimi, Lovable, and MetaGPT struggled with accuracy and depth, often producing vague or fictional information [32][54]. Group 5: Usability and Aesthetics - Most products achieved a clean and clear presentation, but some, like Kimi and Macaron, were overly simplistic and lacked necessary detail [26][44]. - Minimax and Genspark were noted for their balanced structure and interactive design, making them suitable for direct use in educational contexts [49].
锦秋基金被投地瓜机器人:从VGGT到数据闭环,具身智能的突破与探索
锦秋集· 2025-09-03 04:30
以下文章来源于42号电波 ,作者大吉 42号电波 . 42HOW 旗下专注 AI 智能硬件与具身智能的科技媒体,致力于深度报道和洞察前沿科技,探索智能技术如何重塑未来生活。 2025年,锦秋基金已完成对地瓜机器人的投资 。 锦秋基金,作为12 年期的 AI Fund,始终以长期主义为核心投资理念,积极寻找那些具有突破性技术和创新商业模式的通用人工智能初创企业。 2025年8月29日,42号电波对地瓜机器人进行了报道,以下为此次报道的转载。 对话地瓜机器人隋伟:VGGT、数据闭环与具身智能,机器人产业的新篇章 人工智能正经历一场新的范式转变。十年前, 自动驾驶 还是技术最前沿的舞台:从早期的模块化感知与规则驱动,到 2019 年之后的 BEV 与 Transformer 融合,再 到如今端到端的大模型路线,行业经历了数次迭代,如今已进入工程收敛期。但与此同时,另一条平行线—— 机器人 ——正在悄然兴起。 在 2023 年, 隋伟 选择从地平线转向地瓜机器人,正是因为看到了这种交汇的拐点。他在采访中坦言,自动驾驶的技术栈已逐渐收敛,后续更多是工程优化;而机 器人则正处于一个早期未解之地:硬件形态尚未统一,算法框架 ...
28场锦秋小饭桌的沉淀:产品、用户、技术,AI创业者的三重命题
锦秋集· 2025-09-03 01:32
Core Insights - The article discusses the ongoing series of closed-door social events called "Jinqiu Dinner Table," aimed at AI entrepreneurs, where participants share genuine experiences and insights without the usual corporate formalities [1][3]. Group 1: Event Overview - The "Jinqiu Dinner Table" has hosted 28 events since its inception in late February, bringing together top entrepreneurs and tech innovators to discuss real challenges and decision-making processes in a relaxed setting [1]. - The events are held weekly in major cities like Beijing, Shenzhen, Shanghai, and Hangzhou, focusing on authentic exchanges rather than formal presentations [1]. Group 2: AI Entrepreneur Insights - Recent discussions at the dinner table have highlighted the anxieties and breakthroughs faced by AI entrepreneurs, emphasizing the need for collaboration and shared learning [1]. - Notable participants include leaders from various AI sectors, contributing diverse perspectives on the industry's challenges and opportunities [1]. Group 3: Technological Developments - The article outlines advancements in multi-modal AI applications, discussing the integration of hardware and software to enhance user experience and data collection [18][20]. - Key topics include the importance of first-person data capture through wearable devices, which can significantly improve AI's understanding of user interactions [20][21]. Group 4: Memory and Data Management - Multi-modal memory systems are being developed to create cohesive narratives from disparate data types, enhancing the efficiency of information retrieval and user interaction [22][24]. - Techniques for data compression and retrieval are being refined to allow for more effective use of multi-modal data, which is crucial for AI applications [24][25]. Group 5: Future Directions - The article suggests that the future of AI will involve more integrated and user-friendly systems, with a focus on emotional engagement and social interaction [33]. - There is potential for new platforms to emerge from innovative content consumption methods, emphasizing the need for proof of concept before scaling [34][36].