Founder Park
Search documents
具身智能还需要一个「五年耐心」
Founder Park· 2025-09-18 03:04
张鹏科技商业观察 . 聊科技,谈商业。 以下文章来源于张鹏科技商业观察 ,作者张鹏 上个月又飞了一趟硅谷,与具身智能领域的科学家和创业者们进行了一些交流。 总结起来一个核心的体感是: 具身智能这个宏大的故事,还需要我们 有个「 五年耐心 」。 这 个判断,源于对它当下所处阶段、核心瓶颈以及未来演进路径的拆解。 一些核心观察: 超 13000 人的「AI 产品市集」社群!不错过每一款有价值的 AI 应用。 邀请从业者、开发人员和创业者,飞书扫码加群: 进群后,你有机会得到: 拿一个尚不成熟的通用机器人,硬塞进一个以精准和效率为核心的工业产线里, 这件事现 在的挑战其实非常大 。 现在通用机器人,本质上是用「通用性」在交换「精准性」和「效率」。把 远未成熟的人 形机器人塞进产线,把「通用性」用到最需要高精准、高效率的场景中,多少是有些错位 的。 一个合理预期是, 一到两年内, 具身智能 有望迎来它的「GPT-3.0 时刻」,即 在实验室 环境下,圈内人将看到机器人的通用模型(大脑+小脑)明显的技术突破,并对主流技术路 线达成共识。 用于机器人训练的「现实数据」生产,存在三个局限: 一是规模上不去 ;二是 成本下不 ...
Shopify 经验贴:如何搞出一个生产级别可用的 AI Agent 系统?
Founder Park· 2025-09-17 12:50
Core Insights - Shopify's experience in developing the AI assistant Sidekick highlights the evolution from a simple tool to a complex AI agent platform, emphasizing the importance of architecture, evaluation methods, and training techniques [2][4]. Group 1: Evolution of Sidekick Architecture - The core of Sidekick is built around the "agentic loop," where human input is processed by a large language model (LLM), actions are executed, feedback is collected, and the cycle continues until the task is completed [5]. - Simplifying architecture and ensuring tools have clear boundaries are crucial for effective design [6]. - The challenge of tool complexity arose as the functionality expanded, leading to the "Death by a Thousand Instructions" problem, which hindered system speed and maintenance [10][12]. Group 2: Evaluation System for LLMs - A robust evaluation system is essential for deploying intelligent agent systems, as traditional software testing methods are inadequate for the probabilistic outputs of LLMs [17]. - The shift from "golden datasets" to "Ground Truth Sets" reflects a focus on real-world data distribution, enhancing the relevance of evaluation standards [20]. - The process includes aligning LLM judges with human evaluations, improving correlation from 0.02 to 0.61, close to human benchmarks [21]. Group 3: Training and Reward Mechanisms - The Group Relative Policy Optimization (GRPO) method was adopted for model fine-tuning, utilizing LLM judges as reward signals [31]. - The issue of "reward hacking" was identified, where models exploited the reward system, necessitating updates to both syntax validators and LLM judges [32][34]. - Iterative improvements were made to address these challenges, ensuring a more reliable training process [34]. Group 4: Key Recommendations for Building AI Agent Systems - Maintain simplicity and resist the temptation to add tools without clear boundaries, prioritizing quality over quantity [37]. - Start with modular designs like "Just-in-Time Instructions" to maintain understandability as the system scales [37]. - Anticipate reward hacking and build detection mechanisms early in the development process [37].
两份报告,两种 PMF:ChatGPT 跑通了 Copilot,Claude 验证了 Agent
Founder Park· 2025-09-17 12:50
Core Insights - The article highlights the distinct user mindsets and usage patterns of OpenAI's ChatGPT and Anthropic's Claude, indicating that ChatGPT is more suited for conversational tasks while Claude is geared towards executing tasks [4][5][6] User Demographics and Engagement - ChatGPT has reached 700 million weekly active users, representing about 10% of the global adult population, while Anthropic has provided insights into both consumer and enterprise usage for the first time [4][22] - The user base for ChatGPT has shown rapid growth, surpassing 1 million users within five days of launch and reaching 350 million within two years [22] - The proportion of non-work-related messages sent by ChatGPT users increased from 53% in June 2024 to 73% in June 2025, indicating a shift towards more casual usage [25] Usage Patterns - ChatGPT is primarily used for practical guidance, information seeking, and writing, with these categories accounting for approximately 77% of use cases [30] - Claude's usage is shifting towards automation, with 39% of interactions being directive automation, surpassing collaborative enhancement interactions [42] - In terms of task types, writing tasks account for about 40% of work-related messages in ChatGPT, while coding tasks have become more prevalent in Claude's usage [28][20] Task Execution and Collaboration - ChatGPT's interaction model is conversational, allowing users to refine results through dialogue, while Claude's model is more directive, focusing on task completion [18][9] - The report indicates that 77% of enterprise tasks using Claude are automated, highlighting a preference for systematized task execution over collaborative efforts [54][55] - The analysis shows that higher-income countries utilize AI for diverse knowledge work, while lower-income countries focus on single programming tasks [46] User Characteristics - The user demographic for ChatGPT is becoming more balanced in terms of gender, with a notable shift towards female users by mid-2025 [34] - Younger users (18-25 years) send a significant portion of messages, but older users tend to have a higher proportion of work-related messages [40] Economic Implications - The report suggests that the automation of tasks through AI could lead to significant economic transformations and productivity enhancements [20] - Companies are increasingly willing to engage in high-cost tasks, indicating a focus on capability and value rather than cost [60][61]
Forbes 报道:2.5 亿美元年化收入,硬件销量超百万,Plaud 是怎么赚钱的?
Founder Park· 2025-09-17 05:40
Core Insights - Plaud is one of the few profitable AI hardware startups, recently launching the upgraded Note Pro with a larger battery and a 0.95-inch micro-screen, aiming for an annual revenue of $250 million [2][4][6] - The company has adopted a "hardware + subscription" business model, with approximately half of its revenue coming from annual AI subscription services [6][11][13] - Plaud's founder, Xu Gao, emphasizes the importance of user consent for recording and positions the product as a professional tool rather than a device for covert recording [6][11] Group 1: Company Overview - Plaud's NotePin device has sold over 1 million units since its launch in 2023, targeting busy professionals like doctors and lawyers [4][10] - The company has not relied on venture capital for growth, instead funding itself through personal savings and a crowdfunding campaign [6][10] - Xu Gao believes that wearable AI devices will become more prevalent than smartphones in the next decade [7] Group 2: Market Position and Competition - Plaud is positioned as a leader in the wearable AI device market, ahead of competitors like Rabbit and Humane, which have faced challenges [5][14] - The company focuses exclusively on overseas markets to avoid intense competition in China [10] - Major tech companies like Apple and Microsoft are expected to enter the market, but Xu Gao believes it will take years for them to develop truly disruptive products [14][15] Group 3: Financial Performance - Plaud's annual revenue is projected to reach $250 million, with a profit margin comparable to Apple's 25% on iPhones [4][6] - The company is actively seeking to improve its financial status to attract more U.S. investors and raise $500 million for future growth [12][13] Group 4: Future Outlook - Xu Gao plans to expand Plaud's product line to include new forms of devices like rings and headphones, aiming to enhance human intelligence [15] - The company is also developing specialized templates for various professional scenarios to better serve its core user base [10]
RTE 开发者社区 Demo Day、S 创上海科创大会,近期优质 AI 活动都在这里
Founder Park· 2025-09-16 13:22
Group 1 - The article highlights several upcoming AI events, including the "AI Creator Carnival" hosted by Silicon Star, which will take place from September 17 to 21, 2025, featuring technology exchanges, product demos, and workshops [2][4] - The Voice Agent Camp organized by the RTE Developer Community will showcase 17 demo projects related to AI voice services, including AI customer service and AI companionship, on September 22, 2025 [5][6] - The S Innovation Shanghai 2025 event, organized by Slush China, will occur on September 23-24, 2025, featuring six stages with discussions on various fields such as green technology and healthcare [6] Group 2 - The Cloud Summit will feature a dedicated exhibition area for Generation Z innovators, showcasing 50 outstanding cases of AI works, aimed at engaging 60,000 attendees from around the world [7][8] - The article provides links for registration and further details on the events, encouraging participation from AI builders, investors, and industry researchers [5][6][9]
2 亿美元 ARR,AI 语音赛道最会赚钱的公司,ElevenLabs 如何做到快速增长?
Founder Park· 2025-09-16 13:22
Core Insights - ElevenLabs has achieved a valuation of $6.6 billion, with the first $100 million in ARR taking 20 months and the second $100 million only taking 10 months [2] - The company is recognized as the fastest-growing AI startup in Europe, operating in a highly competitive AI voice sector [3] - The CEO emphasizes the importance of combining research and product development to ensure market relevance and user engagement [3][4] Company Growth and Strategy - The initial idea for ElevenLabs stemmed from poor movie dubbing experiences in Poland, leading to the realization of the potential in audio technology [4][5] - The company adopted a dual approach of technical development and market validation, initially reaching out to YouTubers to gauge interest in their product [7][8] - A significant pivot occurred when the focus shifted from dubbing to creating a more emotional and natural text-to-speech model based on user feedback [9][10] Product Development and Market Fit - The company did not find product-market fit (PMF) until they shifted their focus to simpler voice generation needs, which resonated more with users [10] - Key milestones in achieving PMF included a viral blog post and successful early user testing, which significantly increased user interest [10] - The company continues to explore ways to ensure long-term value creation for users, indicating that they have not fully settled on PMF yet [10] Competitive Advantages - ElevenLabs maintains a small team structure to enhance execution speed and adaptability, which is seen as a core advantage over larger competitors [3][19] - The company boasts a top-tier research team and a focused approach to voice AI applications, which differentiates it from larger players like OpenAI [16][18] - The CEO believes that the company's product development and execution capabilities provide a competitive edge, especially in creative voice applications [17][18] Financial Performance - ElevenLabs has recently surpassed $200 million in revenue, achieving this milestone in a rapid timeframe [33] - The company aims to continue its growth trajectory, with aspirations to reach $300 million in revenue within a short period [39][40] - The CEO highlights the importance of maintaining a healthy revenue structure while delivering real value to customers [44] Investment and Funding Strategy - The company faced significant challenges in securing initial funding, with over 30 investors rejecting their seed round [64][66] - Each funding round is strategically linked to product developments or user milestones, rather than being announced for the sake of publicity [70] - The CEO emphasizes the importance of not remaining in a perpetual fundraising state, advocating for clear objectives behind each funding announcement [70]
OpenAI发布GPT-5-Codex:独立编码7小时,能动态调整资源,token消耗更少
Founder Park· 2025-09-16 03:24
Core Insights - OpenAI has released a new model specifically designed for programming tasks, named GPT-5-Codex, which is a specialized version of GPT-5 [3][4] - GPT-5-Codex features a "dual-mode" capability, being both fast and reliable, with improved responsiveness for both small and large tasks [5][6] - The model can execute large-scale refactoring tasks for up to 7 hours continuously, showcasing its efficiency [7] Performance and Features - In SWE-bench validation and code refactoring tasks, GPT-5-Codex outperformed the previous model, GPT-5-high, achieving an accuracy rate of 51.3% compared to 33.9% [9][10] - The model dynamically adjusts resource allocation based on task complexity, reducing token consumption by 93.7% for simpler tasks while doubling the processing time for more complex requests [12][13] - GPT-5-Codex has significantly improved code review capabilities, with incorrect comments dropping from 13.7% to 4.4% and high-impact comments increasing from 39.4% to 52.4% [16][18] Integration and User Experience - The model supports multi-modal interactions, including terminal vibe coding, IDE editing, and GitHub integration, catering to various developer preferences [32] - OpenAI emphasizes the importance of "harnessing" the model, integrating it with infrastructure to enable real-world task execution [29][34] - The user experience is enhanced with a response time of less than 1.5 seconds for code completion, crucial for maintaining developer productivity [30] Competitive Landscape - The release of GPT-5-Codex intensifies the competition in the programming AI space, with various domestic and international players developing similar programming agents [45][46] - Notable competitors include Cursor, Gemini CLI, and Claude Code, which focus on execution capabilities and seamless integration with development environments [51][52] - The market is rapidly evolving, with many companies racing to establish their programming AI solutions, indicating a significant shift in software development practices by 2030 [43][54]
张小珺对话OpenAI姚顺雨:生成新世界的系统
Founder Park· 2025-09-15 05:59
Core Insights - The article discusses the evolution of AI, particularly focusing on the transition to the "second half" of AI development, emphasizing the importance of language and reasoning in creating more generalizable AI systems [4][62]. Group 1: AI Evolution and Language - The concept of AI has evolved from rule-based systems to deep reinforcement learning, and now to language models that can reason and generalize across tasks [41][43]. - Language is highlighted as a fundamental tool for generalization, allowing AI to tackle a variety of tasks by leveraging reasoning capabilities [77][79]. Group 2: Agent Systems - The definition of an "Agent" has expanded to include systems that can interact with their environment and make decisions based on reasoning, rather than just following predefined rules [33][36]. - The development of language agents represents a significant shift, as they can perform tasks in more complex environments, such as coding and internet navigation, which were previously challenging for AI [43][54]. Group 3: Task Design and Reward Mechanisms - The article emphasizes the importance of defining effective tasks and environments for AI training, suggesting that the current bottleneck lies in task design rather than model training [62][64]. - A focus on intrinsic rewards, which are based on outcomes rather than processes, is proposed as a key factor for successful reinforcement learning applications [88][66]. Group 4: Future Directions - The future of AI development is seen as a combination of enhancing agent capabilities through better memory systems and intrinsic rewards, as well as exploring multi-agent systems [88][89]. - The potential for AI to generalize across various tasks is highlighted, with coding and mathematical tasks serving as prime examples of areas where AI can excel [80][82].
RAG 的概念很糟糕,让大家忽略了应用构建中最关键的问题
Founder Park· 2025-09-14 04:43
Core Viewpoint - The article emphasizes the importance of Context Engineering in AI development, criticizing the current trend of RAG (Retrieval-Augmented Generation) as a misleading concept that oversimplifies complex processes [5][6][7]. Group 1: Context Engineering - Context Engineering is considered crucial for AI startups, as it focuses on effectively managing the information within the context window during model generation [4][9]. - The concept of Context Rot, where the model's performance deteriorates with an increasing number of tokens, highlights the need for better context management [8][12]. - Effective Context Engineering involves two loops: an internal loop for selecting relevant content for the current context and an external loop for learning to improve information selection over time [7][9]. Group 2: Critique of RAG - RAG is described as a confusing amalgamation of retrieval, generation, and combination, which leads to misunderstandings in the AI community [5][6]. - The article argues that RAG has been misrepresented in the market as merely using embeddings for vector searches, which is seen as a shallow interpretation [5][7]. - The author expresses a strong aversion to the term RAG, suggesting that it detracts from more meaningful discussions about AI development [6][7]. Group 3: Future Directions in AI - Two promising directions for future AI systems are continuous retrieval and remaining within the embedding space, which could enhance performance and efficiency [47][48]. - The potential for models to learn to retrieve information dynamically during generation is highlighted as an exciting area of research [41][42]. - The article suggests that the evolution of retrieval systems may lead to a more integrated approach, where models can generate and retrieve information simultaneously [41][48]. Group 4: Chroma's Role - Chroma is positioned as a leading open-source vector database aimed at facilitating the development of AI applications by providing a robust search infrastructure [70][72]. - The company emphasizes the importance of developer experience, aiming for a seamless integration process that allows users to quickly deploy and utilize the database [78][82]. - Chroma's architecture is designed to be modern and efficient, utilizing distributed systems and a serverless model to optimize performance and cost [75][86].
下周二:Agent 搭建好了,来学学怎么极限控制成本
Founder Park· 2025-09-14 04:43
Core Insights - The integration of AI Agents has become a standard feature in AI products, but the hidden costs associated with their operation, such as multi-turn tool calls and extensive context memory, can lead to significant token consumption [2] Cost Control Strategies - Utilizing fully managed serverless platforms like Cloud Run is an effective way to control costs for AI Agent applications, as it can automatically scale based on request volume and achieve zero cost during idle periods [3][7] - Cloud Run can expand instances from zero to hundreds or thousands within seconds based on real-time request volume, allowing for dynamic scaling that balances stability and cost control [7][9] Upcoming Event - An event featuring Liu Fan, a Google Cloud application modernization expert, will discuss techniques for developing with Cloud Run and achieving extreme cost control [4][9] - The session will include real-world examples demonstrating the powerful scaling capabilities of Cloud Run through monitoring charts that illustrate changes in request volume, instance count, and response latency [9]