Workflow
Founder Park
icon
Search documents
2 亿美元 ARR,AI 语音赛道最会赚钱的公司,ElevenLabs 如何做到快速增长?
Founder Park· 2025-09-16 13:22
估值 66 亿美元,首个 1 亿美元 ARR 耗时 20 个月,而第二个 1 亿美元 ARR 仅用 10 个月。 AI 音频独角兽 ElevenLabs 可以说是欧洲发展速度最快的 AI 创企。 随着语音模态正在成为人与技术交互的重要接口,AI 语音赛道的竞争也尤为激烈,Murf.ai、Play.ht、 WellSaid Labs......尤其是在 OpenAI、Google、微软这些科技巨头的围攻下,ElevenLabs 能够「跑」出来 十分艰难。在初期融资阶段,ElevenLabs 几乎被所有接触的投资人拒绝;在验证市场需求时,挨个给 YouTuber 发了几千封邀请邮件,得到的肯定回复寥寥无几。 ElevenLabs 是如何从一家「小公司」快速成长为 AI 语音领域独角兽的?ElevenLabs 的 CEO Mati Staniszewski 在一场播客对谈中,回顾了其创业历程以及心得经验: 超 13000 人的「AI 产品市集」社群!不错过每一款有价值的 AI 应用。 邀请从业者、开发人员和创业者,飞书扫码加群: 当技术研发到一定阶段,最终都会走向商品化,仅靠研发优势是不够的,必须要靠产品力。11 ...
OpenAI发布GPT-5-Codex:独立编码7小时,能动态调整资源,token消耗更少
Founder Park· 2025-09-16 03:24
文章转载自「新智元」,内容有调整。 今天,OpenAI 发布了专用于编程任务的新模型 GPT-5-Codex。 此次发布的 GPT-5-Codex 属于 GPT-5 的一个特殊版本,专为智能体编程( agentic coding) 重新设计。 GPT-5-Codex 将具备全面的「 双模」特长 : 简单说就是,GPT-5-Codex不仅快&而且更加可靠。 GPT-5-Codex的交互响应更灵敏,小任务几乎即时,大任务可持续执行数小时。 OpenAI内部测试可连续7小时完成大规模重构。 博客链接: https://openai.com/index/introducing-upgrades-to-codex/ 超 13000 人的「AI 产品市集」社群!不错过每一款有价值的 AI 应用。 邀请从业者、开发人员和创业者,飞书扫码加群: 即时协作 : 与开发者实时配合,快速回答问题、修复小bug。 独立执行 : 能长时间自主推进复杂任务(如大规模重构、跨文件调试)。 进群后,你有机会得到: 01 根据不同任务动态调整资源, 能独立完成冗长复杂任务 首先,在SWE-bench验证和代码重构任务上,GPT-5-Codex ...
张小珺对话OpenAI姚顺雨:生成新世界的系统
Founder Park· 2025-09-15 05:59
Core Insights - The article discusses the evolution of AI, particularly focusing on the transition to the "second half" of AI development, emphasizing the importance of language and reasoning in creating more generalizable AI systems [4][62]. Group 1: AI Evolution and Language - The concept of AI has evolved from rule-based systems to deep reinforcement learning, and now to language models that can reason and generalize across tasks [41][43]. - Language is highlighted as a fundamental tool for generalization, allowing AI to tackle a variety of tasks by leveraging reasoning capabilities [77][79]. Group 2: Agent Systems - The definition of an "Agent" has expanded to include systems that can interact with their environment and make decisions based on reasoning, rather than just following predefined rules [33][36]. - The development of language agents represents a significant shift, as they can perform tasks in more complex environments, such as coding and internet navigation, which were previously challenging for AI [43][54]. Group 3: Task Design and Reward Mechanisms - The article emphasizes the importance of defining effective tasks and environments for AI training, suggesting that the current bottleneck lies in task design rather than model training [62][64]. - A focus on intrinsic rewards, which are based on outcomes rather than processes, is proposed as a key factor for successful reinforcement learning applications [88][66]. Group 4: Future Directions - The future of AI development is seen as a combination of enhancing agent capabilities through better memory systems and intrinsic rewards, as well as exploring multi-agent systems [88][89]. - The potential for AI to generalize across various tasks is highlighted, with coding and mathematical tasks serving as prime examples of areas where AI can excel [80][82].
RAG 的概念很糟糕,让大家忽略了应用构建中最关键的问题
Founder Park· 2025-09-14 04:43
Core Viewpoint - The article emphasizes the importance of Context Engineering in AI development, criticizing the current trend of RAG (Retrieval-Augmented Generation) as a misleading concept that oversimplifies complex processes [5][6][7]. Group 1: Context Engineering - Context Engineering is considered crucial for AI startups, as it focuses on effectively managing the information within the context window during model generation [4][9]. - The concept of Context Rot, where the model's performance deteriorates with an increasing number of tokens, highlights the need for better context management [8][12]. - Effective Context Engineering involves two loops: an internal loop for selecting relevant content for the current context and an external loop for learning to improve information selection over time [7][9]. Group 2: Critique of RAG - RAG is described as a confusing amalgamation of retrieval, generation, and combination, which leads to misunderstandings in the AI community [5][6]. - The article argues that RAG has been misrepresented in the market as merely using embeddings for vector searches, which is seen as a shallow interpretation [5][7]. - The author expresses a strong aversion to the term RAG, suggesting that it detracts from more meaningful discussions about AI development [6][7]. Group 3: Future Directions in AI - Two promising directions for future AI systems are continuous retrieval and remaining within the embedding space, which could enhance performance and efficiency [47][48]. - The potential for models to learn to retrieve information dynamically during generation is highlighted as an exciting area of research [41][42]. - The article suggests that the evolution of retrieval systems may lead to a more integrated approach, where models can generate and retrieve information simultaneously [41][48]. Group 4: Chroma's Role - Chroma is positioned as a leading open-source vector database aimed at facilitating the development of AI applications by providing a robust search infrastructure [70][72]. - The company emphasizes the importance of developer experience, aiming for a seamless integration process that allows users to quickly deploy and utilize the database [78][82]. - Chroma's architecture is designed to be modern and efficient, utilizing distributed systems and a serverless model to optimize performance and cost [75][86].
下周二:Agent 搭建好了,来学学怎么极限控制成本
Founder Park· 2025-09-14 04:43
Core Insights - The integration of AI Agents has become a standard feature in AI products, but the hidden costs associated with their operation, such as multi-turn tool calls and extensive context memory, can lead to significant token consumption [2] Cost Control Strategies - Utilizing fully managed serverless platforms like Cloud Run is an effective way to control costs for AI Agent applications, as it can automatically scale based on request volume and achieve zero cost during idle periods [3][7] - Cloud Run can expand instances from zero to hundreds or thousands within seconds based on real-time request volume, allowing for dynamic scaling that balances stability and cost control [7][9] Upcoming Event - An event featuring Liu Fan, a Google Cloud application modernization expert, will discuss techniques for developing with Cloud Run and achieving extreme cost control [4][9] - The session will include real-world examples demonstrating the powerful scaling capabilities of Cloud Run through monitoring charts that illustrate changes in request volume, instance count, and response latency [9]
数据、IP、境外实体,到底先抓谁?一文讲清 AI 出海合规全流程
Founder Park· 2025-09-12 10:06
产品出海,找到 PMF 之后,下一步就是解决合规和法律问题。 合规的事情,说起来复杂,做起来,也复杂。 数据、知识产权、实体公司、招聘、税务、交易框架、地缘政治…… 听起来就头大。 我们特别邀请到了两位企业出海方面的资深律师,以及 AI 法律类产品的创业者,聊了聊当下科技公 司、AI 创企「出海」面临的合规风险、典型案例及应对方法。 在进行了一些脱敏处理后,Founder Park 整理了本次沉淀内容,很实在的内容,建议收藏。 嘉宾介绍: 超 13000 人的「AI 产品市集」社群!不错过每一款有价值的 AI 应用。 邀请从业者、开发人员和创业者,飞书扫码加群: 李慧君,北京嘉润律师事务所高级合伙人 李然,北京嘉润律师事务所律师 杨帆,WiseLaw 智法数科首席增长官 进群后,你有机会得到: 01 比如,你要在当地聘请当地员工,是否需要有当地实体?或者外派中国员工出去,有没有要求说聘请一 个中国员工就必须按一比一的配比雇佣当地员工?其实每个国家背后的理念是相似的:不仅希望你有个 名头去投资做生意,更希望你的投资能实实在在地造福于他的就业市场或消费者群体,带来新的就业机 会。 产品出海前, 必须要考虑的「四部 ...
Claude 官方发文:如何给 Agent 构建一个好用的工具?
Founder Park· 2025-09-12 10:06
Core Insights - Anthropic has introduced new features in Claude that allow direct creation and editing of various mainstream office documents, expanding AI's application scenarios in practical tasks [2] - The company emphasizes the importance of designing intuitive tools for uncertain, reasoning AI rather than traditional programming methods [4] - A systematic evaluation of tools using real and complex tasks is essential to validate their effectiveness [5] Group 1 - The focus is on creating integrated workflow tools rather than isolated functionalities, which significantly reduces the reasoning burden on AI [6] - Clear and precise descriptions of tools are crucial for AI to understand their purposes, enhancing the success rate of tool utilization [7] - The article outlines key principles for writing high-quality tools, emphasizing the need for systematic evaluation and collaboration with AI to improve tool performance [13][36] Group 2 - Tools should be designed to reflect the unique affordances of AI agents, allowing them to perceive potential actions differently than traditional software [15][37] - The article suggests building a limited number of well-designed tools targeting high-impact workflows, rather than numerous overlapping functionalities [38] - Naming conventions and namespaces are important for helping AI agents choose the correct tools among many options [40] Group 3 - Tools should return meaningful context to AI, prioritizing high-information signals over technical identifiers to improve task performance [43] - Optimizing tool responses for token efficiency is crucial, with recommendations for pagination and filtering to manage context effectively [48] - The article advocates for prompt engineering in tool descriptions to guide AI behavior and improve performance [52] Group 4 - The future of tool development for AI agents involves shifting from predictable, deterministic patterns to non-deterministic approaches [54] - A systematic, evaluation-driven method is essential for ensuring that tools evolve alongside increasingly powerful AI agents [54]
一文拆解英伟达Rubin CPX:首颗专用AI推理芯片到底强在哪?
Founder Park· 2025-09-12 05:07
Core Viewpoint - Nvidia has launched the Rubin CPX, a CUDA GPU designed for processing large-scale context AI, capable of handling millions of tokens efficiently and quickly [5][4]. Group 1: Product Overview - Rubin CPX is the first CUDA GPU specifically built for processing millions of tokens, featuring 30 petaflops (NVFP4) computing power and 128 GB GDDR7 memory [5][6]. - The GPU can complete million-token level inference in just 1 second, significantly enhancing performance for AI applications [5][4]. - The architecture allows for a division of labor between GPUs, optimizing cost and performance by using GDDR7 instead of HBM [9][12]. Group 2: Performance and Cost Efficiency - The Rubin CPX offers a cost-effective solution, with a single chip costing only 1/4 of the R200 while delivering 80% of its computing power [12][13]. - The total cost of ownership (TCO) in scenarios with long prompts and large batches can drop from $0.6 to $0.06 per hour, representing a tenfold reduction [13]. - Companies investing in Rubin CPX can expect a 50x return on investment, significantly higher than the 10x return from previous models [14]. Group 3: Competitive Landscape - Nvidia's strategy of splitting a general-purpose chip into specialized chips positions it favorably against competitors like AMD, Google, and AWS [15][20]. - The architecture of the Rubin CPX allows for a significant increase in performance, with the potential to outperform existing flagship systems by up to 6.5 times [14][20]. Group 4: Industry Implications - The introduction of Rubin CPX is expected to benefit the PCB industry, as new designs and materials will be required to support the GPU's architecture [24][29]. - The demand for optical modules is anticipated to rise significantly due to the increased bandwidth requirements of the new architecture [30][38]. - The overall power consumption of systems using Rubin CPX is projected to increase, leading to advancements in power supply and cooling solutions [39][40].
算一笔「看不见」的 Agent 成本帐
Founder Park· 2025-09-11 08:25
Core Insights - The integration of AI Agents has become a standard feature in AI products, but the hidden costs associated with their operation pose significant challenges [2] - Controlling costs is crucial, and fully managed serverless platforms like Cloud Run offer a viable solution by automatically scaling based on request volume and achieving zero costs during idle times [3][7] Summary by Sections - **AI Agent Development and Costs** - The deployment of AI Agents is just the initial step, with subsequent operational costs potentially consuming thousands to tens of thousands of tokens per interaction due to multi-turn tool calls and complex logic [2] - **Cost Control Solutions** - Cloud Run is highlighted as an effective platform for managing costs associated with AI Agents, allowing for automatic scaling based on real-time request volume and achieving zero costs when there are no requests [3][7] - **Upcoming Event** - An event featuring Liu Fan, a Google Cloud application modernization expert, will discuss techniques for developing with Cloud Run and strategies for extreme cost control [4][9] - **Key Discussion Points of the Event** - How Cloud Run can scale instances from zero to hundreds or thousands within seconds based on real-time requests [9] - The "zero cost with no requests" model that can reduce the operational costs of AI Agents to zero [9] - Real-world examples demonstrating Cloud Run's scalability through monitoring charts that illustrate changes in request volume, instance count, and response latency [9]
Mira Murati 创业公司首发长文,尝试解决 LLM 推理的不确定性难题
Founder Park· 2025-09-11 07:17
Core Insights - The article discusses the challenges of achieving reproducibility in large language model (LLM) inference, highlighting that even with the same input, different outputs can occur due to the probabilistic nature of the sampling process [10][11] - It introduces the concept of "batch invariance" in LLM inference, emphasizing the need for consistent results regardless of batch size or concurrent requests [35][40] Group 1 - Thinking Machines Lab, founded by former OpenAI CTO Mira Murati, has launched a blog series called "Connectionism" to share insights on AI research [3][8] - The blog's first article addresses the non-determinism in LLM inference, explaining that even with a temperature setting of 0, results can still vary [10][12] - The article identifies floating-point non-associativity and concurrency as key factors contributing to the uncertainty in LLM outputs [13][24] Group 2 - The article explains that the assumption of "concurrency + floating-point" as the sole reason for non-determinism is incomplete, as many operations in LLMs can be deterministic [14][16] - It discusses the importance of understanding the implementation of kernel functions in GPUs, which can lead to unpredictable results due to the lack of synchronization among processing cores [25][29] - The article emphasizes that most LLM operations do not require atomic addition, which is often a source of non-determinism, thus allowing for consistent outputs during forward propagation [32][33] Group 3 - The concept of batch invariance is explored, indicating that the results of LLM inference can be affected by the batch size and the order of operations, leading to inconsistencies [36][40] - The article outlines strategies to achieve batch invariance in key operations like RMSNorm, matrix multiplication, and attention mechanisms, ensuring that outputs remain consistent regardless of batch size [42][60][64] - It concludes with a demonstration of deterministic inference using batch-invariant kernel functions, showing that consistent outputs can be achieved with the right implementation [74][78]