Founder Park
Search documents
算一笔「看不见」的 Agent 成本帐
Founder Park· 2025-09-11 08:25
Core Insights - The integration of AI Agents has become a standard feature in AI products, but the hidden costs associated with their operation pose significant challenges [2] - Controlling costs is crucial, and fully managed serverless platforms like Cloud Run offer a viable solution by automatically scaling based on request volume and achieving zero costs during idle times [3][7] Summary by Sections - **AI Agent Development and Costs** - The deployment of AI Agents is just the initial step, with subsequent operational costs potentially consuming thousands to tens of thousands of tokens per interaction due to multi-turn tool calls and complex logic [2] - **Cost Control Solutions** - Cloud Run is highlighted as an effective platform for managing costs associated with AI Agents, allowing for automatic scaling based on real-time request volume and achieving zero costs when there are no requests [3][7] - **Upcoming Event** - An event featuring Liu Fan, a Google Cloud application modernization expert, will discuss techniques for developing with Cloud Run and strategies for extreme cost control [4][9] - **Key Discussion Points of the Event** - How Cloud Run can scale instances from zero to hundreds or thousands within seconds based on real-time requests [9] - The "zero cost with no requests" model that can reduce the operational costs of AI Agents to zero [9] - Real-world examples demonstrating Cloud Run's scalability through monitoring charts that illustrate changes in request volume, instance count, and response latency [9]
Mira Murati 创业公司首发长文,尝试解决 LLM 推理的不确定性难题
Founder Park· 2025-09-11 07:17
Core Insights - The article discusses the challenges of achieving reproducibility in large language model (LLM) inference, highlighting that even with the same input, different outputs can occur due to the probabilistic nature of the sampling process [10][11] - It introduces the concept of "batch invariance" in LLM inference, emphasizing the need for consistent results regardless of batch size or concurrent requests [35][40] Group 1 - Thinking Machines Lab, founded by former OpenAI CTO Mira Murati, has launched a blog series called "Connectionism" to share insights on AI research [3][8] - The blog's first article addresses the non-determinism in LLM inference, explaining that even with a temperature setting of 0, results can still vary [10][12] - The article identifies floating-point non-associativity and concurrency as key factors contributing to the uncertainty in LLM outputs [13][24] Group 2 - The article explains that the assumption of "concurrency + floating-point" as the sole reason for non-determinism is incomplete, as many operations in LLMs can be deterministic [14][16] - It discusses the importance of understanding the implementation of kernel functions in GPUs, which can lead to unpredictable results due to the lack of synchronization among processing cores [25][29] - The article emphasizes that most LLM operations do not require atomic addition, which is often a source of non-determinism, thus allowing for consistent outputs during forward propagation [32][33] Group 3 - The concept of batch invariance is explored, indicating that the results of LLM inference can be affected by the batch size and the order of operations, leading to inconsistencies [36][40] - The article outlines strategies to achieve batch invariance in key operations like RMSNorm, matrix multiplication, and attention mechanisms, ensuring that outputs remain consistent regardless of batch size [42][60][64] - It concludes with a demonstration of deterministic inference using batch-invariant kernel functions, showing that consistent outputs can be achieved with the right implementation [74][78]
Seedream 4.0 来了,AI 图片创业的新机会也来了
Founder Park· 2025-09-11 04:08
Core Viewpoint - The article discusses the emergence of AI image generation models, particularly focusing on the capabilities and advancements of the Seedream 4.0 model developed by Huoshan Engine, which is positioned as a competitive alternative to existing models like Nano Banana and GPT-4o Image [2][4][69]. Group 1: AI Image Generation Models - The AI image generation field has seen significant breakthroughs this year, with models like GPT-4o generating popular images in the Ghibli style [3]. - The Nano Banana model gained attention for its ability to generate high-fidelity images and solve issues related to subject consistency, being compared to ChatGPT in the image generation space [4]. - Huoshan Engine's Seedream 4.0 model offers enhanced capabilities, including multi-image fusion, reference image generation, and image editing, with a focus on improving subject consistency [5][6]. Group 2: Features of Seedream 4.0 - Seedream 4.0 is the first model to support 4K multi-modal image generation, significantly broadening its usability [6]. - The model allows users to input multiple images and generate a high number of outputs simultaneously, showcasing its advanced multi-image fusion capabilities [10][14]. - It supports both single and multi-image inputs, enabling complex creative tasks and maintaining consistency across generated images [50][62]. Group 3: Editing and Customization Capabilities - Seedream 4.0 features strong editing capabilities, allowing users to make precise modifications to images by simply describing the desired changes in natural language [23][24]. - The model can understand and execute detailed instructions, such as replacing elements in an image or adjusting specific details like clothing folds and lighting [26][34]. - It maintains high subject consistency across different creative forms, effectively avoiding common issues like appearance distortion and semantic misalignment during multi-round edits [28][50]. Group 4: Performance and Speed - The model achieves fast image generation speeds, producing images in seconds, which enhances the creative workflow's responsiveness [36]. - With 4K output resolution, Seedream 4.0 delivers high-quality images suitable for commercial publishing, improving detail, color depth, and semantic consistency [39][41]. Group 5: Implications for AI Entrepreneurship - The introduction of context-aware dialogue capabilities in Seedream 4.0 allows for iterative image editing, making it easier for developers to create complex image products without extensive workflow management [69][76]. - This shift in API design enables a more fluid interaction with image generation tools, potentially transforming the landscape of AI image product development [69][70]. - The model's capabilities suggest new entrepreneurial opportunities in the AI image generation space, particularly for products that require iterative design and modification [67][72].
Granola 为什么能赢:会议笔记,把产品做简单很重要
Founder Park· 2025-09-10 12:16
Core Insights - Granola differentiates itself in the crowded "meeting note tool" market by focusing on minimalistic design and leveraging user context effectively [2][3][4] - The primary competitor for Granola is not other AI note-taking products but Apple Notes, as users have only a brief window of 500 milliseconds to decide to take notes during meetings [2][10] Product Design Philosophy - The design philosophy of Granola is based on "lizard brain design," which emphasizes simplicity and minimal intrusion to maximize utility in high-pressure meeting environments [4][9] - Granola aims to be "invisible" during meetings, avoiding the use of intrusive bots that could disrupt the user experience [10][11] User Context and Feedback - Understanding user context is crucial for AI to be helpful, and Granola prioritizes gathering extensive user feedback to inform design decisions [4][14] - The company conducts regular user interviews to ensure they remain aligned with user needs and avoid assumptions about what users want [15] AI Model Utilization - Granola employs the best available third-party AI models initially, only developing proprietary models when necessary to enhance user experience [17][19] - The integration of multiple AI models allows Granola to tailor responses based on user needs and meeting contexts [18][19] Target User Base - Initially, Granola targeted venture capitalists due to their frequent meetings and specific note-taking needs, later expanding to serve founders and other knowledge workers [29][30] - The company believes that if it can effectively serve founders, it can meet the needs of a broader user base [29] Growth Mechanisms - Granola's growth has been driven by user recommendations rather than aggressive marketing strategies, with users often promoting the product in meetings [30][31] - The ability to share notes via links has become a significant growth driver, allowing users to introduce Granola to others seamlessly [30] Future Directions - Granola plans to develop features that allow for cross-meeting analysis and deeper insights based on accumulated context from past meetings [33][36] - The company envisions a future where AI tools can provide real-time insights and recommendations based on a user's entire meeting history [33][36] Competitive Landscape - Granola operates in a competitive landscape where many established players have entered the AI note-taking space, yet it maintains a unique position by focusing on user-centric design [35][38] - The company believes that its approach to creating a personalized tool will allow it to compete effectively against larger firms like OpenAI and Google [38][39]
从 AI 3D生成转型AI原生影视公司,Utopai Studios想「稍微」改造下好莱坞
Founder Park· 2025-09-10 12:16
Core Viewpoint - The article discusses the transformation of Utopai Studios, a company focused on AI-generated content, and its potential impact on the film industry, emphasizing a shift from traditional production paradigms to a more automated and imaginative approach [2][3]. Group 1: Company Transformation - Utopai Studios, previously known for AI 3D generation, has pivoted to content production, launching two AI films [2]. - The studio is led by Cecilia Shen, a young female director, who emphasizes the importance of storytelling and authenticity in their projects [3][5]. Group 2: Film Projects - Utopai's first film, "Cortés," is based on a well-respected story that has been sought after by major studios for decades, showcasing its potential for success [5][7]. - The second project, "Project Space," is an eight-episode sci-fi series that has already pre-sold in the European market [7]. Group 3: AI Integration in Film Production - Utopai aims to create an end-to-end AI production architecture that significantly reduces costs and accelerates the filmmaking process without compromising quality [8]. - The company believes that automation will revolutionize the industry, allowing creators to focus on artistic expression rather than budget constraints [9][15]. Group 4: Challenges and Solutions - Utopai acknowledges the challenges in AI video production, particularly in quality, consistency, and controllability, and is focused on addressing these issues through specialized models [12][13]. - The company’s approach involves integrating 3D data into model training to enhance the understanding of physical interactions in scenes, thus improving the realism of generated content [14]. Group 5: Future of Content Creation - Utopai envisions a future where storytelling and artistic vision take precedence over budgetary limitations, allowing for a broader range of creative projects to be realized [18]. - The integration of AI in filmmaking is seen as a means to democratize the industry, enabling independent creators to produce high-quality content at lower costs [18].
没有法律背景、聊了100位律师后开始创业,他搞出了一家7亿美元估值的AI公司
Founder Park· 2025-09-09 12:53
Core Insights - Legora is rapidly growing in the legal tech sector, having expanded from Europe to the US and partnered with 250 law firms, including top firms like Cleary Gottlieb and Goodwin [2] - The company recently raised $80 million in Series B funding, achieving a valuation of $675 million, positioning itself as a strong competitor to Harvey [2] - The founder, Max Junestrand, emphasizes the importance of humility and collaboration with early partners to navigate the rapidly changing legal industry [2][3] Product Development - Legora aims to create an AI-driven workspace for lawyers, focusing on a comprehensive end-to-end system rather than a single-point solution [3] - The product has evolved from a simple chat function to a sophisticated intelligent agent capable of managing complex workflows, such as drafting memos [4][7] - Innovations include a "Tabular Review" feature that allows for simultaneous processing of multiple queries, enhancing efficiency in legal document analysis [7][10] Sales Strategy - The sales approach targets "star teams" within law firms, emphasizing high-level engagement rather than bottom-up sales tactics [12][20] - Law firms are under pressure to adopt new technologies due to rising client expectations for efficiency and cost-effectiveness [13][14] - Legora positions itself as a long-term partner for law firms, helping them navigate the technological revolution in the legal sector [12][13] Market Positioning - The legal industry is experiencing a shift where lawyers will increasingly act as reviewers rather than executors, necessitating collaboration with AI tools [5][32] - Legora's strategy involves avoiding dependency on single suppliers and competing with AI labs, focusing instead on its unique value proposition [40][41] - The company has successfully built a team that can rapidly iterate and deliver products, outpacing larger competitors [27][28] Future Outlook - The legal tech landscape is evolving, with AI capabilities expected to enhance the roles of legal professionals and streamline operations [32][33] - Legora's growth strategy includes expanding into new markets while maintaining a strong focus on product reliability and user experience [43][44] - The company aims to be a strategic partner for large law firms, helping them adapt to the fast-paced changes in technology and client demands [39][40]
Agent 搭起来了,成本怎么控制?
Founder Park· 2025-09-09 12:53
Core Insights - The article discusses the recent trends in AI entrepreneurship, focusing on key areas such as agent development, product internationalization, overseas marketing, and cost control [2][3]. Group 1: Overseas Growth and Profitability - Companies must prioritize profitability from day one when expanding overseas, with a focus on revenue generation [5]. - AI companies exhibit three distinct characteristics: explosive product-driven growth, a mindset of "profit from day zero," and a proactive approach to using AI tools for marketing [5]. Group 2: AI Advertising Innovations - New advertising formats driven by AI are emerging, such as AI Overview and AI Mode, which enhance user experience by integrating ads into AI-generated content [9]. - Traditional marketing strategies are evolving from keyword-based approaches to understanding user intent, leading to more effective ad placements [9]. - AI significantly reduces the costs associated with creative material production, enabling scalable high-quality advertising content [9]. Group 3: AI Agent Development Techniques - The shift in AI agent development from deterministic programming to probabilistic orchestration is crucial, emphasizing the need for agents to understand "what to do" rather than "how to do it" [10]. - Key challenges in developing reliable AI agents include predictability, stability, and operational management (AgentOps) [10][11]. - Effective agent collaboration requires precise definitions of capabilities and skills through well-crafted Agent Cards [12]. Group 4: Cost Control in AI Agent Operations - Upcoming discussions will focus on using Cloud Run for cost-effective AI agent operations, including scaling instances based on real-time demand and achieving zero-cost operation models [15][20]. - Strategies for ensuring agent stability include adopting standardized protocols, implementing retry mechanisms, and maintaining human oversight [16]. - Effective monitoring and management of agent behavior require detailed logging, automated evaluation systems, and the use of tracking tools [16].
企业、垂类应用都在用 AI 搜索做什么?
Founder Park· 2025-09-09 08:11
Core Viewpoint - AI search has become a validated user demand in the market and is now a standard feature in various chatbot products [2] Group 1: AI Search Integration - Many AI products, including open-source ones, have shown unexpected and delightful use cases after integrating search capabilities [2] - Users' understanding and usage of search have evolved with the integration of search functions in various chatbot products [3] Group 2: Considerations for AI Entrepreneurs - For AI entrepreneurs, deciding whether to integrate search and how to do it effectively is a crucial consideration in the early stages of product development [4] - Bocha Search, which holds a 60% market share in the domestic market, provides search engine technology services for AI, with notable products like AiPPT and Dify utilizing its services [4] Group 3: Upcoming Event - An online sharing session is scheduled for Thursday, September 11, from 20:00 to 22:00, with limited spots available for registration [5] - The discussion will focus on the problems AI products aim to solve with search integration and the challenges enterprises face in creating effective AI search systems [7][9]
Anthropic 断供,国产 Coding 模型的中场战事开启
Founder Park· 2025-09-08 12:30
Core Viewpoint - Anthropic has decided to stop providing services to companies under the jurisdiction of unapproved regions, such as China, which significantly impacts developers and enterprises relying on their models [3][4]. Group 1: Impact of Anthropic's Decision - The announcement from Anthropic effectively blocks many developers and companies from accessing leading AI models, particularly in coding applications [4]. - This situation raises discussions about which domestic models can serve as alternatives to Claude, Anthropic's coding model, which has become a dominant player in the enterprise market [5][6]. Group 2: Domestic Alternatives and Competition - Several domestic models are attempting to position themselves as alternatives to Claude, with Kimi being one of the first to provide compatibility with Claude Code [11][22]. - Other companies, such as Alibaba and Zhizhu, have also launched models that support Claude Code, indicating a competitive landscape for coding capabilities [12][13][14]. Group 3: Model Capabilities and Development - Kimi's K2 model has shown improvements in coding ability, context length, and tool usage capabilities, making it a viable alternative to Claude Code [15][19][21]. - The context length of K2 has been upgraded from 128K to 256K, surpassing Claude's standard context length, which enhances its performance in complex tasks [19][28]. Group 4: Future Directions and Strategic Focus - The ultimate goal for Kimi is to develop a product with strong "Agentic" capabilities, which involves not just coding skills but also the ability to autonomously make decisions and execute complex tasks [40][44]. - Kimi's strategy includes open-sourcing its models to foster innovation and attract developers, positioning itself as a key player in the evolving AI ecosystem [41][42].
光刻机巨头 ASML 领投,砸下 15 亿美元,Mistral AI 现在是「欧洲 AI 之光」了
Founder Park· 2025-09-08 07:30
Group 1 - ASML invested $1.5 billion in Mistral AI, becoming its largest shareholder and gaining a board seat, marking a significant partnership in the AI sector [2][6] - Mistral AI, established just two years ago, has reached a valuation of $14 billion following a $2 billion Series C funding round, making it the most valuable AI company in Europe [3][12] - ASML's investment is seen as a strategic move to enhance European technological sovereignty and leverage Mistral's AI capabilities to improve chip manufacturing precision and efficiency [7][11][36] Group 2 - Mistral AI is positioned as the European counterpart to OpenAI, with technology comparable to ChatGPT, and has rapidly gained recognition in the AI landscape [12][13] - The company promotes an open-source philosophy, allowing developers to freely use and modify its AI models, which has contributed to its rapid growth and adoption [15][16][29] - Mistral AI has released several notable AI models and applications, including Mistral Large 2 and Voxtral, and has established a comprehensive AI infrastructure called Mistral Compute [32][35]