Founder Park
Search documents
红杉中国推出 Agent 基准测试「xbench」,双轨评估体系,关注 AI 真实场景的效用
Founder Park· 2025-05-26 06:44
Core Insights - Sequoia China has launched an internal AI and Agent benchmarking tool called "xbench" and published a corresponding paper titled "xbench: Tracking Agents Productivity, Scaling with Profession-Aligned Real-World Evaluations" [1][2] Group 1: xbench Overview - xbench employs a dual-track evaluation system to construct multidimensional assessment datasets, aiming to track both the theoretical capabilities of AI systems and the practical utility value of Agents in real-world applications [5][19] - The initial release includes two core assessment sets: xbench-ScienceQA for scientific question answering and xbench-DeepSearch for deep search capabilities, along with comprehensive rankings of major products in these fields [5][25] Group 2: Evaluation Methodology - The xbench evaluation system is designed to address two core questions: the relationship between model capabilities and actual AI utility, and the comparability of capabilities across different time dimensions [10][11] - The evaluation framework is dynamic, incorporating real-world application needs and continuously updating assessment content to ensure relevance and timeliness [5][17] Group 3: AGI Tracking and Profession Aligned Evaluations - xbench distinguishes between AGI Tracking evaluations, which verify whether models exhibit intelligent behavior in specific capability dimensions, and Profession Aligned evaluations, which focus on the delivery results and commercial value in real-world scenarios [19][20] - The AGI Tracking assessments are foundational, while Profession Aligned evaluations represent advanced practices that align with actual business processes [19][20] Group 4: Future Directions - The company plans to expand the evaluation framework to include more professional fields such as finance, law, and sales, inviting industry experts to co-develop the assessment tasks [36][37] - The long-term goal is to create a sustainable evaluation ecosystem that adapts to the rapid evolution of AI capabilities and market needs, ensuring that assessments remain relevant and effective [37][39]
Kotoko AI 乔海鑫:C.Al 的故事已经结束,我们用 OC 链接 05后
Founder Park· 2025-05-26 05:30
Core Insights - The article discusses the rise of Original Characters (OC) in the virtual world, highlighting the growing interest from capital markets and the potential for new social interaction platforms [2][3][4]. Group 1: Market Overview - The OC market has seen significant growth, with games like Gacha Life attracting over 200 million players, validating the Product-Market Fit (PMF) for OC [2][18]. - There is a clear demographic of young, highly engaged users who are eager to create and share their OCs, indicating a strong demand for new production tools and distribution platforms [3][10]. Group 2: Company Introduction - Kotoko AI, founded in 2023, aims to create a social interaction platform called Bside that combines User-Generated Content (UGC) with gamified experiences for OC [7][8]. - The founder, Qiao Haixin, has a background in UGC gaming platforms and aims to leverage AI to enhance user engagement and creativity [7][8]. Group 3: User Demographics and Behavior - The core OC user base is estimated to be around 10 million globally, with millions in both China and the U.S. actively sharing their OCs on platforms like TikTok and Instagram [8][10]. - Users exhibit diverse motivations for engaging with OCs, with females often viewing them as idealized versions of themselves, while males may adopt a more paternalistic approach [12][17]. Group 4: Product Features and Development - Bside is designed to be an "OC playground," providing a comprehensive platform for users to create, nurture, and socialize with their OCs [12][13]. - The platform aims to reduce the barriers to entry for OC creation, allowing users to engage in a cycle of creation, nurturing, and social interaction [14][19]. Group 5: Market Potential and Future Outlook - The OC phenomenon is not just an online trend but has real-world implications, with users engaging in OC-related activities offline, such as collecting dolls or custom figures [10][12]. - The potential for the OC market to grow into a multi-billion dollar industry is significant, with the possibility of reaching over 100 million daily active users (DAU) [10][12]. Group 6: AI Integration and User Experience - AI is seen as a tool to enhance the OC experience, allowing for more personalized interactions and reducing the creative burden on users [40][41]. - The focus is on creating a sense of autonomy and individuality for OCs, enabling them to engage in social interactions and share experiences, akin to having a "life" of their own [47][62]. Group 7: Competitive Landscape - The article compares Bside to existing platforms like Roblox and Gacha Life, emphasizing the need for a unique approach that combines social interaction with OC creation [72][76]. - The success of Bside will depend on its ability to foster a vibrant community and provide engaging experiences that resonate with users' desires for creativity and social connection [60][76].
去年很火的 Founder Show,回来了!
Founder Park· 2025-05-23 11:01
Founder Show 是 AGI Playground 2025 大会中的创业者特别分享环节。 每位创业者将通过 20 分钟时间,全方位分享产品进展、创业思考,与场上的「高年级创业者」实时互动交流。 我们将通过「资料初筛-线上预沟通-项目复审-入选通知」等环节,选出 9 支新锐团队登上 Founder Show 的舞台。 通过初筛及全部线下展示团队都将获得由 Founder Park & 变量资本提供的创业加速资源包。 谁来参与 招募要求 时间地点 活动形式 时间:2025 年 6 月 20 日下午 地点:北京|751 图书馆 时间线 招募流程 资料提交-资料初审-线上面试-项目复审-入选通知 9 位新锐 Founder,独立开发/拥有团队均可 AGI Founders Fund 特邀 LP、Founder Park 的「高年级创业者」 泛 Gen AI 赛道,垂类场景和产品形态不限,有可展示的产品 Demo 更佳 如最终入选,可配合大会流程,进行约 20 分钟的产品展示及线下互动 报名时间:5 月 23 日-6 月 10 日 18:00 最终通知:6 月 13 日 18:00(过期未通知即为未入选) 资 ...
目标出货一亿台,Altman和Ive的新公司「io」到底要做什么硬件?
Founder Park· 2025-05-23 11:01
Core Insights - Sam Altman and Jony Ive are collaborating to create a new hardware device, which aims to be the third core device on users' desks after the MacBook Pro and iPhone [1][4][5] - OpenAI has announced the acquisition of Jony Ive's AI hardware startup "io" for nearly $6.5 billion, with plans to ship 100 million units of the new device [1][4][8] - The device is designed to reduce users' reliance on screens and is not intended to be a smartphone or wearable technology [1][5][10] Summary by Sections Acquisition and Collaboration - OpenAI's acquisition of "io" is seen as a significant opportunity, with Altman suggesting it could generate up to $1 trillion in additional value for the company [4][9] - The collaboration between Altman and Ive has evolved over the past 18 months, with a focus on developing a device that serves as a core interaction point between users and OpenAI [10] Device Concept and Design - The new device will be pocket-sized and designed for easy placement on desks, emphasizing a low-profile design [5][10] - Altman and Ive believe that existing devices do not meet user needs, and the new product aims to change how users interact with AI [10] Market Context and Competition - The announcement comes amid other tech giants like Google and Apple launching their own AI hardware products, including smart glasses [2][9] - Altman acknowledges the challenges of entering the hardware market, especially against established companies like Apple and Google [8][9]
Claude 4发布!AI编程新基准、连续编码7小时,混合模型、上下文能力大突破
Founder Park· 2025-05-23 01:42
文章转载自「新智元」。 今天凌晨的 Anthropic 开发者大会上,Claude 4 登场。 CEO Dario Amodei亲自上阵,携Claude Opus 4和 Claude Sonnet 4亮相,再次将编码、高级推理和AI智能体,推向全新的标 准。 其中,Claude Opus 4是全球顶尖的编码模型,擅长复杂、长时间运行的任务,在AI智能体工作流方面性能极为出色。 而Claude Sonnet 4,则是对Sonnet 3.7 的重大升级,编码和推理能力都更出色,还能更精准地响应指令。 同时,Claude把这段时间积攒的一系列产品,通通一口气发布了—— Claude Opus 4和Sonnet 4混合模型的两种模式 :几乎即时的响应和用于更深度推理的扩展思考。 扩展思考与工具使用(测试版) :两款模型均可在扩展思考过程中使用工具(例如网络搜索),使Claude能在推理与工具使 用间灵活切换,从而优化响应质量。 新的模型能力 :两款模型均可并行使用工具,更精确地遵循指令,并且(当开发者授予其访问本地文件的权限时)展现出显 著增强的记忆能力,能提取、保存关键信息,以保持连续性,并随时间积累隐性知识。 C ...
a16z聊AI编程:别担心被取代,新玩家、新范式带来的是「很多」机会
Founder Park· 2025-05-22 13:32
以下文章来源于四木相对论 ,作者关注AI的 四木相对论 . 见过上千个CEO/集齐五家大厂/链接数百机构。唠唠科技,看看海外。 AI Coding 目前是第二大 AI 市场,仅次于 Chatbot,甚至有可能成为最大的单一市场。 这是 a16z 的播客中,三位投资合伙人 Matt Bornstein、Yoko Li 和 Guido Appenzeller 的观点。 和别的聊 AI Coding 的节目不同,a16z 对于编程的未来没有那么悲观。比如在他们看来,现在的 Vibe Coding 就像 20 年前开始流行的博客一样,让「创作」成为一件没有门槛的事情。 「软件的落地深度能做到什么程度? 说实话,我觉得不会很深,但这不重要。只要对人们来说实用就 行。 」 编程语言不会被取代、资深工程师依然很被需要。核心是,「 新人群+新方法很可能催生全新的软件形 态和应用场景。 」 只能说,不愧是 a16z,看到的全是新机会。 不管是对于创业者,还是程序员们,都强烈推荐一读。 一些有趣的观点: AI Coding 已成为第二大 AI 市场,仅次于面向消费者的聊天机器。人工智能生成代码预计将给市 场带来数万亿美元的生产力 ...
靠「AI婴儿播客」拿到a16z 3200万美元投资,Hedra凭什么?
Founder Park· 2025-05-22 13:32
65亿!OpenAI收购Jony Ive的AI硬件创企,Altman要开始做硬件了
Founder Park· 2025-05-22 02:56
本篇文章转载自「AI寒武纪」,内容略有调整。 今天凌晨, OpenAI宣布了 一则重磅消息,以近65亿美元的全股票交易方式收购了由苹果前首席 设计官Jony Ive参与成立的AI设备初创公司io。Jony Ive 将与OpenAI CEO Sam Altman深度联 手,共同打造一个全新公司。 为此,OpenAI官网发布了一篇官宣文章, Sam和Jony在文章中提到:"这是一个非凡的时 刻。计算机如今已具备视觉、思维和理解能力。" 然而,一个不容忽视的现实是:"尽管AI取 得了前所未有的能力进步,我们的体验很大程度上仍被传统产品和界面所塑造。" 但是关于二者合作的具体细节,文章中并未透露。文章中提到,早在两年前, Jony Ive 和创 意 团 体 LoveFrom 便 已 经 悄 悄 开 始 与 Sam Altman 和 OpenAI 的 团 队 合 作 ; 一 年 前 , Jony Ive、Scott Cannon 等人创立了"io"公司。 文章地址: https://openai.com/sam-and-jony/ 超 4000 人的「AI 产品市集」社群!不错过每一款有价值的 AI 应用。 邀请从业者 ...
微软CPO专访:Prompt是AI时代的PRD,产品经理的工作方式已经彻底变了
Founder Park· 2025-05-21 12:05
Core Insights - The article emphasizes that in the AI era, "Prompt" is becoming the new Product Requirement Document (PRD), shifting the focus of product design towards prototype validation and practical experimentation [20][21][22] - The concept of "Agent" is highlighted as a tool that can autonomously execute tasks, moving beyond simple operations to handle more complex responsibilities [5][11][12] - The importance of taste and editorial skills for product managers is increasing, as the volume of creative ideas and prototypes rises, necessitating effective content curation [25][26] Group 1: Product Development in the AI Era - The transition from traditional PRD to Prompt signifies a need for teams to produce prototypes and corresponding prompts during project development [20][21] - The development cycle is becoming uneven, with shorter times from idea to demo but longer times from demo to full launch, raising the bar for what constitutes an excellent product [21][22] - The emergence of "full-stack builders" in product teams indicates a shift towards individuals who can navigate design, product, and engineering roles fluidly [21][22] Group 2: Characteristics of Effective Agents - Effective Agents should exhibit autonomy, complexity, and natural interaction, allowing them to handle advanced tasks and operate asynchronously [11][12][13] - Natural Language Interfaces (NLI) are becoming the ultimate user experience, requiring thoughtful design beyond simple chat interactions [14][16] - The design of interaction components, such as prompts and plans, is crucial for enhancing user experience with Agents [16][17] Group 3: Key Considerations for Product Managers - Product managers must focus on qualitative feedback and user actions rather than relying on traditional metrics too early in the development process [36][38] - Understanding the three critical turning points—technological leaps, changes in user behavior, and shifts in business models—is essential for creating successful products [41][42] - The role of product managers is evolving, with an increased emphasis on decision-making based on real expertise rather than title alone [25][26] Group 4: Challenges in AI Product Development - Companies must balance user experience with compliance and governance when developing enterprise-level products, which adds complexity to the product design process [44][45] - The rapid pace of technological change necessitates a flexible approach to product development, allowing early adopters to experiment without hindering overall progress [46][47] - The need for a robust system that integrates various functionalities is critical for the success of AI-driven products, as seen with GitHub's approach [52][53]
月流水百万美元、跻身赛道前列,美图做出了视频版「美图秀秀」
Founder Park· 2025-05-21 12:05
以下文章来源于白鲸出海 ,作者白鲸小编 白鲸出海 . 白鲸出海,泛互联网出海服务平台,白鲸专注于具备互联网属性的行业、公司、产品和服务的出海,包括应用、游戏、电商、区块链、智能手机及硬件、 旅游、网络文学、影视、动漫、教育、体育和金融等。 3 月份,我们曾经对美图 2024 全年财报进行了详细解读,在面向 C 端生活场景的产品线中,收入主要还是靠早年上线的图片产品+AI 来撑住场面,但进 入 2025 年,C 端却出现了月流水超过百万美元的"新星",且是美图新探索的视频方向。 Wink 全球双端收入|图片来源:点点数据 美图旗下的视频编辑产品 Wink 的 MAU 从 2023 年 12 月的 642 万增长到了 900 万,至今,MAU 一直在 770-900 万之间波动。虽然用户量的增长出现暂时 波动,但收入端持续强势,去年 12 月 Wink 的全球月流水突破百万美元后,除了 2025 年 2 月略有回落外,其他几个月均保持在百万美元之上。 | 排名 | 产品 | 开发者 | 近 30 天全球收入 | 2025 年 3 月全球 MAU | | --- | --- | --- | --- | --- | ...