Founder Park

Search documents
4个月11万用户、Claude Code成了,Dogfooding该被AI公司重视起来了
Founder Park· 2025-07-22 12:27
Dogfooding(内部试用) 应该被 AI 创业公司重视起来了。 对于今天的 AI 公司来说,「先解决自己的问题,完全可能带来改变整个市场的突破性产品。」 比如,Anthropic 今年推出的 Claude Code,就是这样的一个典型。与其他 AI Coding 产品不同的是,Claude Code 源自公司内部工具孵化,在经过高强度的 内部使用和真实使用场景验证之后,对外发布。 也正是因为从真实的用户需求角度出发,Claude Code 相比于其他产品,功能设计更能切中开发者的核心体验。所有交互都可以通过命令行完成、全局代 码库理解能力等等,这种独特的优势,加上背后强大的基础模型能力以及高性价比,让 Claude Code 仅在发布四个多月后,便拥有了 11 万开发者用户。 Claude Code 的成功或许说明了一点:深度、真实的内部试用是最终的、不可复制的竞争优势。最好的产品往往不是来自市场调研,而是来自于解决自己 每天面临的真实问题。 对于今天还在苦苦寻找 PMF 的 AI 初创公司来说,Dogfooding 很重要。 Gennaro Cuofano 最近的一篇博客文章,详细地介绍了 Claud ...
现在全世界最好的开源模型,是 Kimi、DeepSeek 和 Qwen
Founder Park· 2025-07-21 13:26
Core Viewpoint - Kimi K2 is recognized as a leading open-source model, outperforming other models and gaining significant traction in the AI community, particularly in China [1][12][13]. Group 1: Model Performance and Recognition - Kimi K2 has achieved the highest ranking among open-source models on LMArena, surpassing DeepSeek R1 and becoming the most powerful open-source model globally [1][9]. - The model has received positive feedback from the international tech community, with Jack Clark, co-founder of Anthropic, labeling it as the best open-source weight model available [12][15]. - K2's performance is comparable to top models from leading Western companies, indicating a significant advancement in Chinese AI technology [13][14]. Group 2: Community Engagement and Adoption - Following its release, K2 quickly became the most popular model on Hugging Face, maintaining this status for over a week [5]. - The model has seen over 140,000 downloads and has inspired the development of 20 fine-tuned and quantized models within a short period [7]. - Major AI coding software platforms, such as VS Code and Cursor, have integrated K2, highlighting its growing adoption in practical applications [10]. Group 3: Strategic Implications for the Industry - The success of K2 is seen as a pivotal moment for Chinese AI models, akin to the "DeepSeek moment," suggesting a shift in the competitive landscape of open-source models [11][16]. - The open-source strategy adopted by companies like Moonshot is viewed as essential for survival and competitiveness in the current market, allowing for rapid iteration and community support [21][22]. - The emergence of K2 and similar models indicates a growing gap between Western and Chinese open-source models, with the latter leading in practical applications and accessibility [17][19].
Meta AI 梦之队成员背景大盘点,44 人中近一半为华人研究员
Founder Park· 2025-07-21 13:26
文章转载自「 量子位 」,内容略有调整。 近期,在推特上流传着一张 Meta AI 团队的「顶尖天才」绝密名单,名单共 44 人, 有其中,大部分(40%)的人来自 OpenAI,华人研究员居 多。 | Name | | | Tenure @ Meta YoE | | Current Job | Prior Roles | Expertise | Advanced Degree | Undergrad Degree | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | Nat Friedman | | American | 18 days | 28 | VP, Meta Superintelligence | NFDG: CEO, Github | Developer ecosystems | | BS, MIT (CS) | | Daniel Gross | | leraeli | 18 days | 15 | VP Product, Meta Superintelligence Cofounder, SSI; NDFG | | ...
16 个月、45 万资金投入,一款 AI 社交产品的创业失败复盘
Founder Park· 2025-07-19 16:26
以下文章来源于老段的想法 ,作者尚书毅 老段的想法 . 祝愿我们都开心一些噢~ 编辑注:今天这篇文章,是一款 AI 产品失败后的复盘,回头来看,在没有确定 PMF、用户愿不愿意付费的情况下,组这么大的一个团队来创 业,确实是踩了不少坑。 创业项目是情侣的AI社交工具,项目名叫"抱抱窝"。主要的产品形态是在一个独立的APP中,实现了情侣2人的IM(即时通讯,参见微信聊天的功 能)并且在其中加入AI智能机器人,能给理解历史消息、参与对话。还有2个人可以协同编辑的笔记,AI也能自主地基于聊天内容,按照笔记主题更 新内容。 项目历时1年4个月,约35人兼职/实习参与项目,2人共全职21个月。耗费了约45W的资金。最终APP开发完成,无力进一步迭代、推广运营。 本次复盘将按照时间线,叙述当时的关键行动和现在对此行动的复盘。 超 9000 人的「AI 产品市集」社群!不错过每一款有价值的 AI 应用。 邀请从业者、开发人员和创业者,飞书扫码加群: 进群后,你有机会得到: 01 坚定创业 2024年2月,在过年回家时做未来发展规划,还是觉得自己想创业,我目前的个人财务能承担创业成本、没有任何贷款,具备抗风险能力。AI行业高 ...
来自 Manus 的一手分享:如何构建 AI Agent 的上下文工程?
Founder Park· 2025-07-18 18:51
Manus 官网昨天更新了一篇文章,分享了他们为 Manus 搭建合适的上下文工程的经验教训。 作者季逸超 (Peak),Manus 公司联合创始人、首席科学家。 文章基于 Kimi K2 翻译,我们进行了一些调整。 在 Manus 项目伊始,我和团队就面临一个关键抉择:是利用开源基础模型训练一个端到端的智能体,还是依托前沿模型的上下文学习能力,在其之上 构建智能体? 在我投身 NLP 的第一个十年里,我们并没有这种奢侈的选择。遥想当年 BERT 问世(没错,那已是七年前),模型必须先经过微调——还要评估—— 才能迁移到新任务。每次迭代往往耗时数周,尽管那时的模型体积与今日的 LLMs 相比微不足道。对于快速迭代的应用,尤其是 PMF 之前的阶段,如 此缓慢的反馈循环几乎是致命的。这是我上一家初创公司留下的惨痛教训:当时我从零开始训练模型,用于开放信息抽取和语义搜索。随后 GPT-3 与 Flan-T5 横空出世,我那些自研模型一夜之间便失去了意义。颇具讽刺意味的是,正是这些新模型开启了上下文学习的大门——也为我们指明了一条全 新的道路。 这个来之不易的教训让选择变得清晰:Manus 将押注于上下文工程。这让 ...
MiniMax 技术闭门会分享:长上下文是 Agent 的 Game Changer
Founder Park· 2025-07-18 18:24
Core Insights - The article discusses the advancements in Reinforcement Learning (RL) and its potential to enhance model capabilities, particularly in the context of limited context lengths and the importance of pre-training data diversity [6][8][10]. Group 1: RL and Model Capabilities - RL can indeed provide new capabilities to models, especially when dealing with limited context lengths, by altering the output distribution and reducing the number of tokens needed to solve specific problems [6]. - The pass@k metric is highlighted as a useful measure for evaluating model capabilities, with the definition of k being crucial depending on the problem context [7]. - Reward modeling remains a significant challenge in RL, particularly for non-outcome-based rewards, which complicates the training process [7]. Group 2: Pre-training and Data Distribution - Pre-training is essential for exposing models to diverse data distributions, which is currently more varied than the narrower distributions used in RL training [8]. - The article emphasizes that while RL can potentially fill gaps in pre-training, the quality and diversity of pre-training data are critical for effective model training [8]. Group 3: Long Context and Agent Workflows - Long context windows are identified as game-changers for agent workflows, allowing for the processing of extensive information in a single pass, which enhances output quality [15][16]. - The application of long context models is particularly beneficial in fields such as legal compliance analysis and customer research, where comprehensive data processing is required [17][18]. Group 4: Hybrid Architectures - Hybrid attention mechanisms are positioned as the future of model design, combining the strengths of linear and full attention models to improve efficiency and performance [19][20]. - The article notes that the effective deployment of hybrid architectures is currently limited by infrastructure challenges, despite their proven potential [20]. Group 5: Practical Applications and Challenges - The implementation of hybrid architectures in real-world applications is crucial, especially for handling large-scale requests efficiently [22]. - The article discusses the need for unified abstraction layers to optimize both traditional and hybrid architectures in inference engines [21]. Group 6: Future Directions - The exploration of latent reasoning and self-training models is highlighted as an exciting frontier in RL research, with implications for the development of more autonomous AI systems [13][14]. - The importance of evaluating model performance based on computational budgets rather than fixed output lengths is emphasized for a more accurate assessment of efficiency [24].
OpenAI核心研究员:比提示词工程更重要的,是spec-writing
Founder Park· 2025-07-18 11:37
程序员最有价值的技能已经不再是编写代码了,而是精确地向 AI 传达意图。 一份完善的规范才是包含完整意图的真正「源代码」。 这是 OpenAI 研究员 Sean Grove 在 AIEWF 2025 的演讲中提出的观点。前不久, Andrej Karpathy 也针对于提示词提出了他的观点 ,不同的是,Karpathy 聚焦如何给 AI「喂更多地料」,让 AI 更理解你的意图。Karpathy 认为,提供完整且恰当的上下文往往比编写好的提示词更重要。 Sean Grove 的视角则聚焦在如何形成一份完善、可执行的「规范」,以此精准地向 AI 传达意图。 在某种程度上,两者的观点都深刻地体现了一点:生成代码已经不是重点了,软件工程的本质是人与 AI 之间的「沟通」。 而且,这也可以看作是对 Jason Wei 提出的验证者规律 的回应,规范本身就是一种可验证的标准。 在演讲中,Sean Grove 从 AI 时代「新代码」的角度,分享了他对于软件工程的看法。Sean Grove 认为,提示词是规范,不应被用过一次后即被丢弃,捕 捉其中的意图和价值观非常重要, 最有价值的成果不是代码,而是源规范。 此外,Sean ...
4人团队,连做两款AI教育爆款,AI时代小团队创业取胜指南
Founder Park· 2025-07-18 11:37
Core Insights - The article discusses the emergence of small, efficient teams in the AI era, exemplified by Oleve, which has achieved significant revenue with a minimal workforce [3][7]. Group 1: Company Overview - Oleve is an AI startup with a team of only 4 people, generating an annual revenue of $6 million (approximately 43 million yuan) [3]. - The company has developed three products, including two educational applications: Quizard AI and Unstuck AI, with Quizard achieving a ranking of third in educational apps [3][9]. Group 2: Product Success - Quizard, launched in January 2023, quickly gained traction, achieving profitability within 9 months [9][17]. - The marketing strategy for Quizard included viral videos on TikTok, which garnered over 1 million views and converted into 10,000 users within 30 hours [13][15]. - Unstuck AI, the second product, reached 1 million users in just 2 months after its launch [19]. Group 3: Team and Growth Strategy - Oleve employs a "lean growth" strategy with six principles, including hiring multi-skilled individuals and prioritizing profitability [26][28]. - The team focuses on continuous process improvement and utilizes "super tools" to streamline operations and enhance flexibility [31][32]. - Oleve's approach to automation includes using AI to analyze social media trends and inform product decisions, aiming for a three-phase automation system [36]. Group 4: Future Outlook - The company is planning a third product that is not education-focused, which has already achieved profitability [35]. - Oleve's model showcases how small teams can leverage AI to create efficient workflows and innovative products, potentially leading to more "small and beautiful" AI startups in the education sector [36].
Kimi 员工复盘 K2:为什么聚焦 Agent、为什么开源,为什么选择 DSV3 架构?
Founder Park· 2025-07-18 09:39
Core Viewpoint - The article discusses the launch and features of the K2 model, highlighting its advancements in coding capabilities and its recognition in the AI community, particularly as an open-source flagship model [1][4][13]. Group 1: Model Performance and Features - K2 has become the top-ranked open-source model in the LMArena arena, showcasing its strong performance in coding capabilities [1][3]. - The model architecture includes a trillion-parameter MoE (Mixture of Experts) design, emphasizing its innovative approach to agent tool use and coding abilities [2][4]. - K2's coding capabilities have been acknowledged by various coding products integrating with it, indicating its effectiveness in practical applications [3]. Group 2: Development Insights - The development of K2 involved significant research into model structure and scaling experiments, leading to the decision to inherit the successful structure of the DSv3 model while optimizing parameters for cost efficiency [20][21]. - The team focused on maintaining training and inference costs comparable to DSv3, ensuring the model remains viable for a smaller company [20][21]. - The K2 model's design includes specific adjustments such as the number of experts and attention heads, aimed at improving performance while managing resource constraints [22][24][30]. Group 3: Open Source Strategy - The decision to open-source K2 is driven by the desire for greater visibility and community engagement, which can enhance the model's technical ecosystem [13][14]. - Open-sourcing allows for higher technical standards, compelling the company to produce better models and align more closely with the goal of achieving AGI (Artificial General Intelligence) [14][15]. - The article emphasizes that open-source models must demonstrate reproducibility and effectiveness, which can drive innovation and improvement in model development [15][13]. Group 4: Market Position and Competition - The article reflects on the competitive landscape, noting that many agent products rely heavily on foundational models like Claude, indicating the importance of strong underlying technology [16][19]. - Despite challenges in visibility and market presence, the company remains committed to focusing on core model development rather than diverting resources to less impactful areas [19]. - The success of competitors like DeepSeek is viewed positively, reinforcing the belief that strong model performance is the best form of promotion in the market [19].
OpenAI 发布 ChatGPT Agent:已向付费用户开放,与 Manus 相似
Founder Park· 2025-07-18 03:19
Core Viewpoint - The article emphasizes that the major theme of AI in 2025 is the emergence of "Agent" capabilities, transitioning from AI merely "talking" to actively "doing" tasks [1][31]. Group 1: Introduction of Agent Mode - OpenAI introduced the Agent mode, allowing users to directly request tasks from ChatGPT, such as purchasing items or generating presentations, with the AI autonomously executing these tasks in a virtual environment [2][5]. - The Agent mode can utilize three tools: text browser, visual browser, and terminal, enabling it to perform complex tasks efficiently [6][7]. Group 2: User Experience and Interaction - Users can interact with the Agent in real-time, providing confirmations and new requirements during task execution, enhancing the collaborative experience [5][12]. - The Agent's ability to autonomously switch between tools and execute tasks significantly improves efficiency compared to traditional methods [6][30]. Group 3: Integration of Previous Tools - The Agent mode is a combination of two previously launched tools, Operator and Deep Research, which were integrated to enhance user experience and task execution capabilities [15][17]. - This integration allows the Agent to perform tasks that require both browsing and in-depth research, streamlining the process of generating comprehensive reports [18][22]. Group 4: Performance Metrics and Comparisons - The Agent mode has shown significant improvements in performance metrics, achieving a score of 42% in the Humanities Last Exam, indicating a substantial enhancement in capabilities compared to previous models [22][30]. - While the Agent mode is still not on par with human performance in certain tasks, it demonstrates a notable advancement in web operation capabilities [30]. Group 5: Future Implications and Challenges - The rise of Agent capabilities raises questions about user trust and the extent of permissions granted to AI, as it begins to handle more complex real-world tasks [36][37]. - The article highlights the potential impact on the workforce, questioning whether AI will empower or threaten jobs as it takes on more responsibilities [37][38].