Workflow
红杉汇
icon
Search documents
今天,我们推出xbench
红杉汇· 2025-05-25 23:20
Core Viewpoint - The article discusses the launch of a new AI benchmark testing tool called xbench by Sequoia China, aimed at creating a more scientific and effective evaluation system for AI capabilities, particularly in real-world applications [1][2]. Group 1: xbench Overview - xbench employs a dual-track evaluation system that constructs a multidimensional dataset to track both the theoretical limits of AI models and the practical value of AI agents in real-world scenarios [2][3]. - The tool features an Evergreen Evaluation mechanism, ensuring continuous updates to testing content to maintain relevance and timeliness [2][3]. Group 2: Evaluation Methodology - The initial release includes two core assessment sets: xbench-ScienceQA for scientific question answering and xbench-DeepSearch for deep search capabilities, with comprehensive rankings of major products in these fields [3][19]. - The evaluation methodology focuses on aligning assessments with real-world applications, particularly in recruitment and marketing sectors, to establish clear business value [3][12]. Group 3: Historical Context and Development - xbench has been used internally by Sequoia China for over two years to track and evaluate foundational model capabilities, with significant improvements observed in model performance over time [5][7]. - The tool's question bank has undergone multiple updates to reflect increasing complexity and relevance to real-world tasks, demonstrating rapid advancements in AI model capabilities [5][7]. Group 4: Future Directions - The article emphasizes the need for innovative task settings and evaluation methods that align with practical applications, moving beyond traditional assessment frameworks [8][22]. - Future evaluations will focus on dynamic, real-world tasks that reflect the evolving needs of various professional fields, with an emphasis on collaboration with industry experts to refine assessment criteria [24][27]. Group 5: Long-term Evaluation Strategy - The Evergreen Evaluation approach aims to mitigate issues of question leakage and overfitting by maintaining a dynamic and continuously updated assessment pool [11][30]. - The article outlines a vision for ongoing assessments that adapt to the rapid evolution of AI technologies and their applications in diverse professional contexts [30][35].
为什么顶级品牌都在「动」? | 红杉爱生活
红杉汇· 2025-05-22 23:31
而且,动态设计中也涉及大量叙事元素。 当视觉叙事与图形元素、策略性文案及艺术指导完美融合时,能 达到最佳的动态效果,从而打动受众。 让我们从一些案例开始看起,了解 动态设计可以放在哪些触点: 如今,视频贴片广告、社交媒体流广告、APP开屏广告、网页用户界面、微互动设计、数字广告牌、户外屏幕 ——一切都在动! 屏幕无处不在,在线内容也呈井喷式增长。品牌纷纷调整策略,在坚守传统渠道的同时,积极拥抱数字化触 点。从沉浸式的品牌体验到流媒体和广播内容,动态设计已然成为很多行业的标配。在一个"大屏幕看着,小屏 幕刷着"已是常态的世界里,让内容不动才是奇怪的事情。 现在,让我们聚焦于品牌的"动态设计" (Motion Design,有时亦称"动效设计") ,聊聊如何创造出能真正打动受 众的动态效果。 动态设计是一种通过动画技术让设计 (甚至更广义的"品牌") "生动"起来的艺术与工艺。 "动态设计"的目的是持续且有效地传递信息,强化项目的概念和视觉语言——通常是为了鼓励用户采取行 动。其对象包括品牌标志 、 符号、字体、图片、视频片段、插画、图标——任何传递信息的视觉元素都可 以成为设计的切入点。 需要考虑的动态设计触点 ...
为什么生命如此多彩? | 红杉爱科学
红杉汇· 2025-05-21 15:21
今天,我们再一次迎来国际生物多样性日。今年的主题是 "万物共生 和美永续" ,呼吁以人与自然和谐共生之 道,创和美永续之路,进而推进全球生物多样性保护治理新进程。 最早拥有三色视觉的动物是一类 非脊椎动物——节肢动物 (如昆虫、蜘蛛和甲壳类动物) 。而在4.2亿至5亿 年前, 脊椎动物 也开始具备三色视觉,使它们能够比单色视觉的生物更精准地识别猎物和捕食者,提高在 环境中的导航能力。 化石证据提供了进一步的线索。例如, 三叶虫 ——一类生活在5亿多年前的已灭绝的海洋节肢动物,就拥 有 复眼 。这一视觉结构能够探测多种光波长,使三叶虫在昏暗的海洋环境中获得进化优势,提高运动感知 能力和生存竞争力。 这些证据表明,生物在自身变得五彩斑斓之前,便已具备感知色彩的能力。 地球生命共同体的每一次脉动,都与人类命运息息相关。从灰棕、柔和的主体色彩到今日的五彩斑斓,从雄孔 雀炫目的羽毛到绚烂多姿的花朵,地球历经了一系列的进化——生命的色彩是基因密码的具象表达,是物种生 存的底层逻辑的体现。在这个过程中,生物多样性构筑了人类生存的根基。 但一个不可否认的事实是:全球众多物种的遗传多样性正在加速丧失,那些承载着适应潜能的基因宝 ...
一流管理者,都是“选人”高手 | 首席人才官
红杉汇· 2025-05-19 13:15
Core Viewpoint - The article emphasizes the importance of human resource management from an organizational perspective rather than just a human resources function, highlighting that selecting the right people is paramount for long-term success [2]. Group 1: Importance of Employee Selection - Research indicates that the performance difference between top employees and average employees is significantly larger than expected, with top performers in various industries outperforming their peers by substantial margins [3]. - Choosing the wrong employees can lead to inefficiencies in subsequent HR functions like training and retention, making it crucial to select the right individuals from the outset [3]. Group 2: Investment Focus - Companies should shift their investment focus from training to recruitment, as the costs associated with hiring the wrong person can be much higher than those incurred during training [4]. Group 3: Five Rules for Improving Selection Probability - **Rule 1: Define Key Selection Criteria** Companies should clearly outline the essential characteristics they seek in candidates, as exemplified by Amazon's focus on practicality, ownership, and resilience [6]. - **Rule 2: Conduct Proper Assessments** Effective selection tools include cognitive ability tests, work sample tests, and structured situational interviews, while relying solely on personality assessments may not yield reliable results [8]. - **Rule 3: Implement Scientific Interviews** Structured situational interviews, which follow the STAR principle, are shown to be the most effective method for assessing candidates [10]. - **Rule 4: Emphasize Probation Periods and Background Checks** Utilizing probation periods allows companies to evaluate candidates in real work situations, and thorough background checks are essential, especially for higher-level positions [12]. - **Rule 5: Have the Courage to Replace Unsuitable Employees** Companies should be prepared to part ways with employees who do not meet expectations, as maintaining a harmonious relationship relies on mutual suitability [14][16].
AI大家说 | 斯坦福大学年度报告:企业AI运用水平创下纪录
红杉汇· 2025-05-18 02:21
有的时候你可能会有这样的迷惑:怎么新闻里的AI又写科研论文,又能自动驾驶,但是我们手里的AI工具总是 时灵时不灵呢?或许这份斯坦福大学最新的《2025年人工智能指数报告》能够解答你的困惑。 这份400多页的年度报告,包括了对不断演变的AI硬件格局的深入分析、对推理成本的全新估算,以及对AI领域 学术发表和专利申请趋势的新分析。同时还引入了有关企业采用尽责AI实践的新数据。我们编译了报告的部分 重要成果,希望能够帮助大家更好地理解AI技术的发展,充分利用它获得先发优势。 【温馨提示】文末"阅读原文"可下载原报告 人工智能日益融入日常生活 从医疗保健到交通运输的诸多领域,人工智能正迅速从实验室走向日常生活。2023年,美国食品药品监督 管理局 (FDA) 批准了223种配备人工智能的医疗器械,而2015年这一数字仅为6种。在道路上,自动驾驶 汽车已不再处于试验阶段,某自动驾驶汽车运营公司已每周提供超过15万次自动驾驶出行服务。 在这波人工智能大热潮中,科学和医学领域涌现了令人惊喜的新气象。 多个新发布的基础模型,它们将助 力材料科学、天气预报和量子计算等方面的研究。 许多公司正尝试将AI的预测与生成能力转化为有利 ...
能分清这是真的还是AI生成吗?这有一份鉴定指南送给你
红杉汇· 2025-05-15 17:00
先来看一张图。如果AI接到指令,要画一张梅西、C罗和内马尔在夜晚火锅店里的随手自拍快照,它可能会 生成这样一张图片: 图源:小红书 以下文章来源于硅星人Pro ,作者周一笑 硅星人Pro . 硅(Si)是创造未来的基础,欢迎来到这个星球。 那么,面对AI如此强大的"创作力",普通人还有办法分辨真伪吗?当然,技术的背面永远藏着"对抗的钥 匙"。今天我们围绕文字、图片、视频这三种内容形式,梳理了一些技巧,希望人人都能练就"一眼辨AI"的 本领。 识别AI文字:从"AI味儿"入手 你有没有在网上读到过一些文字,心里默默嘀咕:"这话……是人写的还是AI写的?" 随着AI写作工具越来越普遍,这几乎成了上网冲浪的一项必备技能。有意思的是,就像学者伊森·莫里克说 的,当你能轻易感觉出一段内容是AI写的,那往往说明它本身就还不够好。所以, 学会辨别AI文本,不只 是为了"打假",更是为了在信息爆炸的时代,保持对好内容的品味 。 某些AI写出来的文字有一种特别的"味道",用得越久越容易辨认。 一些明显的特征是故作精确。 人类可能 会写"寒意刺进她的胸膛",而AI却会奇怪地用"寒意刺进她第四节胸椎"这样不常规的写法。 还有词汇的 ...
首个科研智能体“天团”出道!近期AI新鲜事还有这些……
红杉汇· 2025-05-14 14:05
提升人类科研效率,该AI上场了 五一期间,FutureHouse推出四个AI科研智能体,官方用四种动物来命名它们,分别是Crow (通用智能体) 、Falcon (自动化文献综述智能体) 、Owl (调研智能体) 、Phoenix (实验智能体) 。这些AI智能体可访问 完整科学文献全文,还具备信息质量评估能力。 图源:FutureHouse官网 这其中,Crow、Falcon和Owl通过了严格的基准测试,在搜索精度和准确性上已经超越了o3-mini、GPT- 4.5、Claude-3.7等当下顶级搜索模型。这三个模型 可以访问大量完整的科学文本 ,这就意味着,你可以向 它们提出关于实验方案和研究局限性的更详细的问题。它们还能使用各种因素来区分来源质量,确保它们 不会依赖低质量的论文,或者是流行的科学来源。 目前这四个科学家智能体虽然还不能自主完成大多数科学研究,但是人类已经可以用它来生成和评估新的 假设,规划新的实验——速度要比以前快很多。 而且这些智能体的推理过程完全透明,对每个信息来源都 进行了多阶段的深入分析。 更重要的是,用户可以清晰地查看整个推理过程,了解智能体得出结论的每一 步依据。 再加上Fu ...
想成为稻盛和夫?你可以试着这样打造企业家IP | 红杉汇内参
红杉汇· 2025-05-13 11:29
Core Viewpoint - The article emphasizes the importance of personal branding (personal IP) for entrepreneurs and company leaders, highlighting how it can enhance trust and connection with customers, suppliers, and investors [2][5][6]. Group 1: Importance of Personal IP - Entrepreneurs and founders serve as the most persuasive "human billboards" for their brands, influencing consumer perceptions through their personal stories and professional expertise [2][5]. - A strong personal IP can humanize a company, making it more relatable and trustworthy to consumers, especially in a market where traditional advertising is losing effectiveness [5][6]. - Research indicates that 63% of respondents feel companies with active social media presence from their CEOs appear more "warm," and 64% appreciate personal content shared by CEOs [5][6]. Group 2: Building Personal IP - Defining personal IP involves understanding one's true self and sharing personal experiences, education, and interests to resonate with potential customers [8][9]. - Targeting the right social media platforms is crucial; focusing on one or two platforms can yield better results than spreading efforts too thin [10]. - Continuously showcasing value through insightful content and personal stories can attract and retain audience attention [11][12]. Group 3: Networking and Consistency - Establishing connections with colleagues, clients, and partners is vital for building a personal IP, as recommendations and endorsements can enhance credibility [14]. - Consistent content updates and maintaining a unified brand image across all platforms are essential for reinforcing personal IP and professional credibility [15]. Group 4: Employee Personal IP - Employees' personal IPs can collectively shape a company's overall image, making it more approachable and relatable to potential clients [19]. - Engaging employees in personal branding efforts can significantly increase interaction rates on social media, often outperforming official company accounts [19].
公元:DeepSeek只打开一扇门,大模型远没到终局 | 投资人说
红杉汇· 2025-05-11 05:09
Core Viewpoint - The discussion highlights the evolving landscape of AI and embodied intelligence, emphasizing the importance of clear commercialization routes and the rapid pace of technological change in the industry [1]. Group 1: AI and Embodied Intelligence Landscape - The current entrepreneurial models differ significantly from the internet era, with a focus on clear commercialization routes rather than solely on technological disruption [1]. - The market for embodied intelligence is likened to the AI landscape in 2018, suggesting that significant breakthroughs are yet to be seen, similar to the emergence of GPT [6]. - The emergence of DeepSeek has disrupted the existing narrative around AGI in the U.S. and reshaped the domestic large model landscape, leading to predictions that only a few companies will dominate the market [6]. Group 2: Investment Strategies and Market Dynamics - Investors are increasingly challenged to keep pace with rapid model iterations, necessitating a deeper understanding of model boundaries and capabilities [7]. - The investment landscape is characterized by a shift in focus from traditional metrics like DAU and MAU to the capabilities of AGI models, which can lead to sudden user shifts [7]. - The belief in the future of AGI is crucial for investors, as the current state of embodied intelligence is still in its early stages, with no clear prototypes of general models yet available [9]. Group 3: Entrepreneurial Challenges and Opportunities - Entrepreneurs in AI and embodied intelligence face difficulties in articulating clear applications, contrasting with previous business plans that clearly defined objectives [8]. - The need for a dual approach to both pre-training and post-training in model development is emphasized, indicating that both aspects are essential for progress in the field [6]. - The industry is still in the early stages of development, with significant time required before a universal model emerges [9].
为什么说多数创新都是“睡美人”?| 红杉Library
红杉汇· 2025-05-08 15:21
这是一个你或许不曾发现的真相:所有的颠覆式创新,一开始都是毫不起眼的;几乎所有创新都出现在它们的 时代之前。 无论是自然界的物种进化,还是人类文化的创新,许多突破性成果并非诞生于其黄金时代,而是早在环境成熟 前就已萌芽,却因认知局限、技术瓶颈或时代错位沉寂数百年,直到某个契机被唤醒并重塑世界。 奥地利进化生物学家安德烈亚斯·瓦格纳在其新作《唤醒创新睡美人》中为我们详细阐述了这个道理。他用丰富 的案例告诉我们,创新并非依赖个别天才的灵光乍现,而是概率、环境和偶然性的共同产物。 那么如何唤醒创新"睡美人"?瓦格纳指出:最伟大的创新或许就藏在最不起眼的角落,而你需要的,或许只是 一份耐心、一双发现的眼睛,以及敢于打破边界、拥抱不确定性的勇气。 本文选编自《唤醒创新睡美人》。荐读之。 《唤醒创新睡美人》 作 者 : [ 奥 ] 安 德 烈 亚 斯 ·瓦格 纳 译者: 贾拥民 出版时间:2 0 2 5年3月 出版社:湛庐文化 浙江科学技术出版社 地球上最成功的生物是什么?许多人给出的答案是狮子或大白鲨这样的顶级掠食者,还有一些人可能说是 鸟、昆虫或细菌。很少有人会想到,草也完全可以称得上是最成功的生物。 草这种生物至少 ...