AI agents
Search documents
Andrej Karpathy on why we still need humans in the loop
Y Combinator· 2025-06-24 21:46
I think guies are very useful for auditing systems and visual representations in general. And I think guies for example are extremely important to this because a guey utilizes your computer vision GPU in all of our head. Reading text is effortful and it's not fun.But looking at stuff is fun and it's it's just a kind of like a highway to your brain. We have to keep the AI on the leash. I think a lot of people are getting way over excited with AI agents and uh it's not useful to me to get a diff of 1,000 line ...
Salesforce launches Agentforce 3 with AI agent observability and MCP support
VentureBeat· 2025-06-23 21:03
Core Insights - Salesforce has launched significant enhancements to its AI agent platform, Agentforce 3, aimed at addressing enterprise challenges in deploying digital workers at scale, particularly in monitoring performance and ensuring security across corporate systems [1][2][9] Group 1: AI Agent Performance and Demand - The introduction of a "Command Center" in Agentforce 3 provides executives with real-time visibility into AI agent performance and supports interoperability with numerous external business tools [2][9] - There has been a 233% increase in AI agent usage within six months, with over 8,000 customers adopting the technology, leading to measurable returns such as a 15% reduction in customer case handling time for Engine and a 70% autonomous resolution rate for 1-800Accountant during peak tax season [3][22] Group 2: Enterprise Integration and Transformation - PepsiCo is leveraging Agentforce as part of its AI-driven transformation strategy, recognizing the need for better integration of data and systems to meet evolving customer demands [5][6][8] - The deployment of AI agents is seen as essential for enhancing customer engagement and driving backend efficiency, with PepsiCo's long-standing partnership with Salesforce facilitating a swift transition to AI technologies [7][8] Group 3: Operational Challenges and Solutions - The new observability platform in Agentforce 3 addresses the operational challenges that arise post-deployment, providing analytics on agent interactions and health monitoring with real-time alerts [11][12] - The system captures all agent activity using the OpenTelemetry standard, allowing integration with existing monitoring tools and ensuring oversight of AI agents within operational workflows [13] Group 4: Interoperability and Security - Salesforce's adoption of the Model Context Protocol (MCP) enhances AI agent interoperability, enabling connections with MCP-compliant servers without custom development [14][16] - The platform's enhanced architecture offers 50% lower latency and improved security for regulated industries by hosting AI models within Salesforce's infrastructure, ensuring sensitive data remains secure [18][19] Group 5: Industry-Specific Deployments and Pricing - Salesforce has developed over 200 pre-configured industry actions to expedite AI agent deployment, with significant results reported by clients such as OpenTable and Grupo Falabella [21][22] - The company has introduced flexible pricing models, including unlimited usage licenses for employee-facing agents and per-action pricing that scales with actual AI work performed [22] Group 6: Future of Work and Competitive Landscape - The rise of AI agents is transforming enterprise operations, with new roles emerging for managing these digital employees, indicating a shift in how work is organized [23][24] - As competition intensifies among major technology firms, Salesforce emphasizes its integration advantages, allowing for comprehensive tracking of work cycles within the enterprise ecosystem [24]
Building Agents with Amazon Nova Act and MCP - Du'An Lightfoot, Amazon (Full Workshop)
AI Engineer· 2025-06-19 02:04
Workshop Overview - The workshop focuses on building AI agents using Amazon's agent technologies [1] - Participants will gain hands-on experience in building sophisticated AI agents [1] - The workshop is 2-hour long [1] Technologies Highlighted - Amazon Nova Act is used for reliable web navigation [1] - Model Context Protocol (MCP) connects agents to external data sources and APIs [1] - Amazon Bedrock Agents orchestrates complex workflows [1] Skills Acquired - Participants will learn to build agents that can navigate the web like humans [1] - Participants will learn to perform complex multi-step tasks [1] - Participants will learn to leverage specialized tools through natural language commands [1]
How 11x Rebuilt Their Alice Agent: From React to Multi-Agent with LangGraph| LangChain Interrupt
LangChain· 2025-06-16 16:36
Company Overview - 11X is building digital workers, including Alice, an AI SDR, and Julian, an AI voice agent [1] - The company relocated from London to San Francisco and rebuilt its core product, Alice, from the ground up [2] Alice Rebuild & Vision - The rebuild of Alice was driven by the belief that agents are the future [3] - The new vision for Alice centers on seven agentic capabilities, including chat-based interaction, knowledge base training via document uploads, AI-driven lead sourcing, deep lead research, personalized emails, automatic handling of inbound messages, and self-learning [11][12][13] Development Process & Tech Stack - The rebuild of Alice 2 took only 3 months from the first commit to migrating the last business customer [3][14] - The company chose a vanilla tech stack and leveraged vendors like Langchain to move quickly [15][16][17] - Langchain was chosen as a key partner due to its AI dev tools, agent framework, cloud hosting, observability, Typescript support, and customer support [18][19] Agent Architecture Evolution - The company experimented with three different architectures for campaign creation: React, workflow, and multi-agent systems [21] - The final architecture was a multi-agent system with a supervisor and specialized sub-agents for research, positioning, LinkedIn messaging, and email writing [44][45][46] Results & Future Plans - Alice 2 went live in January and has sourced close to 2 million leads and sent close to 3 million messages [52] - Alice 2 has generated about 21,000 replies, with a reply rate of around 2%, on par with a human SDR [52] - Future plans include integrating Alice and Julian, implementing self-learning, and exploring new technologies like computer use, memory, and reinforcement learning [53][54]
Databricks CEO on evaluating AI agents
CNBC Television· 2025-06-12 14:45
Bottleneck in AI Agent Adoption - The primary obstacle is the lack of proper evaluation and benchmarking for AI agents within companies [2] - Companies are essentially "flying blind" because they lack the ability to assess the performance and impact of their AI agents [2] - Current AI agent capabilities in excelling at programming contests or math Olympiads do not directly translate to their effectiveness in specific job roles within a company [1] Importance of Evaluation - Evaluations or benchmarks are crucial for agent learning, enabling companies to teach AI agents and allow them to self-evaluate [2] - Without proper evaluation, companies risk deploying AI agents that could potentially cause significant disruption or "wreck havoc" [2] - Companies need to know how AI agents are performing before fully integrating them into the workforce [2] Understanding AI Agent Capabilities - A fundamental issue is that companies often lack a clear understanding of what their AI agents are actually doing [3]
From Prompt to Paris: How AI Agents Launch a Food Truck Dream
NVIDIA· 2025-06-11 13:11
AI Agents Overview - AI agents are digital assistants that use prompts to reason and break down problems into multi-step plans [1] - They utilize appropriate tools, collaborate with other agents, and leverage context from memory on NVIDIA accelerated systems [1] - The process begins with a simple prompt, exemplified by asking Perplexity to initiate a food truck business in Paris [1] Agent Collaboration and Functionality - Agents collaborate to address each step, utilizing various tools [2] - A market researcher analyzes trends and the competitive landscape through reviews and reports [2] - A concept designer explores local ingredients and proposes a menu with prep time estimates, researches pallets, and develops a brand identity [2] - A financial planner employs Monte Carlo simulations to project profitability and growth [3] - An operations planner creates a detailed launch timeline, covering equipment procurement and permit acquisition [3] - A marketing specialist develops a launch plan with a social media campaign and an interactive website featuring a map, menu, and online ordering [3] Proposal and Outcome - The collective work of each agent culminates in a final packaged proposal [3] - All of this originates from a single prompt [3]
Just do it. (let your tools think for themselves) - Robert Chandler
AI Engineer· 2025-06-10 17:30
Hi, I'm Robert. I'm the co-founder and CTO at Wordware. And at Wordware, I've personally helped hundreds of teams build reliable AI agents.I'm here to share a few of the insights that we got, especially when it comes to tools. Um, really agentic MCPs, giving your tools time to think. Before I worked on uh LLMs and agents, I used to work on self-driving cars, and really, you know, building high reliable systems is in my blood.So, uh, yeah, here we go. The promise of agents are automated systems that can take ...
Is Okta's 15% Price Drop A Buying Opportunity?
Forbes· 2025-06-05 11:35
Core Insights - Okta, a leading cybersecurity firm specializing in identity and access management, has seen a stock decrease of approximately 10% over the last month despite reporting strong first-quarter earnings that exceeded analyst expectations [2][3] - The company's stock has increased nearly 30% year-to-date, presenting an attractive opportunity for investors [2] Financial Performance - In Q1, Okta's revenue grew 12% year-over-year to $688 million, surpassing the forecast of $678 million to $680 million [3] - Subscription revenue also rose 12% to $673 million, while adjusted EPS increased 24% year-over-year to $0.86 [3] - The company reported positive free cash flow of $238 million for the quarter, marking an 11% year-over-year growth [3] - The net dollar retention rate was 106%, down from 111% a year prior [3] Growth Forecast - Okta has maintained its fiscal 2026 revenue forecast of $2.85 billion to $2.86 billion, indicating a growth of 9-10% [4] - For Q2, the company projects revenue growth of 10% to $710-$712 million, with adjusted EPS of $0.83-$0.84 [4] Market Outlook - The overall cybersecurity market is expected to grow significantly, with investments projected to exceed $298 billion annually by 2028 [5] - Okta's identity management platform is crucial for securing access across various applications, especially as companies adopt cloud-based solutions [5] - Management has noted strong demand for new offerings, such as Identity Governance and Privileged Access [5] Valuation Analysis - Okta has a market capitalization of $17 billion and a price-to-sales (P/S) ratio of approximately 6x based on fiscal 2026 revenue estimates, which is reasonable compared to other cybersecurity stocks [6] - However, trading at 25 times its trailing free cash flow, Okta stock appears somewhat expensive given its low-teens revenue and free cash flow growth [6]
SaaS 的下一站是 Agentforce ?Salesforce 押注 AI 工作流革命
3 6 Ke· 2025-05-23 02:28
Group 1 - Marc Benioff, CEO of Salesforce, envisions a transformative era for enterprise software driven by AI agents and unified data architecture, transitioning from Software as a Service (SaaS) to Service as Software [1][2] - The "digital workforce" revolution is expected to be more disruptive than the cloud and mobile waves of 15 years ago, fundamentally redefining application functionalities [2] - Salesforce's Agentforce and Data Cloud strategies are central to its agentic vision, positioning the company as a potential "pure software hyperscaler" [2] Group 2 - Agentforce is a new AI-driven enterprise agent platform that integrates autonomous or semi-autonomous software assistants into all Salesforce applications, aiming to enhance human productivity [3][4] - Benioff claims that embedding these agents into workflows could lead to a 50% productivity increase across departments, a significant rise from a previously stated 30% [4] - Early customer deployments, such as Disney's use of AI agents for optimizing theme park operations, demonstrate the practical viability of this vision [4] Group 3 - The concept of "agent fluidity" allows AI agents to seamlessly operate across datasets and applications, exemplifying the Service as Software model [5] - Salesforce's Data Cloud serves as a unified real-time data platform, aggregating internal and external data sources into a comprehensive business state map [8][9] - The integration of Data Cloud with core applications like Tableau enhances the effectiveness of AI agents by providing unified real-time data and metadata frameworks [10] Group 4 - Salesforce's strategy emphasizes data fluidity, allowing for federated data integration without requiring all data to be migrated to Salesforce's storage [11][12] - Collaborations with third-party data platforms like Snowflake and Databricks enhance the capabilities of Data Cloud, allowing real-time data queries and integration [12][13] - This open integration strategy positions Salesforce as a key player in modern data architecture, avoiding the pitfalls of data silos [30] Group 5 - Salesforce aims to become the first pure software hyperscaler, leveraging its SaaS platform to achieve scale without the capital-intensive model of traditional hyperscalers [19][20] - The company anticipates reaching an annual revenue of approximately $50 billion this fiscal year, with a focus on maintaining healthy free cash flow [20] - By embedding agents, workflows, and federated datasets into daily operations, Salesforce seeks to establish itself as a neutral orchestration layer in heterogeneous environments [20][21] Group 6 - The competitive landscape includes major players like Microsoft, which poses a significant challenge to Salesforce's ambitions in the AI space [23][24] - Salesforce's strategy of integrating rather than competing with data infrastructure providers like Snowflake and AWS allows it to avoid direct confrontations while enhancing its offerings [29][30] - The company is experiencing strong market response to its AI-driven agents, with over 5,000 organizations deploying the technology shortly after its launch [6][32] Group 7 - Salesforce's ambitious goal is to drive overall productivity improvements exceeding 50% through AI agents, with plans to embed AI capabilities across its entire customer base [35][36] - The next 12 to 24 months are critical for validating Salesforce's strategy and its ability to redefine the cloud economy through software alone [35][36] - If successful, Salesforce could reshape the perception of cloud leaders and establish itself as the preferred platform for enterprise-level AI [34][36]
Z Product|全球爆火的Manus背后,一款关键的AI产品,让AI Agent像人一样操作浏览器
Z Potentials· 2025-05-18 03:43
Core Insights - The article discusses the innovative technology behind Browser Use, which enables AI agents to automate browser operations seamlessly, addressing challenges faced by AI in web interactions [2][3]. Group 1: Technology and Features - Browser Use is designed to connect AI agents with web pages, allowing for automated operations such as logging in and filling out forms [2]. - It supports automatic rotation of AI agents and allows users to run multiple parallel tasks on demand [3]. - The platform is open-source under the MIT license, making it customizable and free for users to integrate any model [2][3]. - Browser Use has gained significant traction, with over 60,000 stars on GitHub and active contributions from more than 15,000 developers [3][7]. Group 2: Market Potential and Growth - The AI agents market is projected to grow from $5.1 billion in 2024 to $47.1 billion by 2030, with around half of companies expected to deploy agents by 2027 [3]. - The founders of Browser Use are optimistic about the future of AI agents and browser automation, predicting that by the end of 2025, the number of agents on the web may surpass that of humans [3]. Group 3: Performance and Accuracy - Browser Use achieved a success rate of 89.1% in the WebVoyager benchmark across 586 different web tasks, indicating industry-leading accuracy [8]. - Specific success rates for various platforms include 100% on Huggingface, 95% on Google Flights, and 80% on Booking.com [10][11]. Group 4: Funding and Development - Browser Use secured $17 million in seed funding in March 2025, led by Felicis Ventures, with participation from several notable investors [22][23]. - The founders, Magnus Müller and Gregor Zunic, developed the prototype during their master's program at ETH Zurich, initially as a small project that gained rapid popularity [14][23].