Responses API

Search documents
GPT-5差评启示录:用户与AI交互方式还停留在上一个时代
3 6 Ke· 2025-08-21 08:49
Core Insights - GPT-5 has received mixed reviews since its launch on August 8, with users expressing dissatisfaction despite its technical advancements [1][5][7] - The official stance from OpenAI is that the issues stem from users not adapting to the new interaction model required by GPT-5, which has evolved into a more autonomous "digital mind" [9][78] - The release of a prompt guide by OpenAI aims to help users better engage with GPT-5, emphasizing the importance of updated communication methods [8][9] Group 1: Performance and Capabilities - GPT-5 demonstrates significant improvements in areas such as mathematics, coding, and multi-modal understanding, showcasing its capabilities as a "full-stack engineer" [4][13] - Despite its higher IQ, GPT-5 exhibits instability, sometimes making errors on simple tasks and lacking emotional intelligence, which has led to concerns about its practical usability [5][6][10] - OpenAI has reported a performance increase in the Tau-Bench test, with scores rising from 73.9% to 78.2%, indicating better efficiency and lower costs [23][24] Group 2: User Interaction and Guidelines - The prompt guide outlines four key areas of evolution for GPT-5: agentic task performance, coding ability, raw intelligence, and steerability, which are crucial for effective user interaction [10][15][17] - Users are encouraged to adjust parameters like reasoning effort and verbosity to optimize GPT-5's performance based on task complexity [53][70] - The guide suggests methods for users to either constrain or empower GPT-5's capabilities, depending on the task requirements, highlighting the need for a more nuanced approach to AI interaction [29][32][36] Group 3: Challenges and Solutions - The dual-edged nature of GPT-5's capabilities means that improper use can lead to inefficiencies, necessitating users to become adept "trainers" of the AI [26][27] - OpenAI emphasizes the importance of clear and structured prompts to avoid conflicts that could lead to performance degradation [54][56] - The guide provides practical solutions for common user challenges, such as managing verbosity and reasoning depth, to enhance the overall interaction experience [50][52][68]
AI加速落地,算力产业链确定性高
Mei Ri Jing Ji Xin Wen· 2025-05-27 00:50
Group 1 - The core viewpoint of the article highlights the acceleration of AI applications and capital expenditures by major companies, indicating a positive trend in the industry [3][4]. - Major AI companies are releasing new models and applications, with Google's Gemini series being upgraded and set to launch across multiple platforms [3]. - OpenAI's announcement of the Responses API supporting MCP is expected to enhance AI Agent development efficiency and interaction capabilities, further driving the demand for the AIDC industry chain [3]. Group 2 - In Q1 2025, major overseas companies showed strong capital expenditures: Meta's CAPEX was $13.7 billion (up 104% YoY, down 8% QoQ), Amazon's was $26.3 billion (up 74% YoY, down 7% QoQ), and Google's was $17.2 billion (up 43% YoY, up 20% QoQ) [3]. - Domestic companies also increased their capital expenditures significantly: Alibaba's CAPEX was 24.6 billion yuan (up 120.6% YoY, down 22.6% QoQ), while Tencent's was 27.5 billion yuan (up 91% YoY, down 25% QoQ) [4]. - The ongoing investment in IDC construction by both domestic and international companies suggests a high level of certainty in the domestic AIDC computing power industry chain [4].
腾讯研究院AI速递 20250523
腾讯研究院· 2025-05-22 15:09
Group 1: OpenAI Innovations - OpenAI's Responses API now supports MCP services, allowing developers to connect external services with simple configurations, significantly reducing development complexity [1] - The updated API enhances security controls through the allowed_tools parameter and permission management to ensure safe tool usage by agents [1] - New features include image generation, Code Interpreter, file search, background mode, inference summaries, and encrypted inference items [1] Group 2: Microsoft's Magentic-UI - Microsoft launched the open-source Web Agent project Magentic-UI, enabling automatic web browsing, file reading/writing, and code execution, with user monitoring and control [2] - The system employs a collaborative planning and execution mechanism, generating task plans for user confirmation and allowing real-time intervention during execution [2] - The project integrates innovative technologies like neural style engines, component DNA mapping, and performance prediction for intelligent style conversion and component reuse [2] Group 3: Mistral's Devstral Model - Mistral, in collaboration with All Hands AI, released the open-source language model Devstral, featuring 24 billion parameters and capable of running on a single RTX 4090 or a 32GB RAM Mac [3] - Devstral scored 46.8% on the SWE-Bench Verified benchmark, outperforming GPT-4.1-mini and other open-source models, showcasing excellent code understanding and problem-solving abilities [3] - The model is released under the Apache 2.0 license for commercial use, with pricing set at $0.10 per million input tokens and $0.30 per million output tokens [3] Group 4: xAI's Live Search API - xAI introduced the Live Search API, providing real-time data access for Grok AI, enabling retrieval of the latest information from X platform, web content, and breaking news [4][5] - The API offers flexible search control features, including enabling/disabling searches, limiting result numbers, and specifying time ranges and domains, combined with DeepSearch for inference display [5] - A Python SDK is available, with free beta testing until June 5, 2025, allowing developers to implement real-time information queries and research assistance [5] Group 5: OpenAI's Acquisition of Jony Ive's Team - OpenAI acquired AI device startup io for $6.5 billion, gaining a hardware team led by former Apple Chief Design Officer Jony Ive, with the deal expected to close by summer [6] - io is developing new forms of AI devices aimed at reducing screen time, including headphones, wearables, and AI home devices, with a projected release in 2026 [6] - The associated company LoveFrom will continue to operate independently while taking on more design responsibilities for OpenAI, including ChatGPT interface and voice interaction products [6] Group 6: Kunlun Wanwei's Skywork Super Agents - Kunlun Wanwei launched the Skywork Super Agents, integrating five expert agents and one general agent for one-stop generation of documents, PPTs, and spreadsheets [7] - The product's core is based on deep research technology, supporting deep information retrieval and traceable content generation at only 40% of OpenAI's costs, with the framework open-sourced [7] - System features include automated requirement clarification, information tracing, and personal knowledge base functionality, allowing users to upload various file formats to build knowledge bases [7] Group 7: Microsoft's Aurora Model - Microsoft introduced the first large-scale atmospheric foundation model, Aurora, trained on millions of hours of atmospheric data, achieving computation speeds 5000 times faster than the most advanced numerical forecasting systems [8] - Aurora excels in predicting air quality, wave patterns, tropical cyclone trajectories, and high-resolution weather, maintaining high accuracy even in data-scarce regions and extreme weather [8] - The model utilizes a 3D Swin Transformer architecture, allowing fine-tuning for different application areas, with a training cycle of only 4-8 weeks, and future expansion into ocean circulation and seasonal weather predictions [8] Group 8: Gartner's Principles for Intelligent Applications - Gartner identified that GenAI will drive enterprise software from auxiliary tools to intelligent agents, outlining five principles for building intelligent applications: adaptive experience, embedded intelligence, autonomous orchestration, interconnected data, and composable architecture [9] - Intelligent applications emphasize personalized experiences and proactive services, enabling cross-system tasks through natural language interactions, with AI capabilities deeply embedded in business logic for process optimization [9] - Enterprises need to maintain balanced investments in the five principles while upgrading foundational data, processes, architecture, and experiences to ensure intelligent applications transition from pilot demonstrations to scalable value applications [9] Group 9: a16z's Insights on AI Programming - The AI coding market has become the second-largest AI market after chatbots, valued at approximately $3 trillion, with developers rapidly adopting this tool as early technology adopters [10] - AI programming will not completely replace traditional programming; understanding foundational abstractions and system architecture remains crucial, with developer roles shifting towards product management or QA engineering [10] - New demographics and methods are fostering a new software paradigm, similar to the WordPress era, where AI lowers the barrier to "writing code," yet the depth and complexity of software development still require professional knowledge [10]
OpenAI开放工具包,智能体落地加速
Guotai Junan Securities· 2025-03-14 11:29
Investment Rating - The report assigns an "Overweight" rating for the industry, consistent with the previous rating [2]. Core Insights - OpenAI has launched four AI agent toolkits, including a new Responses API and open-source Agents SDK, which simplify the development of AI agent applications and enhance product development capabilities [4]. - The introduction of new tools accelerates the diverse application of agents, responding to competitors like Google, Microsoft, and Alibaba, while reinforcing OpenAI's competitive edge established by its first AI agent product, Operator [9]. - Recommended stocks include Dingjie Zhizhi, Foxit Software, and iFlytek, with beneficiaries being Fanwei Network, Maifushi, and Rundamedical [9]. Summary by Sections Investment Highlights - The Responses API features web and document search tools, with the web search tool achieving higher accuracy than competitors, significantly reducing manual search costs [9]. - The computer operation tool, supported by the CUA model, automates tasks like clicking and scrolling, enhancing agent interaction with the external world [9]. - The open-source Agents SDK improves agent collaboration, allowing for customizable LLM models and tools, which simplifies multi-threaded task processing and reduces developer workload [9]. - OpenAI employs a "free API + paid tools" business model, expanding its market through free offerings while recouping R&D costs via paid tools, ensuring a sustainable financial foundation for future growth [9].