Gemini API
Search documents
Gemini 3 Flash is now rolling out in the Gemini App, AI Mode in Search and our developer tools.
Google· 2025-12-17 20:31
With Gemini 3 Flash, we're combining a strong foundation of reasoning, multimodal, and vision understanding with greater speed and efficiency. In other words, it's the incredible reasoning of Gemini 3 Pro, which we launched last month, but with the speed of a Flash model. In Search, Gemini 3 Flash is now beginning to roll out to everyone as the default model for AI Mode.It can better understand the intent of your questions and uses real-time information and links from across the web to give you thoughtful, ...
从海外云巨头财报看AI发展趋势——CAPEX激增下的增长逻辑与传导路径
Sou Hu Cai Jing· 2025-11-18 09:28
Group 1: Capital Expenditure Analysis - In Q3 2025, the four major cloud service providers (CSPs) - Amazon (AWS), Microsoft (Azure), Google (GCP), and Meta - experienced unprecedented capital expenditure (CAPEX) expansion driven by AI, with a total CAPEX nearing $120 billion, reflecting a year-on-year growth rate exceeding 50% [1] - Microsoft led with a CAPEX of $34.9 billion, a 75% increase year-on-year, focusing on AI data centers and GPU/CPU procurement [1] - Google followed with $24 billion in CAPEX, an 83% increase, with 60% directed towards servers and chips [1] Group 2: CAPEX to Revenue Transmission Path - The transformation of cloud business capital expenditure into revenue is a multi-stage, non-linear process involving capacity construction, revenue conversion, and profit optimization [2] Group 3: Capacity Building Phase - The initial phase focuses on building physical infrastructure, with investments concentrated on data center construction, AI chip procurement, and high-speed network deployment [3] - Key indicators in this phase are physical capacity metrics rather than financial data, highlighting the urgency of AI computing power demand [3] Group 4: Revenue Conversion Phase - Once capacity is built, the monetization phase begins, converting available capacity into revenue through traditional cloud services, AI infrastructure services, and AI application services [4][5] - The efficiency in this phase is determined by capacity utilization and revenue conversion rates [4] Group 5: Scale Effect Phase - The third phase focuses on maximizing profits through scale effects, achieved by diluting fixed costs, increasing the share of high-margin services, and optimizing pricing strategies [6][7] - The overall logic chain of cloud business CAPEX transmission is "capital investment → capacity formation → efficient monetization" [7] Group 6: Cloud Business Performance - In Q3 2025, cloud business growth was strong, with Microsoft reporting $30.9 billion in intelligent cloud revenue, a 28% year-on-year increase, driven by increased capacity and large client orders [8][10] - Google Cloud's revenue reached $15.2 billion, a 33.5% increase, with a significant improvement in operating profit margin to 23.7% [8][10] - Amazon AWS achieved $33 billion in revenue, a 20% increase, with a notable order backlog of $20 billion [9][11] Group 7: Challenges in AI Cloud Services - The industry faces a severe supply-demand imbalance, with AI computing power demand growing exponentially while infrastructure development lags [12] - Profitability pressures are increasing, with varying operating profit margins among CSPs, highlighting concerns over the sustainability of high capital expenditures [13] - Two strategic paths have emerged among leading AI cloud providers: "full-stack self-research" and "cloud + ecosystem," each with distinct advantages and challenges [14] Group 8: Conclusions and Insights - The global cloud computing industry is transitioning from "scale-driven" to "quality-driven," with AI significantly enhancing growth elasticity while testing capital efficiency [18] - Short-term focus should be on AI conversion efficiency and profitability structure, while long-term considerations should include technology routes and strategic resilience [17][18] - Future investment logic will favor companies with strong capital discipline and clear commercialization paths [18]
免费开源的日报生成器,捕捉操作、分析活动、一键输出,老板看了都点赞~
菜鸟教程· 2025-11-17 03:30
Core Insights - The article introduces Dayflow, an AI tool designed to automatically record computer activities and summarize daily work, alleviating the burden of writing daily reports [2][5]. Features of Dayflow - Dayflow records one frame per second, analyzing activities every 15 minutes to create a condensed timeline of daily tasks [5][8]. - The tool offers a customizable dashboard that allows users to track work-related trends and insights [7]. - It provides automatic time-lapse features, enabling users to review their day and identify moments of distraction [8]. - The application includes a daily journal feature for reflecting on captured highlights and adding notes or screenshots [10]. System Compatibility and Installation - Currently, Dayflow is only compatible with macOS, and users can download it from the official GitHub page or install it via Homebrew [12].
X @Demis Hassabis
Demis Hassabis· 2025-11-09 23:10
Product Announcement - Gemini API 推出文件搜索工具,这是一个托管的 RAG 解决方案,提供免费存储和免费查询时间嵌入 [1] - 该方法旨在显著简化上下文感知 AI 系统的路径 [1]
X @Demis Hassabis
Demis Hassabis· 2025-10-18 01:19
Product Innovation - Gemini API 引入了 Google Maps 的 grounding 功能,将 Gemini 与 250 million (2亿5千万) 个地点的数据结合,创造全新体验 [1] - 将地图和搜索等功能整合到单一体验中,功能强大 [1]
刚刚, AI视频王者大更新!硬刚Sora,威尔史密斯吃面更香了
创业邦· 2025-10-16 03:23
Core Insights - OpenAI recently launched the Sora 2 video generation model, while Google upgraded its Veo 3.1 model, indicating a competitive landscape in AI video generation technology [4][41]. Group 1: Google Veo 3.1 Upgrade - The upgrade includes enhanced video editing capabilities, allowing users to make more precise adjustments to video segments [5]. - New features such as "Ingredients to Video," "Frames to Video," and "Extend" now incorporate audio, making audio a part of the creative process [7][11]. - Veo 3.1 shows significant improvements in prompt understanding and audiovisual quality, resulting in more natural transitions from images to videos [8]. Group 2: User Functionality - Users can define characters and styles using multiple reference images, which the "Ingredients to Video" feature utilizes to generate final scenes [13]. - The "Frames to Video" feature allows for seamless transitions between starting and ending frames, beneficial for artistic projects [15]. - The "Extend" feature can generate content longer than one minute, maintaining narrative continuity based on previous segments [17]. Group 3: Output Formats and User Engagement - Veo 3.1 now supports both horizontal and vertical video formats, adapting to current content consumption trends [19]. - Since the launch of Flow in May, users have created over 275 million videos, leading to the introduction of new editing features like "Insert New Elements" and "Remove Objects" for more flexible video editing [20]. Group 4: Application Scenarios - Practical applications of Veo 3 include generating first-person perspective videos, ASMR fruit slicing, and night vision monitoring videos [24]. - The model has been used to create product advertisement videos, showcasing its ability to deliver high-quality visual content [30]. Group 5: Performance Comparison - While Veo 3.1 excels in photo-realistic and commercial content generation, it still has room for improvement in accurately replicating specific artistic styles, such as anime [40]. - The rapid iteration of video generation models like Veo 3.1 and Sora 2 suggests a fast-evolving market, with potential for widespread adoption in various content creation platforms [41][42].
刚刚,谷歌Veo 3.1迎来重大更新,硬刚Sora 2
机器之心· 2025-10-16 00:51
Core Insights - Google has released its latest AI video generation model, Veo 3.1, which enhances audio, narrative control, and visual quality compared to its predecessor, Veo 3 [2][3] - The new model introduces native audio generation capabilities, allowing users to better control the emotional tone and narrative pacing of videos during the creation phase [10] Enhanced Audio and Narrative Control - Veo 3.1 improves support for dialogue, environmental sound effects, and other audio elements, allowing for a more immersive video experience [5] - Core functionalities in Flow, such as "Frames to Video" and "Ingredients to Video," now support native audio generation, enabling users to create longer video clips that can extend beyond the original 8 seconds to 30 seconds or even longer [6][9] Richer Input and Editing Capabilities - The model accepts various input types, including text prompts, images, and video clips, and supports up to three reference images to guide the final output [12] - New features like "Insert" and "Remove" allow for more precise editing, although not all functionalities are immediately available through the Gemini API [13] Multi-Platform Deployment - Veo 3.1 is accessible through several existing Google AI services and is currently in a preview phase, available only in the paid tier of the Gemini API [15][16] - The pricing structure remains consistent with the previous Veo model, charging only after successful video generation, which aids in budget predictability for enterprise teams [16][21] Technical Specifications and Output Control - The model supports video output at 720p or 1080p resolution with a frame rate of 24 frames per second [18] - Users can upload product images to maintain visual consistency throughout the video, simplifying the creative production process for branding and advertising [19] Creative Applications - Google’s Flow platform serves as an AI-assisted movie creation tool, while the Gemini API is aimed at developers looking to integrate video generation features into their applications [20]
「免费额度」秒变40万债务?学生误泄Gemini API密钥背上巨额账单:开发者社区炸锅,谷歌最终免单
3 6 Ke· 2025-09-28 07:13
Core Points - A student from Georgia accidentally leaked his Google Cloud Gemini API Key on GitHub, leading to a bill of $55,444 in just a few months due to malicious usage [1][3][9] - The incident sparked discussions among developers regarding Google's lack of a hard spending cap and the need for better user protection mechanisms [2][6][8] Incident Details - The student registered for Google Cloud using a school email, intending to utilize the $300 free credit for learning experiments, but only consumed $80 before the leak occurred [3][4] - The API Key was exposed on June 6, and the student was unaware of the issue until September 7, when he was alerted by another GitHub user [3][5] - The bill accumulated in three phases: $732 in June, over $31,000 in August, and an additional $21,000 from September 1 to 7 [4][7] Google's Response - Upon discovering the issue, the student contacted Google Cloud support and provided evidence, but Google stated that the bill would not be canceled or modified [5][6] - The student expressed that the bill represented decades of income for him, highlighting the severe financial impact of the situation [6] Developer Community Reaction - The incident led to widespread discussion among developers, questioning why Google does not implement a hard spending limit and only provides alerts [8] - Some developers shared their own experiences and suggested best practices to prevent similar issues, such as limiting API call quotas and using tools to scan for leaked keys [8] Resolution - Ultimately, after increased attention from the developer community, Google Cloud's billing team reviewed the case again and waived the entire $55,444 bill on September 25 [9]
谷歌 - 2025 年 Communacopia + 科技大会-关键要点
2025-09-11 12:11
Summary of Alphabet Inc. (GOOGL) Conference Call Company Overview - **Company**: Alphabet Inc. (GOOGL) - **Event**: Communacopia + Technology Conference 2025 - **Presenter**: Google Cloud CEO Thomas Kurian Key Industry Insights - **Cloud Adoption**: There is a long runway for cloud adoption and future migrations to public cloud, driven primarily by organizations seeking to transform their businesses through AI products and solutions offered in the cloud [2][5] - **AI Systems**: Google Cloud's AI systems are designed for high performance, reliability, and scalability in both training and inference [2][5] - **Revenue Diversification**: The company has developed a diversified revenue base with 13 product lines generating over $1 billion in annual revenue each [2][5] Core Company Strategies - **Monetization of AI**: Management outlined multiple monetization strategies for AI, including consumption, subscription, increased usage, value-based pricing, and premium upsell [2][5][6] - **Product Development**: Focus on building domain-specific enterprise agents across five areas: code/data/security, creativity/collaboration, specific application domains, specific industries, and chat & agent platforms [5][6] - **Generative AI**: Commitment to expanding enterprise access to models, offering a suite of 182 leading models, including large-scale models for generative AI applications [5][6] Financial Performance and Projections - **Operating Margins**: Improvement in operating margins and profitability as Google Cloud expands its customer base and product usage [6] - **Cost Optimization**: Early decisions to develop proprietary chips and models have led to cost optimization and efficiency [6] - **Price Target**: The 12-month price target for GOOGL is set at $234, with a current price of $239.63, indicating a downside potential of 2.3% [8] Financial Metrics (Projected) - **Revenue Growth**: Projected revenues of $295.1 billion in 2025, increasing to $424.4 billion by 2027 [8] - **EBITDA**: Expected EBITDA growth from $127.7 billion in 2025 to $206.9 billion in 2027 [8] - **EPS Growth**: Projected EPS growth from $8.04 in 2025 to $11.56 in 2027 [8] Risks and Challenges - **Competitive Landscape**: Risks include competition affecting product utility and advertising revenues [7] - **Market Disruption**: Potential headwinds from industry disruption impacting monetizable search [7] - **Regulatory Scrutiny**: Exposure to regulatory scrutiny and changes in industry practices that could alter business model prospects [7] - **Macroeconomic Factors**: Vulnerability to global macroeconomic volatility and investor risk appetite for growth stocks [7] Conclusion - **Investment Rating**: The company is rated as a "Buy" with a focus on its strong growth potential in cloud and AI sectors, despite facing various risks and competitive challenges [6][7]
AI读网页,这次真不一样了,谷歌Gemini解锁「详解网页」新技能
机器之心· 2025-09-02 03:44
Core Viewpoint - Google is returning to its core business of search by introducing the Gemini API's URL Context feature, which allows AI to "see" web content like a human [1]. Group 1: URL Context Functionality - The URL Context feature enables the Gemini model to access and process content from URLs, including web pages, PDFs, and images, with a content limit of up to 34MB [1][5]. - Unlike traditional methods where AI reads only summaries or parts of a webpage, URL Context allows for deep and complete document parsing, understanding the entire structure and content [5][6]. - The feature supports various file formats, including PDF, PNG, JPEG, HTML, JSON, and CSV, enhancing its versatility [7]. Group 2: Comparison with RAG - URL Context Grounding is seen as a significant advancement over the traditional Retrieval-Augmented Generation (RAG) approach, which involves multiple complex steps such as content extraction, chunking, vectorization, and storage [11][12]. - The new method simplifies the process, allowing developers to achieve accurate results with minimal coding, eliminating the need for extensive data processing pipelines [13][14]. - URL Context can accurately extract specific data from documents, such as financial figures from a PDF, which would be impossible with just summaries [14]. Group 3: Operational Mechanism - The URL Context operates on a two-step retrieval process to balance speed, cost, and access to the latest data, first attempting to retrieve content from an internal index cache [25]. - If the URL is not cached, it performs real-time scraping to obtain the content [25]. - The pricing model is straightforward, charging based on the number of tokens processed from the content, encouraging developers to provide precise information sources [27]. Group 4: Limitations and Industry Trends - URL Context has limitations, such as being unable to access content behind paywalls, specialized tools like YouTube videos, and having a maximum capacity of processing 20 URLs at once [29]. - The emergence of URL Context indicates a trend where foundational models are increasingly integrating external capabilities, reducing the complexity previously handled by application developers [27].