Workflow
AI Studio
icon
Search documents
藏师傅用 Nano Banana Pro 帮你想去哪就去哪
歸藏的AI工具箱· 2025-11-25 12:59
前几天 Nano Banana Pro 发布之后早上看到了 Deepmind 官方的一个用法,就是给到 Nano Banana Pro 经纬度让他直接生成对应地点的照片。 主要基于 Nano Banana Pro 的实时检索能力,他可以通过经纬度知道这个地点的具体位置。 我试了一下果然可以,而且我还给他生成的照片加上了对应的水印,同时让他检索对应位置的当前时间 和天气,让生成的照片更加符合现实场景。 然后我就想 Gemini 3 这么厉害,不行做个水印相机吧,于是就在 AI Studio 里面的 Build 模式里直接 一句话错出来了。 之后有个朋友就说可以加个人像的打卡,然后我就加了,加了以后一发不可收拾,就在上面加各种各样 好玩的东西。嗯,现在这个想去哪里点哪里的相机已经彻底完工了,给大家介绍一下。 顺便我也为他做了一个官网,( https://bananacamera.trickle.host/ )这里有 你可以通过搜索地址或者在地图上点击对应位置来触发图像生成。 图像生成目前有两个模式,Scenery 风景模式只生成对应位置当前的风景,他会查询当前位置的时间和 天气生成符合的风景照片。 简单的介绍以及 ...
36个月大逆转,他带着谷歌AI杀回来了,下一步世界模型
3 6 Ke· 2025-11-20 23:53
Core Insights - The competition in the AI model landscape is intensifying, with Google's Gemini 3 Pro recently surpassing Elon Musk's Grok 4.1 to claim the top spot in various rankings [1][3][7]. Group 1: Gemini 3's Capabilities and Impact - Gemini 3 is highlighted for its advanced reasoning, multimedia processing, and coding abilities, enhancing Google's existing products, particularly its lucrative search business [7][8]. - The introduction of AI Overviews has led to a 10% increase in search query volume, while visual search capabilities have surged by 70% due to Gemini's photo analysis [8]. - Gemini 3 is positioned as a foundational model for Google's product ecosystem, integrating AI into various services like Google Maps, Gmail, and cloud services [8][12]. Group 2: Competitive Landscape and Market Position - Google has made significant investments in AI, leading to breakthroughs that have allowed it to catch up with competitors like OpenAI, which initially disrupted its core search business [9][10]. - The monthly active users of Gemini applications have exceeded 650 million, indicating a strong user engagement compared to ChatGPT's 700-800 million weekly active users [12]. - Gemini 3 has outperformed OpenAI's GPT-5 in several benchmarks, particularly in reasoning and long-term planning, enhancing its practical capabilities [12]. Group 3: Future Directions and AGI Aspirations - Google aims to develop a comprehensive model that excels in various domains, which is seen as a crucial step towards achieving Artificial General Intelligence (AGI) [13][14]. - The company is focused on refining the Gemini model to improve its programming, reasoning, and mathematical capabilities, with future iterations expected to be more efficient and cost-effective [13][14]. - The timeline for achieving AGI is projected to be 5 to 10 years, with Gemini 3 serving as a pivotal platform for future advancements [14][15]. Group 4: Economic Viability and AI Bubble Concerns - Despite concerns about an AI bubble, Google is well-positioned due to its solid revenue streams and the strategic role of DeepMind in enhancing its AI capabilities [15][17]. - The integration of AI into existing Google services is already yielding tangible returns, enhancing the performance of search, YouTube, and cloud services [16][17].
Vibe coding with Gemini 3 — live from Mountain View
Google· 2025-11-20 13:48
Gemini 3 Capabilities & Features - Showcases Gemini 3's ability to bring any idea to life in AI Studio through vibe coding [1] - Highlights various functionalities of AI Studio's build tab, including one-shot, iteration, deployment, sharing, and "I'm Feeling Lucky" features [1] - Demonstrates features like Nano Banana, Maps Grounding, and Live API within AI Studio [1] - Showcases Gemini 3's multimodal capabilities, including video understanding [1] - Illustrates Gemini 3's multilingual and reasoning capabilities [1] AI Studio Applications & Demos - Presents a hand-tracking rhythm game demo [1] - Features an interactive choreography app created through vibe coding [1] - Demonstrates Google Maps Grounding with a running app demo [1] - Showcases the use of Gemini 3 for web design and aesthetics [1] - Presents a demo of bringing images to apps [1] - Visualizes quantum research papers [1] - Showcases interactive floor plans and 3D home design [1] - Features a live vibe coding session building a fishing game [1] - Introduces Anti-Gravity, a next-generation agentic IDE [1] - Demonstrates Anti-Gravity features like artifacts and browser actuation [1] - Presents a multiplayer collaborative whiteboard demo [1] - Showcases a 3D globe travel app and chess learning application [1] - Features a Veo 3.1 generative video app demo [1]
慢一点、深一点|藏师傅带你看清 Gemini3 真实实力
歸藏的AI工具箱· 2025-11-19 08:04
Core Insights - The article discusses the performance of Gemini 3, highlighting its state-of-the-art (SOTA) capabilities across various benchmarks, significantly outperforming competitors in most categories [1][2]. Benchmark Performance - Gemini 3 Pro achieved the highest scores in several benchmarks, including: - 91.9% in GPQA Diamond for scientific knowledge [2] - 95.0% in AIME 2025 for mathematics without tools [2] - 100% in AIME 2025 with code execution [2] - 87.6% in Video-MMMU for knowledge acquisition from videos [2] - 2,439 Elo Rating in LiveCodeBench Pro for competitive coding [2] - In the ARC-AGI-2 visual reasoning puzzles, Gemini 3 scored 31.1%, significantly higher than its competitors [2]. Multimodal Understanding - The article emphasizes Gemini 3's strong multimodal understanding capabilities, particularly in analyzing video content and generating detailed summaries [6][8]. - It successfully analyzed a complex video, providing detailed insights into each scene and suggesting design tools for implementation [7][8]. Design and Coding Capabilities - Gemini 3 demonstrated advanced design capabilities by generating a complete design agent platform that can autonomously create images and videos based on user prompts [12][14]. - The AI was able to replicate complex design tasks, including logo design and packaging, showcasing its potential for practical applications in design [14][20]. Interactive Content Generation - The AI's ability to generate interactive content was highlighted, with examples of creating interactive games and visual novels based on user-provided scripts [34][36]. - This capability opens up new opportunities for content creation, allowing users to develop engaging narratives and gameplay experiences with minimal input [35]. Technical Implementation - The article provides detailed prompts for users to leverage Gemini 3's capabilities in web development, including creating a storytelling webpage and generating 3D voxel animations from images [26][44]. - The technical requirements emphasize the use of modern web technologies, ensuring that the generated content is visually appealing and functionally robust [28][43].
Vibe Coding with Gemini 3 in Google AI Studio
Google DeepMind· 2025-11-18 16:01
[Music] VI coding in AI Studio with Gemini 3 lets you build anything. Build entirely new experiences. Create apps that see.Make tools that listen. Build apps with a single prompt. Get inspired.[Music] Tell me what you want. [Music] Refine [Music] >> and refine some more. >> Try now.>> Share with a friend >> and remix. >> Then share with the world. [Music] Bring anything to life.Tell me what you want to do. ...
谷歌Gemini 3发布预期拉满,历史学者称其解决了AI领域两个最古老难题
3 6 Ke· 2025-11-13 03:19
Core Insights - The article discusses a significant breakthrough in AI, particularly in handwritten text recognition and symbolic reasoning, achieved by Google's AI model, potentially Gemini-3 [1][3][22] - The findings suggest that the model not only excels in recognizing handwritten text but also demonstrates an ability to reason and understand the context behind the text, marking a potential shift in AI capabilities [2][19][21] Group 1: AI Model Performance - The AI model tested by Mark Humphries showed "almost perfect" handwriting recognition and the ability to perform "spontaneous, abstract, symbolic reasoning" [1][2] - The model achieved a character error rate (CER) of 0.56% and a word error rate (WER) of 1.22%, indicating a significant improvement over previous models [7][19] - This performance aligns with the "scaling laws," suggesting that as model parameters increase, capabilities in complex tasks improve exponentially [7][22] Group 2: Historical Document Recognition - Recognizing historical documents is more complex than standard text due to issues like spelling inconsistencies and semantic ambiguities [5][22] - The model's ability to infer the author's intent and correct errors in historical documents indicates a level of understanding previously thought unattainable by AI [5][19] - The implications for historical research are profound, as AI could automate the transcription and analysis of vast amounts of historical data [22][23] Group 3: Theoretical Implications - The findings challenge the long-held belief that symbolic reasoning is beyond the reach of deep learning models, suggesting a convergence of statistical learning and symbolic manipulation [20][21] - The emergence of implicit reasoning capabilities in AI models raises questions about the nature of understanding and cognition in machines [21][22] - This breakthrough could signify a move towards general intelligence in AI, as models begin to demonstrate understanding rather than mere pattern recognition [22][23]
Meta's AI app has seen growth soar since launch of Vibes, but trails OpenAI's Sora
CNBC· 2025-10-28 11:00
In this articleMETAMark Zuckerberg, chief executive officer of Meta Platforms Inc., wears a pair of Meta Oakley Vanguard AI glasses during the Meta Connect event in Menlo Park, California, US, on Wednesday, Sept. 17, 2025.David Paul Morris | Bloomberg | Getty ImagesMeta's AI app has seen a major jolt in downloads since launching its Vibes feed of AI-generated videos, giving investors a glimpse of the company's artificial intelligence strategy ahead of Wednesday's third-quarter earnings.Since releasing Vibes ...
Fusemachines Begins Trading on NASDAQ Marking the Start of a New Chapter
Globenewswire· 2025-10-23 15:00
Core Insights - Fusemachines Inc. has commenced trading on the Nasdaq Stock Market under the symbol "FUSE", marking a significant milestone in its mission to democratize AI [1][2] - The public listing positions the company to capture the growing global demand for scalable AI solutions [2] - The CEO emphasized the commitment to innovation and execution, indicating a focus on disciplined growth and strategic investments to create sustainable value for shareholders [3] Company Strategy - Proceeds from the public listing will be utilized to strengthen the company's balance sheet and accelerate growth, with plans for strategic investments in product innovation, customer expansion, and sales and marketing [3] - The company aims to enhance recurring revenue opportunities, expand margins, and deliver long-term value to shareholders through these initiatives [3] - Fusemachines intends to explore strategic partnerships and targeted M&A opportunities to accelerate market expansion and enhance its competitive position [4] Company Background - Founded in 2013, Fusemachines is a global provider of enterprise AI products and services, focusing on democratizing AI [5] - The company leverages proprietary AI technologies to assist clients in their AI transformation journeys across various industries, including healthcare, finance, retail, manufacturing, and government [5]
Fusemachines Announces Closing of Business Combination and Date for Commencement of NASDAQ Listing
Globenewswire· 2025-10-23 14:07
Core Points - Fusemachines Inc. has completed its business combination with CSLM Acquisition Corp., and will begin trading on Nasdaq under the symbol "FUSE" starting October 23, 2025 [1][2] - The public listing is expected to accelerate growth and innovation, with proceeds from the transaction aimed at developing products, expanding customer reach, and supporting enterprise clients [3] - The leadership team of the combined company includes Sameer Maskey as CEO and Christine Chambers as CFO, with a board of directors comprising industry professionals [4] Company Overview - Founded in 2013, Fusemachines is a global provider of enterprise AI products and services, focused on democratizing AI and helping organizations implement and scale AI solutions [5] - The company operates in various industries, including healthcare, finance, retail, manufacturing, and government, with a presence in North America, Asia, and Latin America [5] - Fusemachines is committed to providing high-quality AI education in underserved communities and assisting organizations in maximizing their potential with AI [6]
没融资仅一款产品 2 年就超 4000 万美金 ARR,又是土耳其的 AI Studio
投资实习所· 2025-10-13 05:39
Core Insights - Turkey has a favorable environment for building AI studios, with significant achievements made without external financing, exemplified by a studio that reached an ARR of $40 million in just four years [1] - Another Turkish AI studio has launched over 60 products in five years, achieving an ARR exceeding $300 million without any financing [2] Product Strategy - The product strategy of the new AI studio appears straightforward, focusing on replicating successful AI products in the market rather than acquiring poorly performing ones [3] - The studio has developed products similar to popular applications like ChatGPT, which have proven to be lucrative, with some generating monthly revenues of $5 million [3] - The fastest-growing product focuses on image editing, achieving over $40 million in ARR within two years, distinguishing itself from traditional editing apps by redefining the image editing experience using AI [5][6]