NotebookLM
Search documents
Nano Banana Pro的最神级用法,其实是一键生成PPT。
数字生命卡兹克· 2025-11-24 01:21
Core Viewpoint - The article highlights the innovative capabilities of NotebookLM in conjunction with Nano Banana Pro, particularly its ability to generate high-quality PowerPoint presentations from various input materials, showcasing a significant advancement in AI-driven productivity tools [1][12][41]. Group 1: NotebookLM and Nano Banana Pro Features - NotebookLM allows users to upload various formats of data, including PDFs, Word documents, and images, facilitating seamless knowledge management and transformation into different formats [12][13]. - The integration with Nano Banana Pro enables the automatic generation of visually appealing PPTs, maintaining a consistent style and utilizing original data from the input materials [17][18]. - Users can customize the style of the generated PPTs, choosing from various themes such as clay, comic, and large-character styles, enhancing the visual appeal and engagement of presentations [5][8][22]. Group 2: User Experience and Benefits - The article emphasizes the time-saving aspect of using NotebookLM and Nano Banana Pro, allowing users to focus on content rather than the tedious process of designing presentations [37][41]. - The generated PPTs are noted for their high quality, with minimal errors, making them nearly ready for immediate use after slight modifications [4][15]. - The combination of these tools is described as one of the most useful functionalities encountered in the year, significantly improving the efficiency of creating presentations [28][36]. Group 3: Limitations and Future Improvements - Some limitations are mentioned, such as the inability to edit individual elements within the generated PPTs, which could hinder customization [28][31]. - There are also concerns regarding the quality of Chinese text in the presentations, which may not match the clarity of English text, indicating a need for further development in this area [34]. - The article suggests that future iterations of Nano Banana Pro could address these issues, enhancing the overall user experience and functionality [34].
腾讯研究院AI速递 20251124
腾讯研究院· 2025-11-23 16:01
生成式AI 二、NotebookLM震撼上线"一键生成幻灯片"功能,这次不一样 1. Google AI笔记神器NotebookLM正式推出"一键生成幻灯片"功能,用户只需上传资料即可在几分钟内生成逻辑清 晰的演示文稿; 2. 提供详细版和演讲版两种核心模式,支持通过提示语精准控制幻灯片风格、受众和重点,输出多种语言满足跨国汇 报需求; 3. 支持在线演示、PDF下载和链接共享多种管理方式,可广泛应用于学生复习、职场汇报和教师培训等多个场景。 https://mp.weixin.qq.com/s/A2DOsQLxGMtXU9h-rwJWlQ 三、Meta再推WorldGen,一句话,「盖」出50×50米一座城 一、Nano Banana Pro的一张AI合影,肉眼难辨,刷爆全网 1. 谷歌Nano Banana Pro出世仅48小时即在LMArena榜单双榜第一,其AI生成的硅谷CEO合影刷爆全网,逼真到肉 眼无法分辨; 2. 该模型基于Gemini 3 Pro,在文字-图像测试中领先第一代84分,在图像编辑中高出41分,可通过坐标生成特定 地点历史事件; 3. 谷歌全栈优势凸显,从DeepMind研究人员搭 ...
NotebookLM 功能逆天了:我是如何用它来深度学习的
3 6 Ke· 2025-11-23 00:06
神译局是36氪旗下编译团队,关注科技、商业、职场、生活等领域,重点介绍国外的新技术、新观点、新风向。 编者按:别再等AI喂给你知识了。关键中的关键是,你得先教会AI"如何教你"。文章来自编译。 顺便分享一些我用来定制化学习的实用提示词。 我本想用 LangChain 为我的newsletter写个具备 RAG(检索增强生成)功能的专属 AI 智能体聊天机器人的。 但问题是,我完全不知道从何下手。 每一篇教程都默认我已经懂了向量数据库、嵌入(embeddings)和检索管道。那些文档是写给把 Python 玩得滚瓜烂熟的开发者的。Stack Overflow 上的帖子 动辄"分块策略"和"相似性搜索"这样的术语,好像人人都该懂似的。 我就卡在那种"一知半解,离真正有用还差得远"的尴尬境地。我理解 ChatGPT 和 Claude。我也用 Make.com、Zapier、n8n、Relay 这些工具写过过自动化流 程,感觉已经颇为高级了。 但 LangChain 呢? 零代码自动化和真正的 AI 智能体开发之间的鸿沟,感觉宽得令人绝望。我无法理解。 然后我想起了那个曾改变过我学习方式的工具。 六个月前,我曾写过一 ...
Nano Banana Pro和顶级设计Agent Lovart会擦出怎样的火花?
歸藏的AI工具箱· 2025-11-22 12:50
前天晚上谷歌推出了基于 Gemini 3 优化后的 Nano Banana Pro 模型,能力大幅提升,而 且解决了多语言问题。 我也疯狂玩了两天,搞了一堆案例,刚好看到一向财大气粗的 Lovart 又搞免费活动了,就 顺便探索一下这么强的图像生成和编辑模型结合 Agent 会有什么更强的能力,结果还真让我找到几个。 可以帮你生成纸片人老婆到现实场景的照片,可以有想去哪里点哪里的打卡照,以及完爆 NotebookLM 的 PPT 生成。 先介绍一下 Lovart 的活动: 11.21–11.23 这几天 Nano Banana Pro 全员免费使用。 在这期间订阅 Basic 及以上会员,会员有效期能享受 Nano Banana Pro 365 天 0 积分无限量使用 。 所有现有 Basic 以上等级的会员也会自动获得同样的 365 天 Banana Pro 0 积分权益。 Nano Banana(NB1)、Seedream 4、Midjourney v7现在也在同步享受 365 天 0 积分无限量使用 。 上次免费的时候很多朋友反映因为误操作被扣了积分,这次刚好先教一下大家怎么用才能 避免调用其他模型导致 ...
36个月大逆转,他带着谷歌AI杀回来了,下一步世界模型
3 6 Ke· 2025-11-20 23:53
Core Insights - The competition in the AI model landscape is intensifying, with Google's Gemini 3 Pro recently surpassing Elon Musk's Grok 4.1 to claim the top spot in various rankings [1][3][7]. Group 1: Gemini 3's Capabilities and Impact - Gemini 3 is highlighted for its advanced reasoning, multimedia processing, and coding abilities, enhancing Google's existing products, particularly its lucrative search business [7][8]. - The introduction of AI Overviews has led to a 10% increase in search query volume, while visual search capabilities have surged by 70% due to Gemini's photo analysis [8]. - Gemini 3 is positioned as a foundational model for Google's product ecosystem, integrating AI into various services like Google Maps, Gmail, and cloud services [8][12]. Group 2: Competitive Landscape and Market Position - Google has made significant investments in AI, leading to breakthroughs that have allowed it to catch up with competitors like OpenAI, which initially disrupted its core search business [9][10]. - The monthly active users of Gemini applications have exceeded 650 million, indicating a strong user engagement compared to ChatGPT's 700-800 million weekly active users [12]. - Gemini 3 has outperformed OpenAI's GPT-5 in several benchmarks, particularly in reasoning and long-term planning, enhancing its practical capabilities [12]. Group 3: Future Directions and AGI Aspirations - Google aims to develop a comprehensive model that excels in various domains, which is seen as a crucial step towards achieving Artificial General Intelligence (AGI) [13][14]. - The company is focused on refining the Gemini model to improve its programming, reasoning, and mathematical capabilities, with future iterations expected to be more efficient and cost-effective [13][14]. - The timeline for achieving AGI is projected to be 5 to 10 years, with Gemini 3 serving as a pivotal platform for future advancements [14][15]. Group 4: Economic Viability and AI Bubble Concerns - Despite concerns about an AI bubble, Google is well-positioned due to its solid revenue streams and the strategic role of DeepMind in enhancing its AI capabilities [15][17]. - The integration of AI into existing Google services is already yielding tangible returns, enhancing the performance of search, YouTube, and cloud services [16][17].
NotebookLM introducing Slide Decks
Google· 2025-11-20 22:21
[music] [music] Heat. Heat. [music] Heat up >> [music] >> here. ...
Google launches Nano Banana Pro, an updated AI image generator powered by Gemini 3
CNBC· 2025-11-20 15:00
Core Insights - Google has launched Nano Banana Pro, an advanced image editing and generation tool, building on the momentum from its new Gemini AI model release [1] - The introduction of Nano Banana Pro follows the record-breaking stock highs attributed to the Gemini 3 Pro announcement [1] Product Features - Nano Banana Pro offers enhanced capabilities compared to its predecessor, including the ability to create infographics and slide decks, and maintain character consistency across multiple images [2] - The tool can process up to 14 different images or five different characters simultaneously [2] User Engagement - Internal users have successfully utilized Nano Banana Pro for various applications, such as creating infographics from code snippets and LinkedIn resumes [3] - The original Nano Banana gained significant popularity, adding 13 million new users to the Gemini app within four days of its launch [4] Availability - Nano Banana Pro is accessible through the Gemini app with limited free quotas, as well as in Google's writing assistant, NotebookLM, and other developer, enterprise, and advertising products [4] - Google AI Pro and Ultra subscribers can access Nano Banana Pro through the AI Mode in Google's search features [5]
腾讯研究院AI速递 20251118
腾讯研究院· 2025-11-17 16:18
Group 1: Meta's AI Integration - Meta will officially incorporate "AI-driven impact" into employee performance metrics starting in 2026, assessing how employees utilize AI to enhance work outcomes and team productivity [1] - The company has launched the "Level Up" game project and AI performance assistant tools this year to encourage employees to use the internal AI chatbot Metamate as much as possible [1] - Meta has begun allowing some job candidates to use AI assistants during coding interviews, believing this better represents a real development environment [1] Group 2: Google NotebookLM Features - Google NotebookLM introduced image data source functionality on November 15, enabling automatic OCR and semantic parsing, allowing users to retrieve content from images using natural language [2] - The underlying multimodal model can distinguish between handwritten and printed areas, extract table structures, and automatically link with existing text, audio, and video notes [2] - Within 48 hours of the feature launch, educational accounts uploaded over 500,000 pages of images, a 340% increase, with plans to integrate AR glasses for real-time "see and ask" capabilities next year [2] Group 3: Alibaba's Qianwen App Launch - Alibaba's Qianwen app public beta has launched, built on the Qwen3 model, providing an all-in-one entry point for users to experience a full suite of AI capabilities for free [3] - The application will gradually cover various life scenarios including office work, maps, health, and shopping, aiming to make AI a daily companion [3] - Qianwen will continue to evolve and integrate the latest Qwen models, currently available for search and download in major app stores in China [3] Group 4: Zhiyu GLM Coding Plan - Zhiyu has launched the "GLM Coding Plan·Special Edition" subscription package, offering a 50% discount for first-time buyers, with a minimum monthly cost of only 16 yuan [4] - Powered by the flagship model GLM-4.6, it ranked first globally in the LMArena evaluation alongside Claude Sonnet 4.5 and GPT-5, supporting 200K long context [4] - The model is officially compatible with over 10 mainstream AI programming tools, with several US tech companies like Cerebras and Vercel adopting GLM-4.6 [4] Group 5: Xiaomi's Miloco Solution - Xiaomi has launched its first "large model + smart home" solution, Miloco, using the Mijia camera as a visual information source, with the self-developed large language model MiMo-VL-Miloco-7B at its core, and the framework is open-sourced [5] - Users can communicate with the smart home system through natural language, allowing the system to automatically fulfill various smart needs and rules while ensuring privacy through visual data understanding [5] - Xiaomi's AIoT platform has connected nearly 1 billion IoT devices, and Miloco achieves interoperability between the Mijia ecosystem and Home Assistant ecosystem through standardized MCP protocols, supporting third-party IoT platform integration [5] Group 6: MiroMind's MiroThinker v1.0 - MiroMind has officially launched the open-source intelligent agent base model MiroThinker v1.0, introducing a new dimension of "deep interaction scaling," supporting 256K context and 600 tool calls [6] - In the BrowseComp test, it achieved an accuracy rate of 47.1%, nearing OpenAI DeepResearch's 51.5%, while surpassing DeepSeek-v3.2 by 7.7 percentage points in Chinese tasks [6] - The model adopts a fully open-source architecture, providing all model weights, toolchains, and interaction frameworks, with the 72B version approaching or even surpassing OpenAI DeepResearch, promoting intelligent agents from passive execution to active learning evolution [6] Group 7: MedGPT's Clinical Success - The core model of Future Doctor AI Studio, MedGPT, has outperformed GPT-5 and other leading international models in a multi-model practical evaluation conducted by 32 top domestic clinical experts, achieving the global first in clinical safety and effectiveness assessment [7] - It has launched two products: a clinical decision AI assistant and a patient follow-up AI assistant, providing safe and effective decision support during diagnosis and supporting patient follow-up for chronic disease management [7] - MedGPT has been adopted by dozens of national discipline leaders for daily use and is recognized by experts as the "best practice" for AI empowering grassroots healthcare, aligning with the National Health Commission's guidelines for promoting and regulating AI in healthcare [7] Group 8: Li Feifei on AGI - Li Feifei stated in an interview that AGI is "more of a marketing term than a scientific term," emphasizing that the current AI's biggest shortcoming is the lack of spatial intelligence, which allows humans to navigate and manipulate in a three-dimensional world [8] - She outlined three core capabilities of world models: generative, multimodal, and interactive, arguing that relying solely on data and computing power will not lead to the maturity of robots, which are physical systems needing bodies and application scenarios [8] - The first large-scale world model product, Marble, released by World Labs, has been widely applied in film production, game development, scientific research, and robot training, reducing creation time by 40 times [8]
腾讯研究院AI速递 20251117
腾讯研究院· 2025-11-16 16:01
Group 1: openEuler and AI Operating Systems - openEuler community has launched a new 5-year development plan, with the first AI-focused supernode operating system (openEuler 24.03 LTS SP3) set to be released by the end of 2025, involving over 2,100 member organizations and more than 23,000 global contributors [1] - The operating system features global resource abstraction, heterogeneous resource integration, and a global resource view, aimed at maximizing the computational potential of supernodes and accelerating application innovation [1] - The Lingqu Interconnection Protocol 2.0 will contribute support for supernode operating system plugins, providing key capabilities such as unified memory addressing and low-latency communication for heterogeneous computing [1] Group 2: Google and AI Models - Google CEO's cryptic response with two thoughtful emojis hints at the anticipated launch of Gemini 3.0 next week, with 69% of netizens betting on the release of this next-generation AI model, which is expected to be a significant turning point for Google [2] - Early testing reveals that Gemini 3.0 can generate operating systems and build websites in seconds, showcasing impressive front-end design capabilities, leading to its label as the "end of front-end engineers" [2] - Warren Buffett has invested $4.3 billion in Google stock, with high expectations for Gemini 3.0's performance, which will determine Google's potential to challenge for AI leadership [2] Group 3: Gaming AI Developments - Google DeepMind has introduced SIMA 2, an AI agent capable of playing games like a human by using virtual input devices, overcoming the limitations of simple command following and demonstrating reasoning and learning abilities [3] - SIMA 2 can tackle new games without pre-training and understands multimodal prompts, enhancing its self-improvement through self-learning and feedback from Gemini [3] - The system employs symbolic regression methods and integrates Gemini as its core engine, aiming to serve as a foundational module for future robotic applications, though it still faces limitations in complex tasks [3] Group 4: Long-term Memory Operating Systems - The EverMemOS, developed by Chen Tianqiao's team, has achieved high scores of 92.3% and 82% on LoCoMo and LongMemEval-S benchmarks, significantly surpassing state-of-the-art levels [4] - Inspired by human memory mechanisms, the system features a four-layer architecture (agent layer, memory layer, index layer, interface layer) and employs "layered memory extraction" to address challenges in pure text similarity retrieval [4] - An open-source version is available on GitHub, with a cloud service version expected to be released later this year, aimed at providing enterprises with data persistence and scalable experiences [4] Group 5: AI Wearable Technology - Sandbar has launched the Stream smart ring, priced at $249-$299, which eliminates health monitoring features to focus on AI voice interaction capabilities [5] - The ring uses a "fist whisper" interaction method to activate recording and dynamically switch between multiple large models, but has a battery life of only 16-20 hours, which is inferior to traditional smart rings [5] - The accompanying iOS app utilizes ElevenLabs to generate voice models that mimic user voices, ensuring end-to-end encryption of data without storing original audio, although privacy and value propositions remain questionable [5] Group 6: NotebookLM and Research Tools - Google NotebookLM has introduced the Deep Research feature, which can automatically gather multiple relevant web sources and organize them into a contextual list, creating a dedicated knowledge base within minutes [7] - The system supports processing of 25 million tokens in context, ensuring that all responses are based on user-provided sources with citation, enhancing verifiability and reducing AI hallucination issues [7] - Its video overview feature can convert documents, web pages, and videos into interactive videos, with Google committing not to use personal data for model training [7] Group 7: AI in Physics - A team from Peking University has developed the AI-Newton system, which employs symbolic regression methods to rediscover fundamental physical laws without prior knowledge [8] - The system is supported by a knowledge base consisting of symbolic concepts, specific laws, and universal laws, identifying an average of about 90 physical concepts and 50 general laws in test cases [8] - AI-Newton demonstrates progressive and diverse characteristics, currently in the research phase, but offers a new paradigm for AI-driven autonomous scientific discovery, with potential applications in embodied intelligence [8] Group 8: OpenAI's Research on Explainability - OpenAI has released new research on explainability, proposing sparse models with fewer neuron connections but more neurons, making the internal mechanisms of the model easier to understand [9] - The research team identified the "minimal loop" for specific tasks, quantifying explainability through geometric averages of edge counts, finding that larger, sparser models can generate more powerful but simpler functional models [9] - The paper's communication author, Leo Gao, is a former member of Ilya's super alignment team, but the research is still in early stages, with sparse models being significantly smaller and less efficient than cutting-edge models [9] Group 9: Elon Musk's AI Vision - Elon Musk is advancing xAI on the X and Tesla platforms, with the Colossus supercomputer data center deploying 200,000 H100 GPUs in 122 days for training Grok-4 and the upcoming Grok-5 [10] - xAI follows a "truth-seeking, no taboos" approach, allowing AI to generate synthetic data to reconstruct knowledge systems, aiming to create a "Grok Encyclopedia," with Tesla's next-generation AI5 chip expected to enhance performance by 40 times [10] - Grok is set to be integrated into Tesla vehicles, with Musk predicting that by 2030, AI capabilities may surpass those of all humanity, while xAI plans to open-source the Grok-2.5 model and release Grok-3 in six months [10]
X @Demis Hassabis
Demis Hassabis· 2025-11-14 00:12
RT NotebookLM (@NotebookLM)Finally, the moment you’ve all been waiting for 🥁:Rolling out today, you can now create your own custom video overview styles by typing in a prompt in the customization box.Drop your favorite examples below (our expectations are SKY HIGH). https://t.co/fjCTcBXdGR ...