Multimodal capabilities
Search documents
Gemini 3 Flash: Visual context in an instant
Google DeepMind· 2025-12-17 15:59
Gemini 3 Flash adds a contextual UI on image generations, showing its strong multimodal capabilities in understanding visual input and reasoning to describe the content of the image in a compelling and interactive way. Learn more at https://deepmind.google/models/gemini/flash/ ___ Subscribe to our channel https://www.youtube.com/@googledeepmind Find us on X https://twitter.com/GoogleDeepMind Follow us on Instagram https://instagram.com/googledeepmind Add us on Linkedin https://www.linkedin.com/company/deepm ...
Gemini 3 is now in the Gemini app. See what's new with these 3 prompts
Google· 2025-11-26 23:00
Gemini 3 is bringing new features and capabilities to the Gemini app. Here are three examples. Ask Gemini to plan a 3-day trip to Rome.Select Visual layout and Gemini will provide an immersive view of the information you need with a visual itinerary you can actually explore – all thanks to Gemini 3 multimodal capabilities. Ask Gemini about the Van Gogh Gallery, select Dynamic view and Gemini 3’s advanced agenetic coding capabilities will provide an interactive interface that lets you tap, scroll, and learn. ...
Build beautiful frontends with OpenAI Codex
OpenAI· 2025-10-27 15:57
Hey everyone, I'm Roman. Codex is your AI teammate that you can pay with everywhere you code. Whether it's on your computer with Codex CLI or the ID extension or Codex cloud that you can send tasks to anytime from the web or your mobile phone.But one superpower we really wanted to zoom in today is its multimodal capabilities. But it's even more magical when the model can have vision understanding but also the ability to check visually its own work. Today I'm joined by Channing who helped train the model to ...