Image generation
Search documents
Computer Use & Frontend UI with GPT-5.4 Thinking
OpenAI· 2026-03-05 18:18
We want models to be like really good at checking its own work, especially as the things we ask our models to build become more complex. My name is SQ and I work on training the models to be better at web development, app development, anything that really requires some sort of user experience. Today we're talking about the launch of our new model GP 5.4% thinking and two of its app development related capabilities.One, it's it ability to use kua or computer use. And two, it's its ability to make great websi ...
X @xAI
xAI· 2026-01-29 06:07
We are excited about partnering with @fal on the new Grok Imagine API!fal (@fal):fal is proud to partner with @xai as Grok Imagine’s day-0 platform partnerxAI's latest image & video gen + editing model✨ Stunning photorealistic images/videos from text⚡ Lightning-fast generation🎥 Dynamic animations with precise control🎨 Edit elements, styles & more https://t.co/1RwkhlJA9w ...
Gemini 3 Flash: Visual context in an instant
Google DeepMind· 2025-12-17 15:59
Product Features - Gemini 3 Flash enhances image generation with a contextual UI, showcasing strong multimodal capabilities [1] - The system demonstrates understanding of visual input and reasons to describe image content interactively [1] Company Information - Google DeepMind promotes its Gemini Flash model [1] - Google DeepMind encourages users to learn more via a provided link [1] - Google DeepMind directs users to its social media channels (X, Instagram, LinkedIn) and YouTube channel [1]
X @Demis Hassabis
Demis Hassabis· 2025-08-26 14:56
The new Gemini 2.5 image model🍌is by far the best out there with a whopping +180 ELO point lead in image editing & it really excels at character consistency.Available for free in the @GeminiApp right now. Try uploading an image & playing around with it, it's pretty amazing! https://t.co/4PhzDzFr4pGoogle DeepMind (@GoogleDeepMind):Image generation with Gemini just got a bananas upgrade and is the new state-of-the-art image generation and editing model. 🤯From photorealistic masterpieces to mind-bending fantas ...
GPT-5 Fully Tested (INSANE)
Matthew Berman· 2025-08-07 18:00
GPT-5's Capabilities - GPT-5 can generate interactive Rubik's Cube simulations of up to 20x20x20, including solving algorithms [2][3][4][5][6][7][8] - GPT-5 can create functional clones of applications like Excel and Microsoft Word with features such as formula support, formatting, and image insertion [9][10][11] - GPT-5 can implement complex browser-based games like Conway's Game of Life with 3D visualizations and Snake with enhanced visual effects [12][13][14][15][16][17][18][19][20] - GPT-5 can generate physics simulations, including double pendulums, cloth simulations, fluid dynamics, and ray tracers [20][21][25][26][27][28][36][37][38][39][40] - GPT-5 can create 3D environments such as a flight simulator and a Lego builder, though with some limitations [30][31][32][33][34][35] GPT-5's Speed and Multimodal Functionality - GPT-5 has two modes: GPT5 and GPT5 thinking, with GPT5 achieving speeds of approximately 60-80 tokens per second [22][23][24] - GPT-5 is a multimodal model capable of interpreting images and generating new images based on input [7][49][50][51][52][53] GPT-5's Front-End Development Prowess - GPT-5 can rapidly generate front-end clones of websites like Twitter and create financial dashboards with functional elements [42][43][46][47][48] - GPT-5 can create website front-ends with specific aesthetics, such as a '90s-style website [44][45] GPT-5's Ethical Considerations - GPT-5 can provide responsible and ethical responses to potentially harmful or reckless plans, offering alternative solutions and resources [54][55][56][57][58]