GPT Image 1
OpenAI's image generation model. GPT Image 1 brings the language understanding of GPT to image generation — following complex, nuanced instructions with high accuracy and rendering text within images correctly.
What is GPT Image 1?
GPT Image 1 is OpenAI's image generation model, built with the language understanding capabilities that define the GPT model family. Where many image models parse prompts at a surface level, GPT Image 1 brings deep natural language comprehension to the interpretation of image generation instructions.
One of GPT Image 1's most distinctive capabilities is accurate text rendering within images. Generating readable, correctly-spelled text inside images — signs, labels, captions, book covers, UI mockups — is a longstanding challenge for image models. GPT Image 1's language foundation gives it a significant advantage in tasks that involve text in the image itself.
The model also handles multi-step instructions, relative spatial descriptions, and nuanced style guidance more reliably than models without a strong language model backbone. For users who write descriptive, conversational prompts rather than keyword-heavy directives, GPT Image 1's natural language understanding produces more consistently accurate results.
GPT language model
Deep prompt comprehension
Text rendering
Accurate in-image text
Complex instructions
Multi-step prompt handling
Versatile output
Photo, illustration, and more
How to generate images with GPT Image 1 on project.video
Open the composer
Go to your project.video dashboard. GPT Image 1 is available under OpenAI models in the image generation section.
Select GPT Image 1
Choose GPT Image 1 for deep instruction-following and accurate text rendering. Best for complex, conversational prompts and any image that includes text.
Write your prompt naturally
Write your image description in natural language. GPT Image 1 excels with full sentences, multi-clause descriptions, and nuanced instructions — you don't need keyword syntax.
Generate and review
Generate your image and check that all elements of your instruction are reflected in the output. GPT Image 1's language understanding means precise instructions should produce precise results.
Technical specs
Best use cases
Images containing readable text
Generate book covers, packaging labels, signs, UI mockups, poster designs, and any image where legible text is part of the composition. GPT Image 1's text rendering accuracy makes it the right choice when text in the image needs to be correct.
Complex multi-element scenes
Describe scenes with multiple subjects, specific spatial relationships, and detailed attributes. GPT Image 1's language comprehension handles 'the woman on the left wearing a red coat looking toward the camera' more reliably than models that parse prompts less thoroughly.
Conversational and instructional prompts
For users who describe images in natural language rather than keyword syntax, GPT Image 1 produces results that closely match the intention of the description — making it more accessible for users not trained in prompt engineering.
UI and product mockup generation
Generate app screens, website layouts, product packaging, and brand material mockups that include interface text and labels. The combination of layout understanding and text rendering accuracy makes GPT Image 1 well-suited to design mockup generation.
Example prompts
GPT Image 1 handles natural language descriptions well. Write the way you'd describe an image to a person.
"A hardcover book lying flat on a light wood table. The cover is deep navy blue with the title 'The Long Silence' in gold serif type and the author name 'E. Marsh' below it in smaller gold text. Clean editorial photo style."
"A café menu board photographed on the wall of a warm, dimly lit bistro. The board reads 'Today's Special: Mushroom Risotto €14' in chalk lettering on a dark background."
"Three friends sitting at an outdoor café table — one on the left checking her phone, one in the middle laughing, one on the right looking off to the side. Late afternoon light, shallow depth of field, lifestyle photography."