Sora - Video Generation Models as World Simulators Spring 2024
We introduce Sora, OpenAI's first video generation model. It can create HD videos up to 1 min long. AGI will be able to simulate the physical world, and Sora is a key step in that direction. Read our technical report here.
DALL-E 3 in ChatGPT Fall 2023
We created a text-to-image model that represents a leap forward in its prompt following ability, allowing users to easily translate ideas into exceptionally accurate images directly in ChatGPT.
Putting People in Their Place: Affordance-Aware Human Insertion into Scenes Spring 2023
We train a generative model to produce a realistic image of a person in a scene given an image of a person and a separate image of a scene as input.
InstructPix2Pix: Learning to Follow Image Editing Instructions Fall 2022
We teach a diffusion model to follow editing instructions by training on paired data generated with GPT-3 and Stable Diffusion.
Learning to Learn with Generative Models of Neural Network Checkpoints Fall 2022
We train loss-conditional diffusion models of neural network checkpoints that learn to optimize.
Generating Long Videos of Dynamic Scenes NeurIPS 2022
We design a video generation model capable of producing new content over time while maintaining long-term consistency.
Hallucinating Pose-Compatible Scenes ECCV 2022
What does human pose tell us about a scene? We propose a task to answer this question: given human pose as input, hallucinate a compatible scene. We present a large-scale generative adversarial network for pose-conditioned scene generation.
Studying Bias in GANs through the Lens of Race ECCV 2022
We study how the performance and evaluation of generative image models are impacted by the racial composition of their training datasets.
Handheld Mobile Photography in Very Low Light Fall 2019
SIGGRAPH Asia, 2019
Mobile cameras struggle to capture images in very low light. We propose a complete image processing pipeline to tackle this problem, which is used in the Google Pixel smartphone Night Sight mode.
Learning to Synthesize Motion Blur Fall 2018
CVPR, 2019 (Oral Presentation)
It is difficult to portray a sense of movement in a single image. We present a novel technique to synthesize motion blur, which creates a visual effect to summarize movement, such as portraying a car racing by or the commotion of a busy city intersection. It is also useful to temporally smooth timelapse videos and rendered animations.
Unprocessing Images for Learned Raw Denoising Fall 2018
CVPR, 2019 (Oral Presentation)
Photographs often exhibit noise, especially in low light. While denoise neural networks work well on synthetic inputs, they often fail on real noisy images. We present the new approach to "unprocess" images, which creates more realistic training data and produces state-of-the-art results on real photos.
HDR+ Burst Processing Pipeline Fall 2016
Inspired by the Google Pixel's HDR+ burst photography mode, this technique combines multiple underexposed raw frames to decrease noise, then applies a sequence of standard image processing algorithms.