Deep Learning Image Generation
This project implements a diffusion model for image generation trained using the CIFAR-10 dataset. Diffusion models are a class of generative models that learn to gradually denoise random noise into coherent images, and have become the foundation for state-of-the-art image generation systems like DALL-E and Stable Diffusion.
The model was developed as the final project for my Machine Learning class at UNC Charlotte. It is based on Hugging Face's Diffusion Course and demonstrates the core concepts of diffusion-based image generation on a smaller scale, generating 32x32 pixel images across 10 object categories (airplanes, cars, birds, cats, dogs, etc.).
Courses I followed during the development of this project:
Diffusion models work through a two-phase process:
Once trained, the model can start with random noise and iteratively denoise it to generate new, realistic images that resemble the training data.
The CIFAR-10 dataset consists of 60,000 32x32 color images across 10 classes:
The generated images are of mixed quality but are generally recognizable as CIFAR-10 images. Interestingly, the "car" and "truck" classes perform the best, likely due to the more rigid and predictable structure of these objects compared to organic subjects like animals.