03 | Everything Is a Remix

How AIs generate images? #

Artificial intelligence has evolved in its ability to generate images, with significant developments moving from simpler algorithms to more complex systems like Generative Adversarial Networks (GANs). GANs involve two neural networks—the generator and the discriminator—competing to produce increasingly realistic images. The generator tries to create convincing images, while the discriminator evaluates them against real images, enhancing the generator’s capabilities through this adversarial process.

Prior to GANs, methods like variational autoencoders (VAEs) were also popular. VAEs are a type of autoencoder that provides a probabilistic manner for describing an observation in latent space. Unlike traditional autoencoders, VAEs do not just compress the data; instead, they learn the parameters of the probability distribution that the data represents.

Building on this, diffusion models represent a newer approach in image generation. These models start with a pattern of random noise and gradually learn to reverse this process to form coherent images. They transform the noise into detailed visuals by predicting and subtracting the noise at each step, effectively ‘denoising.’ This method allows for generating high-quality images with intricate details, offering advancements over GANs by producing more varied and less artifact-prone outputs. The process is controlled and iterative, making diffusion models a powerful tool for realistic and diverse image generation.

Diffusion Process #

We will continue with the following tutorial to understand the diffusion process and start generating images using it:

Stable Diffusion Tutorial

Comfy UI and Auto1111 #

We will explore the Comfy UI and Auto1111 tools to generate images and videos. These tools offer a user-friendly interface for creating visuals and animations, making them accessible to artists and creators without extensive technical knowledge.

Workflows #

We will discuss different workflows for generating images and videos using ComfyUI, and how we can modify these workflows to fit our needs.

Experimentation #

The last few hours of the day will be dedicated to experimenting with the tools and workflows discussed, allowing students to create their own images and and perhaps videos too.

Training your own model #

We can customize the models to generate images that are more aligned with our artistic vision. This process involves training the model on specific datasets, allowing us to control the output and style of the generated images. We can train a Stable Diffusion model using our own images using the following links: