AI For Streaming

Above: Some good, awkward, and not-so-good images generated by our AI image generations

Hack The Planet leverages generative AI for our live Twitch stream. The chat has secret AI commands; any command that isn’t already registered in our main Twitch bot is send to an LLM chatbot running on a separate computer. But more overtly, the video stream features images generated in real-time based on the Artist and Track Title that are currently playing at the time.

This documentation highlights two parts of that stream-integration ecosystem: the LLM chatbot and the AI image generation.

The main chatbot, which integrates all the fun integrations of our stream, controls:

chat commands, like !cheers, !newdjname, and !mugtion
LED light colors and patterns in the studio
the dancing robot
our text adventure
the scavenger hunt
various hacking-related games
tons of easter eggs
LLM chat (like mentioned above)

The AI image generation is based on another program that sits outside of the chatbot. It reads the information from the turntables to find out what song is playing, then sends info to another computer that is running the AI image generation model. Currently, that model is based on Stability AI’s models. We’ve messed around with SDXL, SD3 Medium, and others. The examples documented here focus on Stability AI’s SD3 Medium.

The client program reads the song data, then sends it to another computer with a dedicated GPU. The client lives on the streaming computer, and we found that the stream suffered when we hosted them on the same machine.

It takes about 30 seconds to generate an image given the parameters in the code on a relatively moderate computer with an RTX 3060 GPU. We’ve been happy with its output, but we also like it when things look wrong–too many fingers, odd body parts, or animals with the wrong appendages. 🙂

The code has two sets of clients and servers. The clients send the prompts and wait for a response from the AI model. The servers sit on computers with adequate specs to run them. Again, we got about 30-second image generation times on a GPU, and about 10-second text responses from a relatively small CPU-based LLM model. In this code we used Stability AI’s LM2 Zephyr 1.6B model.

The code can be found here: https://github.com/pfeiffer3000/Stability-AI-for-stream

DJ Pfeif

AI For Streaming

Discover more from DJ Pfeif

Leave a comment Cancel reply