


Above: Some good, awkward, and not-so-good images generated by our AI image generations
Hack The Planet leverages generative AI for our live Twitch stream. The chat has secret AI commands; any command that isn’t already registered in our main Twitch bot is send to an LLM chatbot running on a separate computer. But more overtly, the video stream features images generated in real-time based on the Artist and Track Title that are currently playing at the time.
This documentation highlights two parts of that stream-integration ecosystem: the LLM chatbot and the AI image generation.
The main chatbot, which integrates all the fun integrations of our stream, controls:
- chat commands, like !cheers, !newdjname, and !mugtion
- LED light colors and patterns in the studio
- the dancing robot
- our text adventure
- the scavenger hunt
- various hacking-related games
- tons of easter eggs
- LLM chat (like mentioned above)
The AI image generation is based on another program that sits outside of the chatbot. It reads the information from the turntables to find out what song is playing, then sends info to another computer that is running the AI image generation model. Currently, that model is based on Stability AI’s models. We’ve messed around with SDXL, SD3 Medium, and others. The examples documented here focus on Stability AI’s SD3 Medium.
The client program reads the song data, then sends it to another computer with a dedicated GPU. The client lives on the streaming computer, and we found that the stream suffered when we hosted them on the same machine.
It takes about 30 seconds to generate an image given the parameters in the code on a relatively moderate computer with an RTX 3060 GPU. We’ve been happy with its output, but we also like it when things look wrong–too many fingers, odd body parts, or animals with the wrong appendages. 🙂
The code has two sets of clients and servers. The clients send the prompts and wait for a response from the AI model. The servers sit on computers with adequate specs to run them. Again, we got about 30-second image generation times on a GPU, and about 10-second text responses from a relatively small CPU-based LLM model. In this code we used Stability AI’s LM2 Zephyr 1.6B model.
The code can be found here: https://github.com/pfeiffer3000/Stability-AI-for-stream

Leave a comment