Stable diffusion for real-time music generation
Riffusion is an open-source AI model that composes music by visualizing it with spectograms. It uses the v1.5 stable diffusion model to create AI music from spectrograms paired with text.
The app is built using Next.js, React, Typescript, three.js, Tailwind, and Vercel.
The app communicates over an API to run the inference calls on a GPU server. We used Truss to package the model and test it locally before deploying it to Baseten which provided GPU-backed inference, auto-scaling, and observability. We used NVIDIA A10Gs in production.
If you have a GPU powerful enough to generate stable diffusion results in under five seconds, you can run the experience locally using our test flask server.