tempo.

Inspiration

Today, making music is a beneficial creative activity–whether for education, or passion/hobby. However, a big issue many aspiring musicians face is finding a place to start -- education can be lacking, inspiration is hard to come by, and the creative process often needs a kickstart. This is where tempo. comes in. tempo allows people to find inspiration through music in a fun and interactive way with their friends.

What it does

tempo. is a game that leverages generative AI to let users create music through text prompts with their friends in a fun way and have the results judged. In a game of tempo, which can support 1-4 players, each player has a set amount of time to give several short phrases describing the music they want to create. Each phrase results in an audio clip generated by AI using that phrase as an input, and the overall track consists of all of the individual clips overlaid on each other. After each player finishes their track, everyone’s tracks get judged using Gemini AI, which takes in as input the finished audio and outputs a review of each person’s final track.

tempo opens up music creation to people who may not have traditional musical training or access to musical software, providing a platform for creative expression. tempo lets players explore different musical genres and soundscapes, offering a diverse range of possibilities for each game. Additionally, feedback from Gemini AI can also serve as another starting point for musical inspiration.

By turning music creation into an interactive, fun, and easy game, tempo not only acts as an accessible starting point for generating ideas for music, but also redefines how we engage with digital experiences and with others in an increasingly online world. tempo offers a unique and engaging way to bond with friends through collaborative music creation. Players can challenge each other with their musical ideas and experiment with different sounds, resulting in a shared creative experience. As each player contributes their short phrases to generate audio clips, everyone gets the chance to listen to and appreciate the artistic perspectives of their friends.

How we built it

tempo was built by a diverse group of hackers and developers. We began by iterating on ideas of things we used daily–among them, interactive games and music were the most common. We then jumped into sketching out a user flow, and a basic design process.

We opted for a modern technology stack to accomplish this project. Our front-end and UI was built with Next.js, React, and CSS. The backend, server, and database leveraged Express and MongoDB. To facilitate game logic, we implemented real-time communication with websockets. Finally, our generative AI features came via the Google Gemini model API and HuggingFace’s Meta MusicGen model API.

Our tech stack and tech flow for how different technologies were linked together.

Challenges we ran into

We ran into plenty of challenges throughout our development process. While we iterated upon an initial MVP, we ran into issues clarifying the user flow and game logic when beginning implementation. We pivoted ideas a few times, based on what we felt was most intuitive and would result in the most fun gameplay experience.

On the technical side, we ran into implementation issues. Working with the HuggingFace API was more difficult than initially expected; we had to explore the use of the transformers.js library instead to fully utilize the API. It was simple to understand how we wanted Sockets to work, but implementing it for a specific room of certain users was also a challenge. We had to deal with accidentally initializing socket channels within existing socket channels, attempting to synchronize all users to common events, and effectively broadcasting messages.

Accomplishments that we're proud of

We're proud of the fact that we have a working prototype despite all the roadblocks we hit throughout the project! We're able to generate a music clip and overlay it on top of each other for a completed music for each user. We are also able to see each user's clips at the end of the game.

What we learned

We used a diverse lineup of technologies; as a result, we learned quite a bit. The newer hackers on our team learned a lot about developing quickly and iterating. Sockets were a new technology choice; for those working on the backend, it was interesting learning how to integrate real-time, bidirectional communication with traditional REST endpoints. Since this was an audio-file based project, we also got to experiment with various ways of storing the data. We learned about blobs (Binary Large OBjects), and various ways of interacting with them on the frontend and storing them in MongoDB Atlas on the backend.

We also learned the value of designs, and established a strong user flow early on. Throughout the development process, we found ourselves catching small issues or edge cases, which slowed us down and caused us to reconsider and refactor code.

What's next for tempo

We want to make tempo even more interactive by allowing players to add and modify each other’s audios before revealing the final tracks. This would allow for a more collaborative and engaging experience, as there would be an element of surprise for how each track would turn out. Additionally, we want to add more ways to customize the tracks that are generated. With more powerful music generation models tuning, or perhaps more robust hardware, we could make the music generation experience more diverse.

jasmine-dragons / tempo