Param-Harrison / audioinsight

AudioInsight processes audio, transcribes it, summarizes it, generates a title for the content, and allows users to ask questions about the related audio.

Home Page:https://audioinsight.gabrielsena.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

audio insight screenshot

AudioInsight - Cloudflare AI Challenge Entry

AudioInsight processes audio, transcribes it, summarizes it, generates a title for the content, and allows users to ask questions about the related audio.

This is an entry for the Cloudflare AI Challenge.

Live on: https://audioinsight.gabrielsena.dev/

How It Works

  1. On the application's homepage, the user uploads an audio file.
  2. We use the whisper model to transcribe the audio into text.
  3. We use the neural-chat-7b-v3-1-awq model to generate a title based on the provided content.
  4. We summarize the content with the bart-large-cnn model.
  5. After that, the user can ask questions, and we use the neural-chat-7b-v3-1-awq model to answer the user's questions.

Under the Hood

  • D1 Database is responsible for storing chat and its history.
  • The Cloudflare R2 is responsible for storing chat's audio files.
  • Cloudflare Pages is responsible for hosting the entire NextJS application, which provides a front-end and back-end ecosystem.

Additional Features

  • Preserve conversation: Your chat and audio are stored remotely. You can continue talking about the audio later.

How to Install

  1. Start by cloning this repository:
git clone git@github.com:gabrielsenadev/audioinsight.git
  1. Install dependencies:
npm ci
  1. Create D1 Database:
npx wrangler d1 create db-d1-audioinsight
  1. Configure your database:
npx wrangler d1 execute db-d1-audioinsight --remote --file=./src/database/schema.sql
  1. Create your R2 bucket:
npx wrangler r2 bucket create r2-audios
  1. Update wrangler.toml to target your recently created database and bucket properly:
[[d1_databases]]
binding = "DB"
database_name = "db-d1-audioinsight"
database_id = "d485c019-8021-4d08-88e6-e5a6ea66ad4e"

[[r2_buckets]]
binding = 'R2'
bucket_name = 'r2-audios'
  1. Run preview:
npm run preview
  1. Deploy the application:
npm run deploy

Audio Examples

In the examples/ directory, there are some useful audios to try this application.

Screenshots

Homepage

homepage

Chat

homepage

About

AudioInsight processes audio, transcribes it, summarizes it, generates a title for the content, and allows users to ask questions about the related audio.

https://audioinsight.gabrielsena.dev/

License:MIT License


Languages

Language:TypeScript 96.7%Language:JavaScript 2.6%Language:CSS 0.7%