Ai Voice Assistant

OpenAI-powered AI-voice-assistant for easy web navigation, answering and speaking.

OpenAI’s API provides access to GPT-3, which performs a wide variety of natural language tasks, and Codex, which translates natural language to code. The API is designed to allow users to try it on virtually and do any task in the English language. Our project aims to create a voice assistant for the chatbot in order to hold conversations that take voice as input and gives audio responses. The contents of this project are as follows :

Openai Library : Installing and importing Openai library helps us to create the chatbot. The OpenAI Node.js library provides convenient access to the OpenAI API from Node.js applications. Most of the code in this library is generated from our OpenAPI specification. The OpenAI Python library provides convenient access to the OpenAI API from applications written in the Python language. It includes a pre-defined set of classes for API resources that initialize themselves dynamically from API responses which makes it compatible with a wide range of versions of the OpenAI API. This library additionally provides an openai command-line utility which makes it easy to interact with the API from your terminal. Run openai api -h for usage. The library needs to be configured with our account's secret key.
API secret key : API Keys and Secrets are the credentials required to use an API Hook. For a request to an API Hook to be authorised, both the X-API-Key and X-API-Secret headers must be provided. The values of the API Key and Secret represent the values of these headers respectfully.
Pyttsx3 : pyttsx3 is a text-to-speech conversion library in Python. Unlike alternative libraries, it works offline and is compatible with both Python 2 and 3. An application invokes the pyttsx3.init() factory function to get a reference to a pyttsx3. Engine instance is a very easy to use tool which converts the entered text into speech. It supports 3 tts engines : sapi5, nsss, espeak.
Speech Recognition : Speech recognition is a machine's ability to listen to spoken words and identify them. You can then use speech recognition in Python to convert the spoken words into text, make a query or give a reply. You can even program some devices to respond to these spoken words. You can do speech recognition in python with the help of computer programs that take in input from the microphone, process it, and convert it into a suitable form. To decode the speech into text, groups of vectors are matched to one or more phonemes—a fundamental unit of speech. This calculation requires training, since the sound of a phoneme varies from speaker to speaker, and even varies from one utterance to another by the same speaker. A special algorithm is then applied to determine the most likely word (or words) that produce the given sequence of phonemes. Recognizing speech requires audio input, and SpeechRecognition makes retrieving this input really easy. Instead of having to build scripts for accessing microphones and processing audio files from scratch, SpeechRecognition will have you up and running in just a few minutes. The SpeechRecognition library acts as a wrapper for several popular speech APIs and is thus extremely flexible. One of these—the Google Web Speech API—supports a default API key that is hard-coded into the SpeechRecognition library. That means you can get off your feet without having to sign up for a service. The flexibility and ease-of-use of the SpeechRecognition package make it an excellent choice for any Python project. However, support for every feature of each API it wraps is not guaranteed. You will need to spend some time researching the available options to find out if SpeechRecognition will work in your particular case.

Webbrowser : In Python, webbrowser module is a convenient web browser controller. It provides a high-level interface that allows displaying Web-based documents to users. webbrowser can also be used as a CLI tool. The webbrowser module is a convenient web browser controller in the Python programming language. This module offers a high-level interface that enables showing the documents based on the web. Under most circumstances, we can call the open() function from the webbrowser module to perform the right thing. Under the Unix operating system, the graphical browsers are preferred under X11; however, the text-mode browsers will be utilized if graphical browsers are unavailable or an X11 display is unavailable. If the text-mode browsers are utilized, the calling process will block until the user exits the browser. Whenever we give a command like “Open Google” or “Open Youtube” this module assists us in opening the browser.
Flask : Flask is a web framework, it’s a Python module that lets you develop web applications easily. It’s has a small and easy-to-extend core: it’s a micro framework that doesn’t include an ORM (Object Relational Manager) or such features. It does have many cool features like url routing, template engine. It is a WSGI web app framework. Flask provides you with tools, libraries and technologies that allow you to build a web application. This web application can be some web pages, a blog, a wiki or go as big as a web-based calendar application or a commercial website. Flask is part of the categories of the micro-framework. Micro-framework are normally framework with little to no dependencies to external libraries.

SG-Akshay10 / ai-voice-assistant

Ai Voice Assistant

About

Languages