grupoarrfug.com

Transform ChatGPT into a Voice-Activated Chatbot Using 60 Lines

Written on

Introduction to Voice-Enabled Chatbots

I recently stumbled upon an article detailing a project that integrated Zapier with Alexa to utilize ChatGPT. While innovative, I realized a third-party platform wasn’t necessary for voice interactions with ChatGPT; Chrome offers everything needed natively.

Reflecting on my experience in 2017 with a voice-controlled product, I became quite familiar with the web speech recognition and synthesis APIs. By leveraging these technologies, along with some clever adaptations to interact with ChatGPT’s interface, I was able to enable voice communication with the AI.

In this guide, I'll show you how to transform ChatGPT into a voice-responsive chatbot using merely 60 lines of code. The good news is that anyone can easily replicate the script.

Steps to Create Your Chatbot

To construct this voice-enabled chatbot, follow these three straightforward steps:

  1. Launch the ChatGPT website in your Chrome browser.
  2. Access the developer console by hitting Ctrl+Shift+I or by right-clicking on the page and selecting "Inspect." Then, navigate to the “Console” tab.
  3. Paste the JavaScript code provided and start your conversation! You can converse naturally without needing specific keywords.

The JavaScript Code

First, you need to create a new instance of SpeechRecognition with your desired settings. You can choose any language you prefer, and for more information on this API, check here.

const SpeechRecognition = window.SpeechRecognition || webkitSpeechRecognition;

const recognition = new SpeechRecognition();

recognition.lang = "en-US";

recognition.continuous = true;

recognition.maxAlternatives = 1;

recognition.interimResults = false;

The ChatGPT interface lacks descriptive class names or element IDs, so we will use selectors to identify the input field and the submit button.

const formTextarea = document.querySelector("main form textarea");

const formSubmit = document.querySelector("main form button");

To manage the conversation flow, we will define a couple of global variables. The isSpeaking boolean ensures that the recognition doesn’t transcribe audio that the AI is generating.

let isSpeaking = false;

let intervalRcg, intervalUtr;

When you speak into the microphone, the recognition.onresult function is triggered. It stops the recognition process, and the recognized speech gets filled into the ChatGPT input and submitted.

recognition.onresult = (event) => {

recognition.stop();

fillAndsubmitForm(event);

setTimeout(pollResultStatus, 1000);

setTimeout(startVoiceSynth, 1000);

};

The fillAndsubmitForm() function populates the ChatGPT input with the transcribed speech and submits it.

function fillAndsubmitForm(event) {

const result = event.results[0][0].transcript;

if (result == "stop") return;

formTextarea.value = result;

formSubmit.click();

}

The onresult callback sets timeouts to execute two functions after a second: pollResultStatus() and startVoiceSynth().

As ChatGPT disables the submit button during its responses, we can frequently check its status to restart the recognition process.

function pollResultStatus() {

intervalRcg = setInterval(() => {

if (formSubmit.disabled || isSpeaking) return;

recognition.start();

clearInterval(intervalRcg);

}, 500);

}

The startVoiceSynth() function converts the AI’s responses into audible speech using the speechSynthesis API.

When invoked, it sets the isSpeaking variable to true, indicating the bot is speaking. An inner function, sayResult(), vocalizes the latest response.

function startVoiceSynth() {

isSpeaking = true;

function sayResult() {

if (this.innerText === this.spokenText) {

clearInterval(intervalUtr);

isSpeaking = false;

return;

}

speechSynthesis.speak(

new SpeechSynthesisUtterance(

this.innerText.slice((this.spokenText || "").length)

)

);

this.spokenText = this.innerText;

}

intervalUtr = setInterval(

sayResult.bind(document.querySelector(".result-streaming")),

500

);

}

It would be less engaging if the entire response had to be received before it began to speak. The sayResult() function compares spoken words to the complete response, allowing it to synthesize only the new words.

Finally, don’t forget to start the recognition process:

recognition.start();

Conclusion

Once you copy and paste the above code into the console, the script will begin to listen to your voice and transcribe your speech in real-time. When you finish speaking, it will send your transcription to ChatGPT and await a response.

In summary, transforming ChatGPT into a voice-enabled chatbot is quite straightforward using the speech recognition and synthesis APIs built into Chrome. The script presented here is a basic framework that can be expanded for more advanced applications.

You can find the complete script on my blog for easy copying and pasting. Check it out here: walkthrough.ai/voice-enable-chat-gpt.

Enjoy your experience!

This video tutorial demonstrates how to create a voice assistant using ChatGPT in just eight minutes with Python.

In this video, learn how I transformed ChatGPT into a voice-activated AI assistant, showcasing its potential for interactive conversations.

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Who Was the First Recorded Individual in Human History?

Explore the intriguing origins of recorded history and the unexpected first name in human records.

Mesmerizing Wonders: 30 Photos You Can't Help But Stare At!

Discover 30 captivating photos that showcase the wonders of our world, making you gaze in awe and fascination.

Revolutionizing Brain-Computer Interfaces: Neuralink's Milestone

Elon Musk's Neuralink achieves a breakthrough with the first wireless brain chip implant in a human.

The Okinawa Diet: Five Evidence-Based Reasons for Health Benefits

Discover five evidence-based advantages of the Okinawa diet for enhanced health, longevity, and well-being.

Transforming Trauma: A Journey Through Near-Death Experiences

Gianna Mauceri’s near-death experience reveals profound insights about life, purpose, and spiritual guidance.

Mastering HTTP Caching: Key Frontend Interview Insights — Part 1

Explore essential concepts of HTTP caching, its benefits, and common interview questions related to caching mechanisms in web development.

Transform Your Life in Just 30 Days: A Personal Journey

Discover how a 30-day challenge can reshape your body and mindset, leading to healthier habits and self-love.

The Eclipse: A Cosmic Comedy of Indifference and Hilarity

A humorous critique of the fascination with celestial events and personal indifference to them.