How to Implement Voice Commands in Your React App
Controlling embedded videos using voice commands
How does voice command work? Can you use voice command to control a video? How can you implement voice command as a feature in your React app? These are questions I asked myself as I was working on building an app with embedded video. I wanted to be able to control the video hands-free using voice command.
I ran into a common problem: I was not finding resources that walked through the steps required to set it up for my specific use case. I did not find a site that had voice command built in to control content, and I did not find many blogs walking through the process of setting up this kind of feature.
With that said, now that I have found a solution, here is my walkthrough on how to get a voice command feature set up to control video in your React app.
1. Integrate the Web Speech API
For voice recognition, I utilized the React Hook react-speech-recognition, which uses Web Speech API behind the scenes. This allowed me to very quickly access functions of the computer’s microphone to track what a user is saying.
To install it, use the following command in your terminal:
npm install — save react-speech-recognition
Then at the top of your React file, import it using the following code:
import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition'
My app is a recipe and cooking site where I have an embedded walkthrough video for each recipe. I wanted to control that video using voice command, so I imported this speech recognition functionality into my Recipe Detail Page
component. For your project, import this directly into the component that will be accessing the speech recognition functionality.
2. Find and Save Video Element
The next step is to target the video that you want to control. In my app, I accessed the video file through my Rails API, and I used Cloudinary to manage video upload and storage. I set the state of my component as the recipe object, as shown here:
Using state, I was able to create a video element within the return
statement of my component using the video from state as the source:
Using the ID of video
, I created a variable:
let video = document.getElementById('video')
Now that we have a variable accessing the video element, we are ready to set up the commands.
3. Set Up Voice Commands
All videos have coded commands for their standard controls, including play, pause, fast-forward, etc. In doing research, I found a sample site from W3 that had set up all the video commands, and they were able to control a video by using buttons. Upon inspecting each button, I found how to use commands to control the video, which I was able to apply in my app:
First, I set up an array of commands. I decided I wanted play, pause, rewind ten seconds, fast-forward ten seconds, mute, and unmute. I also included a clear
command for the voice command transcript, but that was more for me to use during testing and implementation. For each command, you include the command words to listen to along with the callback function that actually controls the embedded video.
I then set up two callback functions (one called listen
and one called stopListening
) to use for toggling voice recognition on and off. When turned on, SpeechRecognition
will pick up the transcript from the computer’s microphone and listen for the specific commands you set up. When turned off, SpeechRecognition
will stop listening and will no longer track the transcript.
As you can see, I am also calling the following:
setVoiceOn(true)
And:
setVoiceOn(false)
Along with setting the state of the recipe object to access on my recipe detail page, I have a state for the voice command:
const [voiceOn, setVoiceOn] = useState(false);
As you can reference in my return
statement, I have set up a toggle for what button to show. When voiceOn
is false
, a button appears on the app that allows you to “Turn On Voice Command.” When voiceOn
is true
, the button will change to allow you to “Turn Off Voice Command.” This is where the listen
and stopListening
functions are used to turn the microphone on and off.
In following the npm guide on setting up react-speech-recognition, there is a sample implementation using commands, so you can reference that as well.
And that’s it!
Once you’ve set up these commands associated with your video, you can run your app and test it out! Feel free to view my demo to see the voice command in action (skip to 2:53):