As I am exploring the ml5.js library, I wanted to introduce myself to the ml5.js sound classifier by creating a simple project. I created a JavaScript bubble popper a while ago, which creates a bubble upon pressing the spacebar, and deletes a bubble upon clicking it. I decided to modify the bubble popper project to use the ml5.js sound classifier because it would be a relatively easy way to play with the ml5.js sound classifier.

The ml5.js sound classifier is a pre-trained model and only recognizes these words: “the ten digits from “zero” to “nine”, “up”, “down”, “left”, “right”, “go”, “stop”, “yes”, “no”. I decided to use the words go and stop for bubble generation and popping, because these were the closest words I could think of in the set to create and pop the bubbles.

HTML

To create the structure of the ml5.js sound classifier I added in some external files for p5.js, jquery, ml5.js, google fonts, bootstrap.

Within the html body, I added in some instructional text, and a card that will display words the end user is saying. I created two empty divs with ids of word and wordConfidence. These will be later used to display the word the user is saying, and the confidence level the ml5.js sound classifier returns with that word. I also added in two audio tags, to hold the two audio files that will be played when the bubble is being created or popped.

Copy to Clipboard

CSS

In the CSS file I have added in a background image of a jellyfish. I added in some font, and hr styling. I created a bubble class which will be added to the bubble element I generate in the JavaScript file. I added in some transitions to the bubble class, so it will appear to be growing upon creation of the element. This is done with the animations and keyframes added into the css file.

Copy to Clipboard

JavaScript

In the JavaScript file, I created a getRandomPosition function. This function assigns a random x and y coordinate to an element within the screen view. It does this by grabbing the user’s screen height and width (minus 100). I then use the random math function to generate the x and y coordinates. For the x and y values I set the highest random value I want returned to me is the user’s height / width of the user’s screen – 100.

I then created a goResults function. If there is an error the function returns an error message. If no error is present the results are returned in an array I set as the variable results. I grab the first element in the array because that is the result with the highest confidence level. I then grab the first result label and confidence level and set those as inner text to my two blank divs I created earlier with ids of word and wordConfidence.

In order to generate a bubble if the user says “go”, I create an if statement asking if the results are equal to go. If the results are equal to go I play my bubble wave sound file. I then create a div and assign the class bubble to it. I then assign a random position to this bubble by using the getRandomPosition function. I then append the bubble to the body of my document.

In order to delete a bubble if the user says “stop”, I create an if statement asking if results are equal to stop. If they are, I grab all the elements with the class name of bubble and set that as a variable of the bubble. I then find the last bubble. If there is more than one bubble in the document, I play the popping audio and I remove the last bubble from the html.

Copy to Clipboard