Today we'll talk about something called a sound classifier. This
allows the computer to recognize several sounds, like clapping,
whistling or even a few words.
Here is an example of what you can do, if you want to create
something to recognize birds songs.
To create this kind of Artificial intelligence, we will need ml5.js,
which is a part of JavaScript library.
But first let's understand what machine learning is :
Machine learning or "apprentissage automatique" in French is a concept
that is more and more talked about in the world of computing, and that
relates to the field of artificial intelligence. Still called
"statistical learning", this term refers to a process of development,
analysis and implementation leading to the establishment of systematic
processes. To put it simply, it is a kind of program that allows a
computer or a machine an automated learning, so that it can perform a
number of very complex operations. The aim is to make the machine or
computer capable of providing solutions to complicated problems by
processing an astronomical amount of information. This offers an
opportunity to analyze and highlight the correlations that exist
between two or more given situations, and to predict their different
implications.
As of today, you will learn to use the ml5 library. In there, you can
find the ml5.soundClassifier(), which allows you to classify audio.
With the right pre-trained models, you can detect whether a certain
noise was made (e.g. a clapping sound or a whistle) or a certain word
was said (e.g. Up, Down, Yes, No).
At this moment, with the ml5.soundClassifier(), you can use your own
custom pre-trained speech commands or use the the "SpeechCommands18w"
which can recognize "the ten digits from "zero" to "nine", "up",
"down", "left", "right", "go", "stop", "yes", "no", as well as the
additional categories of "unknown word" and "background noise"."
This is exactly what we want.
HERE YOU GO ! TEST IT AND MAKE IT YOURSELF !
Try to speak the following commands to your microphone: 'zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'up', 'down', 'left', 'right', 'go', 'stop', 'yes', and 'no', in addition to 'background_noise' and 'unknown'