How to recognize incoming voice using speech to text conversion?

This article demonstrates how to develop a voice recognition software in C#, using Ozeki VoIP SIP SDK. This voice recognition application is also able to recognize spoken words by using speech recognition algorithm. To be able to use this feature, your application doesn't even need to be registered to a PBX or listen to the phone lines, it will only use the tools provided by the SDK, instead of creating a softphone for this purpose.
To use this example, you need to have Ozeki VoIP SIP SDK installed, and a reference to ozeki.dll should be added to your visual studio project.

speech to text conversion
Figure 1 - Speech to text conversion

What is voice recognition and speech recognition? When to use them?

Speech recognition (SR) is the translation of spoken words into text. It is also known as automatic speech recognition (ASR), computer speech recognition, or just speech to text. The performance of speech recognition is usually evaluated in terms of accuary and speed. Speech recognition is a very complex problem; vocalizations vary in terms of accent, articulation, roughness, volume, speed etc., speech can be distorted by background noise, echoes, and electrical characteristics.
By using Ozeki VoIP SIP SDK, developers shouldn't worry about these disturbing factors, even if they have the oppotrunity to configure the speech recognizer systems, becouse it's only optional.

Speech recognition applications can be used in different ways, for different goals; they can use voice recognition or speaker identification to find the identity of who is speaking, they can include voice user interfaces such as voice dialing, call routing, simple data entry, speech-to-text processing etc.
With the different features and functions these applications have, they can be used for different purposes; for in-car systems, telephony, education, security systems, military, daily life and much more..

How to implement speech recognition feature using C#?

You can implement speech recognition app easily in c#, using Ozeki VoIP SIP SDK. Your application doesn't even need to be a softphone, it can be a simple application - which can be built into a softphone, later.
All you have to do is to connect the microphone device to an instance of the SpeechToText class, and talk into the microphone.

Speech recognition example in C#

using System;
using Ozeki.Media;

namespace Voice_Recognition
{
    class Program
    {
        static Microphone microphone;
        static MediaConnector connector;
        static SpeechToText speechToText;   // creates a SpeechToText object

        private static void Main(string[] args)
        {
            microphone = Microphone.GetDefaultDevice();
            connector = new MediaConnector();

            SetupSpeechToText();    // calls the proper method to convert speech to text

            Console.ReadLine();
        }

        static void SetupSpeechToText()
        {
            string[] words = {"Hello", "Welcome"};  // defines the words to be recognized

            speechToText = SpeechToText.CreateInstance(words);  // sets the speechToText object
            speechToText.WordRecognized += (SpeechToText_WordsRecognized);  // subscribes to get notified if a word is being recognized

            connector.Connect(microphone, speechToText);    // connects the microphone to the speech-to-text converter
            microphone.Start(); // starts the microphone
        }

        // this will be called, when a word is being recognized
        static void SpeechToText_WordsRecognized(object sender, SpeechDetectionEventArgs e)
        {
            Console.WriteLine(e.Word);  // displays the recognized word
        }
    }
}

Related Pages

More information