These days speech recognition is usually performed with different varieties of machine learning algorithms.
First, the algorithms are translated into models. Then the models are used to recognize aligned audio data which is then considered a fully functional speech recognizer for most text in that language.
Finally, free form audio is inputted and out comes a transcription of what was previously only audio signals.