Blog With Speech Recognition on Windows

Janne Kemppainen | May 14, 2019

If you are slow typist it can take a lot of time to write blog posts. With the advances in voice recognition technologies it is now easier than ever to write content fast.

In this blog post I will show you how you can write blog posts using the integrated voice recognition on Windows 10.

Personally, I’m using the Hugo static site generator to publish this blog which means that I use Visual Studio Code for writing content as Markdown files. However, most of these points should apply to creating any text form content so if your blog runs on WordPress read ahead.

As I am a touch typist and not a native English speaker for me it is actually easier to type than to speak to the computer. However, I think that after some practice I will be able to write faster using speech recognition. The biggest problem seems to be the need to think ahead what I’m going to say next.

Most of this blog post has been written with Windows dictation. Naturally, adding Hugo shortcodes (and fixing my pronounciation errors) still requires the use of a keyboard.

Setting up voice recognition

Windows 10 comes bundled with voice recognition software. At the moment it only supports the major languages, so no Finnish recognition for me. This is where Google Docs (or “kugeln dogs” like Windows thinks it should be written) has the upper hand for supporting so many different languages. Still, I think that writing the Markdown straight in the editor is better than copying it back from Google docs and fixing things afterwards.

If you are using some other language than English then the English speech recognition may not work at first. You might get the error “Speech Recognition could not start because the language configuration is not supported.”

Error message for speech recognition — Error message with speech recognition

To fix this open the start menu and search for “language”, then hit enter. This will open the language options shown below.

Select the options for the English language and click the download button for speech. It will take a while to install the required speech packages.

English language speech package downloaded

If you have a non English keyboard you can change the keyboard layout to your own language as I have done in the image above. This way you have the correct layout even if you’re not using your native language.

If you have multiple languages installed then you need to set English as the current language when you are dictating. So make sure that the taskbar reads ENG, and if not you can click on the current language and switch to English.

English selected as the current language in taskbar

Now you should be able to open to dictating window with Windows+H. The dictation toolbar will start in listening mode. If you are silent for awhile then the computer will stop listening automatically. You can re-enable the dictation with the microphone icon or by pressing Windows+H twice.

Using dictating with VS Code

Dictating on VS Code doesn’t really differ that much from any other application. However, voice commands are not fully supported. Therefore, commands like “go to the end of the paragraph” or “delete that” won’t work.

For basic usage make sure that the cursor is at the correct position and just start speaking after enabling dictation. Because not all voice features are supported you will need to fix possible errors manually.

While you can insert special characters by saying “start spelling” and then saying the name of the character (at symbol, dollar symbol, caret, etc.) I think it is still easier to use the keyboard for them. For example creating sub headings is faster if you type the hash symbols normally and only then start to dictate. I also think that dictating headings is generally not as useful as it is for the body text.

Benefits of dictating

I think the biggest benefit of speech recognition is that you can just speak out what you want to say and then format the text later. This way you can get your thoughts fast on “paper” so that you don’t forget what you wanted to say while you were typing.

Dictating can also be great for slow typists. If your content is really text heavy then you will probably be much more productive. I think faster typists should also see some improvement, though.

After some practice you will probably become better at dictating and the error rate will drop significantly. Also, if you are of the type that likes to walk around when speaking and thinking then voice dictation will free you from the keyboard (until you need to verify that the speech was detected correctly).

While speaking you are also completely focused on the task at hand. Therefore better concentration could mean that you also produce better quality content.

And finally, it is fun! Dictating makes writing blog posts almost gamified. You see the computer work its magic to write the text for you while you are trying to maintain good accuracy.

Challenges

For me the most difficult thing with speech recognition is that I have to keep up with the computer. If I start thinking for too long then the speech recognition will stop and I have to press the microphone button again. Also it seems that longer pauses during speaking cause the computer to capitalize the next word.

In general, I’m quite surprised at how well it works. However, if the computer misspells some words then I might forget what I actually wanted to say. So you have to be careful that the computer gets it right.

It seems that I can’t re enable the speech recognition without pressing the keyboard shortcut twice. The first click will close the dictation window and the second one will start it again. While hitting H twice in rapid succession isn’t actually that bad it would be nice if the dictation could be started with a single press.

The support with Visual Studio Code is not perfect. I can’t see the text being detected live as compared to Google Docs or WordPad for example. Therefore I’m considering using WordPad for writing the text and then pasting it to the markdown file. The problem with this approach is that I can’t easily add images or Hugo shortcodes and I can’t live preview the content on a web browser. After all, Markdown is code which is best written with a proper code editor.

Conclusion

It turns out that using voice recognition on Windows is actually quite simple. But that means that you need to know what you are going to say beforehand. And if you are writing more technical text then you might have to fix errors more often.

I think I will keep on trying to dictate my blog posts for a while to see how it works in the long run. If nothing else then at least I will get some practice at speaking English. Over time I think I will become better and better at speaking to the computer and I hope that it will improve my workflow.

Do you think you could be more productive using speech recognition? You should at least give it a try.

Stop Worrying About Python Code Formatting

Experiences from CITCON Europe 2019