Idea for a new productivity-booster product: Audio file to text

narayan · 2019-02-21 07:43

Hi Denis,

In Bangalore, we have a hot debate about the best way to handle our traffic jams.
Recently a few opinion-leaders in the city were interviewed, and the videos were posted (total time >4 hours).

I wanted to map their arguments using a argumentation map.
For that I need transcripts of their interviews.

But it is a painful task to transcribe four hours of audio.
So I looked for speech recognition product that converts an mp4/mp3 into text.
(It is easy to correct a few errors that may be there.)

I tried a few dictation software (including a few online services), but their transcription is horrible.

Windows 10 is supposed to have an excellent built-in speech recognition feature.
But surprisingly it does not allow users to transcribe audio/video files.

Would it be possible for you to create such a product using the built-in capability of Windows?

den4b · 2019-02-21 11:15

narayan wrote:

Would it be possible for you to create such a product using the built-in capability of Windows?

Yes, it should be possible. Windows does indeed have API for speech recognition, i.e. System.Speech.Recognition and Microsoft Speech API (SAPI).

In my experience, anything to do with speech (recognition or synthesis), usually requires significantly more than just using an API to get a good quality result.

Have you looked around already, are there any viable solutions out there?

For example:
List of speech recognition software @ Wikipedia
What free software can convert audio files into text files @ Quora
https://sonix.ai/

narayan · 2019-02-22 10:55

Oh I did an extensive search, and tried a large number of tools, including Online tools.

I even played the mp3 file on my laptop, and tried to take notes on my mobile.
(Somehow there are many more Android apps for this, compared to Windows apps).

But the result is always very poor (too many mistakes).
(I tried to change the language to American, British and Indian English, but the result is bad.)
But the same Android apps give wonderful accurate transcription when I speak.

The audio on the mp3 file is crystal clear, with no accent issues.

So, on the whole, I think there is a need for such apps.

BTW I came across a product called Simon, and it looks promising.
But I could not use it for any practical use.

Last edited by narayan (2019-02-22 10:56)

den4b · 2019-02-26 14:21

I created a small app for testing Microsoft's Speech API, using the SpeechRecognitionEngine specifically. Then I processed few sample speech recordings in English, but the recognition was very poor. Most words came out wrong, and that is even before trying to assemble them into real sentences.

Another limitation of this Speech API is that the recognition language is bound to the installed version of Windows, so if you have a US version of Windows installed, then it will only have American English recognition.

den4b Forum

#1 2019-02-21 07:43

Idea for a new productivity-booster product: Audio file to text

#2 2019-02-21 11:15

Re: Idea for a new productivity-booster product: Audio file to text

#3 2019-02-22 10:55

Re: Idea for a new productivity-booster product: Audio file to text

#4 2019-02-26 14:21

Re: Idea for a new productivity-booster product: Audio file to text

Board footer