Last week, Shamir Allibhai, CEO of SimonSays, sent me a link to their new website which features automatic text transcription for audio and video files. I was intrigued. Here’s what I found.
Simon Says provides a very fast, automated transcription service that converts your media files into time-stamped transcripts. Excluding upload time, the transcripts themselves are essentially completed in real-time.
This web-based service is easy to use, delivers final transcripts in Word, CSV or SRT formats, and is very reasonably priced. It supports direct file uploads as well as files stored on Dropbox. 21 different media formats are supported, along with five languages: US English, UK English, Spanish, French, and Brazilian Portuguese. The actual transcripts are done using 3rd-party services; the Simon Says website does not say which service they use.
Pricing that is pay as you go on a per-minute basis or monthly subscription are both available. The most you spend for a transcript is $0.33/media minute. Supported payments include American Express, Visa and MasterCard.
The only cautions about the service are that all uploaded media files are retained for six months after transcription is complete and the resulting transcripts need to be reviewed by someone knowledgable about the subject matter as punctuation, technical terms, and off-beat pronunciations are not transcribed correctly.
Developer: Simon Says
Pricing: No more than $0.33/media minute
A 30-minute transcription credit is provided free to all new users to test the service
HOW IT WORKS
(Click to see a larger image.)
To see the true capability of the website, you need to register. Registration is free, but requires a Google ID. At which point, this opening screen appears. To get started, create a new project. You can upload files directly through the web portal (up to 100 MB) or Dropbox (up to 5 GB).
NOTE: Project names are determined by you and you can upload multiple files at once for processing.
Drag and drop a media file onto the Upload button and the system displays a series of animated messages showing the status of your file.
Since the first 30 minutes of transcripts were free, I transferred a 20-minute audio file from one of my recent webinars to test the system. The audio quality was excellent, with no background noise. However, the subject was fairly technical, which gave me a chance to see how well the system performed with challenging subject matter.
While transcriptions are done in real-time, upload time will vary based on the speed of your Internet connection.
When the transfer is complete, the system asks if you want to start the transcription process. Click the purple button to get started.
Next, and I think this is a good thing, the system confirms how much this transcript will cost and, if necessary, negotiates payment before you even start. This allows you to know exactly how much you will be charged. Since my file was less than 30 minutes and the first 30 minutes are free, there was no cost for this test.
NOTE: Something I REALLY like was that Simon Says did not ask for a credit card before running this initial transcript. Many sites try to capture payment data even for free trials. Hats off to Simon Says for avoiding this.
Essentially instantly after clicking the Payments button, my media file appeared on the left and the transcript started typing out on the right.
At the top of the screen, a status bar showed how much of the program was transcribed.
As the transcript spells out, you can:
Correcting text is easy: double-click any word on screen to correct it or modify the punctuation. Changes are reflected in the final exported transcript. The web interface allows you to play the media file while transcription is running in the background so that you can listen to and review the text at the same time as new transcript text is displayed. I found that to be very easy and very helpful.
NOTE: Bookmarks and notes are flagged at the top of the Word version of the transcript, marked in separate columns in the CSV version, and ignored in the SRT version. Very clever.
When the transcript is complete – and it completes in very close to real-time – we can download it as a Word doc, CSV file for Excel or Numbers, or an SRT file to integrate using a variety of subtitle programs, including Adobe Premiere Pro CC.
NOTE: Apple Pages 6.2 and 4.1 are both unable to open the Word doc. Word 2011 opened the file with no problems. I assume more recent versions of Word will, as well.
The results are pretty good, but you’ll still need someone that knows the subject to perform a final review.
For an automated service, these results aren’t bad. The bulk of the transcription is accurate; this alone saves a huge amount of time over transcribing manually.
This is a screen shot of the Word doc. Remember, you have total control of text formatting in Word. However, there is no way I found to remove the timecode except manually.
Note that the system was unable to punctuate accurately, recognize words like “Hi,” “webinar,” “We’re going,” or break the text into sentences as narrated.
Still, this is a WHOLE lot faster than typing a transcript manually and, while it would be great to have every word perfect, most of the words are fine, which means we only need to concentrate on punctuation and correcting errors.
This is a screen shot of the CSV file. Note the column that indicates if a bookmark was added to that line of text.
This is a screen shot of the SRT file, suitable as is to import into most subtitle software.
PRICING AND SUPPORTED FORMATS
Pricing is determined by the duration of your media file. The most you pay is $0.33/media minute.
Simon Says supports 21 different media formats.
THINGS I’D LIKE TO SEE IMPROVED
The service is very easy to use and the web interface makes clear what is going on every step of the way. Creating a transcript is not easy. Audio quality, background noise and the ability of the speaker to enunciate all affect the final results. I’m not overly troubled at the errors I see in the text, all transcripts should be reviewed for accuracy regardless of how they were created.
However, after using the service, there are several things I’d like to see improved in the future:
No one likes to transcribe media files. It is time-consuming, boring and repetitive. The benefit of using actual people to transcribe your audio is that it tends to be more accurate. But accuracy costs time and money.
The benefit to automated transcription is that it is far faster and far cheaper than using human transcribers, but requires that we carefully review the results for accuracy, including punctuation.
Because services like Simon Says are using super-computers and AI-based software in the back-end for the actual transcription, I suspect that accuracy will improve with time as the software processing the transcripts improves.
If you are looking for better ways to turn your media files into text, ESPECIALLY if you have lots of raw interviews to turn into a rough cut, Simon Says is a great place to start.
Final Cut Pro X 10.4
Edit smarter with Larry’s brand-new webinars, all available in our store.