First Look: “Simon Says” Automated Transcripts

Posted on by Larry

Last week, Shamir Allibhai, CEO of SimonSays, sent me a link to their new website which features automatic text transcription for audio and video files. I was intrigued. Here’s what I found.

EXECUTIVE SUMMARY

Simon Says provides a very fast, automated transcription service that converts your media files into time-stamped transcripts. Excluding upload time, the transcripts themselves are essentially completed in real-time.

This web-based service is easy to use, delivers final transcripts in Word, CSV or SRT formats, and is very reasonably priced. It supports direct file uploads as well as files stored on Dropbox. 21 different media formats are supported, along with five languages: US English, UK English, Spanish, French, and Brazilian Portuguese. The actual transcripts are done using 3rd-party services; the Simon Says website does not say which service they use.

Pricing that is pay as you go on a per-minute basis or monthly subscription are both available. The most you spend for a transcript is $0.33/media minute. Supported payments include American Express, Visa and MasterCard.

The only cautions about the service are that all uploaded media files are retained for six months after transcription is complete and the resulting transcripts need to be reviewed by someone knowledgable about the subject matter as punctuation, technical terms, and off-beat pronunciations are not transcribed correctly.

Website: SimonSays.ai
Developer: Simon Says
Pricing: No more than $0.33/media minute
A 30-minute transcription credit is provided free to all new users to test the service

HOW IT WORKS

(Click to see a larger image.)

To see the true capability of the website, you need to register. Registration is free, but requires a Google ID. At which point, this opening screen appears. To get started, create a new project. You can upload files directly through the web portal (up to 100 MB) or Dropbox (up to 5 GB).

NOTE: Project names are determined by you and you can upload multiple files at once for processing.

Drag and drop a media file onto the Upload button and the system displays a series of animated messages showing the status of your file.

Since the first 30 minutes of transcripts were free, I transferred a 20-minute audio file from one of my recent webinars to test the system. The audio quality was excellent, with no background noise. However, the subject was fairly technical, which gave me a chance to see how well the system performed with challenging subject matter.

While transcriptions are done in real-time, upload time will vary based on the speed of your Internet connection.

When the transfer is complete, the system asks if you want to start the transcription process. Click the purple button to get started.

Next, and I think this is a good thing, the system confirms how much this transcript will cost and, if necessary, negotiates payment before you even start. This allows you to know exactly how much you will be charged. Since my file was less than 30 minutes and the first 30 minutes are free, there was no cost for this test.

NOTE: Something I REALLY like was that Simon Says did not ask for a credit card before running this initial transcript. Many sites try to capture payment data even for free trials. Hats off to Simon Says for avoiding this.


(Click for a larger image.)

Essentially instantly after clicking the Payments button, my media file appeared on the left and the transcript started typing out on the right.

At the top of the screen, a status bar showed how much of the program was transcribed.

As the transcript spells out, you can:

Correcting text is easy: double-click any word on screen to correct it or modify the punctuation. Changes are reflected in the final exported transcript. The web interface allows you to play the media file while transcription is running in the background so that you can listen to and review the text at the same time as new transcript text is displayed. I found that to be very easy and very helpful.

NOTE: Bookmarks and notes are flagged at the top of the Word version of the transcript, marked in separate columns in the CSV version, and ignored in the SRT version. Very clever.

When the transcript is complete – and it completes in very close to real-time – we can download it as a Word doc, CSV file for Excel or Numbers, or an SRT file to integrate using a variety of subtitle programs, including Adobe Premiere Pro CC.

NOTE: Apple Pages 6.2 and 4.1 are both unable to open the Word doc. Word 2011 opened the file with no problems. I assume more recent versions of Word will, as well.

THE RESULTS

The results are pretty good, but you’ll still need someone that knows the subject to perform a final review.

For an automated service, these results aren’t bad. The bulk of the transcription is accurate; this alone saves a huge amount of time over transcribing manually.


(Click to see a larger image.)

This is a screen shot of the Word doc. Remember, you have total control of text formatting in Word. However, there is no way I found to remove the timecode except manually.

Note that the system was unable to punctuate accurately, recognize words like “Hi,” “webinar,”  “We’re going,” or break the text into sentences as narrated.

Still, this is a WHOLE lot faster than typing a transcript manually and, while it would be great to have every word perfect, most of the words are fine, which means we only need to concentrate on punctuation and correcting errors.

This is a screen shot of the CSV file. Note the column that indicates if a bookmark was added to that line of text.

This is a screen shot of the SRT file, suitable as is to import into most subtitle software.

PRICING AND SUPPORTED FORMATS

Pricing is determined by the duration of your media file. The most you pay is $0.33/media minute.

Simon Says supports 21 different media formats.

THINGS I’D LIKE TO SEE IMPROVED

The service is very easy to use and the web interface makes clear what is going on every step of the way. Creating a transcript is not easy. Audio quality, background noise and the ability of the speaker to enunciate all affect the final results. I’m not overly troubled at the errors I see in the text, all transcripts should be reviewed for accuracy regardless of how they were created.

However, after using the service, there are several things I’d like to see improved in the future:

SUMMARY

No one likes to transcribe media files. It is time-consuming, boring and repetitive. The benefit of using actual people to transcribe your audio is that it tends to be more accurate. But accuracy costs time and money.

The benefit to automated transcription is that it is far faster and far cheaper than using human transcribers, but requires that we carefully review the results for accuracy, including punctuation.

Because services like Simon Says are using super-computers and AI-based software in the back-end for the actual transcription, I suspect that accuracy will improve with time as the software processing the transcripts improves.

If you are looking for better ways to turn your media files into text, ESPECIALLY if you have lots of raw interviews to turn into a rough cut, Simon Says is a great place to start.


Bookmark the permalink.

9 Responses to First Look: “Simon Says” Automated Transcripts

  1. Warren Nelson says:

    Oh my!

    So a quick question, is there a way to ingest a CSV file into FCP that would let me create a series of clips based on a “paper cut” of an interview?

    Even the ability to create markers?

    I’m going to look around but if you know of a magic XML tool, I’m in!

    • You should check out Speedscriber. It’s built to integrate directly with FCPX and will do close to what you’re asking for. In the Speedscriber app you can highlight ranges of your transcript as favorites. Back in FCPX you can just choose to show only those favorites in the browser, pull them into a timeline and just rearrange as needed. The notes field of the FCPX timeline index has all of the transcribed words for each clip, so it’s easy to select them in the index and rearrange by text without having to listen to the clips to verify content. I’ve been working this way with Speedscriber quite happily for many months now.

    • Hi @Warren

      We will support XML within 2 weeks so you will be able to bookmark sentences and import them easily into FCP or Premiere.

      Hi @Larry

      I had no idea you were writing a review and was touched to read this. Thank you for the comments and feedback.

      Couple points:
      1. It is important to mention for production people: most/all of your footage does NOT start at 00:00:00;00. We know this and have built in auto-sync that matches the timecode/frame rate in your uploaded files to the transcripts. You can also set this manually (via the clock button near the search bar).
      2. You mentioned you would like a Word doc without timecode – if you scroll to the bottom half of the Word doc, there is the transcript WITHOUT timecode. (The first half of the doc is transcript WITH timecode).
      3. The retaining of files for 6 months is actually for the users benefit that they can come back to the site and still find their transcribed projects months from now. If anyone prefers to delete their project (and all the associated media/files), they can do so from the dashboard by clicking the trash can in the top right of thumbnail of the project they wish to remove.

      Happy to answer any q’s.

      Best
      Shamir

      • Larry says:

        Shamir:

        1. I enjoy reviewing software – especially software that is both well-written and useful.

        2. Good to know – you might add a note at the top of the Word doc that this is what you’ve done.

        3. Excellent. I wanted the option to delete anything that is sensitive/private.

        Thanks,

        Larry

  2. Tony says:

    Is there a quick easy way to convert these transcripts into an automatically synced Closed Caption file? Or would sticking with a dedicated closed captioning service like Rev still be the way to go?

    • Larry says:

      Tony:

      Yes.

      Most captioning software can import an SRT file, which Simon Says creates. Since the SRT file includes both text and timecode references, this can easily be added to a video to create closed captions.

      Larry

  3. Shamir says:

    Hi everyone

    It’s been a busy few months. A few updates:

    1. The price is now 8 cents/minute with a subscription plan or ~17 cents/minute on pay as you go.
    2. We have plethora of additional post production integrations incl Avid’s Pro Tools and Media Composer; Adobe’s Audition and Premiere; FCP X; FCP 7; and Edius. Editing with transcripts in your NLE/DAW program is sooo much faster now.
    3. Publishing transcripts is now available: some users, like podcasters, want to post their transcripts like so:
    https://simonsays.ai/app.html#!/public/shamir-allibhai/npr2094OWeZ7/oprah-winfrey-at-the-2018-golden-globes/NZ50Bg6yQvjo

    As before, it is free to sign up and comes with free credits. Happy to answer any q’s and feel free to send any suggestions.

Leave a Reply

Your email address will not be published. Required fields are marked *

Larry Recommends:

FCPX Complete

NEW & Updated!

Edit smarter with Larry’s latest training, all available in our store.

Access over 1,900 on-demand video editing courses. Become a member of our Video Training Library today!

JOIN NOW

Subscribe to Larry's FREE weekly newsletter and save 10%
on your first purchase.