Volume 30 ~ Issue 3 | July 2010

President's Report |CIT Conference Info | Call for Nominations | Advice for Digital Immigrants | SSP Training | Signing Time Academy | VRS Curriculum Infusion | Job Postings


Column: Advice for Digital Immigrants

by Doug Bowen-Bailey
CIT Webmaster

Captioned Videos: The Excuses are Gone

In my work developing online and video resources, I have frequently had conversations with people who have expressed the desire that all videos be captioned. As someone who has spent literally hundreds of hours putting in time codes for captions, it always has made me sigh. I, too, really would like to have all videos captioned. Logistically, however, it just didn't seem possible. For example, to create a 5 minute video, it might take 15 minutes of filming, editing, and posting it on the web. Depending on the process and the level of production, it could actually take a lot less. However, to turn that 5 minutes of video into a captioned version can take well over an hour. First of all, you need to transcribe the video. This is relatively easy. The challenge comes in going through the transcript and inserting time codes to tell the video player when each segment of captioning should begin and end. It is a laborious task.

Or, I should say, was. Google, though its online video service, YouTube, has now created a free automated captioning service. Gone are the days of laboriously needing to insert all the time codes for a video. Here's how it works.

Click here for an example created by Google of how to do created automated captions.

Machine Transcription

If you create a video in spoken English, you can request machine transcription. Through this process, YouTube uses the same computer process that GoogleVoice uses to transcribe voicemail into an email. YouTube essentially listens to the video, uses software to recognize the words and creates a transcript. At the same time, it notes the timing when each of the words are spoken and inserts the time codes into script so that it will be able to display them with the video track.

Advantages: Simple. Requires no more work than a click of a button. Saves the effort of having to do a transcription.

Drawbacks: Machine transcription sometimes fails. It also often makes mistakes in the transcription. In addition, the captions often are segmented in strange breaks - more dependent on the number of characters than on the actual separation of ideas. (Essentially, it is captioning without any discourse mapping.)

You can, however, download the transcript, make changes, and then upload it again.

Create Your Own Transcript

You can type up your own transcript and then upload it to YouTube. YouTube's software then reads your transcript and listens to the audio track of your video and puts in the time code markers automatically.

Advantages: Creating your own transcript allows you to make sure the words are represented exactly as you want them to be. You also can manage the way that the captioning is segmented by putting in line breaks in your document at the end of segments.

Drawbacks: Takes more time to create a transcript. It is best to upload it as a text document, rather than a Word document or some other word processing format. A file that is formatted as a .doc or .docx has a lot of hidden information in the header of the document that YouTube might read and insert as captioning. So, it is really best if you can save it as a text file. (.txt) If you have Microsoft Word, you can choose this from "Save As..."

Here's an example of how the segmenting might work. Let's use the first few sentences of this article as the transcript of what I am sure would be an award winning YouTube video:

In my work developing online and video resources, I have frequently had conversations with people who have expressed the desire that all videos be captioned. As someone who has spent literally hundreds of hours putting in time codes for captions, it always has made me sigh.

If you upload the transcript like this, you may end up with a series of captions that look like this:

Caption 1:In my work developing online and video resources, I have frequently had conversations with people

Caption 2: who have expressed the desire that all videos be captioned. As someone who has spent literally

Caption 3: hundreds of hours putting in time codes for captions, it always has made me sigh. I, too

This is obviously not the best breaking points for keeping the ideas as intact as possible in the captions. You can control how YouTube segments the captions by uploading a transcript that has line breaks where you want to start new captions. Here's an example of more thoughtful breaking:

In my work developing online and video resources, I have frequently had conversations

with people who have expressed the desire that all videos be captioned.

As someone who has spent literally hundreds of hours putting in time codes for captions,

it always has made me sigh.

Uploading a transcript like this will produce 4 captioned segments, but in ones that are more congruent with the ideas and utterance boundaries of the text. For those of you in interpreter education programs, this actually can be an excellent assignment for students to analyze a text. Doing a transcription, and then figuring out where the utterance boundaries are, can be very helpful in terms of building the linguistic awareness of English that is necessary for interpreting.

Example Captioned Video

Below, there is an example video of the introduction to this article captioned. This video has two caption tracks - created in both of the ways described above. You can see how to access those captions from the explanation below the video.

Controling the Captions

Once the video starts playing, you can click on the button on the far right of the control bar on the video above. This allows you to control the options for the captions. By hovering over teh triangle to the left of the CC, you can select which caption track you would like to view.

  • "English" shows the transcript uploaded without line breaks allowing the software to break up the captions.
  • "English - With Line Breaks" shows the results of the transcript with more intentional segmentation of the captions.

You can also choose to Translate Captions into other languages. (Google doesn't make any great claims to dynamic translations. They essentially say, "Sometimes the translations are pretty decent. Sometimes, they are not." I would be curious to hear from CIT members who speak these languages to let me know how effective the translations actually were.)

Moving Beyond YouTube

What if you don't want to deliver your video in a YouTube format? Does YouTube require that you only use these captions on their video service?

Fortunately, the answer is no. While it becomes a somewhat more complicated process requiring more steps (though fortunately, not much more time and no more money) you can take the captions created by YouTube and translate them into other formats.

Here are the steps in that process:

  1. Upload a video on YouTube and caption it. If you do not want the video to be viewable here, but are just using it as a working space, you can choose to make the video private or unlisted. You can also delete it when you are finished with the captioning process.
  2. Download the caption file. YouTube creates a caption file in the format of a document with the extension - .sbv. You can download this and then open it with a word processing or text editing program that can handle .txt files. (For Windows, a program like Notepad and for Mac, a program like TextEdit, are programs you can use.). It will look something like this:

    0:00:01.209,0:00:06.350
    In my work developing online and video resources,
    I have frequently had conversations

    0:00:06.350,0:00:11.090
    with people who have expressed the desire
    that all videos be captioned.

    0:00:11.090,0:00:15.639
    As someone who has spent literally hundreds
    of hours putting in time codes for captions,

    0:00:15.639,0:00:19.439
    it always has made me sigh.

    The numbers before the captions represent the In and Out time codes for the captions.

  3. Translate .sbv file to .srt file. The .sbv file is read by YouTube, but is not a standard that is yet adopted by lots of other captioning programs. A more popular format is a .srt file which can be used for captioning online video or subtitling DVD video. Gideon Goldberg created a simple script that converts this file and can be done online for free. Click here to go to that page. You can copy the above example of a .sbv text and then paste it into the top window, and click the button to transform it into an .sbv file. One problem is that the script converts the "," into "-->" because those are the different symbols used to show the separation between start and end times for a caption. However, this means that a comma in the transcript will automatically turned into a "-->." You can use the "Find and Change" feature of a text ending program to selectively change the ---> back into commas in the appropriate places.
  4. Clean-up the .srt file. This step may not be necessary. However, there may be times when the SRT file is not saved in a format that you can open in a program that can convert the subtitles (as explained in the next step.). There are, however, free programs that can clean-up the SRT file if needed. For Windows, you can use SubRip (free download here.) For Mac OS, SubCleaner is a program that lets you simply drag a file onto and it turns it into the correct formating as an .srt file that can be read as a subtitle.
  5. Convert .srt file to other format. As video formats move to become more accessible, there are a variety of formats of caption files. Flash files might require an .xml file while QuickTime videos require a QuickTime text track. In simple terms, these formats really differ in how the time codes are signified. To make these changes by hand would be extremely time consuming. However, there is an open-source software called Jubler, which can read an .srt file and convert it to 13 different formats. Jubler works on Mac, Windows and Linux. According to discussion forums on Jubler's web site, the developers of this program plan to add the .sbv format to the list of documents that Jubler can read and produce. This will make steps 3 and 4 unnecessary and we will be able to create the time code transcript on YouTube and then translate it to over formats directly through Jubler.

I recognize that the title of this article is hypberbolic. Quality captioning (like quality interpreting) is not something that can be simply automated. So, some excuses still exist. However, the impediment of having to manually insert time codes has been removed from the work and it will hopefully lead to a significant increase in both the number and percentage of online videos that are accessible. I'm not sure if we'll get to the point where all videos are captioned, but I, for one, will be doing my part to contribue to the number of videos that are. I hope that you will join me.

For more information, you can view Google's announcement of this service.