Last week, on the Digital Production Buzz, we interviewed Evelyn McLellan, a professional archivist for Artefactual Systems.
(You can hear her interview here — 14 minutes, MP3 file.)
I found the interview to be an excellent orientation to getting our files prepped for permanent storage. We didn’t talk about archive hardware at all, just what we need to do with our media.
Well, after the show, John Mozzer and Evelyn McLellan had an email conversation that I want to share with you, as it is relevant to all of us. (And thanks to both John and Evelyn for allowing me to share this with you.)
John Mozzer asks:
I’m pretty confused by the Digital Production Buzz interview with Evelyn McLellan, Systems Archivist for Artefactual Systems.
I think I understand sustainability factors when choosing media file formats (adoption, non-proprietary, etc.), and the benefits of storing the media on a server (the software tools, etc).
But I don’t understand the reason for converting video to MPEG-2 with Intra-coded frames, even though it is high quality. To what extent does this involve re-encoding the original video? If it involves re-encoding, why do it?
I can understand needing to uncompress and re-compress Digital Betacam, for example, in order to store that video on a server. (Am I right about that?) But, for example, what about all the legacy standard definition video on tape in the DV format, which can be captured bit-for-bit?
Evelyn McLellan responds:
The purpose of re-encoding the video is to reduce a multitude of incoming formats, many with proprietary codecs, into a few device-independents format for long-term preservation. Since different formats and codecs are likely to become obsolete at different times, it becomes very difficult, if not impossible, to monitor which video files are at risk at any given time. MPEG-2 is a non-proprietary, openly-specified codec and many heritage institutions are using it.
This means that, for a long time into the future, there will almost certainly be tools and support for MPEG-2 – in other words, we won’t have to re-format for a long time, if ever. The idea is to reformat only once if possible. So there may be some (imperceptible) data loss with the initial reformatting, but the alternative is to fail to reformat proprietary and/or obsolete formats until it is too late and thus lose the ability to render the video. Of course, as I mentioned during the interview, we keep all the orginal formats as well, in case a better preservation strategy comes along that we aren’t able to predict right now.
Generally we deal with device-independent end-state formats. My understanding of DV is that in order to render it device-independent you need to place it in some kind of wrapper (such as AVI, QuickTime or MXF) or store it as raw video (DV-DIF). The Library of Congress is investigating wrapper formats for DV, particularly MXF (which is the wrapper we use for our video files), since AVI and QuickTime are proprietary. Actually, Library of Congress is an excellent source of information on this subject – please see
http://www.digitalpreservation.gov/formats/content/video.shtml.Apologies if this is insufficient detail – I’m an archivist, not a video expert, and video files are just one of the types of digital objects we’re trying to preserve (the others are office documents, e-mail, audio files, raster and vector images, web sites, databases, etc.). However, similar principles apply across the board when it comes to digital preservation – accept or convert to a small number of non-proprietary, openly-specified, device-independent and widely used formats, and use redundant storage to allow for replacement of any damaged objects.
Thanks for your questions & I hope this helps.
Larry adds: Thanks for allowing me to share this!