title | subtitle | author | header-includes | output | ||||||
---|---|---|---|---|---|---|---|---|---|---|
RFC on Video Processing Pipeline |
Time Machine RFC-0035 |
|
|
pdf_document |
This RFC defines the general architecture for video. The document deals with how to produce high-quality video files that are suitable for research purposes and that can also be preserved for future uses.
Digitising and preserving video recordings is often challenging, because there is a great variety of different video formats. In addition, the lifespan of analogue and digital video carriers is limited by their physical and chemical stability and the availability of playback technology.
Digitising video recordings often requires a significant amount of resources, as not all work steps can be automated, especially in smaller cultural heritage organizations. The preservation strategy should therefore be able to balance the current and future needs of the users as well as the organisation’s conditions and resources.
Signal retrieval from the original video carriers is often the most significant work step in terms of quality. Furthermore, preserving digitised and born-digital video files requires regular transfers to a new storage medium and migrations to new file formats. These should be considered when the video material is transferred from the original storage medium for the first time.
A somewhat simpler video digitisation workflow aimed at small and medium-sized cultural heritage organizations can be found in Europeana Digitization Handbook 1.
Digitising video recordings is expensive and slow, and it takes a lot of human resources. If the video collection to be digitised is extensive, choices must be made regarding what to digitise and in what order. In the selection and prioritisation of the material, there are several different perspectives that should be taken into account. Various general factors related to the selection of material to be digitised are discussed in more detail in the document RFC-0008. Regarding video recordings, special attention should be paid to the risk of destruction. Often technical factors such as the condition, age and format of the original video carriers determine the urgency of digitisation, as especially the oldest cassettes are in danger of deteriorating to the point of being unusable.
The selection of videos to be digitised and the order of digitisation is also significantly influenced by the fact that, because of the large number of magnetic and optical video formats, a lot of different equipment is needed, the manufacture of many of which has already been discontinued. Servicing tape recorders requires expertise, which is becoming less and less available, and the availability of spare parts is decreasing all the time. The situation is made somewhat easier by the fact that recorders are often backwards compatible. For example, Digital8 recorders can play also Video8 and Hi8 tapes.
Museum of Obsolete Media has published a Video Media Timeline 2 and information about different video tape formats, video discs, and other video medias 3, which can be used as an aid in detecting the necessary equipment. California Audiovisual Preservation Project as published a guide to identification of video tape formats 4. Furthermore, Museum of Obsolete Media has published Media Stability Ratings 5 and Obsolescence Ratings 6 which can be used as a help when planning the order of video digitisation.
There are three different video systems, the oldest of which is NTSC (National Television System Committee) that was developed for the needs of American television. PAL (Phase Alternating Line) has since been developed based on NTSC, as well as the SECAM system (Séquentiel couleur à mémoire), which closely resembles it. It is possible to convert a video from one system to another, but its quality drops with each conversion, when the image size and number of frames per second changes.
NTSC 7 was developed in the in the early 1940s. The video signal is assigned to 525 horizontal lines (480 visible lines) and the transmission has 30 frames (60 half frames interlaced) per second. The video image of the NTSC system is weaker in terms of vertical sharpness than the video image according to the PAL system, because the maximum signal frequency of the luminance signal (Y-signal, luminance, luma) is lower than in the PAL system.
The PAL 8 system prevails in most of Europe. PAL was developed on the basis of the NTSC system in the mid-1960s. Improvements were made especially to the colour signal of the video image. In addition, the vertical resolution of the video image was improved by increasing the frequency range of the luminance signal over the chrominance signal (C-signal). In the PAL system, the video signal is assigned to 625 horizontal lines (576 visible) and the transmission has 25 (50) frames per second.
SECAM 9 resembles PAL, and the video signal is also assigned to 625 horizontal lines (576 visible) and 25 (50) frames per second. However, in the SECAM system the colour signal uses frequency modulation, resulting in only one signal for each line. Thus, SECAM does not have colour saturation disturbances like PAL. But, for the same reason, SECAM is not linear and combining two synchronized SECAM signals may not produce a valid SECAM signal.
An analogue video signal contains colour (chrominance) and lightness (luminance) signals, either combined (composite video) or separately (S-Video, component video). Unlike digital video, analogue video does not have a time code, unless it is added separately with digital TBC 10 (Time base correction) technology.
When digitising videos, the highest quality video signal should be used that the video format supports. Of lowest quality is composite video 11, where all the data of the video signal travels along the same cable, causing interference and unevenness. Of the analogue video formats such as VHS/VHS-C and Video8 are based on composite video signal. The second best analogue signal is S-Video 12 (Separated video, also known as Y/C), where the Y and C signals are separated. For instance, S-VHS, Hi8, U-matic and Betamax support S-video signal. The best option is component video 13, where the video signal is split over three components Y, Pʙ and Pʀ. Among other, Betacam and Betacam SP video players have component video output.
U-matic 14 was introduced in 1971. U-matic cassettes are large, with a tape width of 3/4 inch. U-matic was aimed at consumers, but it did not become common in domestic use. However, the format found its way into TV production in the mid-1970s. Other U-matic versions include the smaller U-matic S, U-matic High Band (BVU, Broadcast Video U-Matic), and U-matic SP (Superior Performance) versions.
VHS 15 (Video Home System) was introduced in 1976. It is the best known and most common of the consumer analogue video systems. Disadvantages of VHS include small image size (240×576 PAL / 240×486 NTSC), poor image quality, and poor carrier life. S-VHS 16 (Super VHS) is significantly better than VHS in terms of image size (resolution 400 lines). In S-Video technology, the chrominance and luminance signals are separated and do not interfere with each other. S-VHS recorders can play standard VHS tapes, but VHS recorders cannot play S-VHS tapes. VHS-C (Compact VHS) is a smaller version of the VHS cassette designed for home video use. There are adapters for VHS-C cassettes that allow one to play them with a VHS (or S-VHS) recorder, but the rarer S-VHS-C cassette needs its own camera and adapter.
When it comes to image quality, Betamax 17 (Beta) is close to VHS, with the resolution of 250 lines. Whereas Betamax was developed for the consumer market, Betacam 18, based on Betamax, was aimed for professional use. Betacam and Betamax cassettes are almost identical in appearance, but they have an indication of which model the cartridge represents. Betacam was the first analogue video recorder system in which the video signal was recorded on tape in component form, the chrominance and luminance signals separately on their own video tracks. Therefore, the high-frequency chrominance signal of the video signal cannot soften the sharpness of the image when dubbing the tapes. Time code can also be recorded on Betacam tapes with a built-in TBC.
With Betacam SP 19 (Superior Performance) tapes, the recording capacity was increased to 90 minutes with L-sized tapes, which are considerably larger than conventional Beta tapes. Betacam SP recorders can play both cassette sizes without a separate adapter. The resolution of Betacam is 300 lines and of Betacam SP 340 lines.
Video8 20 was a home video format, from which the Hi8 and Digital8 formats were later developed. Video8 corresponds in quality to VHS video (resolution 240, composite output). The Hi8 21 video has a better image quality than its predecessor (resolution 400, S-video output), but the format is still analogue, even though digital audio can be recorded on the cassettes with more advanced cameras. Video8, Hi8 and Digital8 cassettes are the same size and backwards compatible.
A digital video signal contains significantly more data than an analogue signal. In addition to time code, digital video contains – depending on the type of video – various types of metadata. Although digital videos can theoretically be copied without any loss in quality, even cassettes and discs designed for professional use do not last indefinitely. Therefore, digital video cassettes and discs should be transferred as files to a digital preservation system.
In the PAL system, a digital video image can be recorded at 25 frames per second as a progressive image or 50 half frames per second as an interlaced image. Since the technology that records 25 frames per second was considered too jerky for television, the interlaced image was introduced in TV production. In the interlaced video, the image is built from half fields consists of horizontal lines, every other one of which is recorded in the first field, and every other one in the next one. This creates the illusion of smoother movement at the expense of image integrity. When watching an interlaced video image, the fields change every 1/50 of a second. The interlaced image does not become a problem as long as it is viewed on the devices intended for it, for example a standard television. However, computer screens use a progressive image, whereby a fast-moving interlaced image appears jagged. An interlaced image can be changed to a progressive format using the deinterlace function of an editing software.
DV (Digital Video) is both a codec and a physical cassette format. There are two sizes of cassettes, the smaller and more common of which is called MiniDV 22. MiniDV was primarily aimed at consumers. The resolution of DV video is 720×576 pixels and the bit rate is 25 Mbps.
DVCAM 23, developed on the basis of DV technology, was intended for professional use. It differs from standard DV in its durability and more precise synchronization of sound and image. DVCAM tapes also have fewer dropout errors than standard DV tapes. DVCAM recorders can play DV tapes, and some newer DVCAM recorders can also play DVCPRO tapes. The image resolution and bitrate are the same in all three formats.
Digital8 24 uses the same DV codec as DV and DVCAM, and in all its features resembles these formats more than its analogue predecessors Video8 and Hi8. In terms of technical characteristics, Digital8 tapes are almost identical to MiniDV tapes.
The DVCPRO 25 format is available in three different quality standards. DVCPRO (also called DVCPRO25) closely matches DV technology in terms of features, but it is more common in professional use due to slightly better image quality. DVCPRO50 26 is about twice as high quality. It corresponds to Digital Betacam in terms of image quality. There is also a version of DVCPRO for high-definition video, DVCPRO HD (DVCPRO100). The DVCPRO100, DVCPRO50, and DVCPRO cassettes are backwards compatible.
Digital Betacam 27 (Digibeta) is comparable in features to the DVCPRO50 level, but it uses the rarer DCT codec. The data recorded by Digibeta is of a high quality: the bit rate of the video is 90 Mbps and the sound is recorded uncompressed on four audio tracks, PCM encoded, 48 kHz / 20 bit.
In addition to magnetic tapes, videos have been stored on optical media for decades. It is noteworthy that not only video tapes deteriorate, but also optical discs have a limited lifespan. This applies especially to self-recorded discs. Thus, it is important to transfer them without unnecessary delays to a better storage platform for long-term preservation.
LaserDisc 28 was the first optical video format. LaserDisc discs are double-sided, and the disc sizes are 12-inch and 8-inch. LaserDisc comes with a horizontal resolution of 440 lines (PAL), and the analogue output is S-VHS-quality. Based on LaserDisc and CD technologies, the CD Video 29 (CDV) format was developed. There are three sizes of plates: 5, 8 and 12 inches.
Video CD 30 (Compact Disc Digital Video, VCD) was an optical format based on CD technology. VCDs are playable in dedicated VCD players, and they are also widely playable in most DVD players. In PAL system the resolution is 352 x 288, 25 frames / second, using MPEG-1 codec with 1150 kbps. The audio is 44.1 kHz, 224 kbps.
DVD-Video 31 uses MPEG-2 Part 2 compression at up to 9.8 Mbps. A single-level DVD-Video disk can hold 4.7 GB of data. Data compressed in MPEG-2 format is quite lossy, and when combined this with the limited shelf life of self-burned DVDs, DVD-Video format cannot be recommended for permanent storage of video recordings.
Blue-Ray 32 was designed to supersede the DVD format, capable of storing several hours of high-definition video. Blu-Ray discs are identical in size to DVD discs, but information is stored at a much higher density due to the use of blue lasers. Thus, Blu-Ray discs they can store 25 GB of data on single-layer discs and 50 GB on double-layer layers. As of November 2022, there are no less than five versions of Blu-ray Disc Recordable Erasable (BD-RE) and four versions of Blu-ray Disc Recordable (BD-R) 33.
The effect of analogue reproduction equipment on the quality of video digitisation is usually much greater than that of A/D converters and other digital equipment. Because of the vast amount of different videotape specifications, some governed by standards and some not, digitising analogue video tapes is demanding, and this broad topic cannot be properly addressed within the scope of this document.
Instructions can be found, for example, in IASA Guidelines for the Preservation of Video Recordings, where there is comprehensive advice on how to replay obsolete videotapes 34. Each format-based chapter provides advice specific to that particular carrier and the video tape recorders needed to play those carriers back. There are also general advice on the assessment, preparation, cleaning, and heat treatment of videotapes. Additional carrier-specific advice on these topics, where relevant, will be found in the sections devoted to specific carriers.
For videos in digital format, it makes sense to preserve the original file format, codec, resolution and bitrate, because the quality will in most cases decrease if the file is converted to another format.
A codec (coder-decoder) is a device or program that converts (encodes) a video or audio file into a storage and presentation format. The conversion can take place to a protected (encrypted) format and from there back to the presented format. The signal can also be compressed lossy (lossy codec) or lossless (lossless codec) to save space. Lossless compressed data takes up more space than lossy compressed data, but it can be unpacked in original quality.
In lossy codecs, it is important to pay attention, in addition to the codec properties, to the data transfer speed (bit rate), which determines how much data the video contains in relation to its duration. The bit rate is reported in kilobits per second (kbps, kb/s, kilobits per second) or megabits per second (Mbps, Mb/s, megabits per second). For example, the bit stream of DVD-video is 7–10 Mb/s, whereas the bit rate of DV- video is 25 Mb/s.
The encoded video is stored in a container format 35, also called a wrapper. A container allows multiple data streams to be embedded into a single file, that is, audio, video, subtitles, tracks, and metadata (tags) — along with the synchronization information needed to play different data streams together. Containers may identify how data is encoded, but they do not provide instructions about how to decode that data. Thus, a software that can open a container must also use an appropriate codec to decode its contents.
There are many different possibilities for wrapping a video 36. Commonly used video wrappers include, among others, the freely licenced Matroska 37 (.mkv), MPEG-4 File Format 38 (Version 2, Part 14,.mp4), Quick Time File Format 39 (.mov), Audio Video Interleave 40 (.avi), and Material Exchange Format 41 (.mxf).
Due to numerous different video wrappers and codecs and their different combinations, choosing a video storage format for preservation purposes is not an easy task. Library of Congress recommends 42 that videos are stored with the original production resolution, frame rate, and file-based format that was delivered to the content distributor. The following video formats are recommended, in order of preference:
- Interoperable Master Format 43 (IMF) consisting of essence files as MXF 44 tracks including video, audio, data and dynamic metadata essences, composition playlist, and packaging data
- ProRes (QuickTime .mov container with 4444 (XQ)/4444 45, or 422 HQ 46 codecs)
- MPEG-2 47
- XDCAM 48 (MXF container with HD422, SHD422, or HD codecs)
IASA's guidelines for video formats are more detailed and they are based on six different categories depending on the format of the original video 49. IASA has also published a thorough Target Format Comparison Table 50. The video formats and codecs for preservation purposes require often so much storage space that their distribution over the network is slow. Thus, it often makes sense to make a user copy(s) for viewing purposes and for other use via the network. User copies may be stored, for example, with the following codecs:
- H.264 51 (Advanced Video Coding AVC, MPEG-4 part 10)
- H.265 52 (High Efficiency Video Coding HEVC, MPEG-H Part 2)
The H.265 is newer and more advanced than H.264, and it compresses information more efficiently. H.265 files are almost half in size compared to H.264, with comparable video quality.
It is recommended to collect enough technical metadata about the original video and the equipment used, as well as the parameters related to digital video files. This helps, for example, to automate future migration. Likewise, systematic errors in digitisation can be corrected. If it is later discovered that a certain playback equipment has worked incorrectly, all videos digitised with that equipment can be digitised again. The technical metadata should contain, for example, the following information:
- Type of the recording
- The playback equipment and playback parameters, especially if they differ from those specific to the media
- The A/D converter used
- Recording software and its version
- Responsible persons and dates
- The format of the digital recording, including the video codec and bit stream
- The characteristics of audio tracks such as format, number of tracks, sampling frequency and audio resolution
- Possible anomalies that occurred in the digitisation process
- The selection of material to be digitised is discussed in more detail in RFC-0008.
Footnotes
-
http://calpreservation.org/wp-content/uploads/2013/10/2013-Audiovisual-Formats_draft_webversion-2013oct15.pdf ↩
-
https://obsoletemedia.org/media-preservation/media-stability-ratings/ ↩
-
https://obsoletemedia.org/media-preservation/obsolescence-ratings/ ↩
-
https://www.iasa-web.org/sites/default/files/publications/IASA-TC_06-C_v2019.pdf ↩
-
https://en.wikipedia.org/wiki/Comparison_of_video_container_formats ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000342.shtml ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000155.shtml ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000052.shtml ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000059.shtml ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000013.shtml ↩
-
https://www.loc.gov/preservation/resources/rfs/moving.html ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000013.shtml ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000527.shtml ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000403.shtml ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000335.shtml ↩
-
https://www.iasa-web.org/sites/default/files/publications/IASA-TC_06-B_v2019.pdf ↩
-
https://www.iasa-web.org/sites/default/files/publications/IASA-TC_06-B-app_v2019.pdf ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000081.shtml ↩
-
https://www.loc.gov/preservation/digital/formats/fdd/fdd000530.shtml ↩