About Word Clock, SMPTE, and MIDI Clock

[ LiB ]

About Word Clock, SMPTE, and MIDI Clock

Before we start looking at how Cubase handles synchronization, it is important to understand the different types of synchronization, its terminology, and the basic concepts behind these terms. The idea behind synchronization is that there will always be a sender/receiver relation between the source of the synchronization and the recipient of this source. There can only be one sender, but there can be many receivers to this sender. There are three basic concepts here: timecode, MIDI Clock, and Word Clock.

Timecode

The concept behind timecode is simple: It is an electronic signal used to identify a precise location on time-based media, such as audio, videotape, or in digital systems that support timecode. This electronic signal then accompanies the media that needs to be in sync with others. Imagine that a postal worker delivering mail is the locking mechanism, the houses on the street are location addresses on the timecode, and the letter has matching addresses. The postal worker reads the letter and makes sure it gets to the correct address, the same way a synchronizing device compares the timecode from a source and a destination, making sure they are all happening at the same time. This timecode is also known as SMPTE ( Society of Motion Picture and Television Engineers ), and it comes in three flavors:

MTC (MIDI Timecode). Normally used to synchronize audio or video devices to MIDI devices such as sequencers. MTC messages are an alternative to using MIDI Clocks (a tempo-based synchronization system) and Song Position Pointer messages (telling a device where it is in relation to a song). MTC is essentially SMPTE (time-based) mutated for transmission over MIDI. On the other hand, MIDI Clocks and Song Position Pointers are based upon musical beats from the start of a song, played at a specific tempo (meter-based). For many nonmusical cues, such as sound elements that are not part of a musical arrangement but rather sound elements found on movie sound tracks (for example, Foley, dialogue, ADR, room tones, and sound effects), it's easier for humans to reference time in some absolute way (time-based) rather than musical beats at a certain tempo (music-based). This is because these events are regulated by images that depict time passing, whereas music is regulated by bars and beats in most cases.
VITC (Vertical Interval Timecode). Normally used by video machines to send or receive synchronization information from and to any type of VITC-compatible device. This type of timecode is best suited for working with a Betacam or VTR device. You will rarely use this type of timecode when transferring audio-only data back and forth. VITC may be recorded as part of the video signal in an unused line, which is part of the vertical interval. It has the advantage of being readable when the playback video deck is paused .
LTC (Longitudinal Timecode). Also used to synchronize video machines, but contrary to VITC, it is also used to synchronize audio-only information, such as a transfer between a tape recorder and Cubase. LTC usually takes the form of an audio signal that is recorded on one of the tracks of the tape. Because LTC is an audio signal, it is silent if the tape is not moving.

Each one of these timecodes uses an hours: minutes: seconds: frames format.

NOTE

ABOUT MOVIE SOUND TRACKS

The sound track of a visual production is usually made up of different sounds mixed together. These sounds are divided into six categories: Dialogue, ADR, Foley, Ambiances, Sound Effects, and Music.

The dialogue is usually the part played by the actors or narrated off-screen .

The ADR ( Automatic Dialogue Replacement ) refers to the process of rerecording portions of dialogue that could not be properly recorded during the production stage for different reasons. For example, if a dialogue occurs while there is a lot of noise happening on location, the dialogue is not usable in the final sound track, so the actor rerecords each line from a scene in a studio. Another example of ADR is when a scene is shot in a restaurant where the atmosphere seems lively. Because it is easier to add the crowd ambiance later, the extras in the shot are asked to appear as if they are talking, but, in fact, they might be whispering to allow the audio boom operator to get a good clean recording of the principal actors' dialogue. Then, later in a studio environment, additional recordings are made to recreate the background chatter of people in this restaurant.

The Foley (from the name of the first person who used this technique in motion pictures) consists of replacing or enhancing human- related sounds, such as footsteps, body motions (for example, noises made by leather jackets, or a cup being placed on a desk, and so on), or water sounds (such as ducks swimming in a pond, a person taking a shower, or drowning in a lake).

Ambiances, also called room tones , are placed throughout a scene requiring a constant ambiance. This is used to replace the current ambiance that was recorded with the dialogue, but due to the editing process, placement of camera, and other details that affect the continuity of the sound, it cannot be used in the final sound track. Take, for example, the rumbling and humming caused by a spaceship's engines. Did you think that these were actually engines running on a real spaceship?

The sound effects in a motion picture sound track are usually not associated with Foley sounds (body motion simulation sounds). In other words, "sound effects" are the sounds that enhance a scene in terms of sonic content. A few examples are explosions, gun shots, the "swooshing" sound of a punch, an engine revving, a knife slashing, etc. Sound effects aren't meant to mimic but to enhance.

And last but not least, the music, which comes in different flavors, adds the emotional backdrop to the story being told by the images.

Frame Rates

As the name implies, a frame rate is the amount of frames a film or video signal has within a second. It is also used to identify different timecode uses. The acronym for frame rate is "fps" for Frames Per Second. There are different frame rates, depending on what you are working with:

24 fps. This is used by motion picture films and in most cases, working with this medium will not apply to you because you likely do not have a film projector hooked up to your computer running Cubase to synchronize sound.
25 fps. This refers to the PAL ( Phase Alternation Line ) video standard used mostly in Asia and SECAM/EBU ( Sequential Color And Memory/European Broadcast Union ) video standard used mostly in Europe. If you live in those areas, this is the format your VCR uses. A single frame in this format is made of 625 horizontal lines.
29.97 fps. Also known as 29.97 nondrop and may also be seen as 30 fps in some older two-digit timecode machines (but not to be mistaken with the actual 30 fps timecode; if you can't see the 29.97 format, chances are the 30 format is its equivalent). This refers to the NTSC ( National Television Standards Committee ) video standard used mostly in North America. If you live in this area, this is the format your VCR uses. A single frame in this format is made of 525 horizontal lines.
29.97 fps DF. Also known as 29.97 drop frame (hence the DF at the end). This can also be referred to as 30 DF on older video timecode machines. This is probably the trickiest timecode to understand because there is a lot of confusion about the drop frame. To accommodate the extra information needed for color when this format was first introduced, the black-and-white's 30 fps was slowed to 29.97 fps for color. Though not an issue for most of you, in broadcast, the small difference between real time (also known as the wall or house clock) and the time registered on the video can be problematic . Over a period of one SMPTE hour, the video is 3.6 seconds or 108 extra frames longer in relation to the wall clock. To overcome this discrepancy, drop frames are used. This is calculated as follows : Every frame 00 and 01 are dropped for each minute change, except for minutes with 0's (such as 00, 10, 20, 30, 40, and 50). Therefore, two frames skipped every minute represents 120 frames per hour, except for the minutes ending with zero, so 120x12 = 108 frames. Setting your frame rate to 29.97 DF when it's notin other words, if it's 29.97 (Non-Drop)causes your synchronization to be off by 3.6 seconds per hour .
30 fps. This format was used with the first black-and-white NTSC standard. It is still used sometimes in music or sound applications in which no video reference is required.
30 fps DF. This is not a standard timecode protocol and usually refers to older timecode devices that were unable to display the decimal points when the 29.97 drop frame timecode was used. Try to avoid this timecode frame rate setting when synchronizing to video because it might introduce errors in your synchronization. SMPTE does not support this timecode.

Using the SMPTE Generator Plug-in

The SMPTE Generator (SX Only) is a plug-in that generates SMPTE timecode in one of two ways:

It uses an audio bus output to send a generated timecode signal to an external device. Typically, you can use this mode to adjust the level of SMPTE going to other devices and to make sure that there is a proper connection between the outputs of the sound card associated with Cubase and the input of the device for which the SMPTE was intended.
It uses an audio bus output to send a timecode signal that is linked with the play position of the project currently loaded. Typically, this tells another device the exact SMPTE location of Cubase at any time, allowing it to lock to Cubase through this synchronization signal.

Because this plug-in is not really an effect, using it on a two-output system is useless because timecode is not what you could call "a pleasant sound." Because it uses an audio output to carry its signal, you need to use an audio output on your sound card that you don't use for anything else, or at least one channel (left or right) that you can spare for this signal. Placing the SMPTE Generator on an empty audio track is also necessary because you do not want to process this signal in any way, or the information the signal contains will be compromised.

How To

To use the SMPTE Generator plug-in:

Create a new audio track.
Open the Track Inserts section in the Inspector area.
From the Plug-ins Selection drop-down menu, select the SMPTE Generator.
Expand the audio channel section in the Inspector area.
Assign the plug-in to a bus that doesn't contain any other audio signal. If you don't have an unused bus, see if you can use one side in a left/right setup and then pan the plug-in on one side and whatever was assigned to that bus to the other side. For example, use a bass on the left and the SMPTE Generator on the right.
Click the Edit button to access its panel (see Figure 14.1).

Figure 14.1. The SMPTE Generator panel.
Make sure the Framerate field displays the same frame rate as your project. You can access your Project Setup dialog box by pressing the Shift+S keys to verify if this is the case. Otherwise, set the Framerate field to the appropriate setting.
Make the connections between the output to which the plug-in is assigned and the receiving device.
Click the Generate button to start sending timecode. This step verifies if the signal is connected properly to the receiving device.
Adjust the level in either the audio channel containing the plug-in or on the receiving device's end. This receiving device should not receive a distorted signal to lock properly.
After you've made these adjustments, click the Link button in the Plug-in Information panel.
Start the playback of your project to lock the SMPTE Generator, the project, and the receiving device together.

MIDI Clock

MIDI Clock is a tempo-based synchronization signal used to synchronize two or more MIDI devices together with a beats-per-minute (BPM) guide track. As you can see, this is different than a timecode because it does not refer to a real-time address (hours: minutes: seconds: frames). In this case, it sends 24 evenly spaced MIDI Clocks per quarter note. So, at a speed of 60 BPM, it sends 1,440 MIDI Clocks per minute (one every 41.67 milliseconds), whereas at a speed of 120 BPM, it sends double that amount (one every 20.83 milliseconds ). Because it is tempo-based, the MIDI Clock rate changes to follow the tempo of the master tempo source.

When a sender sends a MIDI Clock signal, it sends a MIDI Start message to tell its receiver(s) to start playing a sequence at the speed or tempo set in the sender's sequence. When the sender sends a MIDI End message, the receiver stops playing a sequence. Up until this point, all the receiver can do is start and stop playing MIDI when it receives these messages. If you want to tell the receiver sequence where to start, the MIDI Clock has to send what is called a Song Position Pointer message , telling the receiver the location of the sender's song position. It uses the MIDI data to count the position where the MIDI Start message is at in relation to the sender.

Using MIDI Clock should be reserved for use between MIDI devices only, not for audio. As soon as you add digital audio or video, you should avoid using MIDI Clock because it is not well-suited for these purposes. Although it keeps a good synchronization between similar MIDI devices, the audio requires much greater precision. Video, on the other hand, works with time-based events, which do not translate well in BPM.

MIDI Machine Control

Another type of MIDI-related synchronization is the MIDI Machine Control (MMC). The MMC protocol uses System Exclusive messages over a MIDI cable to remotely control hard disk recording systems and other machines used for recording or playback. Many MIDI-enabled devices support this protocol.

MMC sends MIDI to a device, giving it commands such as play, stop, rewind, go to a specific location, punch-in, and punch-out on a specific track.

To make use of MMC in a setup in which you are using a multitrack tape recorder and a sequencer, you need to have a timecode (SMPTE) track sending timecode to a SMPTE/MTC converter. Then you send the converted MTC to Cubase so that it can stay in sync with the multitrack recorder. Both devices are also connected through MIDI cables. It is the multitrack that controls Cubase's timing, not vice versa. Cubase, in return, can transmit MMC messages through its MIDI connection with the multitrack, which is equipped with a MIDI interface. These MMC messages tell the multitrack to rewind, fast forward, and so on. When you click Play in Cubase, it tells the multitrack to go to the position at which playback in Cubase's project begins. When the multitrack reaches this position, it starts playing the tape back. After it starts playing, it then sends timecode to Cubase, to which it then syncs.

Digital Clock

The digital clock is another way to synchronize two or more devices together by using the sampling frequency of the sender device as a reference. This type of synchronization is often used with MTC in a music application such as Cubase to lock both audio sound card and MIDI devices with video devices. In this case, the sender device is the sound card. This is by far the most precise synchronization mechanism discussed here. Because it uses the sampling frequency of your sound card, it is precise to 1/44,100th of a second when you are using a 44.1 kH sampling frequency (or 0.02 milliseconds). Compare this with the precision of SMPTE timecode (around 33 milliseconds at 30 fps) and MIDI Clock (41.67 milliseconds at 120 BPM), and you quickly realize that this synchronization is very accurate.

When you make a digital transfer between two digital devices, the digital clock of the sender device is sent to the receiver device, making sure that every bit of the digital audio from the sender device fits with the receiver device. Failure to do so results in signal errors and will lead to signal degradation. When a receiver device receives a Word Clock (a type of digital clock) from its sender, it replaces its own clock with the one provided by this sender.

A digital clock can be transmitted on one of these cables:

S/PDIF (Sony/Phillips Digital InterFace). This format is probably the most common way to connect two digital devices together. Although this type of connection transmits digital clock information, it is usually referred to by its name rather than Word Clock. S/PDIF connectors have RCA connectors at each end and carry digital audio information with embedded digital audio clock information. You can transmit mono or stereo audio information on a single S/PDIF connection.
AES/EBU (Audio Engineering Society/European Broadcast Union). This is another very common, yet not as popular type of digital connector used to transfer digital information from one device to another. AES/EBU uses an XLR connector at each end of the cable; like the S/PDIF format, it carries the digital audio clock embedded in its data stream. You can also transmit mono or stereo audio information on this type of connection. Because this type of connection uses XLR connectors, it is less susceptible to creating clicks and pops when you connect them, but because they are more expensive, you won't find them on low-cost equipment.
ADAT (Alesis Digital Audio Technology). This is a proprietary format developed by Alesis that carries up to eight separate digital audio signals and Word Clock information over a single-wire, fiber- optic cable. Most sound cards do not provide ADAT connectors on them, but if yours does, use it to send and receive digital clock information from and to an ADAT compatible device.
TDIF (Tascam Digital InterFace). This is a proprietary format develop by Tascam that also provides eight channels of digital audio in both directions, with up to 24-bit resolution. It also carries clocking signals that are used for synchronizing the transmission and reception of the audio; however, it does not contain Word Clock information, so you typically need to connect TDIF cables along with Word Clock cables (see below) if you want to lock two digital audio devices using this type of connection.
Word Clock. A digital clock is called Word Clock when it is sent over its own cable. Because Word Clock signals contain high frequencies, they are usually transmitted on 75-ohm coaxial cables for reliability. Usually, a coaxial BNC connector is used for Word Clock transfers.

To be able to transfer digital audio information in sync from one digital device to another, all devices have to support the sender's sampling rate. This is particularly important when using sampling frequencies other than 44.1 or 48 kH, since those are pretty standard on most digital audio devices.

When synchronizing two digital audio devices together, the digital clock might not be the only synchronization clock needed. If you are working with another digital hard disk recorder or multitrack analog tape recorder, you need to send transport controls to and from these devices along with the timing position generated by this digital clock. This is when you have to lock both the digital clock and timecode together. Avoid using MIDI Clock at all costs when synchronizing with digital audio. The next section discusses different possibilities and how to set up Cubase to act as a sender or a receiver in the situations described previously.

When doing digital transfers between a digital multitrack tape and Cubase, it is important that both the Word Clock (digital clock) information and the timecode information be correlated to ensure a no-loss transfer and that for every bit on one end, there's a corresponding bit on the other. This high-precision task can be performed through ASIO Position Protocol (APP).

APP uses the ASIO driver provided for your sound card and a compatible APP digital device, such as Alesis' ADAT. In this type of setup, the ADAT provides the master (sender) Word Clock and the timecode information to Cubase. The ASIO 2.0 compatible driver of your sound card simply follows this information and stays accurate to the last sample.

[ LiB ]

About Word Clock, SMPTE, and MIDI Clock