Dialogue workflows.

In a film, each soundtrack plays an important role. Dialogue obviously stands out from the other voices. Dialogue conveys information, the actor's emphasis sets the tone. Without a practical example, it's difficult to convey what to look for, but I'll try.

You have the tracks and should start working on them. What should you start with? With the sound effects? Maybe with the music? Of course not. Dialogue always comes first. You have to analyse the quality of the tracks you have and start working on them accordingly.

I've received film material where the dialogue was relatively OK, but it was already compressed, it was impossible to soften the sound in softer scenes, only EQ and volume changes were left. At the same time, the use of the compressor also made the noise too loud and amplified the reflections. For noise reduction and reverberation, the Izotope RX family of software provides a relatively tolerable solution, but you have to be very careful with it, you really have to watch where the limit is, because even a tiny Threshold difference in dB can make the sound harsh. This tool is really only used where absolutely necessary and you need to think ahead about what sound will be in the background which can mask the noise beneficially.

Eliminating noise is sometimes as simple as turning down the volume. We have taken the noise down, but we find that the space becomes very empty. The easiest way to fill this in is to use pink noise, but we won't cover that in this part of the workflow. The significant volume differences between scenes are brought roughly to the same level. We compress the signal, with a minimum of adjustment, and standardise any frequency differences with EQ. Conversations within a scene are not usually very different, but one actor's voice may be too low, so it is a good idea to bring the two different interlocutors closer together in both frequency and volume. Which voice to change is always a matter of individual judgement. Try to work with as few tracks as possible! In other words, if one part of the conversation needs EQ'ing but not another, don't open a new track, but instead apply the effect to that part only! I'm not familiar with all editor software in sufficient detail, but most of them have the option to put the effect on the section you're editing, rather than on the track. That way the interface remains more transparent. Of course, you can save CPU if you only have one effect on the track and it is automated, but this is not practical, you will get confused after a while and changing EQ parameters suddenly with automation will not give an immediate effect if you have to switch between two settings suddenly. Having said all this, it will become apparent where there is still unwanted noise in the recording, the noise should be removed or reduced to an acceptable level.

The dialogue may be recorded in a studio afterwards. In this case, the sound recorded on location and the sound recorded in the studio will not be the same. If the post-recording is a replacement for a broken recording, it should be brought to the same level. EQ is the most important weapon for this. When simulating speech in a closed echo chamber, the use of a Convolution Reverb is also necessary. For an accurate simulation, a crackling sound is recorded on the spot by a specialist. This produces an impulse response. This snapping sound can be called into the reverb software which analyses the frequency and length of the reverberation heard after the snapping sound and adjusts itself. In the best case, we then only have to optimise the reverb volume. The reverb will be stereo, so this will need to be monoed if the scene you are recording is mono.

Perhaps the actor is narrating his own actions after the fact. Here, the need for pinpoint similarity is less important, and it may be that we do well to make the differences less obscure, thus making the narrative feel more important.

Returning to stereo sound, it may happen that the resulting dialogue was recorded in stereo. Unless there is some special reason for this, it is unnecessary and harmful. In cinema, at least, it can be harmful. If the recording also includes atmospheric noise, such as the sound of cars driving past in low tones, or the sound of urban roar in a breeze, this can be detrimental in stereo in cinema. It is not a law, of course, but it is possible that almost identical sounds from both sides may try to cancel each other out. The end result can be a strange flanger, chorus-like effect sound if you're not careful. Also, those sitting on the left side of the cinema would hear the speech better on the left side, and those sitting on the right side would hear the speech better on the right side. That's why the centre speaker in the cinema is designed. Dialogue is mostly tracked here in mono. Unless stereo is warranted, dialogue should be mono by skipping one side track and putting the other in the middle, with the ambience replaced by separately recorded sound on the two sides of the speakers if necessary.

After general improvements, we can move on to the other tracks. Obviously, from now on we can't concentrate only on the effects, only on the dialogue, we need to hear and perceive the necessary changes as a whole. If a very important heartbreaking piece of music is playing loudly, the volume of the dialogue needs to be adjusted to it, not the other way round. And for a quieter, calmer passage, you don't need loud dialogue, so here you start to adjust the speech track to the other sounds.

How the compressor is used depends largely on the quality of the raw material. For a recording with a resonant sound, the quality of the compressor is very important. Always try to use soft settings, don't want extra short attack and release times. For more problematic parts, you may need to manually adjust the volume by cutting syllables. Don't apply too much clipping to the threshold value unless the sound material is of high enough quality to allow it! If possible, a minimal amount of compression can be applied to the whole track with an analogue opto compressor. They are soft, very well matched to human speech, and the sound remains natural. It should be noted, however, that if you want to apply this, do so before using a stereo reverb, delay, or other stereo effect to simulate a room or other effect. The fact is that using analogue hardwares is cumbersome and may not always add much to the sound, so of course you should try software first. But it's been my experience that a moderately good compressor can beat its software counterpart. DSP here UAD there, software compressors still haven't caught up with the sound of the better analogue stuff. So if the raw material is very delicate, you can try an analogue compressor if it's not a VCA compressor

It's pretty much all coming together. Where do we position the dialogue volume? No matter how we start the project, at the beginning nothing is ever where it should be. But now that the dialogue is almost ready, we can finally get it set to the right level. But the value of this depends on a number of factors. Some people have written online that they prefer a True Peak of 10-11 dB. I use a lower value. But however you set it, everything else has to be set after that. And when I'm done mixing, I finalize the volumes to the standard level depending on what the loudness meter shows.

If the values are OK, we're hearing pretty much what we expect to hear in the cinema but there are one or two places where the dialogue intelligibility is still not right, we can use EQ to improve the passage. Here we just apply a subtle boost at the frequencies where intelligibility improves.

Professional articles

Dialogue workflows.