Reasons for Poor Audio

Devices and Proximity

Many recordings are made with small concealed handheld microcassette or digital voice recorders. When they are concealed in a shirt pocket or a purse the sound becomes muffled and the sibilant syllables of speech (esses, effs, etc) are hard to hear.  When the device is placed under a car seat, the road vibrations shake it and the roar can become overpowering.

The relative distance of the device to a noise source and the target of the speech plays a big role in intelligibility. In interior rooms the microphone picks up the sound reflections coming off the walls, floor and ceiling, blurring the speech of the far person. If placed near an air conditioning vent, the blower noise will make the noise floor much louder and the speech harder to hear.

Recording Methods for Phone Calls

Many people try to record phone conversations or voicemails by holding one device up to the other and recording through the air.  As a result the speech sounds very tinny and low in level. Any movement or noise in the room will overpower the speech.

Direct recording is always preferable. There are quite a few cell phone apps that promise two-way recording of phone calls. Be sure to test the app with your phone first before recording something important. You might need to turn your speaker on in order to get both sides of the conversation.

Device Settings

Device settings also contribute to poor fidelity.  Digital devices only have so much memory.  The length of recording time available is inversely related to the recording quality you select.  The better the quality, the more storage is taken and the less time is available.

For digital devices this is determined by the sampling bitrate.  Higher sampling rates give more accurate results but take up more storage. Lower sampling rates result in poorer quality audio.  The first thing to suffer is the high end sibilant sounds (S, F, Z). Other audible artifacts can appear such as phasing (that “underwater” sound) and "musical" bell-like tones  Also, let’s not forget distortion caused by incorrectly set record levels or faulty automatic level compensating circuitry.

Automatic Voice Activation

If your device has this available, DON'T USE IT.  This can result in files where the device stops recording in the middle of a sentence and then picks up again maybe a few minutes later.  It is impossible to tell how long the pause is, so there is no continuity to the recording.  You won't be able to trust the what you are hearing.

Power Sources

If the recorder is placed near a power source with a strong transformer, the electrical hum can often be as loud as the voice signal.  In cheaper microcassette recorders, the noise of the motors that drive the cassette hubs can be picked up by the recording heads and heard on the tape.

External Devices

Other external devices can leave their mark as well.  For instance, “smart” phones generate strong pulses that can be picked up by recorders located many feet away.  Medical equipment can generate loud tones as well.  Air conditioners create full-spectrum noise that is difficult to separate from low speech signals. Televisions and radios are particularly tough when it comes to noise removal because much of the broadcast contains actors’ and commentators’ dialog.  The same algorithms that separate speech from noise will bring up the broadcast dialog right along with the targeted speech.

Location

The location itself can pose challenges for audio cleanup.  For example, the traffic that speeds along a highway makes loud swooshes as the cars pass by.  If a recording is made over dinner in a restaurant, there are other conversations going on simultaneously along with ambient music and the general noise associated with food service.