Wednesday, February 15, 2012

Digital Sound Recording

During recording, a microphone is set near the source sound. The air pressure inside the microphone is then sampled many times per second and recorded as a fixed point number. For stereo recordings, the sound is sampled at two positions in space, corresponding to a right and left sound channel, and the sampled amplitudes are stored separately. The number of times per second the sound is sampled is called the sampling frequency. The precision of the recording refers to the allowed dynamic range of the pressure readings. For example, if the air pressure at each sample is stored as an 8-bit signed floating point number, then there would be 2^8 possible values that a sampled pressure could attain. If we instead use a 16-bit signed floating point number to represent the sampled pressure, then we would have a dynamic range of 2^16 .

The more bits we use to represent the pressure, and the higher the sampling
frequency used, the better quality a digital recording we produce. However, we pay for this improvement in sound quality with an increase in file size. In fact,

File Size= (Fs × t × d)/8 (bytes)

where Fs is the sampling frequency, t is the length of the recorded sound in seconds, and the amplitude is recorded using d-bits. We divide by 8 to convert from bits to bytes (1 byte = 8 bits) since file size is typically measured in bytes.

Additionally, the bytes of a sound file can be represented as linear or logarithmic progressions. The unit of measurement of the represented sound pressure is constant from sample to sample in linear encoding, whereas in logarithmic encoding that unit grows as the sample value increases. The latter has the advantage of representing a greater range of sound levels, albeit with higher noise levels. The µ-law and a-law variations of the AU format, originating from Sun Microsystems and NeXT Computer, use logarithmic coding. An 8- bit µ-law sample, for example, can provide the same dynamic level as a 12-bit linear encoded sample.

The Nyquist–Shannon sampling theorem states that perfect reconstruction of a signal is possible when the sampling frequency is greater than twice the maximum frequency of the signal being sampled, or equivalently, when the Nyquist frequency (half the sample rate) exceeds the highest frequency of the signal being sampled. If lower sampling rates are used, the original signal's information may not be completely recoverable from the sampled signal.For example, if a signal has an upper band limit of 100 Hz, a sampling frequency greater than 200 Hz will avoid aliasing and allow theoretically perfect reconstruction.
The full range of human hearing is between 20 Hz and 20 kHz. The minimum sampling rate that satisfies the sampling theorem for this full bandwidth is 40 kHz. The 44.1 kHz sampling rate used for Compact Disc was chosen for this and other technical reasons.When you start to talk about the 192000hz then your talking the high resolution sources such as DVD - Audio discs. In digital audio the most common sampling rates are 44.1 kHz, 48 kHz, and 96 kHz. A common frequency in computer sound cards is 48 kHz – many work at only this frequency, offering the use of other sample rates only through resampling. The earliest add-in cards ran at 22 kHz.High-end audio equipment, such as in SACD or DVD-Audio players or studio equipment, can reach as high as 192 kHz.

Here is the complete list of common audio sample rates

No comments: