Digital audio is the most commonly used method to represent sound inside a computer. In this method sound is stored as a sequence of samples taken from the audio signal using constant time intervals. A sample represents volume of the signal at the moment when it was measured. In uncompressed digital audio each sample require one or more bytes of storage. Number of bytes required depends on number of channels (mono, stereo) and sample format (8 or 16 bits, mu-Law, etc.). The length of this interval determines the sampling rate. Normally used sampling rates are between 8 kHz (telephone quality) and 48 kHz (DAT tapes).
The physical devices used in digital audio are called ADC (Analog to Digital Converter) and DAC (Digital to Analog Converter). A device containing both ADC and DAC is commonly known as codec. The codec device used in Sound Blaster cards is called DSP which is somehow misleading since DSP also stands for Digital Signal Processor (the SB DSP chip is very limited when compared to "true" DSP chips).
Sampling parameters affect quality of sound which can be reproduced from the recorded signal. The most fundamental parameter is sampling rate which limits the highest frequency than can be stored. It is well known (Nyquist's Sampling Theorem) that the highest frequency that can be stored in sampled signal is at most 1/2 of the sampling frequency. For example 8 kHz sampling rate permits recording of signal in which the highest frequency is less than 4 kHz. Higher frequency signals must be filtered out before feeding them to DAC.
Sample encoding limits dynamic range of recorded signal (difference between the faintest and the loudest signal that can be recorded). In theory the maximum dynamic range of signal is number_of_bits * 6 dB . This means that 8 bits sampling resolution gives dynamic range of 48 dB and 16 bit resolution gives 96 dB.
Sample encoding limits dynamic range of recorded signal (difference between the faintest and the loudest signal that can be recorded). In theory the maximum dynamic range of signal is number_of_bits * 6 dB . This means that 8 bits sampling resolution gives dynamic range of 48 dB and 16 bit resolution gives 96 dB. Quality has price. Number of bytes required to store an audio sequence depends on sampling rate, number of channels and sampling resolution. For example just 8000 bytes of memory is required to store one second of sound using 8 kHz/8 bits/mono but 48 kHz/16bit/stereo takes 192 kilobytes. A 64 kbps ISDN channel is required to transfer a 8kHz/8bit/mono audio stream and about 1.5 Mbps is required for DAT quality (48kHz/16bit/stereo). On the other hand it is possible to store just 5.46 seconds of sound to a megabyte of memory when using 48kHz/16bit/stereo sampling. With 8kHz/8bits/mono it is possible to store 131 seconds of sound using the same amount of memory. It is possible to reduce memory and communication costs by compressing the recorded signal but this is out of the scope of this document.
Audio devices are opened exclusively for selected direction. This doesn't allows to open device direction multiple times with one or more processes for the same audio device direction, but allows one open call to playback direction and second open call to record direction independently. Audio device return EBUSY error to application when other application ownes requested direction.
Low-Level layer supports these formats:
#define SND_PCM_SFMT_MU_LAW 0
#define SND_PCM_SFMT_A_LAW 1
#define SND_PCM_SFMT_IMA_ADPCM 2
#define SND_PCM_SFMT_U8 3
#define SND_PCM_SFMT_S16_LE 4
#define SND_PCM_SFMT_S16_BE 5
#define SND_PCM_SFMT_S8 6
#define SND_PCM_SFMT_U16_LE 7
#define SND_PCM_SFMT_U16_BE 8
#define SND_PCM_SFMT_MPEG 9
#define SND_PCM_SFMT_GSM 10
#define SND_PCM_FMT_MU_LAW (1 << SND_PCM_SFMT_MU_LAW)
#define SND_PCM_FMT_A_LAW (1 << SND_PCM_SFMT_A_LAW)
#define SND_PCM_FMT_IMA_ADPCM (1 << SND_PCM_SFMT_IMA_ADPCM)
#define SND_PCM_FMT_U8 (1 << SND_PCM_SFMT_U8)
#define SND_PCM_FMT_S16_LE (1 << SND_PCM_SFMT_S16_LE)
#define SND_PCM_FMT_S16_BE (1 << SND_PCM_SFMT_S16_BE)
#define SND_PCM_FMT_S8 (1 << SND_PCM_SFMT_S8)
#define SND_PCM_FMT_U16_LE (1 << SND_PCM_SFMT_U16_LE)
#define SND_PCM_FMT_U16_BE (1 << SND_PCM_SFMT_U16_BE)
#define SND_PCM_FMT_MPEG (1 << SND_PCM_SFMT_MPEG)
#define SND_PCM_FMT_GSM (1 << SND_PCM_SFMT_GSM)
Constants with prefix SND_PCM_SFMT_ are used in info structures
and constants with prefix SND_PCM_FMT_ are used in format structure.
Function creates new handle and opens connection to kernel sound audio interface to soundcard number card (0-N) and audio device number device. Function also checks if protocol is compatible to prevent use old programs with new kernel API. Function returns zero if success otherwise it returns negative error code. Error code -EBUSY is returned when some process ownes selected direction.
Default format after open is mono mu-Law at 8000Hz. Device should be used directly for playback of standard .au (Sparc) files.
Bellow modes should be used for mode argument:
#define SND_PCM_OPEN_PLAYBACK (O_RDONLY) #define SND_PCM_OPEN_RECORD (O_WRONLY) #define SND_PCM_OPEN_DUPLEX (O_RDWR)
Function frees all resources allocated with audio handle and closes connection to kernel sound mixer interface. Function returns zero if success otherwise it returns negative error code.
Function returns file descriptor of connection to kernel sound mixer interface. Function returns negative error code if some error was occured.
File descriptor should be used for select synchronous multiplexer function for read direction. Application should call snd_pcm_read or snd_pcm_write functions if some data is waiting for read or write can be performed. Call to this functions is very recomended and leaves place to this functions to do for example some data conversions if needed.
Functions setup block (default) or nonblock mode. Block mode suspends execution of program when snd_pcm_read or snd_pcm_write is called for time which is needed for actual playback or record of whole size of buffer. In nonblock mode program isn't suspended and above functions returns immediately with count of bytes which was read or written to driver. Functions shouldn't in this mode read or write whole buffer and application should perform next call of these functions to continue operation.
Function returns filled *info structure. Function returns zero if success otherwise it returns negative error code.
#define SND_PCM_INFO_CODEC 0x00000001 #define SND_PCM_INFO_DSP SND_PCM_INFO_CODEC #define SND_PCM_INFO_MMAP 0x00000002 /* reserved */ #define SND_PCM_INFO_PLAYBACK 0x00000100 #define SND_PCM_INFO_RECORD 0x00000200 #define SND_PCM_INFO_DUPLEX 0x00000400 #define SND_PCM_INFO_DUPLEX_LIMIT 0x00000800 /* rate for playback & record are same */ struct snd_pcm_info { unsigned int type; /* soundcard type */ unsigned int flags; /* see to SND_PCM_INFO_XXXX */ unsigned char id[32]; /* ID of this PCM device */ unsigned char name[80]; /* name of this device */ unsigned char reserved[64]; /* reserved for future... */ };
This flag is reserved and should be never used. It remains for compatibility with Open Sound System driver.
If this bit is set, rate must be same for playback and record direction.
Function returns filled *info structure. Function returns zero if success otherwise it returns negative error code.
#define SND_PCM_PINFO_BATCH 0x00000001 #define SND_PCM_PINFO_8BITONLY 0x00000002 #define SND_PCM_PINFO_16BITONLY 0x00000004 struct snd_pcm_playback_info { unsigned int flags; /* see to SND_PCM_PINFO_XXXX */ unsigned int formats; /* supported formats */ unsigned int min_rate; /* min rate (in Hz) */ unsigned int max_rate; /* max rate (in Hz) */ unsigned int min_channels; /* min channels (probably always 1) */ unsigned int max_channels; /* max channels */ unsigned int buffer_size; /* playback buffer size */ unsigned int min_fragment_size; /* min fragment size in bytes */ unsigned int max_fragment_size; /* max fragment size in bytes */ unsigned int fragment_align; /* align fragment value */ unsigned char reserved[64]; /* reserved for future... */ };
Driver does double buffering for this device. This means that used chip for data processing have own memory and output should be more delayed than traditional codec chip is used.
If this bit is set, driver uses 8-bit format for 16-bit samples and does software conversion. This bit is used with broken SoundBlaster 16/AWE soundcards which can't do full 16-bit duplex. If this bit is set application or highter digital audio layer should do conversion from 16-bit samples to 8-bit samples rather than keep driver to do it in the kernel.
If this bit is set, driver uses 16-bit format for 8-bit samples and does software conversion. This bit is used with broken SoundBlaster 16/AWE soundcards which can't do full 8-bit duplex. If this bit is set application or highter digital audio layer should do conversion from 8-bit samples to 16-bit samples rather than keep driver to do it in the kernel.
Function returns filled *info structure. Function returns zero if success otherwise it returns negative error code.
#define SND_PCM_RINFO_BATCH 0x00000001 #define SND_PCM_RINFO_8BITONLY 0x00000002 #define SND_PCM_RINFO_16BITONLY 0x00000004 struct snd_pcm_record_info { unsigned int flags; /* see to SND_PCM_RINFO_XXXX */ unsigned int formats; /* supported formats */ unsigned int min_rate; /* min rate (in Hz) */ unsigned int max_rate; /* max rate (in Hz) */ unsigned int min_channels; /* min channels (probably always 1) */ unsigned int max_channels; /* max channels */ unsigned int buffer_size; /* record buffer size */ unsigned int min_fragment_size; /* min fragment size in bytes */ unsigned int max_fragment_size; /* max fragment size in bytes */ unsigned int fragment_align; /* align fragment value */ unsigned char reserved[64]; /* reserved for future... */ };
Driver does double buffering for this device. This means that used chip for data processing have own memory and output should be more delayed than traditional codec chip is used.
If this bit is set, driver uses 8-bit format for 16-bit samples and does software conversion. This bit is used with broken SoundBlaster 16/AWE soundcards which can't do full 16-bit duplex. If this bit is set application or highter digital audio layer should do conversion from 16-bit samples to 8-bit samples rather than keep driver to do it in the kernel.
If this bit is set, driver uses 16-bit format for 8-bit samples and does software conversion. This bit is used with broken SoundBlaster 16/AWE soundcards which can't do full 8-bit duplex. If this bit is set application or highter digital audio layer should do conversion from 8-bit samples to 16-bit samples rather than keep driver to do it in the kernel.
Function setup format, rate (in Hz) and number of channels for playback direction. Function returns zero if success otherwise it returns negative error code.
struct snd_pcm_format { unsigned int format; /* SND_PCM_SFMT_XXXX */ unsigned int rate; /* rate in Hz */ unsigned int channels; /* channels (voices) */ unsigned char reserved[16]; };
Function setup format, rate (in Hz) and number of channels for record direction. Function returns zero if success otherwise it returns negative error code.
struct snd_pcm_format { unsigned int format; /* SND_PCM_SFMT_XXXX */ unsigned int rate; /* rate in Hz */ unsigned int channels; /* channels (voices) */ unsigned char reserved[16]; };
Function sets various parameters for playback direction. Function returns zero if success otherwise it returns negative error code.
struct snd_pcm_playback_params { int fragment_size; int fragments_max; int fragments_room; unsigned char reserved[16]; /* must be filled with zero */ };
Requested size of fragment. This value should be aligned for current format (for example to 4 if stereo 16-bit samples are used) and with fragment_align variable from snd_pcm_playback_info_t structure. Range can be from min_fragment_size to max_fragment_size.
Maximum number of fragments in queue for wakeup. This number doesn't counts partly used fragment. If current count of filled playback fragments is greater than this value driver block application or return immediately back if nonblock mode is active.
Minumum number of fragments writeable for wakeup. This value should be in most cases 1 which means return back to application if at least one fragment is free for playback. This value includes partly used fragment, too.
Function sets various parameters for record direction. Function returns zero if success otherwise it returns negative error code.
struct snd_pcm_record_params { int fragment_size; int fragments_min; unsigned char reserved[16]; };
Requested size of fragment. This value should be aligned for current format (for example to 4 if stereo 16-bit samples are used) and with fragment_align variable from snd_pcm_playback_info_t structure. Range can be from min_fragment_size to max_fragment_size.
Minimum filled fragments for wakeup. Driver blocks application (if block mode is selected) until isn't filled number of fragments specified with this value.
Function returns filled *status structure. Function returns zero if success otherwise it returns negative error code.
struct snd_pcm_playback_status { int fragments; int fragment_size; int count; int queue; int underrun; struct timeval time; unsigned char reserved[16]; };
Currently allocated fragments by driver for playback direction.
Current fragment size used by driver for playback direction.
Count of bytes writeable without blocking.
Count of bytes in queue. Note: (fragments * fragment_size) - queue should not be equal to count.
This value gives to application count of underruns relative from last call of snd_pcm_playback_status.
Time the first sample from next write is going to play. This value should be used for time synchronization. Returned value is same as you can get from standard C function gettimeofday( &time, NULL ).
Function returns filled *status structure. Function returns zero if success otherwise it returns negative error code.
struct snd_pcm_record_status { int fragments; /* allocated fragments */ int fragment_size; /* current fragment size in bytes */ int count; /* number of bytes readable without blo int free; /* bytes in buffer still free */ int overrun; /* count of overruns from last status * struct timeval time; /* time the next read was taken */ unsigned char reserved[16]; };
Currently allocated fragments by driver for record direction.
Current fragment size used by driver for record direction.
Count of bytes readable without blocking.
Count of bytes in buffer still free. Note: (fragments * fragment_size) - free should not be equal to count.
This value gives to application count of overruns relative from last call of snd_pcm_record_status.
Time the next sample read was taken. This value should be used for time synchronization. Returned value is same as you can get from standard C function gettimeofday( &time, NULL ).
This function drain playback buffers immediately. Function returns zero if success otherwise it returns negative error code.
This function flush playback buffers. Function block program while last sample isn't processed. Function returns zero if success otherwise it returns negative error code.
This function flush (destroy) record buffers. Function returns zero if success otherwise it returns negative error code.
Function writes samples to driver which must be in proper format than specified by snd_pcm_playback_format function. Function returns zero or positive value if playback was success (value represents count of bytes which was successfuly written to device) or negative error value if error occured. Function should suspend process if block mode is active.
Function reads samples from driver. Samples are in format specified by snd_pcm_record_format function. Function returns zero or positive value if record was success (value represents count of bytes which was successfuly read from device) or negative error value if error occured. Function should suspend process if block mode is active.
Bellow example shows how can be played first 512kB from /tmp/test.au file on soundcard #0 and device #0:
int card = 0, device = 0, err, fd, count, size, idx;
void *handle;
snd_pcm_format_t format;
char *buffer;
buffer = (char *)malloc( 512 * 1024 );
if ( !buffer ) return;
if ( (err = snd_pcm_open( &handle, card, device, SND_PCM_OPEN_PLAYBACK )) < 0 ) {
fprintf( stderr, "open failed: %s\n", snd_strerror( err ) );
return;
}
format.format = SND_PCM_SFMT_MU_LAW;
format.rate = 8000;
format.voices = 1;
if ( (err = snd_pcm_playback_format( handle, &format )) < 0 ) {
fprintf( stderr, "format setup failed: %s\n", snd_strerror( err ) );
snd_pcm_close( handle );
return;
}
fd = open( "/tmp/test.au" );
if ( fd < 0 ) {
perror( "open file" );
snd_pcm_close( handle );
return;
}
idx = 0;
count = read( fd, buffer, 512 * 1024 );
if ( count <= 0 ) {
perror( "read from file" );
snd_pcm_close( handle );
return;
}
close( fd );
if ( !memcmp( buffer, ".snd", 4 ) ) {
idx = (buffer[4]<<24)|(buffer[5]<<16)|(buffer[6]<<8)|(buffer[7]);
if ( idx > 128 ) idx = 128;
if ( idx > count ) idx = count;
}
size = snd_pcm_write( handle, &buffer[ idx ], count - idx );
printf( "Bytes written %i from %i...\n", size, count - idx );
snd_pcm_close( handle );
free( buffer );