Audio Class Stream Data Flow

The Audio Processing module manages playback and record streams using two internal tasks:

Playback task
Record task

These two tasks are the glue between the µC/USB-Device Core and the Audio Peripheral Driver.

From a host perspective, a stream lifetime will always consist in:

Opening a stream,
Communicating on this stream,
Closing a stream.

Sections below describe in more detailed manner the streams data flow.

Playback Stream

A playback stream carries audio data over an isochronous OUT endpoint. There is a one-to-one relation between an isochronous OUT endpoint, an AudioStreaming interface and a Terminal. Figure - Playback Stream Dataflow presents the audio data flow implemented inside the Audio Processing module. The playback path relies on a ring buffer queue to synchronize the playback task, the core task and the codec ISR.

Figure - Playback Stream Dataflow

(1) The host activates the AudioStreaming interface #X by selecting the operational interface (request SET_INTERFACE sent for alternate setting 1). This marks the opening of the playback. The core task will call the function USBD_Audio_AS_IF_Start(). The first isochronous OUT transfer is submitted to the USB device driver to prime the stream. An empty audio buffer is taken from the ring buffer queue.

(2) The host then sets the sampling frequency for a certain isochronous OUT endpoint by sending a SET_CUR request. The function USBD_Audio_DrvAS_SamplingFreqManage()(not indicated in the figure) is called from the core task's context. This function is implemented by the audio peripheral driver and will set a DAC (Digital-to-Analog Converter) clock in the codec.

(3) The USB Device Controller fills the buffer with the isochronous audio data that have been sent by the host. The buffer is retrieved by the core task. As soon as one isochronous transfer is completed, the core task will call the callback USBD_Audio_PlaybackIsocCmpl() passed as a parameter of USBD_IsocRxAsync(). This callback notifies the audio class that a buffer with audio samples is ready for the audio codec.

Error Handling

If the isochronous transfer has completed with an error, USBD_Audio_PlaybackIsocCmpl() will free the buffer associated to the transfer.

(4) The received buffer is then added to the ring buffer queue.

(5) The core task will submit all the buffers it can to the USB device driver to feed the stream communication by calling USBD_IsocRxAsync() several times.

(5a) Once a certain number of buffers (pre-buffering threshold) have been accumulated , the playback stream is started on the codec side by calling the function StreamStart(). The pre-buffering threshold is always equal to (MaxBufNbr / 2). The field MaxBufNbr is part of the structure USBD_AUDIO_STREAM_CFG. Within Drv_API_Ptr->PlaybackStart() , the audio peripheral driver should signal the playback task N times by calling via the function USBD_Audio_PlaybackTxCmpl(). N corresponds to the number of buffers it can queue. The driver should at least support the double-buffering and thus queue two buffers.

(6) Signalling the playback task consists in posting an AudioStreaming (AS) interface handle in a queue. The playback task wakes up and processes the handle. It submits a ready buffer taken from the ring buffer queue to the audio peripheral driver by calling the function StreamPlaybackTx(). Before being submitted to the audio peripheral driver, the received audio data may go through a correction in case of underrun or overrun situation of ring buffer queue. The playback stream correction is explained in section Playback Stream Correction . The audio peripheral driver should accumulate the ready buffer. After at least two buffers accumulated, the driver should send the first buffer to the codec usually by preparing a DMA transfer.

DMA in Audio Peripheral Driver

The use of DMA transfers is assumed to communicate with the audio codec. It allows to offload the CPU and to optimize performances.

Error Handling

If the ring buffer queue is empty, the playback task waits 1 ms and signals itself to re-submit another ready buffer to the audio peripheral driver. At least one ready buffer should have been inserted in the waiting list during the delay.

(7) In the same way as the core task in USBD_Audio_PlaybackIsocCmpl() , the playback task submits all buffers it can to the USB device driver by calling USBD_IsocRxAsync() several times.

Error Handling

If the submission with USBD_IsocRxAsync() fails by returning an error code, the buffer is freed back to the ring buffer queue.

(8) The buffer contains a chunk (1 ms of audio data) of audio stream. This audio chunk is encoded following a certain format. The audio peripheral driver might have to decode the audio chunk in order to correctly present the audio samples to the codec.

(9) Each time a playback buffer is consumed by the codec, the audio peripheral driver ISR signals to the playback task the end of an audio transfer by calling the function USBD_Audio_PlaybackTxCmpl(). This function posts a AS interface handle and free the consumed buffer back to the ring buffer queue.

(10) Afterwards, steps 3 to 9 are repeated over and over again until the host stops the playback by selecting the default AudioStreaming Interface (request SET_INTERFACE sent for alternate setting 0). At this time, the Audio Processing will stop the streaming on the codec side by calling the audio peripheral driver function StreamStop(). Basically, any playback DMA transfer is aborted. All the playback buffers attached to pending isochronous transfers will be freed automatically by the core which calls USBD_Audio_IsocPlaybackCmpl() for each aborted isochronous transfer.

Refer to page Audio Peripheral Driver Guide for more details about the audio peripheral driver processing.

The playback task supports multi-streams. If the audio function uses several USB OUT Terminal types, each USB OUT Terminal is associated to one AudioStreaming interface structure that the playback task manipulates and updates during stream communication.

OS Tick Rate

In case the ring buffer queue is empty when the playback task is submitting a buffer to the audio peripheral driver, a retry mechanism is used to re-submit the buffer 1 ms later. This delay allows other tasks to execute and a new buffer will become available in the ring buffer queue. The function USBD_Audio_OS_DlyMs() is used for this delay. Whenever possible, the OS tick rate should have a 1 ms granularity. It will also help for the audio class tasks scheduling as audio class works on a 1 ms frame basis.

Record Stream

Figure - Record Stream Dataflow

(1) The host activates the AudioStreaming interface #X by selecting the operational interface (request SET_INTERFACE sent for alternate setting 1). The host then sets the sampling frequency for a certain isochronous IN endpoint by sending SET_CUR request. The function USBD_Audio_DrvAS_SamplingFreqManage (not indicated in the figure) is called from the core task's context . This function is implemented by the audio peripheral driver and will set an ADC (Analog-to-Digital Converter) clock in the codec.

(2) When processing the SET_CUR(sampling frequency) request, the audio class will also start the record stream on the codec side by calling StreamStart(). In this function, a DMA transfer will be prepared to get the first record buffer from the codec. The initial receive buffer will be obtained by calling USBD_Audio_RecordBufGet(). This step is not entirely represented in the figure. The audio class ensures that the record stream is started on the codec side after setting the sampling frequency as a codec needs the correct clock settings before getting record data.

DMA in Audio Peripheral Driver

The use of DMA transfers is assumed to communicate with the audio codec. It allows to offload the CPU and to optimize performances.

(3) Once the first DMA transfer has completed, the audio peripheral driver will obtain the next receive buffer from the ring buffer queue by calling USBD_Audio_RecordBufGet(). This function provides also to the audio peripheral driver the number of bytes to get from the codec.

(4) The buffer will be filled with audio samples given by one or more ADCs (one ADC per logical channel). The buffer will contain 1 ms worth of audio samples. This 1 ms of audio samples should be encoded, either directly by the codec (hardware) or by the audio peripheral driver (software). Most of the time, the codec will provide the chunk of audio stream already encoded. The driver signals the end of an audio transfer to the record task by calling the function USBD_Audio_RecordRxCmpl(). The signal represents an AudioStreaming interface handle.

(5) The record task wakes up and retrieves the ready buffer from the audio peripheral driver by calling the audio peripheral driver function StreamRecordRx(). The buffer is stored in the ring buffer queue.

(6) To prime the audio stream, the record task waits for a certain number of buffers to be ready. The pre-bufferring threshold is always equal to (MaxBufNbr / 2). The field MaxBufNbr is part of the structure USBD_AUDIO_STREAM_CFG. Once the pre-buffering is done, the record task submits the initial isochronous IN transfer to the USB device driver via USBD_IsocTxAsync(). During the stream communication, the record task does not submit other isochronous transfers. Other USB transfers submission is done by the core task.

There is a special situation where the record task can submit a new transfer. When the stream communication loop is broken, that is there are no more ongoing isochronous transfers in the USB device driver, the record task restarts the stream with a new USB transfer.

(7) The USB device driver will send isochronous audio data to the host during a specific frame.

(8) Upon completion of the isochronous IN transfer, the core task will call the callback USBD_Audio_RecordIsocCmpl() provided by the Audio Processing as an argument of USBD_IsocTxAsync() . This callback will free the buffer by returning it in the ring buffer queue. Before the buffer return int the ring buffer queue, a stream correction may happen for the next record buffer to fill by the codec. The record stream correction is explained in section Record Stream Correction.

Error Handling

If a transfer has completed with an error, the associated buffer is freed by USBD_Audio_RecordIsocCmpl().

(9) The core task submits all the ready buffers it can to the USB device driver by calling USBD_IsocTxAsync() several times. The core task is thus responsible to maintain alive the stream communication by repeating the steps 7 and 8.

Error Handling

If the submission with USBD_IsocTxAsync() fails by returning an error code, the buffer is freed back to the ring buffer queue ..

(10) Once the audio stream is initiated, the steps 3 to 8 will repeat over and over again until the host stops recording by selecting the default AudioStreaming Interface (request SET_INTERFACE sent for alternate setting 0). At this time, the Audio Processing will stop the streaming on the codec side by calling the audio peripheral driver function StreamRecordStop() . Basically, any record DMA transfer will be aborted. All empty buffers being processed and all ready buffers not yet retrieved by the record task are implicitly freed by the ring buffer queue reset. On the USB side, all the record buffers unconsumed will be freed automatically by the core by calling USBD_Audio_IsocRecordCmpl() for each aborted isochronous transfers.

Refer to page Audio Peripheral Driver Guide for more details about the audio peripheral driver processing.

The record task supports multi-streams . If the audio function uses several USB IN Terminal types, each USB IN Terminal is associated to one AudioStreaming interface structure posted in the record task's queue. Thus the record task can handle buffers from different streams.

The record data path takes care of the data rate adjustment. This is required for certain sampling frequencies that do not produce an integer number of audio samples per ms. Partial audio samples are not possible. For those sampling frequencies, the Table - Data Rate Adjustment gives the required adjustment. The data rate adjustment is implemented in the isochronous IN transfer completion callback USBD_Audio_RecordIsocCmpl().

Table - Data Rate Adjustment

Samples per frame/ms	Typical Packet Size	Adjustment
11.025	11 samples	12 samples every 40 packets (i.e. ms)
22.050	22 samples	23 samples every 20 packets (i.e. ms)
44.1	44 samples	45 samples every 10 packets (i.e. ms)

For instance, considering a sampling frequency of 44.1 kHz and a mono microphone, the audio class will send to the host isochronous transfers with a size of 44 samples each frame. In order to have 44 100 samples every second, the audio class will send 45 samples every 10 frames (that is every 10 ms). At one second, the host will have received 100 additional samples added to the 44 000 samples received with the 44-byte isochronous transfers.

Stream Correction

Playback Built-In Stream Correction

The built-in playback stream correction is active only when the constant USBD_AUDIO_CFG_PLAYBACK_CORR_EN is set to DEF_ENABLED. As explained in section Playback Stream, the stream correction is evaluated before the playback task provides a ready buffer to the audio peripheral driver. The evaluation relies on monitoring the playback ring buffer queue. Two thresholds are defined: a lower limit and an upper limit as shown in Figure - Playback Ring Buffers Queue Monitoring. The figure shows the four indexes used in the ring buffer queue. A buffer difference is computed between the indexes ProducerEnd and ConsumerEnd. For the playback path, ProducerEnd is linked to the USB transfer completion while ConsumerEnd is linked to the audio transfer completion. The buffer difference represent a circular distance between two indexes. If the distance is less than the lower limit, you have an underrun situation, that is the USB side does not produce fast enough the audio samples consumed by the codec. Conversely, if the distance is greater than the upper limit, this is an overrun situation, that is the USB side produces faster then the the codec can consume audio data. To keep the codec and USB in sync, a simple algorithm is used to add an audio sample in case of underrun and to remove a sample frame in case of overrun.

The frequency at which the playback stream correction is evaluated is configurable via the field CorrPeriodMs of the structure USBD_AUDIO_STREAM_CFG.

Figure - Playback Ring Buffers Queue Monitoring

Figure - Adding a Sample in Case of Underrun illustrates the algorithm to add an audio sample in case of underrun situation.

Figure - Adding a Sample in Case of Underrun

Adding a Sample in Case of Underrun

(1) Sample N is moved at N+1.

(2) Sample N is rebuilt and equal to the average of N-1 and N+1.

(3) The packet size is increased of one sample.

The frequency at which the playback stream correction is evaluated is configurable via the field CorrPeriodMs of the structureUSBD_AUDIO_STREAM_CFG.

The stream correction supports signed PCM and unsigned PCM8 format.

This stream correction is convenient for low-cost audio design. It will give good results as long as the incoming USB audio sampling frequency is very close to the DAC input clock frequency. However, if the difference between the two frequencies is important, this will add audio distortion.

Figure - Removing a Sample in Case of Overrun illustrates the algorithm to remove an audio sample in case of overrun situation.

Figure - Removing a Sample in Case of Overrun

Removing a Sample in Case of Overrun

(1) Sample N-2 is rebuilt and equal to the average of N, N-1, N-2 and N-3.

(2) Sample N is moved at N-1.

(3) The packet size is reduced of one sample.

The playback stream correction offers the possibility to apply your own correction algorithm. If an underrun or overrun situation is detected, an application callback is called. Listing - Example of Playback Correction Callback Provided by the Application shows an example of playback correction callback prototype and definition provided by the application.

static  CPU_INT16U  App_USBD_Audio_PlaybackCorr(USBD_AUDIO_AS_ALT_CFG  *p_as_alt_cfg,
                                                CPU_BOOLEAN             underrun_flag,
                                                void                   *p_buf,
                                                CPU_INT16U              buf_len_cur,
                                                CPU_INT16U              buf_len_total,
                                                USBD_ERR               *p_err);

CPU_BOOLEAN  App_USBD_Audio_Init (CPU_INT08U  dev_nbr,
                                  CPU_INT08U  cfg_hs,
                                  CPU_INT08U  cfg_fs)
{
    ...
    speaker_playback_as_if_handle = USBD_Audio_AS_IF_Cfg(&USBD_SpeakerStreamCfg,
                                                         &USBD_AS_IF1_SpeakerCfg,
                                                         &USBD_Audio_DrvAS_API_Template,
                                                          DEF_NULL,
                                                          IT2_ID,
                                                          App_USBD_Audio_PlaybackCorr,	(2)
                                                         &err);
    if (err != USBD_ERR_NONE) {
        /* $$$$ Handle the error. */
    }
    ...
    return (DEF_OK);
}
/*
*********************************************************************************************************
*                                     App_USBD_Audio_PlaybackCorr()
*
* Description : Apply user-defined correction algorithm to the playback stream.
*
* Argument(s) : p_as_alt_cfg        Pointer to AudioStreaming interface configuration structure.
*
*               is_underrun         Flag indicating if an underrun (audio clock faster than USB) or
*                                   overrun (audio clock slower than USB) situation has been detected
*                                   by the Audio class.
*
*               p_buf               Pointer to buffer to which the correction will be applied to.
*
*               buf_len_cur         Current length of the buffer.
*
*               buf_len_total       Total length of the buffer.
*
*               p_err               Pointer to variable that will receive the return error code from
*                                   this function :
*
*                                   USBD_ERR_NONE   Correction successfully applied to buffer.
* Return(s)   : New length of the buffer after correction.
*
* Caller(s)   : USBD_Audio_PlaybackCorr().
*
* Note(s)     : none.
*********************************************************************************************************
*/
                                                                              			(3)
static  CPU_INT16U  App_USBD_Audio_PlaybackCorr (USBD_AUDIO_AS_ALT_CFG  *p_as_alt_cfg,
                                                 CPU_BOOLEAN             is_underrun,
                                                 void                   *p_buf,
                                                 CPU_INT16U              buf_len_cur,
                                                 CPU_INT16U              buf_len_total,
                                                 USBD_ERR               *p_err)
{
    (void)&p_as_alt_cfg;
    (void)&is_underrun;
    (void)&p_buf;
    (void)&buf_len_cur;
    (void)&buf_len_total;

   *p_err = USBD_ERR_NONE;

    return (buf_len_cur);
}

(1) Prototype of your playback correction callback.

(2) Upon configuration of an AudioStreaming interface with the function USBD_Audio_AS_IF_Cfg, the callback function name is passed to the function. You have the possibility to define a different correction callback for each playback AudioStreaming interface composing your audio topology.

(3) Definition of your playback correction callback. Once the playback is open by the host and the built-in playback correction is enabled (USBD_AUDIO_CFG_PLAYBACK_CORR_EN set to DEF_ENABLED), if an overrun or underrun situation is detected by the Audio Processing module, your callback will be called. You will have access to the structure USBD_AUDIO_AS_ALT_CFG associated to this playback stream through the pointer p_as_alt_cfg. Among the fields, you may be interested in:

TerminalID: ID of terminal associated to the playback stream.

NbrCh: Number of channels supported by the stream.

SubframeSize: Number of bytes occupied by one audio sample.

BitRes: Effectively used bits in an audio sample.

Beside the AudioStreaming alternate setting configuration structure, you will know the situation type (underrun or overrun via underrun_flag), the current buffer length (buf_len_cur) and the total buffer length ( buf_len_total ). Then you can apply your own correction algorithm to the buffer referenced by p_buf. If some samples are removed or added to the buffer, you will have to return to the Audio Processing module the adjusted buffer length. Note that you can add or remove only one sample at a time. You can also specify an error code if something went wrong while applying your correction.

If p_as_alt_cfg-> BitRes is equal to 8 bits, it means that the audio data is encoded in PCM8 format (for legacy 8-bit wave format). In this format, audio data is represented as unsigned fixed point. You correction algorithm must take into account signed PCM and unsigned PCM8.

Record Built-In Stream Correction

There is also a built-in record stream correction active only when the constant USBD_AUDIO_CFG_RECORD_CORR_EN is set to DEF_ENABLED. As explained in the section Record Stream, when an isochronous IN transfer completes by calling the callback function USBD_Audio_RecordIsocCmpl() , the stream correction is evaluated. The evaluation relies on monitoring the record ring buffer queue. Two thresholds are defined: a lower limit and an upper limit based on the same principle as shown in Figure - Playback Ring Buffers Queue Monitoring. For the record path, ProducerEnd is linked to the audio transfer completion while ConsumerEnd is linked to the USB transfer completion. This is the opposite of the playback. Moreover, the ring buffer queue scheme is common to the playback and record streams. And within the audio class, the definition of overrun and underrun situation is "USB-centric".

Consequently, if the lower limit is reached, you have an overrun situation, that is the USB side consumes a little bit faster than the the codec can produce. Conversely, the upper limit corresponds to an underrun situation, that is the USB side does not consume fast enough the audio samples produced by the codec. As opposed to the playback stream correction, no software algorithm is needed to add or remove an audio sample. The audio class will adjust the audio peripheral hardware by using the number of required record data bytes indicated by USBD_Audio_RecordBufGet(). The correction is done implicitly by the audio peripheral hardware by directly getting the right number of audio samples (-1 sample frame or +1 sample frame) to accommodate the overrun or underrun situation.

The frequency at which the record stream correction is evaluated is configurable via the field CorrPeriodMs of the structure USBD_AUDIO_STREAM_CFG.

Playback Feedback Correction

The feedback correction (refer to section Feedback Endpoint for an overview of feedback) takes place when the configuration constant USBD_AUDIO_CFG_PLAYBACK_FEEDBACK_EN is set to DEF_ENABLED and the AudioStreaming interface uses an isochronous OUT endpoint with asynchronous synchronization. As explained in section Playback Stream, the stream correction is evaluated in the function USBD_Audio_PlaybackCorrSynch() before the playback task provides a ready buffer to the audio peripheral driver.

The feedback value evaluation relies on monitoring the playback ring buffer queue. Based on the same principle as the playback built-in correction, the buffer difference between the indexes ProducerEnd and ConsumerEnd is computed and gives the reflect at which the USB and codec clocks operate. The feedback monitoring starts only when the playback stream priming is done, that is when the audio class calls the audio peripheral driver function USBD_Audio_DrvStreamStart. Once the feedback monitoring has started, the underrun or overrun situation requiring a feedback value to be sent to the host is evaluated using the method shown in Table - Feedback Monitoring.

Table - Feedback Monitoring

USB/Codec Clock Difference	USB << Codec		USB < Codec			USB = Codec			USB > Codec		USB >> Codec
Adjustment (in sample)	+1	+1/2	[+1 ; +1/2048]	-	-	-	-	-	[-1/2048 ; -1]	-1/2	-1
buffer difference	-5	-4	-3	-2	-1	0	1	2	3	4	5
Zone	Underrun	Underrun	Underrun	Underrun	Safe	Safe	Safe	Overrun	Overrun	Overrun	Overrun
Threshold	Heavy	Light	No adjustment	Light	Heavy

The underrun situation occurs when the USB side is slower than the codec. In that case, depending how fast is the codec, the underrun situation could be light or heavy. The processing will adjust the feedback value by telling the host to add up to one sample per frame depending of the underrun degree. Similarly, the overrun situation occurs when the USB side is faster than the codec. In that case, depending how slow is the codec, the overrun situation could be light or heavy. The processing will adjust the feedback value by telling the host to remove up to one sample per frame depending of the overrun degree.

When coming from the safe zone, the light underrun or overrun is corrected with a feedback value taking into account the variation of buffers during a certain number of elapsed frames. This allows to correct smoothly the stream deviation instead of over-shooting the correction. The feedback value adjustment is between a minimum adjustment and a maximum adjustment:

Underrun situation: +1/2048 sample < adjustment < +1 sample ()
Overrun situation: -1 sample < adjustment < -1/2048 sample

The first feedback value sent by the device is always the nominal value of samples per frame corresponding to the sampling frequency. For instance, if the sampling frequency is 48.0 kHz, the nominal feedback value send to the host will be 48 samples per frame.

The feedback value update to the host is evaluated every refresh period. The refresh period is configurable via the field CorrPeriodMs of the structure USBD_AUDIO_STREAM_CFG. When the refresh period is reached, if there is a correction to apply, the feedback value update is sent to the host by calling the function USBD_IsocTxAsync(). If there is no correction necessary, the audio class does not prepare an isochronous IN transfer. Thus when the host sends an IN token, a zero-length packet is sent by the device. The host interprets this zero-length packet as "continue to apply the previous valid feedback value". The feedback value is sent in 10.14 format.

Audio 1.0 specification indicates that the feedback refresh period can range from 1 (2 ms) to 9 (512 ms). The refresh period is a power of 2: 2, 4, 8,16, 32, 64, 128, 256, 512. A short bRefresh period will result in a tighter control of the stream data rate. A long bRefresh period may add some latency in the control the stream data rate. Refresh periods such as 256 and 512 should be avoided as they can impact the data rate control. For instance, if the bRefresh is 512 ms and, USB and codec clocks diverge quickly, updates of the feedback value every 512 ms may not be fast enough to re-synchronize USB and codec clocks.