Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
indent20px

This section presents the keys characteristics of audio 1.0 specification that should be understood to use Micriµm audio class. Note that MIDI interface is mentioned in this section but it is not supported by the

Table of Contents
indent20px

This section presents the keys characteristics of audio 1.0 specification that should be understood to use Micriµm audio class. Note that MIDI interface is mentioned in this section but it is not supported by the current audio class implementation. MIDI is referred to better understand the audio 1.0 device in its entirety. 

...

  • One mandatory AudioControl (AC) interface
  • Zero or several optional AudioStreaming (AS) interfaces
  • Zero or several optional MIDI interfaces

 presents Figure - Audio Function Global View presents a typical composite audio device: An AC interface is used to control and configure

Anchor
Figure - Audio Function Global View
Figure - Audio Function Global View

Panel
borderWidth0
titleFigure - Audio Function Global View

Audio Function Global ViewImage Added


An AC interface is used to control and configure the audio function before and while playing a stream. For instance, AC allows to mute, change the volume, control tones (bass, mid, treble), select a certain path within the device to play the stream, mix streams, etc. It uses several class-specific requests to control and configure the audio function. An AS interface transports audio data via isochronous endpoints into and out of the function. An audio stream can be configured by using certain class-specific requests sent to the AS interface. The available configuration is the sampling frequency and/or the pitch. A MIDI interface carries MIDI data via bulk endpoints.

...

  • One pair of control IN and OUT endpoints called the default endpoint.
  • One optional interrupt IN endpoint.
  • One or several isochronous IN and/or OUT endpoints (mandatory only if at least one AS interface is used). 
  • One or several bulk IN and/or OUT endpoints (mandatory only if at least one MIDI interface is used). 

 describes Table - Audio Class Endpoints Usage describes the usage of the different endpoints: 

Besides the standard enumeration process, control endpoints can be used to configure all terminals and units. Terminals and units are described in the next section Audio Function Topology. The interrupt IN endpoint is used to retrieve general status about any addressable entities. It is associated to the additional class-specific requests: Memory and Status requests. In practice, the interrupt IN endpoint is rarely implemented in audio 1.0 devices because Memory and Status requests are almost never used.  

AS interfaces use isochronous endpoints to transfer audio data. Isochronous transfers were designed to deliver real-time data between host and device with a guaranteed latency. The host allocates a specific amount of bandwidth within a frame (i.e. 1 ms) for isochronous transfers. These ones have priority over control, and bulk. Hence, isochronous are well-adapted to audio data streaming. An audio device moving data through USB operates in a system where different clocks are running (audio sample clock, USB bus clock and service clock. Refer to section 5.12.2 of USB 2.0 specification for more details about these clocks). These three clocks must be synchronized at some point in order to deliver reliably isochronous data. Otherwise, clock synchronization issues (for instance, clock drift, jitter, clock-to-clock phase differences) may introduce unwanted audio artifacts.These clock synchronization issues are a one of the major challenges when streaming audio data between the host and the device. In order to take up this challenge, USB 2.0 specification proposes a strong synchronization scheme to deliver isochronous data. There are three types of synchronization:

  • Asynchronous: endpoint produces and consumes data at a rate that is locked either to a clock external to the USB or to a free-running internal clock. The data rate can be either fixed, limited to a certain number of sampling frequencies or continuously programmable. Asynchronous endpoints cannot synchronize to Start-of-Frame (SOF) or any other clock in the USB domain.
  • Synchronous: endpoint can have its clock system controlled externally through SOF synchronization. The hardware must provide a way to slave the sample clock of the audio part to the 1 ms SOF tick of the USB part to have a perfect synchronization. Synchronous endpoints may produce or consume isochronous data streams at either a fixed, a limited number or a continuously programmable data rate.
  • Adaptive: endpoint is the most capable because it can produce and consume at any rate within their operating range. 

Refer to section '5.12.4.1 Synchronization Type' of USB 2.0 specification for more details about synchronization types.

An AS interface must have at least two alternate setting interfaces:

...

Anchor
Table - Audio Class Endpoints Usage
Table - Audio Class Endpoints Usage

Panel
borderWidth0
titleTable - Audio Class Endpoints Usage


EndpointDirectionUsageAssociated to Interface
Control INDevice-to-hostStandard requests for enumeration and class-specific requests.AC, AS
Control OUTHost-to-deviceStandard requests for enumeration, class-specific requests.AC, AS
Interrupt INDevice-to-hostStatus about different addressable entities (terminals, units, interfaces and endpoints) inside the audio function.AC
Isochronous INDevice-to-hostRecord stream communication.AS
Isochronous OUTHost-to-devicePlayback stream communication.AS

Bulk IN

Device-to-host

Record stream communication.

MIDI

Bulk OUT

Host-to-device

Playback stream communication.

MIDI



Besides the standard enumeration process, control endpoints can be used to configure all terminals and units. Terminals and units are described in the next section Audio Function Topology. The interrupt IN endpoint is used to retrieve general status about any addressable entities. It is associated to the additional class-specific requests: Memory and Status requests. In practice, the interrupt IN endpoint is rarely implemented in audio 1.0 devices because Memory and Status requests are almost never used.  

AS interfaces use isochronous endpoints to transfer audio data. Isochronous transfers were designed to deliver real-time data between host and device with a guaranteed latency. The host allocates a specific amount of bandwidth within a frame (i.e. 1 ms) for isochronous transfers. These ones have priority over control, and bulk. Hence, isochronous are well-adapted to audio data streaming. An audio device moving data through USB operates in a system where different clocks are running (audio sample clock, USB bus clock and service clock. Refer to section 5.12.2 of USB 2.0 specification for more details about these clocks). These three clocks must be synchronized at some point in order to deliver reliably isochronous data. Otherwise, clock synchronization issues (for instance, clock drift, jitter, clock-to-clock phase differences) may introduce unwanted audio artifacts.These clock synchronization issues are a one of the major challenges when streaming audio data between the host and the device. In order to take up this challenge, USB 2.0 specification proposes a strong synchronization scheme to deliver isochronous data. There are three types of synchronization:

  • Asynchronous: endpoint produces and consumes data at a rate that is locked either to a clock external to the USB or to a free-running internal clock. The data rate can be either fixed, limited to a certain number of sampling frequencies or continuously programmable. Asynchronous endpoints cannot synchronize to Start-of-Frame (SOF) or any other clock in the USB domain.
  • Synchronous: endpoint can have its clock system controlled externally through SOF synchronization. The hardware must provide a way to slave the sample clock of the audio part to the 1 ms SOF tick of the USB part to have a perfect synchronization. Synchronous endpoints may produce or consume isochronous data streams at either a fixed, a limited number or a continuously programmable data rate.
  • Adaptive: endpoint is the most capable because it can produce and consume at any rate within their operating range. 

Refer to section '5.12.4.1 Synchronization Type' of USB 2.0 specification for more details about synchronization types.

An AS interface must have at least two alternate setting interfaces:

  • One default interface declaring 0 endpoint. This interface is used by the host to temporarily relinquish USB bandwidth if the stream on this AS interface is not active. 
  • One or several other alternate setting interfaces with at least one isochronous endpoint. These alternate settings are called operational interfaces, that is the stream is active. Every alternate setting represents the same AS interface but with the associated isochronous endpoint having a different characteristic (e.g. maximum packet size). When opening a stream, the host must select only one operational interface. The selection is based on the available resources the host can allocate for this endpoint. 

...

An audio function is composed of units and terminals. Units and terminals form addressable entities allowing to manipulate the physical properties of the audio function.  shows

Figure - Example of Audio Function Topology shows an example of audio function topology with some units and terminals:

A unit is

Anchor
Figure - Example of Audio Function Topology
Figure - Example of Audio Function Topology

Panel
borderWidth0
titleFigure - Example of Audio Function Topology

Example of Audio Function TopologyImage Added


A unit is the basic building block of an audio function. Connected together, units allow to fully describe most audio functions. A unit has one or more Input pins and one single Output pin. Each pin represents a cluster of logical audio channels inside the audio function. A unit model can be crossed by an information that is of a digital nature, fully analog or even hybrid audio functions. Each unit has an associated descriptor with several fields to identify and characterize the unit. Audio 1.0 defines five units presented in .

A terminal is an entity that represents a starting or ending point for audio channels inside the audio function. There are two types of terminal presented in table : Input and Output terminals. A terminal either provides data streams to the audio function (Input Terminal) or consumes data streams coming from the audio function (Output Terminal). Each terminal has an associated descriptor.

The functionality represented by a unit or a terminal is managed through audio controls. A control gives access to a specific audio property (e.g. volume, mute). A control is managed by using class-specific requests with the default control endpoint. Class-specific requests for a unit or terminal's control are addressed for the AC interface. A control has a set of attributes that can be manipulated or that present additional information on the behavior of the control. The possible attributes are:

  • Current setting attribute (CUR)
  • Minimum setting attribute (MIN)
  • Maximum setting attribute (MAX)
  • Resolution attribute (RES)
  • Memory space attribute (MEM)

The class-specific requests are GET and SET requests whose general structure is shown in .

As shown in , there are also class-specific requests addressed to AS interface or isochronous endpoint permitting to manage some other controls. These controls are presented in .Table - Units and Terminals Description, Controls and Requests.

A terminal is an entity that represents a starting or ending point for audio channels inside the audio function. There are two types of terminal presented in table Table - Units and Terminals Description, Controls and RequestsInput and Output terminals. A terminal either provides data streams to the audio function (Input Terminal) or consumes data streams coming from the audio function (Output Terminal). Each terminal has an associated descriptor.

The functionality represented by a unit or a terminal is managed through audio controls. A control gives access to a specific audio property (e.g. volume, mute). A control is managed by using class-specific requests with the default control endpoint. Class-specific requests for a unit or terminal's control are addressed for the AC interface. A control has a set of attributes that can be manipulated or that present additional information on the behavior of the control. The possible attributes are:

  • Current setting attribute (CUR)
  • Minimum setting attribute (MIN)
  • Maximum setting attribute (MAX)
  • Resolution attribute (RES)
  • Memory space attribute (MEM)

The class-specific requests are GET and SET requests whose general structure is shown in Table - Class-Specific Requests General Structure.

Anchor
Table - Units and Terminals Description, Controls and Requests
Table - Units and Terminals Description, Controls and Requests

Panel
borderWidth0
titleTable - Units and Terminals Description, Controls and Requests


EntityDescriptionControlRequest Supported
Input Terminal (IT)Interface between the audio function’s ‘outside world’ and other units in the audio function.

Refer to section '3.5.1 Input Terminal' of audio 1.0 specification for more details.
Copy ProtectGET_CUR
Output Terminal (OT)Interface between units inside the audio function and the ‘outside world’.

Refer to section '3.5.2 Output Terminal ' of audio 1.0 specification for more details.
Copy ProtectSET_CUR
Mixer Unit (MU)Transforms a number of logical input channels into a number of logical output channels.

Refer to section '3.5.3 Mixer Unit ' of audio 1.0 specification for more details.
Input channel to mixSET_CUR and GET_CUR/MIN/MAX/RES
Selector Unit (SU)Selects from n audio channel clusters, each containing m logical input channels and routes them unaltered to the single output audio channel cluster, containing m output channels.

Refer to section '3.5.4 Selector Unit ' of audio 1.0 specification for more details.
Input pin selectionSET/GET_CUR
Feature Unit (FU)Multi-channel processing unit that provides basic manipulation of the incoming logical channels.

Refer to section '3.5.5 Feature Unit ' of audio 1.0 specification for more details.
MuteSET/GET_CUR
Feature Unit (FU)Multi-channel processing unit that provides basic manipulation of the incoming logical channels.

Refer to section '3.5.5 Feature Unit ' of audio 1.0 specification for more details.
VolumeSET_CUR and GET_CUR/MIN/MAX/RES
Feature Unit (FU)Multi-channel processing unit that provides basic manipulation of the incoming logical channels.

Refer to section '3.5.5 Feature Unit ' of audio 1.0 specification for more details.
Bass  SET_CUR and GET_CUR/MIN/MAX/RES
Feature Unit (FU)Multi-channel processing unit that provides basic manipulation of the incoming logical channels.

Refer to section '3.5.5 Feature Unit ' of audio 1.0 specification for more details.
Mid  SET_CUR and GET_CUR/MIN/MAX/RES
Feature Unit (FU)Multi-channel processing unit that provides basic manipulation of the incoming logical channels.

Refer to section '3.5.5 Feature Unit ' of audio 1.0 specification for more details.
Treble  SET_CUR and GET_CUR/MIN/MAX/RES
Feature Unit (FU)Multi-channel processing unit that provides basic manipulation of the incoming logical channels.

Refer to section '3.5.5 Feature Unit ' of audio 1.0 specification for more details.
Graphic EqualizerSET_CUR and GET_CUR/MIN/MAX/RES
Feature Unit (FU)Multi-channel processing unit that provides basic manipulation of the incoming logical channels.

Refer to section '3.5.5 Feature Unit ' of audio 1.0 specification for more details.
Automatic Gain  SET/GET_CUR
Feature Unit (FU)Multi-channel processing unit that provides basic manipulation of the incoming logical channels.

Refer to section '3.5.5 Feature Unit ' of audio 1.0 specification for more details.
Delay  SET_CUR and GET_CUR/MIN/MAX/RES
Feature Unit (FU)Multi-channel processing unit that provides basic manipulation of the incoming logical channels.

Refer to section '3.5.5 Feature Unit ' of audio 1.0 specification for more details.
Bass Boost  SET_CUR and GET_CUR/MIN/MAX/RES
Feature Unit (FU)Multi-channel processing unit that provides basic manipulation of the incoming logical channels.

Refer to section '3.5.5 Feature Unit ' of audio 1.0 specification for more details.
LoudnessSET/GET_CUR
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
EnableSET/GET_CUR
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Mode SelectSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
SpaciousnessSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Reverberation TypeSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Reverberation LevelSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Reverberation TimeSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Reverberation FeedbackSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Chorus LevelSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Chorus RateSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Chorus DepthSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Compression RatioSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Max AmplitudeSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
ThresholdSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Attack TImeSET_CUR and GET_CUR/MIN/MAX/RES
Processor Unit (PU)

Transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster by applying a certain algorithms:

  • Up/Down-mix
  • Dolby Prologic
  • 3D-Stereo Extender
  • Reverberation
  • Chorus
  • DynamicRangeCompressor 
Refer to section '3.5.6 Processor Unit' of audio 1.0 specification for more details.
Release TimeSET_CUR and GET_CUR/MIN/MAX/RES
Extension Unit (XU)Allows to add a vendor-specific unit.

Refer to section '3.5.7 Extension Unit' of audio 1.0 specification for more details.
EnableSET/GET_CUR


Anchor
Table - Class-Specific Requests General Structure
Table - Class-Specific Requests General Structure

Panel
borderWidth0
titleTable - Class-Specific Requests General Structure


RequestAttributeRecipient
SET_XXXCurrent (CUR)
Minimum (MIN)
Maximum (MAX)
Resolution (RES)
Memory space (MEM)
AC interface
AS interface
Isochronous endpoint
GET_XXXCurrent (CUR)
Minimum (MIN)
Maximum (MAX)
Resolution (RES)
Memory space (MEM)
AC interface
AS interface
Isochronous endpoint



As shown in Table - Class-Specific Requests General Structure, there are also class-specific requests addressed to AS interface or isochronous endpoint permitting to manage some other controls. These controls are presented in Table - AudioStreaming Interface Controls and Requests.

Anchor
Table - AudioStreaming Interface Controls and Requests
Table - AudioStreaming Interface Controls and Requests

Panel
borderWidth0
titleTable - AudioStreaming Interface Controls and Requests


RecipientControlRequest supported
AS InterfaceDepends of Audio Data FormatDepends of Audio Data Format
EndpointSampling FrequencySET_CUR and GET_CUR/MIN/MAX/RES
EndpointPitchSET/GET_CUR



Units and terminals descriptors allows the USB audio device to describe the audio function topology. By retrieving these descriptors, the host is able to rebuild the audio function topology because the interconnection between units and terminals are fully defined. Units and terminals descriptors form class-specific descriptors associated to the AC interface. There are also class-specific descriptors associated to AS interface and its associated isochronous endpoint (refer to audio 1.0 specification, section 4 'Descriptors' for more details about AC and AS class-specific descriptors and their content). These AS class-specific descriptors gives details about the audio data format manipulated by the AS interface. The audio 1.0 specification defines three audio data formats which encompass some uncompressed and compressed audio formats:

...