Performance Issues

Network and Device Configuration

Number of RX & TX Buffers to Configure

The number of large receive, small transmit and large transmit buffers configured for a specific interface depend on several factors.

  1. Desired level of performance.
  2. Amount of data to be either transmitted or received.
  3. Ability of the target application to either produce or consume transmitted or received data.
  4. Average CPU utilization.
  5. Average network utilization.
  6. Type of connection (UDP or TCP)
  7. Number of simultaneous connection.
  8. Application/connection priorities

The discussion on the bandwidth-delay product is always valid. In general, the more buffers the better. However, the number of buffers can be tailored based on the application. For example, if an application receives a lot of data but transmits very little, then it may be sufficient to define a number of small transmit buffers for operations such as TCP acknowledgements and allocate the remaining memory to large receive buffers. Similarly, if an application transmits and receives little, then the buffer allocation emphasis should be on defining more transmit buffers. However, there is a caveat:

If the application is written such that the task that consumes receive data runs infrequently or the CPU utilization is high and the receiving application task(s) becomes starved for CPU time, then more receive buffers will be required.

To ensure the highest level of performance possible, it makes sense to define as many buffers as possible and use the interface and pool statistics data in order to refine the number after having run the application for a while. A busy network will require more receive buffers in order to handle the additional broadcast messages that will be received.

In general, at least two large and two small transmit buffers should be configured. This assumes that neither the network or CPU are very busy.

Many applications will receive properly with four or more large receive buffers. However, for TCP applications that move a lot of data between the target and the peer, this number may need to be higher.

Specifying too few transmit or receive buffers may lead to stalls in communication and possibly even dead-lock. Care should be taken when configuring the number of buffers. µC/TCP-IP is often tested with configurations of 10 or more small transmit, large transmit, and large receive buffers.

Number of DMA Descriptors to Configure

If the hardware device is an Ethernet MAC that supports DMA, then the number of configured receive descriptors will play an important role in determining overall performance for the configured interface.

For applications with 10 or less large receive buffers, it is desirable to configure the number of receive descriptors to that of 60% to 70% of the number of configured receive buffers.

In this example, 60% of 10 receive buffers allows for four receive buffers to be available to the stack waiting to be processed by application tasks. While the application is processing data, the hardware may continue to receive additional frames up to the number of configured receive descriptors.

There is, however, a point in which configuring additional receive descriptors no longer greatly impacts performance. For applications with 20 or more buffers, the number of descriptors can be configured to 50% of the number of configured receive buffers. After this point, only the number of buffers remains a significant factor; especially for slower or busy CPUs and networks with higher utilization.

In general, if the CPU is not busy and the µC/TCP-IP Receive task has the opportunity to run often, the ratio of receive descriptors to receive buffers may be reduced further for very high numbers of available receive buffers (e.g., 50 or more).

The number of transmit descriptors should be configured such that it is equal to the number of small plus the number of large transmit buffers.

These numbers only serve as a starting point. The application and the environment that the device will be attached to will ultimately dictate the number of required transmit and receive descriptors necessary for achieving maximum performance.

Specifying too few descriptors can cause communication delays. See Listing F-2 for descriptors configuration.

LF-2(7) Number of receive descriptors. For DMA-based devices, this value is utilized by the device driver during initialization in order to allocate a fixed-size pool of receive descriptors to be used by the device. The number of descriptors must be less than the number of configured receive buffers. Micrium recommends setting this value to approximately 60% to 70%f of the number of receive buffers. Non DMA based devices may configure this value to zero.

LF-2(8) Number of transmit descriptors. For DMA-based devices, this value is utilized by the device driver during initialization in order to allocate a fixed-size pool of transmit descriptors to be used by the device. For best performance, the number of transmit descriptors should be equal to the number of small, plus the number of large transmit buffers configured for the device. Non DMA based devices may configure this value to zero.

Configuring Window Sizes

Receive and transmit queue size must be properly configured to optimize performance. It represents the number of bytes that can be queued by one socket. It's important that all socket are not able to queue more data than what the device can hold in its buffers. The size should be also a multiple of the maximum segment size (MSS) to optimize performance. UDP MSS is 1470 and TCP MSS is 1460. 

RX and TX maximum queue size is configured using #define in net_cfg.h, see Socket Layer Configuration.

RX and TX queue size can be reduce at run time using socket option API (NetTCP_ConnCfgRxWinSize and NetTCP_ConnCfgTxWinSize).

the following listing shows a calculation example: 

    Number of TCP connection  : 2
    Number of UDP connection  : 0
    Number of RX large buffer : 10
    Number of TX Large buffer : 6
    Number of TX small buffer : 2
    Size of RX large buffer   : 1518
    Size of TX large buffer   : 1518
    Size of TX small buffer   : 60
 
    TCP MSS RX                = 1460
    TCP MSS TX large buffer   = 1460
    TCP MSS TX small buffer   = 0
 
    Maximum receive  window   = (10 * 1460)           = 14600 bytes
    Maximum transmit window   = (6  * 1460) + (2 * 0) = 8760  bytes
 
 
    RX window size per socket = (14600 / 2) =  7300 bytes
    TX window size per socket = (8760  / 2) =  4380 bytes

Reducing the Number of Transitory Errors (NET_ERR_TX)

The number of transmit buffer should be increased. Additionally, it may be helpful to add a short delay between successive calls to socket transmit functions.