Sunday, July 10, 2011

Bandwidth Sharing within CBWFQ/LLQ


Bandwidth Sharing within CBWFQ/LLQKevin Davis, Senior Network Consultant, NetQoS, Inc.

Often we are asked how Class Based Weighted Fair Queuing (CBWFQ) works and how bandwidth sharing is accomplished when Low Latency Queuing (LLQ) is configured. As the name implies, a weight is used to determine the relative frequency data from each configured queue given access to network interface for transmission of its contents. Typically this weight is a value derived from the amount of bandwidth (or number of bytes) assigned to each queue and the size of the packets being transmitted to create fairness among small and large conversations.
An important but often overlooked part of CBWFQ/LLQ is the algorithm behavior before the point bandwidth is allocated among each configured queue. Understanding how packets get classified and how they find their way into each configured queue is critical to understanding proper configuration and operation for CBWFQ/LLQ.
First, let us deal with these common misconceptions about how CBWFQ works so that we can then discuss its theory of operation:
  • When a configured classification’s queue fills, packets marked for that queue are NOT sent down to the next queue in the order in which they are configured.
  • When a configured classification’s queue fills, packets marked for that queue are NOT sent to the default queue either.
  • When the bandwidth assigned to the LLQ is consumed, it is NOT allowed to share in the unused bandwidth with other non-LLQ classes.
What happens when a configured classification’s queue fills? As shown with Packet 12 destined for BWQ3 in Figure 1, additional packets are tail dropped (unless RED is configured) when the queue is full. The resulting tail drop results in the sender retransmitting the packet and TCP throttling back. This decreases the amount of data for that traffic flow and its overall throughput.
Understanding how this technology works conceptually is a key to successfully implementing Layer 3 QoS, which provides the expected results: low jitter and latency for VoIP applications and consistent latency for business-critical applications. Misunderstanding how CBWFQ/LLQ works can result in problems with VoIP implementations and cause performance problems for business-critical applications.

CBWFQ w/LLQ: Packets destined for a network that is reachable from the network interface in question are forwarded (routed) to its output buffers. Often this involves merely passing a pointer in the router’s memory. If CBWFQ/LLQ queuing has been configured, the interface First-In-First-Out (FIFO) HW buffer is checked to see if it is full, which indicates that is has become congested.
Note: Congestion with respect to Layer 3 QoS is not a function of utilization measurement over time (it does not represent average utilization of 80% or 90% over 30 second, 5 minute, or 15 minute intervals), but whether a certain hardware buffer on the interface is full at any moment.
If the FIFO buffer is not full, the packets are forwarded directly to it and are transmitted in the order in which they are received. Given that the FIFO HW buffer must be full to invoke the CBWFQ queues, it is important that the hardware FIFO queue is small enough to invoke LLQ to manage jitter and latency for VoIP, but large enough to not reduce overall throughput.
The size of this FIFO hardware buffer depends on each vendor’s implementation, but might be manually adjusted by configuration commands (depending on interface type and Layer 2 encapsulation) or auto-adjust if a QoS policy such as CBWFQ is configured on the interface.
When the FIFO HW buffer becomes full, newly arriving packets are sent to a classification process to be assigned to specific queues configured within the CBWFQ/LLQ service policy. The classification process determines to which class the packet belongs by reading the IP Precedence or DSCP value in the IP header and attempts to forward the packet to the queue servicing that particular class. If the queue servicing the class is full, the packet is tail dropped. The packet is not forwarded to the next queue or the default queue.
This process is repeated for each packet as it arrives at the interface during periods of congestion; that is, when the HW FIFO buffer is full. At each respective queue, the packets wait for the scheduler to forward them to the HW FIFO buffer for transmission on the network circuit. In each configured class and its attendant queue, the packets are serviced FIFO to ensure packets are not delivered out of order to the receiving end.
If a packet is forwarded to the Low Latency Queue (strict priority queue), it is controlled by a policing mechanism to make sure the LLQ is not consuming more than the amount of bandwidth it has been assigned. If the amount of assigned bandwidth for the LLQ has already been consumed, the packet is dropped. If the bandwidth has not been consumed, the packet is forwarded to the scheduler. The policing mechanism prevents the LLQ from starving the other queues from bandwidth. It also tightly manages jitter and latency within the queue. If you do not configure sufficient bandwidth for your LLQ, your VoIP implementation will have performance problems.
How is bandwidth sharing accomplished? Each class is configured with a weight, which is usually the bandwidth. An algorithm uses this weight to determine when each class is scheduled. When a configured class is not using its assigned bandwidth (for example, when its queue is empty), the scheduler visits the other bandwidth queues (except the LLQ) according to their weighted value to create a statistical fairness for transmission among the various queues. In Figure 1, BWQ2 does not contain packets; therefore, the 10% of bandwidth assigned to it will be used by the other bandwidth queues, but not the LLQ because it is strictly policed.
CBWFQ can dynamically adapt to mete out unused bandwidth assignments to the other configured classes. Bandwidth assigned to each queue is only “carved out” when applications are actually in need of such bandwidth. This flexible and workable approach provides consistent network resources to key applications without permanently committing resources that might go unused and impact non-preferred applications.
Note: Before implementing a QoS policy, conduct an analysis to determine if the application performance problem is a result of inconsistent resource availability in the network; for example, network congestion and packet drop. QoS policies such as CBWFQ are intended to bring consistent performance from the network to applications. They do not create additional bandwidth, bring two sites physically closer together to remove delay because of distance, or reduce the round trips necessary to fulfill an application transaction.
Before implementing CBWFQ/LLQ, it is important to do the following:
  • Use Netflow data to determine the utilization signature of business-critical applications that get precedence in CBWFQ. Do this on each interface where CBWFQ/LLQ is applied.
  • Use Netflow data to baseline interface utilization as a whole and to identify rogue applications that might cause performance problems for applications assigned to the default queue. A rogue application transfers large amounts of data, which creates significant latency and packet drop for other applications in the default queue. You should shape, police, or reschedule these applications to minimize impact on the default queue.
  • Use a network tool or agent such as a sniffer or IPSLA to baseline network latency for business-critical applications and therefore determine the effect on end-to-end latency and packet drop before and after implementation of CBWFQ/LLQ.
  • Use a network tool such as a sniffer or IPSLA to baseline network latency for several applications that will be placed in the default queue and determine the impact on end-to-end latency for such applications; for example, it is a zero sum game and one application’s gain will be another application’s loss. Microsoft’s file sharing protocols (NetBIOS SSN 139 and MS DS 445) are good candidates for such baselines.
  • Establish initial SLA values for utilization, latency, and packet drop on a per interface/application basis.
  • Obtain feedback from users on performance of business-critical applications that receive CBWFQ/LLQ.
After implementing CBWFQ/LLQ it is important to do the following:
  • Use data from Netflow, IPSLA, application monitors, or SNMP pollers to detect changes in application utilization/latency and make proactive changes to CBWFQ policies before end user’s experiences are negatively impacted.
  • Use SNMP monitors to detect packet drops in CBWFQ queues. Packet drop should be equal exactly zero in all configured queues. All packet drops should occur in the default queue.
  • Use network toolsets to set and monitor SLAs in terms of utilization, latency, and packet drop.
  • Obtain feedback from users on business-critical application performance that received CBWFQ/LLQ to ensure that consistent application performance is received across application users.
While this discussion specifically addresses implementating CBWFQ/LLQ (a Cisco technology), these queuing algorithms have similar characteristics: finite queue depths, weighting algorithms, scheduling algorithms, and key events that invoke the queuing behavior. Other white papers conceptually describe weighed fair queuing behavior, priority queuing, and weighed round robin queuing as well as enhancements to CBWFQ/LLQ such as the use of RED and scavenger queues.

Online Tutorial : http://www.cisco.com/web/learning/le31/le46/cln/qlm/CCVP/qos/congestion-management-configuring-cbwfq-and-llq-2/player.html
Related Posts Plugin for WordPress, Blogger...