Saturday, August 16, 2014

Optimized Roaming, RSSI Low Check, RX-SOP, Oh My!

In the Cisco landscape today, there are three features that usually come up in the same conversation. They all solve what I'd call "related" problems, but are not the same. They are incredibly useful features and do share one thing in must know your RF environment before implementing them. I'll provide use-cases and examples below, but it should be noted that in the case of "Optimized Roaming", this is based on public information currently available and could change prior to the WLC AirOS version 8.0 release.

Optimized Roaming

The problem:
The well known "sticky client" issue. For the uninitiated, when a client refuses to roam to an assumedly "better" AP (closer, stronger RSSI, better SNR etc.) that client is being "sticky". Why is this bad? Consider the following example of a lecture hall:
As the client enters the room, it associates to AP-1. As it moves farther away from AP-1 it's RSSI gets weaker, SNR gets worse, retransmissions occur, dynamic rate-shifting happens, and you end up with a client communicating at a much lower data-rate. Lower data-rate consumes more air-time to transfer the same information, resulting in higher channel utilization. Ideally, the client would roam to AP-3 and the resulting RF space would be better for everyone.

The Solution:
With Optimized Roaming, once the client reaches either a certain RSSI, Data-Rate, or both, the AP will send an 802.11 Disassociation Frame. Ideally, after receiving a disassociation frame, the client will then associate to the closer AP (AP-3 in our example). The RSSI is from the perspective of the AP. Both RSSI & Data-rate are configurable. 
If the situation were reversed and our client is leaving the building, optimized roaming can also help. If this same client, or even a less "sticky" one, were to exit the building it may still be at or around -81 for quite some time. Considering that AP-5 is in a lobby, next to glass doors, it's possible for clients to remain connected as they approach a parking lot some distance away. While they hang on to their WiFi connection for dear life, it would be a far better experience for these clients if they drop off WiFi and pick up their 3G/4G connection.

Optimized Roaming has a built-in hysteresis. This prevents "thrashing", or a client idling at or around the threshold and therefore being disassociated, re-associating, getting disassociated again, etc. By default this threshold is 6dBm. For example, if your threshold is -75, and a client gets disassociated, that client will not be able to associate (or re-associate) to that AP until it reaches at least -69. Remember, this is from the perspective of the AP.
You can use Data-rate, instead of or in addition to, RSSI. If both are configured, both must pass in order for the client to associate.

RSSI Low Check

The Problem: 
In our last example, we outlined the issue of clients leaving a building but remaining on WiFi for far too long. A related issue is that of clients who are merely walking by the building, on their way to somewhere else. They never get stronger than a -80 RSSI, but their mobile device prefers WiFi over 3G/4G. The client tries to connect, sometimes succeeds sometimes fails. Either way, it's a poor connection, at best, leading to a poor user experience. I personally experience this often; consider the following example:
The main paths of this Campus have no outdoor coverage, however bleed-over from each building is enough for a Mobile device to "try". 

The Solution: 
With RSSI Low check enabled, a clients association requests will be ignored, unless the AP hears them stronger than -80 (configurable). To be more accurate, the 802.11 Association Request will be followed by an Association Response with status code 0x0022, meaning "poor conditions". Attached is a .pcap of this occurring (with a threshold of -60 set). Here is what it looks like:

RSSI Low Check Vs. Optimized Roaming...What's the difference? Aside from configuration specifics, RSSI Low Check does not let a "weak" client connect, where Optimized Roaming actively disassociates a "weak" client.

The takeaway here is that you do not need RSSI Low Check, if you are using Optimized Roaming. Reason being: a client must pass the Data RSSI value, configured for Optimized Roaming, in order to associate. Therefore, Optimized roaming is really a replacement for, and better than, RSSI Low Check. Thankfully, it's also configurable in an RF Profile.

Why would you use RSSI Low Check? Well, for one, RSSI Low Check is available today. At the time of this writing Optimized Roaming is not (WLC 8.0 code). Secondly, I've personally not tested Optimized Roaming in a full production environment with thousands of client-types. It's possible that certain clients, upon receiving a disassociation frame, will not immediately rejoin the ESS leading to "client issues". I think that second possibility will be rare, but possible.

If, for some reason you have both Low Check & Optimized Roaming configured, both must pass in order for a client to associate. For example, if Low RSSI Check is set to -65 and Optimized Roaming is set to -75, the client must be at -65 or stronger to associate.

Receive Start of Packet (RX-SOP)

This is arguably the coolest of the three, and surprisingly, has been around the longest. In it's first form (7.2 code I believe) it was hidden. It was then unhidden (7.5 I believe) and finally with 8.0 will be part of the GUI. You can't find any reference to RX-SOP that doesn't also include something along the lines of "if you use this, beware, it may destroy you". I can understand Cisco's trepidation about releasing this to the populous. It is powerful. It is dangerous. Remember that sentence before about knowing your RF environment? Ok disclaimer over. I'd highly recommend you go read the NSA Show review of RX-SOP, by @samuel_clements and @blakekrone. They did a great job with it, including experiments & graphs.

The Problem:
High Co-channel contention & channel utilization in high-capacity environments. The rules of co-channel interference should always be followed (avoid it!), but in HD environments it is sometimes unavoidable. What results, is typically a situation where an AP is holding off transmitting to it's clients, due to CSMA/CA. More specifically, the CCA-Carrier Sense will kick off at anything above -85 for a STA (AP or Client), and the medium determined 'busy' for the time specified by the Length value of the SIG field in the PLCP Header. Further, CCA-Energy Detect will determine the medium 'busy' at anything 20dBm stronger (-65). I'm skipping over Virtual Carrier Sense (NAV) and sticking to just the PHY for this discussion. For description of both Physical & Virtual mechanisms of CSMA/CA check here, here, here or here

Consider the following crude drawing:
This presents a situation where AP-1 "could" successfully transmit to Client-1, assuming sufficient SINR, but it does not, due to CSMA/CA.

The Solution:
RX-SOP essentially takes any frame received below the set threshold and dumps it in the Noise bucket. It's been described as tuning the AP Receive Sensitivity, or applying "Ear Muffs". Taking our example, if RX-SOP is configured at -80, AP-1 does not "hold-off", because it doesn't determine the medium as "busy" due to AP-2's transmissions. As far as AP-1 is concerned, the Medium is free to use (all be it, a bit more noisy). You can see how this could greatly improve performance in the Downlink direction. 

Client behavior does not change. If a client determines the medium busy, with or without RX-SOP yields the same result (it will back-off). In other words, this does not improve the Uplink direction.
If you configure just RX-SOP, without Optimized Roaming, it is possible that a "sticky client" will fail hard. If the client does nothing (but retransmit), in the absence of any ACK's, it could take a while for it to roam. Probably not a common occurrence, but possible.
The success of RX-SOP is dependent on SINR. Your environment will determine whether, or how much, RX-SOP will help and what level it should be set to.
I can only speculate on exactly how RX-SOP is implemented. Does the determination happen at CCA-CS; the receipt of the Short Training Field (STF) in the Preamble, or is it after receiving the full PLCP Header? Or is it happening before that? Not sure, but it's been a tool in the bag for quite some time. Check the further reading section for more.

Further Reading:


  1. Just to clarify. RX-SOP doesn't change the CCA value the AP uses to determine when and when not to transmit. It uses the value of incoming packets from clients, to determine if it ACKs the packet. So if RX-SOP is set at -78dBm then any packet the AP receives at -79dBm or worse, it ignores and does not process. The CCA value of the AP, which is -85 by default stays the same after RX-SOP tuning.

    1. Travis, your comment is interesting. Does this mean that even WITH RX-SOP implemented that a Cisco AP will still show CCA as busy when neighboring AP or client frames are below the RX-SOP threshold but above -85dBm (or whatever Rx Sensitivity allows proper preamble and PLCP decoding)? So RX-SOP doesn't necessarily help alleviate CCI in an HD environment if that is true.

    2. Good point; so if CCA-CS stays the same, and all we are really doing is not sending an ACK, that doesn't help with determining the medium as free. It would not help with the AP's perception of CU either, assuming the client moved to a different AP on same channel. You mention it does not change the value the AP uses to determine when & when not to transmit, but you also mention that any packet weaker than SOP threshold is "ignored/not processes". I guess the question is "how much" ignoring does the AP do? If it doesn't ignore enough to determine medium "free" (ie only implements a "no ACK" as you said), that would be contrary to the notion of treating the frame as noise,tuning sensitivity, ear-muffs etc. Pg. 90/91 of BRKEWN-3010 for example.

  2. RX-SOP is not meant to asses the medium, it's really another tool to reduce cell size, like trimming data rates and limiting power level. The practical application to RX-SOP is to keep clients on the AP/Cell they are suppose to be on. So for example in a stadium, you wouldn't want clients sitting in a lower bowl, attaching to an AP in the upper bowl. So setting the RX-SOP at a value that will ignore packets with a certain RSSI value from being processed, makes the client search for the better AP. From an AP perspective, the CU does go down slightly, because it thinks there are less 802.11 frames on the medium. From a practical perspective, the CU doesn't really go down because the transmissions are still there. In order to address CCI you need to change the CCA value, so that the AP can transmit at values higher than -85dBm.

  3. OK, so I need to back peddle a bit and correct myself. I did some digging on internal documentation on this, to verify my understanding. RXSOP seems modify the CS value for the AP, which will allow it to transmit down stream, provided it can pass the ED portion of CCA. I've had a hard time finding explicit docs saying this, but based on other ones I can find, this is what I have to assume it's doing. My misunderstanding was that SOP was completely separate from the CCA ED/CS portion and really only addressed the incoming frames and allowed the AP to disregard the ones that didn't meet the defined threshold. It seems, based on what I can find, that it does tie into the CS portion of CCA. I'm going to keep digging to see if I can find stuff expressly confirming this.

  4. Great article @Mike, your simple yet informative explanation made things clear. I am interested to know when the RX-SOP was introduced the first time. You mentioned "7.2 code", but it would be more interesting to know the exact date. Maybe @Travis can help to clear this out.
    Thank you.

  5. RX-SOP does indeed happen at CCA, so the solution is quite simple and effective. If the received frame falls below the RX-SOP threshold, it is discarded as noise and is not even demodulated. We spend no time processing the packet.

    The No Strings Attached Show recently published results of their independent testing of this feature here:

    And Cisco posted an RX-SOP deep dive video from WFD7 here:
    The question of "how is RX-SOP implemented" is addressed at 15:52 in the video.

  6. So, that RX-SOP impact RRM algorithms at all?

    1. Hi Rupert,
      I would think it does impact RRM since RRM neighbor packets from other APs might be ignored if they are below the RX-SOP threshold and therefore not demodulated. It would likely impact both transmit power control (TPC) and dynamic channel assignment (DCA). However, we need more information from Cisco as to how RX-SOP impacts RRM to be certain.