Showing posts with label capwap. Show all posts
Showing posts with label capwap. Show all posts

Monday, March 14, 2011

Aerohive HiveAP Provisioning Basics

Honeycomb modular furniture, courtesy Disney.
(This is what I imagine Aerohive's
corporate offices looking like :)
Aerohive offers an innovative wireless product that does not require wireless controllers, which seems ideally suited for branch, remote, and home offices where duplicating hardware controllers is prohibitively expensive. One of the company's advantages that make this possible is the "ground-up" design effort that has gone into this product, unencumbered by design choices made 5-7 years ago when the controller-based model emerged to provide system-wide coordinated intelligence required for many wireless LAN functions (radio management, configuration management, key caching, etc.). This gives Aerohive a fresh swing at innovating without the R&D and customer support of legacy product hampering budgets and resources.

For more information about Aerohive and their solution, see their website and resources. Part of the excitement of working in Wi-Fi is the current state of the market, which is extremely innovative, fast-paced, and heating up competitively.

Recently, I've had the opportunity to gain some exposure to Aerohive's line of equipment. As I work through learning and understanding their solution, I'd like to take the opportunity to share what I find.

What is a "Hive"?
In Aerohive's architecture, a "Hive" simply refers to a logical collection of wireless access points that exchange control plane information with one another to facilitate distributed system-wide network intelligence. The HiveAPs coordinate the following functions:
  • Consistent QoS policy enforcement
  • Seamless layer 2 and layer 3 roaming
  • Dynamic best-path routing of client data
  • Automatic radio frequency and power selection
  • Client traffic tunneling between APs for layer 3 roaming and guest termination
HiveAP Provisioning
The first (post-design) implementation step in deploying a network of HiveAPs is to setup the management console and provision APs.

HiveManager is the company's management console, which I will dive into piecemeal throughout my exposure to their product line and features. Therefore, I won't go into much depth in this post on HiveManager, except to note that HiveManager is strictly a management platform for centralized monitoring and management. It is not in the critical path for network operation (control plane or data plane). The product is available as both an enterprise appliance for hosting within customer premises, or as a virtual appliance hosted by Aerohive as a managed service in the cloud.

Once HiveManager is operational, provisioning of APs can begin. HiveAPs can be configured individually or can be connected to the HiveManager for easier configuration deployment, especially for networks of more than a few access points. I will cover provisioning using HiveManager, as that is the likely scenario for most customers.

If using HiveManger Online (HMOL), upon purchase of a HiveAP customers will want to ensure that the AP serial or MAC addresses are properly registered with the HMOL Staging Server. The staging servers acts as a landing pad for HiveAPs when they cannot discover a local HiveManager appliance through discovery mechanisms listed below, or need to connect to a HMOL hosted server. The staging server redirects HiveAPs to the correct HiveManager appliance, either internal or hosted, as configured by the administrator. To ensure HiveAP registration in these instances, login to your HMOL account and navigate to the Staging Server section.

In the "Monitor > HiveAP Access Control List" sub-tree, ensure the purchased APs are imported using either the AP serial numbers or MAC addresses.


In this instance, I am using a HMOL hosted server and the APs were registered with the staging server automatically when purchased. I'm not sure if this an automatic process for all customers or is unique to my account at this time.

HiveAPs follow a very similar discovery and connection process for HiveManager as most thin-AP architectures for controller discovery. HiveAPs use the CAPWAP protocol for management, and perform the following steps for HiveManager discovery:
  1. Manual Configuration - Manually configure the HiveManger IP address or domain name through the AP command line interface. Login to the AP via console using the default username "admin" and password "aerohive". Configure the HiveManager with the command "capwap client server name <name_or_IP>".
  2. Layer 2 Broadcast - HiveAPs broadcast within the layer 2 domain (subnet) for a HiveManager appliance or Virtual HiveManager (VHM) appliance.
  3. DHCP Options 225 & 226 - Specify the HiveManager server in a DHCP option returned to the AP. Option 225 specifies a domain name and option 226 specifies an IP address. Only one option is required.
  4. DNS Resolution - If no HiveManager server was specified by DHCP options, then the HiveAP attempts to resolve "hivemanager.localdomain", where the local domain name assigned through DHCP is appended. If this name resolves to a valid IP address the HiveAP attempts to join the HiveManger at that address.
  5. HMOL Staging Server - HiveAPs attempt to reach  the HMOL Staging server at staging.aerohive.com. If the staging server has the AP registered, it will redirect the AP to the configured HiveManager server. If the client is using a VHM cloud appliance then the AP is redirected to hm-online.aerohive.com.

Basic HiveManager Connection Protocols (not an exhaustive list):
  • CAPWAP (UDP port 12,222) - Used between the HiveAPs and HiveManager appliance for AP management.
  • SCP (TCP port 22) - Used by HiveManager for full configuration and image uploads to HiveAPs.
  • NTP (UDP port 123) - Used by HiveAPs for time synchronization with HiveManager.
Determine the method you will use for HiveManager discovery, deploy the necessary configuration to DHCP, DNS, or firewalls, then power on the HiveAP. The AP should then discover the server and connect.

If using the HMOL staging server, check the redirection status of the AP on the "Monitor > HiveAPs" sub-tree:


Once the HiveAP has discovered  a HiveManager appliance, verify the HiveAP has connected using CAPWAP by logging into HiveManager and looking at the "Monitor > Access Points > HiveAPs" sub-tree section:


Verification can also be performed from the HiveAP with the following CLI command:
HiveAP#show capwap client
CAPWAP client: Enabled
CAPWAP transport mode: UDP
RUN state: Connected securely to the CAPWAP server
CAPWAP client IP: 10.10.10.60
CAPWAP server IP: 168.143.86.118
HiveManager Primary Name:hm-online.aerohive.com

HiveManager Backup Name:
CAPWAP Default Server Name: staging.aerohive.com
Virtual HiveManager Name: XYZ_Company
Server destination Port: 12222
CAPWAP send event: Enabled
CAPWAP DTLS state: Enabled
CAPWAP DTLS negotiation: Enabled
     DTLS next connect status: Enable
     DTLS always accept bootstrap passphrase:Enabled
     DTLS session status:Connected
     DTLS key type: passphrase
     DTLS session cut interval: 5 seconds
     DTLS handshake wait interval: 60 seconds
     DTLS Max retry count: 3
     DTLS authorize failed: 0
     DTLS reconnect count: 0
Discovery interval: 5 seconds
Heartbeat interval: 30 seconds
Max discovery interval: 10 seconds
Neighbor dead interval:105 seconds
Silent interval: 15 seconds
Wait join interval: 60 seconds
Discovery count: 0
Max discovery count: 3
Retransmit count: 0
Max retransmit count: 2
Keepalives lost/sent: 1/3185
Event packet drop due to buffer shortage: 0
Event packet drop due to loss connection: 1
HiveAP#
Now that the HiveAP is connected to HiveManager, we're ready to begin configuration!

Overall, the HiveAP provisioning process is straightforward and simple. Check back for more information on the Aerohive solution as I work through my lab deployment!

Cheers,
Andrew

Friday, January 7, 2011

[QoS] It's Tricky, Tricky, Tricky, Tricky...

You may have noticed somewhat of a recurring theme across several of my posts - Quality of Service. Since wireless networks are inherently a shared medium, and with Wi-Fi in particular using distributed contention protocols (DCF, EDCA), it stands to reason that implementing QoS controls and having some form of differentiated access to the network is just a bit more critical than on a switched LAN.

Most of the time, Wi-Fi engineers such as ourselves focus on over-the-air QoS since that is typically more critical to performance than wired QoS. However, a recent support incident highlighted the need for careful wired QoS policy definition, especially when supporting an LWAPP/CAPWAP wireless environment.

Shiny New Equipment
A recent project at our organization involved the deployment of several hundred new Cisco 3502 CleanAir access points which run on the Cisco Unified Wireless Network using the CAPWAP protocol. (For an overview of the CAPWAP protocol, see my previous blog posts here, here, here, here, and here.)

This project involved replacing existing 802.11a/b/g "legacy" access points with new 802.11n access points, as well as installation of a few hundred net-new additional APs. The replacement APs were to be installed and patched into the existing AP switch ports, while the new APs were to be patched into open switch ports in the existing data VLAN which provides DHCP services. This would allow the new APs to be deployed with zero-touch configuration, simply taken out of the box and installed by the contractor, minimizing operational expense. After the net-new APs were installed and registered to the controller, an administrator would then move them to the management VLAN and apply the standard port configuration settings for APs in our environment.

Help Desk, "We Have a Problem"
However, almost immediately after the new APs began to be installed, support tickets started rolling in. Users were reporting horribly slow wireless network performance, severe enough to the point of making the network unusable.

A quick trip to the affected building (only 5 min. away) confirmed the issue. A simple ping from a wireless client to the default gateway would drop numerous packets, sometimes as bad as 10% packet loss. And that was when the client was otherwise idle without other applications running. The issue would get even worse when attempting to load more traffic over the connection, such as pulling down a file over an SMB share or browsing a webpage with video content, spiking upwards of 25-30% packet loss. Clearly something was going on.

Sample pings from the distribution switch (housing the WiSM controller) to the wireless client showed the same symptoms in the reverse direction as well:
CAT6K#ping        
Protocol [ip]:
Target IP address: 172.16.10.20
Repeat count [5]: 100
Datagram size [100]: 1400
Timeout in seconds [2]:
Extended commands [n]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 100, 1400-byte ICMP Echos to 172.16.10.20, timeout is 2 seconds:
!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.
!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!
Success rate is 86 percent (86/100), round-trip min/avg/max = 1/7/84 ms
CAT6K#
Initial suspicions fell on the wireless network, after all with changes come the opportunity for new problems. However, a parallel deployment of the new CleanAir APs in several other office buildings as well as a previous warehouse deployment had gone off without a hitch and no issues were experienced in those locations. Numerous wireless packet captures were performed that showed little to no issues over-the-air. Issues were experienced on both 2.4 GHz and 5 GHz frequency bands, very few retransmissions were present, and no interference was observed. Configuration changes were backed-out and testing was performed with the legacy APs, but the issue persisted.

Additionally, the packet loss experienced to/from wireless clients was not observed when communicating directly with the AP (ping, SSH, etc.). It appeared that the wired network was performing normally.

Even stranger, we had one brand-new 3502 access point that performed great, without issue. So we tried moving this good AP to a switch port where another AP experiencing problems had been connected. Still no issue with this single AP.

How Wired QoS Killed Our CAPWAP Tunnels
Reviewing the gathered data, we began investigating switches between the APs and controller for interface errors and packet drops. All counters appeared normal, and no dropped packets were seen in the queues. However, given the predictable pattern of packet loss (as shown above) the issue smelled of rate-limiting of some sort.

Our support case with Cisco was raised to TAC Escalation engineers (aka the Big Dogs in house), who proceeded to run through numerous hidden commands on our Cat6500 switches, looking at ASIC performance and low-level debugs.

Still finding nothing, we took a shot in the dark. We disabled QoS globally on one of the access layer switches that had APs attached with issues.

no mls qos

Immediately... multi-megabit performance! SMB file transfers took seconds where they took close to an hour previously. No packet loss. We've found our culprit! 

(Some of you may be thinking, if they suspected QoS issues before, why wasn't this tested earlier? In a large enterprise, testing changes to an existing configuration in a production environment is risky business. Established processes governing change management don't exactly allow for this type of activity by administrators. It's one thing to have a back-out plan for new changes, but an entirely different scenario when changing established configuration baselines.)

Questions still remained. What QoS settings were causing the problems and why were the QoS queue counters not showing any dropped packets?

Analysis of the data also revealed that only CAPWAP encapsulated packets were being dropped, not packets or traffic destined directly to the access point outside of the tunnel (pings or SSH, as mentioned). So what is unique to the CAPWAP tunnel? Well, we know that CAPWAP uses UDP packets with destination ports 5246 and 5247 to the controller. But the source ports used by the APs are randomly selected ephemeral (high-numbered) ports above 1,024.

What other application traffic uses ephemeral UDP ports? ... Voice Bearer Traffic (RTP)! A quick review of the QoS policy revealed a fairly typical looking configuration:
ip access-list extended QOS_VOICE_BEARER
 permit udp any range 16384 32767 any
 permit udp any any range 16384 32767


class-map match-all MARK_EF
  match access-group name QOS_VOICE_BEARER


policy-map MARK_VLAN_TRAFFIC
  class MARK_EF
     police flow 128000 8000 conform-action set-dscp-transmit ef exceed-action drop


interface Vlan589
 description **Wireless Management**
 service-policy input MARK_VLAN_TRAFFIC
In the policy, voice bearer traffic is identified conforming to typical design guide best practices through an ACL matching UDP ports from 16,384 through 32,767. Additionally, matching voice traffic was then marked with the EF classification and policed to 128kbps.

A quick verification of CAPWAP traffic from previous wired packet captures taken during troubleshooting efforts revealed overlapping port usage between the two applications.


And that one AP that never had an issue just chose a source port that was below the lower-bound of the ACL entry. There were likely other APs unaffected as well, but it would be highly variable based on the port chosen during the CAPWAP join process.

The workaround was to re-enable QoS globally on the switch and to prevent the switch from re-classifying traffic from the AP ports. This is accomplished by trusting the DSCP value of packets being received on switch ports connected wireless access points using the following command:

mls qos trust dscp


Additionally, the packet drops were eventually found by looking at the policed packet counter through the QoS netflow statistics on the switch.


A Cautionary Example
This incident highlights the importance of complete end-to-end QoS configuration for wireless networks. Although not directly a wireless issue, the wireless network relies on other network components to deliver traffic in a fashion congruent with unique application and traffic characteristics found in various wireless architectures.

Having a thorough understanding of both wireless and wired QoS configuration and best practices is critical for network engineers designing and deploying these systems. In addition, best practices don't always work in every environment. Sure, they are a rule-of-thumb, but engineers should examine their unique requirements and adjust accordingly.

For wireless networks, this means at a minimum, that wired QoS policies should have explicit provisions to handle wireless traffic appropriately. This may be as simple as trusting QoS values coming out of the APs, as implemented with our workaround. Or this may mean re-writing of the base QoS policy to ensure correct identification and classification of traffic. ACLs and port-matching are a broad brush which can snag victim application traffic easily. We will be reviewing the method(s) by which voice traffic is identified within our organization.

My recommendation: be fluent in end-to-end QoS, best practices, and know the applications flowing across  your network so you can make sound design decisions.

QoS is a tricky trickster. Put your game face on when dealing with him!

Cheers,
Andrew

Wednesday, December 1, 2010

CAPWAP Connection Maintenance

Continuing down our CAPWAP journey, once the access point has joined a controller, a mechanism is required to verify the correct code version is running on the AP, download the operational configuration, and maintain the connection especially during periods of idle WLAN activity. Collectively, I'll call this the CAPWAP connection maintenance process. Connection maintenance consists of 6 operational states for the access point, the first two of which have already been discussed regarding the controller discovery and join processes.

Connection Maintenance Operational States
1.      Discovery State is used when the AP is performing discovery to identify potential controllers on the network
2.      Join State is used when the AP is actively joining a WLC
3.      Image Data State is used if code version is out of sync with the WLC:
§  LWAPP Image Data Request(s) will be sent by the AP to request chunks of code
§  LWAPP Image Data Response(s) will be sent by the controller containing chunks of code
§  The AP will install new code, reboot, perform discovery, selection, and join a WLC
4.      Config State is used by WLC to provision the AP with the appropriate configuration
5.      RUN State is when the access point is ready to serve clients
6.      Reset State is used when the access point has been issued a command to reboot

Note - These operational states are performed behind the scenes and are not the same as the "Operational Status" field displayed in the WLC wireless access point list. That list defines the registered status of an AP with the controller (REG / DEREG).

Once the access point is in the RUN state, a heartbeat process between the AP and controller is initiated to identify the loss of connection so that the access point can attempt to failover to another controller.

LWAPP/CAPWAP Heartbeat Process:
  1. LWAPP/CAPWAP Echo Request Sent
    When the access point's heartbeat timer expires, it sends an LWAPP Echo Request to the WLC. By default, the heartbeat timer is 30 seconds. It is administrator configurable in code version 5.0 and later.

  2. Starts the 'NeighborDeadInterval' Timer
    • The AP expects an LWAPP Echo Response from WLC before the timer expires.
    • Once a response is received, the NeighborDeadInterval timer is reset and the Heartbeat Timer is restarted.
    • If no response is received and the timer expires, the AP sends additional LWAPP Echo Requests up to 5 more times in 1 second intervals.
    • If there is still no response received after 5 retries, the AP releases and renews its IP address, transitions back into the Discovery State, and attempts to discover a new controller.

      This behavior changed in code version 5.0; if the AP has a valid controller in its backup controller list, then it will immediately transition into the Join State and attempt to join the next controller in the list. The backup list is maintained by sending periodic LWAPP/CAPWAP discovery packets to each discovered controller (dictated by the AP Primary Discovery Timeout value).
The WLC also maintains a heartbeat timer, and expects an LWAPP/CAPWAP Echo Request packet from the AP before its timer expires. Once the WLC hearbeat timer expires, the AP connection is flushed from the controller's active AP list. If you have ever moved APs between primary, secondary, and tertiary controllers and noticed a small period of time that the previously connected controller still shows the AP as connected, it is due to this timer.


In versions 5.0 and later of WLC code, a fast heartbeat timer can be configured to detect failed controllers faster than 30 seconds. Separate timers can be specified for APs in local and H-REAP modes. The valid fast heartbeat interval is 1 – 10 seconds.

config advanced timers ap-fast-heartbeat { local | hreap | all } { enable | disable } interval
show advanced timers

From the AP, verification can be done using these commands:

show lwapp client ha
show lwapp client config


Tweaking the heartbeat timer allows quicker identification of failed controllers and faster failover necessary in a highly available environment. In my next post, I'll detail some additional measures that can be taken to implement a highly available wireless network.

Cheers,
Andrew

Monday, November 29, 2010

CAPWAP AP Join Process

Once the LWAPP/CAPWAP access point has discovered and selected a controller, the next step in the process is for the AP to join the selected controller. The join process verifies the identity of both the Cisco access point and controller, ensuring that only valid Cisco APs with either a Manufacturer Installed Certificate (MIC) or Self-Signed Certificate (SSC) from an autonomous AP conversion to lightweight mode can join the controller. This process also establishes a secure communication path for the LWAPP/CAPWAP control channel to ensure that only the current controller can configure and manage the access point.

The LWAPP and CAPWAP protocol join process is built on existing asymmetric and symmetric cryptography, hashing, and digital signatures. For an introduction to these concepts, see Public Key Cryptography, SSL and TLS protocols.


To join the controller, the access point and controller perform the following process:
1.      AP sends Join Request
a.       Random Session ID
b.      X.509 Certificate of LWAPP
2.      Controller Verification
a.       Verifies LWAPP X.509 Certificate was signed by a trusted CA
b.      Generates random AES encryption key for LWAPP Control traffic
c.       Encrypts AES key using LWAPP Public Key
d.      Concatenates key ciphertext with the Session ID from LWAPP Join Request
e.       Encrypts concatenated string with Controller’s Private Key
3.      Controller sends Join Response
a.       Ciphertext (Session ID, encrypted AES key)
b.      Controller’s X.509 Certificate
4.      LWAPP Verification
a.       Verifies Controller X.509 Certificate was signed by a trusted CA
b.      Decrypts concatenated string using Controller’s Public Key
c.       Validates the Session ID
d.      Decrypts the AES key using LWAPP’s Private Key
5.      Join Process is now completed
6.      AES Key Lifetime timer is 8 hours
a.       LWAPP sends LWAPP Key Update Request (contains new Session ID)
b.      Controller generates new AES key and encrypts as stated above.
c.       Controller sends LWAPP Key Update Response


All LWAPPs manufactured after July 18, 2005 have Manufacturer Installed Certificates (MIC) burned into protected flash memory. Upgraded access points manufactured prior to this date will have Self-Signed Certificates (SSC) installed during the upgrade process. The Cisco Upgrade Tool must be used during the upgrade of older APs in order to generate the self-signed certificate. SSCs are not trusted by default by the WLCs, so a mapping of AP MAC addresses to SSC Public Key hashes is created at the time of upgrade by the Cisco Upgrade Tool. This list can then be imported into WCS and pushed to the WLC.


Access points can also be restricted from joining a controller based on the AP Policies settings in the Security tab of the WLC. This allows more granular control of APs allowed to join a controller if the organization does not want to allow any valid Cisco AP to join for security reasons. 


Select the type(s) of certificates to accept (SSC, MIC, LSC) when authorizing APs against the AP authorization list. SSC certificates always require valid AP entries in the AP authorization list. MIC and LSC are accepted by default, and will only be checked against the AP authorization list if their respective authorization check boxes are enabled.




Debugging the LWAPP Discovery and Join processes can be accomplished with the following commands:

LWAPP Console Port Commands
debug ip udp
debug lwapp client events
show crypto ca certificates

WLC Commands:
debug lwapp events enable
debug lwapp packet enable
debug lwapp error enable
debug pm pki enable
show time

It is VERY important that the WLC have the correct time set, otherwise it may reject the LWAPP Certificate during the Join process because it is outside the validity interval. To set the correct time on the controller, issue the config time CLI command.

If LWAPs are using Self-Signed Certificates, ensure that the WLC is configured to accept the SSCs:
show auth-list
config auth-list ap-policy ssc { enable | disable }
config auth-list add { mic | ssc } ap-mac ap-keyhash
config auth-list delete ap-mac



Cheers,
Andrew

CAPWAP Controller Selection Process

Once a CAPWAP access point has discovered and built a list of all one or more available wireless LAN controllers, it is ready to select the optimal controller to join.

Controller selection is based on the following information embedded in the Discovery Response from each candidate controller:
1.      Controller sysName
2.      Controller Type
3.      AP Capacity
4.      Current AP Load
5.      Master Controller Status
6.      AP-Manager IP Address(es)
7.      Current AP Load on each AP-Manager IP Address

The AP will select a controller based on the following precedence rules:
1.      Primary configured controller preference
2.      Secondary configured controller preference
3.      Tertiary configured controller preference
4.      Primary Backup controller (WLC version 5.0 and later)
5.      Secondary Backup controller (WLC version 5.0 and later)
6.      Master Controller status of WLC
7.      Controller with greatest excess capacity
a.       Ratio of current AP load versus maximum capacity; expressed as a percentage

Note – Once an access point joins a controller, it learns the information of all controllers within the Mobility Group and stores their IP addresses in NVRAM. The AP may join another controller with greater excess capacity if no controller preferences are configured.

Cheers,
Andrew

Tuesday, November 23, 2010

CAPWAP Controller Discovery Process

In a controller-based architecture, CAPWAP access points are dependent on a wireless controller to provide the software image, configuration, and centralized control and optionally data forwarding functions. Therefore, it is necessary for the access point to find a list of available controllers with which it can associate.

The following layer 3 CAPWAP discovery options are supported:
1.       Broadcast on the local subnet
2.       Local NVRAM list of the previously joined controller, previous mobility group members, and administrator primed controller through the console port
3.       Over the Air Provisioning (OTAP) (subsequently removed in version 6.0.170.0 code)
4.       DHCP Option 43 returned from the DHCP server
5.       DNS lookup for "CISCO-CAPWAP-CONTROLLER.localdomain"

Broadcast
The CAPWAP AP sends broadcast discovery requests on the local subnet. Controllers with management interfaces on the same subnet receive the discovery request and send a discovery reply back to the CAPWAP AP.

If no controllers are on the local subnet, broadcasts may be forwarded to the controller management interface by the local router using the Cisco “forward-protocol” and “ip helper-address” features. Use these commands to configure the router:

ip forward-protocol udp 12223
ip forward-protocol udp 5246
interface interface-name
     ip helper-address wlc-management-ip-address

When using the forward-protocol, the default gateway modifies the CAPWAP discovery packet that is broadcast on the local subnet by replacing the broadcast destination IP address 255.255.255.255 with the WLC management IP address configured as an IP helper-address, then routes the packet to the controller. The downside to this approach is that the WLC will also receive all other forwarded protocols such as DHCP discovery packets. Also, other configured IP helpers will receive the CAPWAP discovery packets. Since this behavior is likely undesired, be sure to use the forward-protocol method only temporarily.

Local NVRAM
The local NVRAM of the access point stores a list of controllers, gathered from the following sources:

·         Primary, Secondary, and Tertiary controller preferences previously configured for the AP

If the access point was previously associated to a controller, the IP addresses of the primary, secondary, and tertiary controllers are stored in the access point’s non-volatile memory. This process of storing controller IP addresses on access points for later deployment is called priming the access point.

To verify locally stored controller preferences:

show ap config general ap_name

Primary Cisco Switch Name........................ WLC001
Primary Cisco Switch IP Address.................. Not Configured
Secondary Cisco Switch Name...................... WLC002
Secondary Cisco Switch IP Address................ Not Configured
Tertiary Cisco Switch Name....................... BACKUP-WLC
Tertiary Cisco Switch IP Address................. Not Configured

·         Mobility Group Members from the previous controller connection

The AP also maintains previously learned WLC IP addresses locally in NVRAM. The AP sends a unicast CAPWAP Discovery Request to each of these WLC IP addresses. These WLC IP addresses are learned by the AP from previously joined controllers. The stored WLC IP addresses include all of the WLCs in previously joined controller mobility groups.

To verify locally stored controllers learned through mobility groups, console into the access point and enter the following command:

show capwap client config

mwarName                CCIETEST
mwarName                backupwlc
mwarName
numOfSlots              2
spamRebootOnAssert      1
spamStatTimer           180
randSeed                0x9640
transport               SPAM_TRANSPORT_L3(2)
transportCfg            SPAM_TRANSPORT_DEFAULT(0)
initialisation          SPAM_PRODUCTION_DISCOVERY(1)
ApMode                  Local
Discovery Timer         10 secs
Heart Beat Timer        30 secs
Led State Enabled       1
AP ILP Pre-Standard Switch Support Enabled
AP Power Injector Disabled
Infrastructure MFP validation Enabled
Configured Switch 1 Addr 10.127.78.5
Configured Switch 2 Addr 10.108.50.20


Note – mwarName entries are the controller preference settings (primary, secondary, tertiary). Configured Switch entries are the learned mobility group members.

·         Manually primed controller IP address through the console

Manual configuration can be used to “prime” the CAPWAP if network services for address assignment and WLC discovery do not exist. If the CAPWAP has previously joined a controller, or is currently joined to a controller, these commands will be disabled.

To stage an access point, use the commands:
capwap ap controller ip address wlc-mgmt-ip
show capwap ip config

OTAP
If this feature is enabled on the controller, all associated access points transmit wireless RRM neighbor messages, and un-joined access points can receive the controller IP address from these messages. This feature is disabled by default and should only be enabled when necessary for AP deployment.

Note – OTAP does not work with default APs out of the box or upgraded using the Upgrade Tool because the radios are disabled from the manufacturer. 

Configure OTAP:
config network otap-mode { enable | disable }
show network summary

Note - OTAP was removed from the wireless controller feature set in code version 6.0.170.0 due to a vulnerability.

DHCP Option 43
The IP address that should be configured as DHCP option 43 is the address of the controller Managament interface.

Cisco 1000 series access points use a string format for option 43.
Cisco Aironet access points use the type-length-value (TLV) format for option 43.

DHCP servers must be programmed to return the option based on the access point’s DHCP Vendor Class Identifier (VCI) string (DHCP option 60). The VCI strings for Cisco access points capable of operating in lightweight mode are listed in the following table:


The format of the Option 43 TLV block is:
     Type: 0xf1 (decimal 241)
     Length: Number of controller IP addresses * 4
     Value: List of WLC management interfaces

Configuration of option 43 will vary by DHCP server platform. Here is an example configuration using the built-in Cisco IOS DHCP server:

ip dhcp excluded-address start-ip end-ip
ip dhcp pool 
pool-name
     network ip-address netmask
     default-router ip-address
     
dns-server ip-address … ip-address
     domain-name domain
     lease days hours
     option 60 ascii VCI string
     option 43 hex hex-value

An example of a finished IOS DHCP server configuration will look similar to this:

interface Vlan192
 ip address 192.168.1.1 255.255.255.0

ip dhcp excluded-address 192.168.1.1 192.168.1.10

ip dhcp pool lwapp
   network 192.168.1.0 255.255.255.0
   default-router 192.168.1.1
   dns-server 192.168.1.2
   domain-name test.lab
   lease 7
   option 60 ascii "Cisco AP c1240"
   option 43 hex f108.0a6c.3214.0a6c.3212

In this example, the hex value is obtained from these TLV values:

Type = 241 (0xf1)
Length = 2 IP addresses * 4 bytes each = 8 bytes (0x08)
Value = 10.108.50.20 (0x0a6c3214) and 10.108.50.18 (0x0a6c3212)

Note – Periods are added automatically to the hex value by Cisco IOS and should not be entered by the administrator when entering configuration commands.

DNS
The AP will attempt to resolve the DNS name “CISCO-CAPWAP-CONTROLLER.localdomain”. When the AP is able to resolve this name to one or more IP addresses, the AP sends a unicast CAPWAP Discovery Request to the resolved IP address(es). The DNS entries can be either an A (address) or CNAME (alias) records.

If the AP received a DHCP address, ensure the DHCP server is configured to return valid DNS servers and a valid domain name suffix to the AP.

If the AP is using a static IP address, configure the domain name and DNS name servers from the controller. WLC version 4.2 requires configuration from the CLI, whereas 6.0 and later allow configuration from the GUI. Once applied, the AP will disconnect and re-join the controller.

config ap static-IP { enable | disable } ap_name ip_address netmask gateway
config ap static-IP { add | delete } domain { all | ap_name domain_suffix
config ap static-IP { add | delete } nameserver { all | ap_name dns_server_ip_address

Verify the configuration of the AP:

(Cisco Controller) > show ap config general ccielwap

IP Address Configuration......................... Static IP assigned
IP Address....................................... 10.108.51.54
IP NetMask....................................... 255.255.254.0
Gateway IP Addr.................................. 10.108.50.1
Domain........................................... ccietest.com
Name Server...................................... 10.10.10.25

Verification of Method Used
To view the method used by an AP to discover the controller, view the console output of the AP as it searches, or issue the following command from a controller that receives the discovery request and search for IE 58 from the AP which indicates the discovery method used to resolve the controller IP address:

debug capwap packet enable

CAPWAP Discovery Packet IE 58 values:

0 = Broadcast
1 = Local NVRAM
2 = OTAP
3 = DHCP
4 = DNS

Example 1 – AP Console Log indicates DHCP discovery

*Mar  1 00:00:30.287: Logging LWAPP message to 255.255.255.255.
%DHCP-6-ADDRESS_ASSIGN: Interface FastEthernet0 assigned DHCP address 192.168.1.20, mask 255.255.255.0, hostname AP0018.7361.e702
Translating "CISCO-LWAPP-CONTROLLER.test.lab"...domain server (10.97.40.216)
%LWAPP-3-CLIENTEVENTLOG: Performing DNS resolution for CISCO-LWAPP-CONTROLLER
%LWAPP-3-CLIENTERRORLOG: DNS Name Lookup: could not resolve CISCO-LWAPP-CONTROLLER
%LWAPP-3-CLIENTEVENTLOG: Controller address 10.108.50.20 obtained through DHCP
%LWAPP-5-CHANGED: LWAPP changed state to JOIN
%LWAPP-5-CHANGED: LWAPP changed state to CFG
%LWAPP-5-CHANGED: LWAPP changed state to DOWN
%LWAPP-5-CHANGED: LWAPP changed state to UP
%LWAPP-3-CLIENTEVENTLOG: AP has joined controller CCIETEST

Example 2 – WLC LWAPP Packet Debug indicates DHCP discovery

(Cisco Controller) > debug lwapp packet enable

Mon Feb 22 09:55:32 2010: Start of Packet
Mon Feb 22 09:55:32 2010: Ethernet Source MAC (LRAD): 00:17:DF:96:9F:90
Mon Feb 22 09:55:32 2010: Msg Type       :
Mon Feb 22 09:55:32 2010:    DISCOVERY_REQUEST
Mon Feb 22 09:55:32 2010: Msg Length     :   31
Mon Feb 22 09:55:32 2010: Msg SeqNum     :   0
Mon Feb 22 09:55:32 2010:
IE            :   UNKNOWN IE 58
Mon Feb 22 09:55:32 2010: IE Length     :   1
Mon Feb 22 09:55:32 2010: Decode routine not available, Printing Hex Dump
Mon Feb 22 09:55:32 2010: 00000000: 03                                       Mon Feb 22 09:55:32 2010:

In my next post, I will describe the controller selection process used by the wireless access point to determine which controller to establish a connection to when multiple controllers have been discovered.

Prior to selecting a controller, the access point always performs discovery using the methods listed above. From the discovery responses the AP builds a list of candidate controllers and selects the optimal controller.

Cheers,
Andrew

CAPWAP Split-MAC Architecture Overview

One of the key principles behind the LWAPP and CAPWAP protocol architecture is the notion of a split 802.11 media access control. Since the real processing power and smart feature set of the architecture is implemented in controllers, some functions need to be performed in the controller instead of the access point. This concept is called "Split-MAC" by Cisco and most other controller-based vendors.

The AP and controller are linked by the CAPWAP protocol using both a "control" channel for access point management, configuration, and control, and a "data" channel for forwarding of user traffic between the two entities in the cases where user traffic is tunneled all the way to the controller (central bridging). These two channels are nothing more than CAPWAP encapsulated UDP packets using port 5246 (control) and 5247 (data) since Cisco code version 5.2. Earlier versions of code used the LWAPP protocol, which was CAPWAP's predecessor, and use UDP ports 12223 (control) and 12222 (data).

It is important for wireless engineers designing, deploying, administering, and troubleshooting solutions using this type of architecture to understand the functions carried out by the controller versus the access point.

The industry is currently in a transition back to a de-centralized model, with local data bridging coming into higher demand as 802.11n data rates strain controller bandwidth capacity and branch offices struggle to cost-justify the additional expense of controllers. This is evident with the emergence of Cisco H-REAP, Aruba RAP, Motorola Adaptive APs, and taken to the extreme by Aerohive in their controller-less architecture. This trend will only continue, but engineers will still be required to fully understand the split-MAC concept even under these circumstances as the large vendors are likely to require centralized controllers for some control-plane functions.

The split-MAC functionality is divided between controller and AP in the following fashion:

Controller Responsibilities:

  • Security management (policy enforcement, rogue detection, etc.)
  • Configuration and firmware management
  • Northbound management interfaces
  • Non real-time 802.11 MAC functions
    • Association, Dis-Association, Re-Association
    • 802.11e/WMM Resource Reservation (CAC, TSPEC, etc.)
    • 802.1x/EAP Authentication
    • Encryption Key Management
  • 802.11 Distribution Services
  • Wired and Wireless Integration Services

Access Point Responsibilities:

  • Real-Time 802.11 MAC Functions
    • Beacon generation
    • Probe responses
    • Informs WLC of client probe requests
    • Power management and packet buffering
    • 802.11e/WMM scheduling and queuing
    • MAC layer data encryption and decryption
    • 802.11 control messages (ACK, RTS/CTS)
  • Data encapsulation and de-capsulation via CAPWAP
  • Fragmentation and re-assembly
  • RF spectral analysis
  • WLAN IDS signature analysis

In future posts, I detail how CAPWAP APs discover, select, join, and maintain association with a controller.

Cheers,
Andrew

Friday, July 2, 2010

Fragmentation in Controller Architectures

*Updated Dec. 2010 - Added CAPWAP information.


Why IP Fragmentation is a Concern on Wireless Networks
A big concern with controller-based architectures is reduced performance due to IP fragmentation due to tunnel overhead between the AP and controller. Typically, most lightweight APs tunnel all client traffic back to the controller through a layer 3 tunnel. This tunnel adds overhead, as the AP has to encapsulate the original client frame within a new IP packet.

Almost all vendors now offer options to locally switch (or bridge) the client frames into the wired network from the AP, without tunneling the frame back to the controller. Lookup Cisco H-REAP, Aruba RAP, Ruckus does this by default on all APs, Aerohive doesn’t even have a physical controller, you get my point.

Tunnel overhead reduces the amount of data available in the packet payload that can carry client data. If the client sends frames that are above a certain threshold, then adding the additional tunnel headers increases the packet size beyond the Ethernet MTU (1500 bytes unless using jumbo frames), resulting in IP fragmentation. Ultimately, this results in increased network latency and jitter, and decreased throughput.

Bad for data, worse for voice!

LWAPP Overhead
In Layer 3 LWAPP mode, additional IP, UDP, and LWAPP headers are added to encapsulate the original client 802.11 frame between the access point and controller.

LWAPP overhead for tunneled client frames is 52 bytes due to these additional frame headers which encapsulate the original frame taken out of the air:

§  18 – Ethernet header and checksum
§  20 – IP header (without IP options)
§  8 – UDP header
§  6 – LWAPP header (CAPWAP header is 8 bytes)

The contents of the LWAPP frame also contain the original 802.11 client data frame, with a few notable portions that are stripped out. Let’s add these into the tunneled frame contents:

§  24 – Original 802.11 MAC header, without QoS, encryption, or checksum fields
§  8 – 802.2 LLC and SNAP headers (3 LLC, 5 SNAP)
§  20 – Original IP header
§  Variable – Transport layer header (20 TCP, 8 UDP, 8 ICMP, etc.) and data payload



Notice that I said the whole 802.11 frame, not just the client’s data payload, get’s tunneled. The 802.11 MAC header is tunneled in order for the controller to be able to de-multiplex the original client frame, determine source and destination MAC addresses, forward the frame to appropriate destination host, and perform the non-real-time 802.11 MAC functions in the Split-MAC architecture.

Split-MAC Responsibilities
In Cisco’s LWAPP architecture, split-MAC is the term coined to differentiate what MAC functions get handled by the controller versus the access point.

Since the controller needs to handle some MAC functions, it needs the client’s 802.11 MAC header tunneled back to it to perform its responsibilities. Since the AP handles real-time MAC functions, such as QoS and encryption/decryption, those fields are stripped out of the header before tunneling the packet to the controller.

Controller Responsibilities:
§  Security management (policy enforcement, rogue detection, etc.)
§  Configuration and firmware management
§  Northbound management interfaces
§  Non real-time 802.11 MAC functions
o   Association, dis-association, re-association
o   802.11e/WMM resource reservation (CAC, TSPEC, etc.)
o   802.1x/EAP authentication
o   Key management
§  802.11 distribution services
§  Wired and wireless integration services

Lightweight Access Point Responsibilities:
§  Real-time 802.11 MAC Functions
o   Beacon generation
o   Probe responses
o   Informs WLC of probe requests via LWAPP control channel
o   Power management and packet buffering
o   802.11e/WMM scheduling and queuing
o   MAC layer data encryption and decryption
o   802.11 control messages (ACK, RTS/CTS)
§  Data encapsulation and de-capsulation to LWAPP
§  Fragmentation and re-assembly
§  RF spectral analysis
§  WLAN IDS signature analysis

LWAPP Fragmentation
Let’s take a look at how Cisco’s LWAPP protocol handles IP fragmentation in code versions prior to 5.2, when the switch to CAPWAP is made.

Layer 3 LWAPP is able to perform full fragmentation and reassembly of tunnel packets using standard IP fragmentation, thereby allowing client traffic to make use of a full 1500 byte MTU and not to have to adjust for any tunnel overhead. This simplifies administration of wireless clients, since you don’t have to touch them to modify MTU settings for the wireless adapter. Clients simply assume their IP packets have an MTU of 1500 bytes and move happily along. It’s not until the frame reaches the AP that fragmentation needs to occur because of the tunnel overhead.


* Note - CAPWAP includes fragmentation and re-assembly support and does not rely on the standard IP fragmentation. Therefore, in fragmented CAPWAP tunnel packets, fragmentation will be performed on the CAPWAP data, with each fragment duplicating the outer IP, UDP, and CAPWAP headers.

Adding up the tunnel overhead, we find that the threshold for the original client’s IP packet that causes fragmentation of the LWAPP IP packet on an Ethernet segment with a 1500 byte MTU is at a size of 1434 bytes. We get this by adding the tunnel headers (IP, UDP, LWAPP) and the tunneled 802.11 frame headers (802.11, LLC, SNAP), and subtracting those amounts from the 1500 byte MTU. This leaves only the original client IP packet which must be under a 1434 byte threshold to prevent fragmentation. The outer Ethernet header is not included because the MTU size does not consider the layer 2 MAC header, which is appended later.


(Threshold is 1432 bytes for CAPWAP encapsulated frames)

Here is a packet capture to illustrate this fragmentation. Remember, the outer IP header for the tunnel will handle the fragmentation, so the UDP and LWAPP headers will only be found in the first fragment.

Testing LWAPP Fragmentation
An easy way to test LWAPP fragmentation is to connect to the wireless network and ping the subnet gateway above and below the maximum ICMP payload size of 1406 bytes assuming a 1500 byte MTU. How we get to 1406 bytes for this test:

1500 byte MTU
     Subtract 34 bytes for LWAPP encapsulation (subtract 36 bytes for CAPWAP)
     Subtract 24 bytes for the tunneled 802.11 header
     Subtract 8 bytes for the tunneled 802.2 LLC/SNAP header
     Subtract 20 bytes for the tunneled IP header (without options)
     Subtract 8 bytes for the ICMP header
= 1406 bytes remaining for ICMP payload (use 1404 bytes for CAPWAP)

Setup a tap or port mirror on the switch port attached to the access point. Connect a workstation running a protocol analyzer to the mirrored port or tap, filter on port UDP 12,222 (LWAPP Data) or UDP 5247 (CAPWAP Data), and start capturing packets. 

From the wireless client, try out these commands and watch the results in the protocol analyzer.

Test 1
ping –f –l 1406 ip-address

Result will be successful, no fragmentation.

Test 2
ping –f –l 1407 ip-address

Result will be successful, but LWAPP tunnel packets will be fragmented. LWAPP tunnel packets do not carry over the original packet’s don’t fragment flag.

Test 3
ping –f –l 1473 ip-address

Result will fail, as original IP packet needs to be fragmented by the client but the don’t fragment flag is set.

Note
Many VPN client software packages will change the default interface MTU to a lower value. Verify the interface MTU to adjust test payload size properly. For example the Cisco Systems VPN client software for Windows XP modifies the MTU from 1500 bytes to 1300 bytes.

To view the current interface MTU:
§  Windows XP regedit
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\
interface-id\MTU

(if the key does not exist, it uses the default MTU of 1500 bytes)


§  Windows Vista/7 view interface MTU command is “netsh interfaces ipv4 show subinterfaces”

Preventing Fragmentation
Unfortunately, there is no method to prevent LWAPP fragmentation in the controller or access points until code version 6.0 with the CAPWAP protocol.

To avoid fragmentation of client packets enable TCP Maximum Segment Size (MSS) control on the controller and lightweight access points in code versions 6.0 and later.

This feature allows connected CAPWAP access points to re-write client TCP SYN packets with the configured maximum segment size to optimize large data transfers and prevent fragmentation when CAPWAP encapsulated. The valid range is from 536 to 1363 bytes, and it is recommended to use a 1363 byte value.

Configure it from the CLI using this command:
config ap tcp-adjust-mss { enable | disable } { all | ap-name } tcp_mss_value


- Andrew