Tuesday, May 24, 2011

Aerohive Credential Caching Improves Branch Office Availability

The Necessity for Highly Available WLANs
Wireless LANs are mission-critical, and have been for a while. More so than ever, the availability of the wireless network is important to operate business, educate students, and interact with consumers. Organizations across many industries have realized the benefits of that mobility, users increasingly expect ubiquitous network access, and machine-to-machine (M2M) communications are poised to grow the demand exponentially. In such an atmosphere, the wireless network must be high performing, resilient, adaptable, and highly available.

Many organizations provide highly available WLANs by using pre-shared key (PSK) security since there is no reliance on external systems for network access. EAP authentication provides higher security than PSK deployments, but suffers from higher complexity and reliance on AAA services external to the WLAN for network access. Modern wireless networks typically rely on RADIUS as this service. However, deploying local authentication services at each remote site can be cost prohibitive due to software licensing and server resources. This, coupled with higher latency involved with EAP authentication across a WAN circuit, typically leads many organizations to adopt central AAA services for non-mission critical applications, and forgo stronger EAP security and implement PSK for mission-critical applications such as VoWiFi or transaction processing. This results in a trade-off of security for availability, which is sub-optimal and introduces a much higher amount of risk to the organization.

Can't we have both high security using EAP with dynamic per-user authentication and keying coupled with the high availability of PSK networks? We can, using Aerohive credential caching!

Credential Caching Overview
Aerohive authentication cache provides remote branch offices the ability to cache successful client authentications from central directory services for use when the central site or services are unavailable, such as a WAN outage. This provides high availability of the WLAN for remote sites, enabling continuity of service for locally deployed applications, collaboration among users, and local Internet access (if deployed locally at branches and not tunneled back to the corporate head-end).

It does this by providing local RADIUS service within select HiveAPs and integrating with corporate directory services, including Active Directory, Open Directory, and native LDAP systems. In this scenario, the HiveAP provides EAP authentication for clients and functions as both the authenticator and authentication server simultaneously. EAP types supported include PEAP, EAP-TLS, EAP-TTLS, and LEAP. On the back-end the HiveAP is configured with directory access credentials to authenticate users and cache their login information locally from the directory server for offline use (similar to a Windows computer object joined to the domain). This may include Kerberos v5 if using Active Directory as the back-end directory service.

Aerohive Credential Caching Logical Diagram with Central Directory Services


Note - I know some of you will be wondering why I'm not discussing Private PSK (PPSK) or Dynamic PSK (DPSK) offered by Aerohive and Ruckus. While these are novel approaches for small scale solutions involving user interactive device platforms, these solutions cannot address many deployment scenarios involving embedded devices or non-user devices, as is typical with voice handsets and vendor transaction processing systems. A more robust and scalable solutions is required for such scenarios.

Deployment Requirements
To deploy credential caching, administrators must configure a the following settings on HiveAPs at the branch sites:
  1. Determine the remote HiveAPs that will provide local AAA server (RADIUS) services
  2. Assign static IP addresses to the HiveAPs providing AAA services
  3. Create certificate(s) for HiveAP AAA servers to use during EAP authentication
  4. Integrate the HiveAP AAA servers with user directory services
  5. Create the HiveAP AAA server (RADIUS) configuration
  6. Create AAA client settings for all other NAS client APs, which point to the HiveAP AAA server(s) for RADIUS authentication
  7. Deploy configurations to all HiveAPs
First, determine which HiveAPs at each remote site will be AAA servers providing RADIUS and EAP termination. One or multiple HiveAPs may provide AAA services for other HiveAPs. Multiple APs are recommended to provide fault tolerance at each branch location.

Second, the selected HiveAPs must be configured with static IP addresses since they will be providing services that other NAS client APs will rely on for user authentication. Assign static IP addresses from within individual AP configuration settings in the "Interface and Network Settings" section (be sure to also configure DNS servers in the WLAN profile when using static IP addresses to ensure name resolution for HiveManager discovery).

Next, create one or multiple RADIUS server-side certificates for use by the HiveAP AAA servers. The server-side certificates are required during TLS tunnel setup between the RADIUS server and client during EAP authentication. This can be accomplished by creating a Certificate Signing Request (CSR) either from HiveManager or using an external utility such as OpenSSL. Using HiveManager is a simple process; navigate to Advanced Configuration > Keys and Certificates > Server CSR. Fill out the form and click "Create". 

Then send off the CSR file to the Certificate Authority for verification and certificate creation. Once the CA has issued the certificate, navigate to Advanced Configuration > Keys and Certificates > Certificate Mgmt to import the certificate into HiveManager. The private key file will automatically be generated with the CSR and listed in this section. No manual merging of the private key and issued certificate are required, but may be performed if desired. Also import the CA certificate to send the complete certificate chain to clients during outer TLS tunnel establishment. It will also be used to verify trust of client certificates if using EAP-TLS authentication. The certificate, private key, and CA certificate files will be pushed to HiveAP AAA servers later during the configuration.

Aerohive Certificate Management, Showing the Private Key,
CA Certificate, and Issued Certificate Files

Integration with AAA user directory services is accomplished in the Advanced Settings > Authentication > AAA User Directory Settings section. In this section, you will define a profile that HiveAP AAA servers will use to query a back-end directory to authenticate users. This may include Active Directory, Open Directory, or native LDAP. In this example we will use Active Directory since this is very common. Create a new profile to get started.

Aerohive HiveManager AAA User Directory Settings
Select the directory type from the option buttons and select or define a new IP address / host name network object for the Directory Server. Multiple directory servers may be configured for redundancy.

For Active Directory integration, specify the default domain in which the HiveAP computer object, AD server, and user objects to be authenticated reside (select the root domain of the forest). If users in multiple domains need to be supported, configure the "Multiple Domain Info" section at the bottom. Configure the BindDN account (user object) that HiveAP AAA server(s) will use when authenticating itself to the directory server to lookup user accounts. This allows the HiveAP RADIUS service to authenticate wireless users. Additionally, in order to perform credential caching, configure the Admin User account which is used by the HiveAP to login to Active Directory and add itself as a computer object in the domain or computer OU specified. This allows the HiveAP to cache user NTLM credentials locally in access point RAM for use when the directory server(s) are not available. The admin user specified must have rights to add computer objects into the domain or OU specified.

Note - Using this method, the admin user account information is stored in HiveManager and in the flash configuration file of HiveAPs. If this is not desirable, HiveAPs may be joined individually to the domain from the CLI which does not get stored in any configuration files. Issue the following command: "exec aaa net-join { primary | backup1 | backup2 | backup3 } username USER password PASSWORD".

Next, create a HiveAP AAA Server profile for the access points providing RADIUS service from the Advanced > Authentication > HiveAP AAA Server Settings section. Here you will define what EAP methods are supported, the certificate files to be used during EAP, database access settings, and NAS clients. For the database access settings, select the configured directory service type and previously configured AAA user directory profile. Enable RADIUS Server Credential Caching and set a cache lifetime which determines how long cached credentials will be used when the directory server(s) are not available. The remaining three interval timers determine how long an individual directory server is marked down before being retried (default 600 sec.), how long to use the local database if all directory servers are down before retrying (default 300 sec.), and how long to keep retrying an unresponsive remote directory server in the list before moving on to the next server in the list (default 30 sec.).

Enable Credential Caching in the HiveAP AAA Server Settings

In the NAS settings section be sure to configure the IP address of every NAS client AP  that needs to use the HiveAP RADIUS server to authenticate clients, or configure a IP network object for broader access by multiple APs in the same subnet range.

Finally, create a AAA Client Settings profile which will be deployed to all of the other NAS client APs instructing them to contact the HiveAP AAA Server(s) for RADIUS authentication services. This is configured from the Advanced > Authentication > AAA Client Settings section.

Once all configuration is complete, assign the HiveAP AAA Server Settings profile to APs designated to be RADIUS servers from within individual AP configuration settings in the "Service Settings" section. Also embed the AAA Client Settings profile within SSID profile(s) as the assigned RADIUS server(s), and deploy configurations to all APs.

Verification
Once the configuration for HiveAP RADIUS servers and HiveAP NAS clients have been deployed to access points, verify directory integration by reviewing access point and directory server logs. Logs in both locations will show successful bind to the LDAP server and computer object creation.

Active Directory Logs Showing Successful Bind, and HiveAP Computer Object Creation

Finally, verify correct credential caching operation by performing client authentication while directory services are available and then re-testing once directory services are unavailable using the local AP cache. Issue the "show auth" command to view currently authenticated clients as well as cache entries created. Below we can see one current user session followed by an entry for the same client in the local cache table.

HiveAP02#show auth
Authentication Entities:
if=interface; UID=User profile group ID; AA=Authenticator Address;
idx=index; PMK=Pairwise Master Key; PTK=Pairwise Transient Key;
GMK=Group Master Key; GTK=Group Transient Key;

if=wifi1.1; idx=11; AA=0019:7729:57a0; SSID=HiveMind; default-UID=0;
PTK-rekey=0; GTK-rekey=0; GMK-rekey=0; strict=no; preauth=no; replay-window=0;
proactive-pmkid-response=disabled;
Reauth-period=n/a; PTK-timeout=4000; PTK-retry=4; GTK-timeout=4000; GTK-retry=4;
Local-cache-timeout=86400
Protocol-suite=WPA2-AES-802.1X;

No. Supplicant UID PMK PTK Life State Reauth-itv Cipher User-Name
--- -------------- --- ----- ----- ----- ------- ---------- --------- --------------------
0 0019:7e9d:871d 0 4f76* ec43* -1 done 0 WPA2/CCMP HiveUser

Local Cache Table:
No. Supplicant UID PMK PMKID Life TLC User-Name
--- -------------- --- ----- ----- ----- ----- --------------------
0 0019:7e9d:871d 0 4f76* 0850* -1 86232 HiveUser


Issue the same command after disconnecting the client and the current session entry will be removed, but the local cache table entry will remain (up to the configured cache lifetime).

HiveAP02#show auth
Authentication Entities:
if=interface; UID=User profile group ID; AA=Authenticator Address;
idx=index; PMK=Pairwise Master Key; PTK=Pairwise Transient Key;
GMK=Group Master Key; GTK=Group Transient Key;

if=wifi1.1; idx=11; AA=0019:7729:57a0; SSID=HiveMind; default-UID=0;
PTK-rekey=0; GTK-rekey=0; GMK-rekey=0; strict=no; preauth=no; replay-window=0;
proactive-pmkid-response=disabled;
Reauth-period=n/a; PTK-timeout=4000; PTK-retry=4; GTK-timeout=4000; GTK-retry=4;
Local-cache-timeout=86400
Protocol-suite=WPA2-AES-802.1X;

Local Cache Table:
No. Supplicant UID PMK PMKID Life TLC User-Name
--- -------------- --- ----- ----- ----- ----- --------------------
0 0019:7e9d:871d 0 f7ad* b08f* -1 86386 HiveUser


Finally, disable the back-end directory services and re-test client authentication using the credential cache on the access point. The following log output is in reverse chronological order, and shows the HiveAP AAA server unable to bind to the directory server at 10.22.33.44, then successfully authenticating the client "HiveUser" via RADIUS using the local credential cache from the NAS client at 10.108.30.18, which is its own IP address.

2011-05-10 10:11:20 notice ah_rt_sta_modify: 0019:7e9d:871d(ip=10.108.30.96)
2011-05-10 10:11:20 notice Station 0019:7e9d:871d is authenticated to 0019:7729:57a0 thru SSID HiveMind
2011-05-10 10:11:18 info [mesh]: set proxy : 0019:7e9d:871d 0019:7729:5780 wifi1.1 flag 0x1c03
2011-05-10 10:11:18 info set proxy route: 0019:7e9d:871d -> 0019:7729:5780 ifp wifi1.1 upid 0 flag 0x1c03 monitor(0/0) pkt/sec ok
2011-05-10 10:11:18 info receive event STA JOIN: 0019:7e9d:871d associate wifi1.1 upid 0 vlan 1 flag 0x00000000
2011-05-10 10:11:18 notice ah_rt_sta_add: 0019:7e9d:871d(ip=10.108.30.96)
2011-05-10 10:11:18 info [Auth]STA(0019:7e9d:871d) login to SSID(wifi1.1) by user_name=HiveUser
2011-05-10 10:11:18 notice ah_auth_radius_check_nas_ip: updated nas_ip(10.108.30.18)
2011-05-10 10:11:18 info radius_msg_get_vendor_attr: select vhdr(type=17, len=52)
2011-05-10 10:11:18 info radius_msg_get_vendor_attr: select vhdr(type=16, len=52)
2011-05-10 10:11:18 info RADIUS: The RADIUS server accepted user 'HiveUser' through the NAS at 10.108.30.18.
2011-05-10 10:11:18 info RADIUS: eap auth for STA=0019:7e9d:871d user=HiveUser successfully with type peap
2011-05-10 10:11:18 warn RADIUS: User 'LDAPLAB\HiveUser' do LDAP_search under baseDN 'dc=ldaplab,dc=abccompany,dc=com' from server '10.22.33.44' failed
2011-05-10 10:11:18 warn rlm_ldap: (re)connection attempt failed
2011-05-10 10:11:18 err rlm_ldap: cn=HiveAPBind,ou=people,dc=ldaplab,dc=abccompany,dc=com bind to 10.22.33.44:636 failed: Can't contact LDAP server
2011-05-10 10:11:16 notice ah_auth_radius_check_nas_ip: updated nas_ip(10.108.30.18)
2011-05-10 10:11:16 info RADIUS: eap auth for STA=0000:0000:0000 user=HiveUser successfully with type mschapv2
2011-05-10 10:11:16 warn RADIUS: User 'LDAPLAB\HiveUser' do LDAP_search under baseDN 'dc=ldaplab,dc=abccompany,dc=com' from server '10.22.33.44' failed
2011-05-10 10:11:16 warn rlm_ldap: (re)connection attempt failed
2011-05-10 10:11:16 err rlm_ldap: cn=HiveAPBind,ou=people,dc=ldaplab,dc=abccompany,dc=com bind to 10.22.33.44:636 failed: Can't contact LDAP server

Therefore, using the Aerohive credential caching feature, EAP authentication services can be maintained during a WAN outage or central site service disruption. This feature bridges the gap for branch sites, allowing continued WLAN network access by clients during temporary service disruption. The cache lifetime setting dictates the tolerable duration of a service outage.

Revolution or Evolution? - Andrew's Take
Industries from retail, education, hospitality, healthcare, and transportation are relying on mission-critical wireless networks to operate business. They expect a highly secure and highly available network at reasonable cost and with minimal complexity. And they won't tolerate trade-offs between these features. The status quo from most WLAN vendors is to provide basic RADIUS integration for client authentication. Aerohive has gone further, integrating native LDAP and Kerberos functionality which provides user credential caching enabling a highly available WLAN network without compromising security to get there.

Aerohive isn't rewriting the book on RADIUS, LDAP or Kerberos. These are existing, mature protocols. However, Aerohive has applied these features in a new and unique way that can dramatically improve WLAN availability and provides tremendous benefits for distributed organizations with branch offices.

Cheers,
Andrew


Other Posts You Might Like:

6 comments:

  1. Andrew,

    Thank you for blogging about Aerohive's LDAP credential caching. Aerohive is best-known for our controller-free architecture, however, the credential caching at the edge is a feature that also differentiates us from other Wi-Fi vendors. Credential caching has become very popular with Aerohive customers with multiple branch locations. I will be referring customers and partners alike to your excellent tutorial.

    Thanks so much!

    ReplyDelete
  2. Joe Fraher - Manager of Tech Pubs - AerohiveMay 26, 2011 at 10:23 AM

    Seriously, Andrew, if you are interested, there is currently an open position in our tech pubs group: http://www.aerohive.com/company/careers/STWI_us.html. :) Great explanation!

    ReplyDelete
  3. Motorola claims to have RADIUS caching of credentials for quite a long time. Can you run a comparison please?

    ReplyDelete
  4. is there diference between airohive Credential Caching and cisco WDS (cisco SWAN http://www.cisco.com/en/US/docs/wireless/technology/swan/deployment/guide/swandg.html) ?

    whole Airohive architature looks to me like SWAN 2.0,

    ReplyDelete
  5. Yes, there are similarities to the Cisco SWAN architecture. However, the Aerohive Cooperative Control architecture and related protocols go much further than Cisco SWAN. BTW - Cisco SWAN is also no longer a customer shipping product, because the WLSE is end of life.

    Here are a few differences:
    - Aerohive CC runs on every AP and dynamically discovers other nearby APs dynamically, both over the air and over the wire. Cisco WLCCP protocol required manual group definition and discovery only over the wire on the same Layer 2 VLAN.

    - Aerohive CC incorporates dynamic tunneling between APs for Layer 3 client roaming. Cisco SWAN provides no such service. Cisco tried to create such a service with their WLSM module, but it never took off and has been end of support for a long time now.

    - Aerohive CC implements many other control features that are shared between all APs discovered in the same Hive, including client session state, client QoS policy, client security policy, automatic radio channel and power coordination, and fast roaming (PMK, OKC). Cisco SWAN does not perform most of these tasks, the only one that comes close is dynamic radio management with the WLSE, but it is not nearly as robust or real-time.

    I'm sure that I'm missing some other key differences, but you get the idea. I'll let some of the Aerohive folks comment on items I may have missed :)

    All in all, Cisco SWAN was actually a really innovative idea. Unfortunately, Cisco got caught up in the controller craze of the early 2000's and dropped SWAN product development. It will be interesting to see what Cisco does, since the market is moving back to intelligent edge access points.

    Cheers,
    Andrew

    ReplyDelete
  6. AndrewHive that has a nice BUZZ to it! :-)

    ReplyDelete