Tuesday, February 21, 2012

Mesh Network Performance Impact

I've been "noodling" (yes, I scientific term, I assure you) on Wi-Fi mesh networking of late. There exists this "rule-of-thumb" that is often quoted in Wi-Fi circles that goes something like this:

For every mesh hop (after the first hop), throughput degrades by 50% [or more] per hop.

Not being the type to take statements like that at face value, I dug in and did a bit of research. The statement, on its face value, is very plausible. Given a single-radio mesh backhaul where all ingress and egress traffic must share the same frequency, throughput will degrade because each frame transmission must be repeated to the next hop on the same channel.

Note - I will not discuss multi-radio mesh backhaul solutions that use separate radios and frequencies for simultaneous ingress and egress transmission within the mesh network. These solutions are more mature and do not result in such dramatic throughput degradation.

This is very logical, but in my estimation, also a bit misleading.

My line of thought went something like this:

Yes, the 2nd mesh hop results in a 50% throughput degradation because each frame must be transmitted a second time across the backhaul from the 1st hop to the 2nd hop. Hence 50% loss.

But then we get to larger mesh networks, which have 3 or more hops. If a frame must be transmitted across 3 hops with all backhaul radios on the same frequency, then the total throughput should be 1/n, where 'n' represents the number of mesh hops the frame must traverse. The corresponding throughput degradation is 1 - 1/n. This results in 1/3 or 33% of original throughput, resulting in 66% degradation.

However, the rule-of-thumb dicatates that throughput degradation in a 3 hop mesh network would be 50% to 2nd hop and another 50% to the 3rd hop, net 1/4 or 25% of original throughput, resulting in 75% degradation.

This is the gap in analysis that prompted my research. What should engineers be using as planning values for single-radio backhaul mesh networks? 50% loss per hop, or 1 - 1/n loss based on the total number of hops a frame must traverse?

Both Calculations are Valid [based on your assumptions]
After researching it, both throughput models are technically accurate based on assumptions made during the analysis and design.

Single-Radio Mesh Backhaul Bandwidth Degradation
(courtesy of Strix Systems)

The best-case scenario is 1 - 1/n degradation. This assumes the mesh hops are deployed linearly and can only hear their two neighboring mesh nodes, one upstream and one downstream.

The worst-case scenario is 1 - 1/( 2^(n-1) ) degradation, which is 50% loss per hop (our rule-of-thumb). This assumes that the mesh network is designed as more of a web of mesh nodes that can hear three, four, or more neighboring nodes for failover and self-healing capabilities.

Using either method, it's clear that single-radio backhaul mesh networks result in substantial throughput degradation and should only be used when alternatives such as wired networking or traditional wireless access point deployments cannot be used.

Best-Case Mesh Throughput Degradation
(courtesy of Strix Systems)

Although the single-radio mesh backhaul rule-of-thumb gives the desired impression of performance degradation when mesh solutions are discussed, it should be used cautiously for explaining mesh performance issues from a conceptual point of view.

However, for real-world network planning, use either method as a starting point, and then verify through testing in your environment.

Know you know :)



  1. Andrew,

    It could potentially be even worse... what if multiple mesh nodes could see each other on the same channel - whenever one talks... since they are all in a contention domain - they'd all have to stop and wait for the packet to finish.

    WIth single radio Mesh - after one hop, it just starts to get ugly.

    By the way - thanks for linking to my blog post.


  2. Hi Keith,
    From what I have read, my understanding is that the "worst-case" scenario listed above, with 50% per-hop degradation, accounts for multiple mesh nodes using the same backhaul channel.

    Now, that doesn't include other variables that could decrease performance further, such as frame overhead, fragmentation, re-transmissions, etc.

    But, I have only deployed small mesh networks, usually for "coverage" not "capacity". So, I've never done rigorous testing of these figures. If you have real-life examples, I would love to hear about your experiences!