On the Edge with AI Edge AI reduces network latency, improves user experience and mitigates risk for critical systems. But one major source of performance overhead doesn’t go away, and often gets worse!

Introduction

Edge AI is enabling new capabilities by embedding AI functionality directly into smartphones, IoT devices, industrial machines, robots, vehicles, and servers located where the data is generated and used. This allows immediate feedback based on real-time interaction with the local environment, eliminating the latency involved in transmitting data to a distant cloud for processing.

Most of us already use edge AI. For example, virtual assistants like Amazon Alexa rely on it to respond quickly to voice commands. When you ask Alexa to turn on the lights or play music, the request is processed locally on the device to minimize delay. The device only sends information to the cloud when needed.

In addition to reducing latency and risk, other key advantages of edge AI include:

Improved Privacy and Security – Since data can be processed locally without being transmitted over the internet, edge AI offers enhanced data privacy and security, and reduces vulnerabilities from potential data breaches of cloud-based systems. This is particularly important for sensitive applications like healthcare or personal data management.
Reduced Bandwidth Requirements – By minimizing the need to send large volumes of data to the cloud, Edge AI reduces bandwidth requirements in areas with limited or expensive internet connectivity.
Operation in Remote Areas – Edge AI systems can function with limited or no connectivity, enabling them to operate continuously. This is critical for remote locations or sensitive real-time systems where an internet connection isn’t always available or advisable.
Energy and Cost Efficiency – Processing data locally can be more energy-efficient and cost-effective, especially for large-scale deployments of IoT devices.
Scalability – As the number of AI-enabled devices proliferates, edge AI will be crucial for scaling up and managing all the data and models in a sustainable way. Sending all data to the cloud is not feasible long-term, though edge AI applications may still rely on the cloud for some degree of processing and model training, particularly in hybrid cloud/edge deployment scenarios.

In theory, edge deployments should also help to minimize the performance impact of packet delay variation (PDV), more commonly referred to as jitter, since fewer hops are needed between different points on the network. However, theory and reality don’t always line up. In some ways, jitter can be even more prevalent at the edge than in a centralized cloud. There are four main reasons for this: (1) the behavior of AI and other applications typically deployed at the edge; (2) the application architectures employed; (3) the virtualized server environments often used for deployment; and (4) the nature of the Wi-Fi, 5G and satellite networks edge AI applications typically rely on.

Edge AI Application Behavior and Architecture

Real-time and near real-time IoT and other mobile device applications running at the edge are jitter generators. They can transmit data in unpredictable bursts with variable payload sizes, resulting in irregular transmission and processing times. These effects are multiplied as devices move around, and more devices are added to a network.

Moreover, many edge AI applications are comprised of containerized microservices distributed across multiple servers at cloud and edge locations in a hybrid deployment model – some components of the AI application run on the edge to perform initial analysis and ensure fast response times, while other components are deployed in the cloud for tasks that require more processing power or storage capacity. The edge can handle immediate local processing needs, while the cloud provides the resources for scaling up the application. A practical example would be a video surveillance system where initial video processing and analysis like motion detection is done at the edge, while long-term storage, data aggregation, and more complex analysis such as facial recognition is conducted in the cloud.

While the hybrid model makes sense for many AI applications, network hops between cloud and edge environments also go up. In addition, some AI application models adapt in real-time to improve responses based on new data and interactions as they occur. This leads to unpredictable changes in packet transmission rates caused by frequent synchronization of data models and configurations across edge and cloud components to maintain consistency and reliability.

Virtualization’s Impact

Jitter resulting from AI application behavior is compounded by the virtualized server environments these applications often run in at the edge as well as the cloud.

Competition between hosted applications for virtual and physical CPU, memory, storage and network resources creates random delays. This resource competition also drives VM scheduling conflicts and hypervisor packet delays that don’t necessarily go away when applications are container-based, since containers are often deployed in VMs for security and manageability. In addition, data movement between virtual and physical subnets relies on cloud network overlays such as VXLAN and GRE that introduce packet encapsulation/decapsulation delays, adding still more jitter.

Edge AI Networks

Another contributing factor is the nature of the networks edge AI applications typically use. Wi-Fi networks at the edge are often subject to fading and RF interference that result in jitter. In addition, edge AI applications increasingly use 5G networks to take advantage of the high data volumes and low latency they support. 5G’s smaller cells, higher frequencies and mmWave technology have poorer propagation characteristics than LTE, causing signals to fade in and out. 5G signals also require a clear line-of-sight path between transmitter and receiver. Any obstacle can cause signals to be reflected, refracted, or diffracted, resulting in multiple signal paths with different lengths and different transmission times, leading to variation in packet delivery times.

5G networks can use various technologies to address these sources of jitter, such as beamforming, which directs the signal more precisely towards the receiver, and MIMO (Multiple Input Multiple Output), which uses multiple antennas at both the transmitter and receiver endpoints to improve signal quality and reduce the effects of multipath interference. However, these technologies only mitigate jitter’s impact, they don’t eliminate it.

Moreover, 5G’s small cell architecture has much heavier infrastructure requirements than LTE. This has driven many network providers to the cloud to reduce costs. However, the shift to cloud-native 5G networks adds virtualization jitter described above, on top of jitter driven by 5G’s architecture.

In addition, Starlink and other satellite networks used at edge locations that lack reliable terrestrial network connectivity can experience jitter caused by:

Propagation Delay – Satellite communication involves signals traveling long distances, typically from the Earth’s surface to satellites in orbit and back. The sheer distance these signals must cover inherently introduces a significant propagation delay.
Atmospheric Conditions – Variations in atmospheric conditions, such as rain, snow, or heavy clouds, can affect signal strength and quality, leading to fluctuations.
Network Congestion – High traffic volumes can cause delays and packet loss, contributing to jitter. This is especially relevant for satellite networks like Starlink, which aim to provide high-speed internet access to large numbers of users, including those in remote and underserved areas.
Satellite Handovers – As satellites move across the sky, user terminals may need to switch communication from one satellite to another. These handovers can introduce variations in latency.
Physical Obstructions – Obstacles like buildings, trees, or terrain can interfere with the line-of-sight needed for satellite communication, causing signal disruptions and variations in latency.

Jitter’s Serious Knock-On Effect

Jitter has a far more serious knock-on effect on network throughput and application performance than the latency-inducing random delays outlined above. This knock-on effect can render AI applications, especially those requiring real-time or near-real-time responsiveness virtually unusable, and even dangerous if critical systems are involved.

TCP, the network protocol widely used by applications that require guaranteed packet delivery, and public cloud services such as AWS and MS Azure, consistently treats jitter as a sign of congestion. To prevent data loss, TCP responds to jitter by retransmitting packets and throttling traffic, even when plenty of bandwidth is available. Just modest amounts of jitter can cause throughput to collapse and applications to stall. And not only TCP traffic is affected. For operational efficiency, applications using TCP generally share the same network infrastructure and compete for bandwidth and other resources with applications using UDP and other protocols. More bandwidth than would otherwise be needed is often allocated to applications using TCP to compensate for its reaction to jitter, especially under peak load. This means bandwidth that could be available for applications using UDP and other protocols is wasted, and the performance of all applications sharing a network suffers.

This response to jitter is triggered in the network transport layer by TCP’s congestion control algorithms (CCAs). However, the solutions network administrators typically use to address performance problems caused by jitter, like increasing bandwidth, and using jitter buffers, have no impact on how TCP’s CCAs respond to jitter, and in some cases make the response worse:

Increasing bandwidth is just a temporary fix; as network traffic grows to match the added bandwidth, the incidence of jitter-induced throughput collapse increases in tandem, leading to yet another round of increasingly expensive and disruptive upgrades.
Jitter buffers, commonly used to mitigate jitter’s effect on network and application performance can sometimes exacerbate the issue. Jitter buffers work by reordering and realigning packets for consistent timing before delivering them to an application. However, packet reordering and realignment introduces additional, often random delays, which can worsen jitter and negatively impact performance for real-time applications like live video streaming.
QoS techniques can offer some benefit by prioritizing packets and controlling the rate of data transmission for selected applications and users. But performance tradeoffs will be made, and QoS does nothing to alter TCP’s behavior in response to jitter. In some cases, implementing QoS adds jitter, because packet prioritization can create variable delays for lower priority application traffic.
TCP optimization solutions that do focus on the CCAs rely on techniques such as increasing the size of the congestion window, using selective ACKs, adjusting timeouts, etc. However, improvements are limited, generally in the range of 10-15%, because these solutions like all the others don’t address the fundamental problem – TCP’s CCAs have no ability to determine whether jitter is due to congestion, or other factors like application behavior, virtualization, or wireless network issues.

Apparently, overcoming TCP’s consistent treatment of jitter as a sign of congestion is not a trivial problem to overcome. MIT Research recently cited TCP’s CCAs as having a significant and growing impact on network performance because of their response to jitter, but was unable to offer a practical solution.¹ TCP’s CCAs would have to be modified or replaced to remove the bottleneck they create due to their inability to differentiate between jitter caused by congestion versus other factors. However, to be acceptable and scale in a production environment, a viable solution can’t require any changes to the TCP stack itself, or any client or server applications that rely on it. It must also co-exist with ADCs, SD-WANs, VPNs and other network infrastructure already in place

There is A Proven and Cost-Effective Solution

Only Badu Networks’ patented WarpEngine^TM carrier-grade optimization technology meets the key requirements outlined above for eliminating jitter-induced throughput collapse. WarpEngine’s single-ended transparent proxy architecture means no modifications to client or server applications or network stacks are required. It works with existing network infrastructure, so there’s no rip-and-replace. WarpEngine determines in real-time whether jitter is due to congestion, and prevents throughput from collapsing and applications from stalling when it’s not. As a result, bandwidth that would otherwise be wasted is recaptured. WarpEngine builds on this with other performance and security enhancing features that benefit not only TCP, but also GTP, UDP and other traffic. These capabilities enable WarpEngine to deliver massive network throughput improvements ranging from 2-10x or more for some of the world’s largest mobile network operators, cloud service providers, government agencies and businesses of all sizes ², at a small fraction of the cost of upgrades. In large part, WarpEngine’s performance improvements are so massive because of the huge and rapidly growing impact jitter is having on today’s network environments, coupled with the lack of viable alternatives.

WarpEngine can be deployed at core locations as well as the network edge as a hardware appliance, or as software that can be installed on the server provided by the customer/partner. It can be installed in a carrier’s core network, or in front of hundreds or thousands of servers in a corporate or cloud data center. WarpEngine can also be deployed at cell tower base stations, or with access points supporting public or private Wi-Fi networks of any scale. Enterprise customers can implement WarpEngine on-prem with their Wi-Fi access points, or at the edge of their networks between the router and the firewall for dramatic WAN, broadband and FWA throughput improvements.

WarpVM^TM, the VM form factor of WarpEngine, is designed specifically for cloud and virtualized edge environments where AI and other applications are deployed. WarpVM installs in minutes in AWS, Azure, VMWare, or KVM environments. WarpVM has also been certified by Nutanix^TM , for use with their multicloud platform, achieving similar performance results to those cited above.³AHV enables virtualization for Nutanix’s multicloud platform, and supports their recently announced GPT-in-a-Box^TM AI solution.

Conclusion

As AI, IoT, AR, VR and similar applications combine with 5G and other new network technologies to drive innovation and transformation at the edge, jitter-related performance issues will only grow. WarpEngine is the only network optimization solution that tackles TCP’s reaction to jitter head-on at the transport layer, and incorporates other performance enhancing features that benefit not only TCP, but also GTP, UDP and other traffic. By deploying WarpEngine in the form factor that best suits your use case, you can ensure your edge AI applications and the networks they rely on always operate at their full potential.

To learn more and request a free trial, click the button below.

Request Free Trial

Notes

1. Starvation in End-to-End Congestion Control, August 2022: https://people.csail.mit.edu/venkatar/cc-starvation.pdf

2. Badu Networks Performance Case Studies: https://www.badunetworks.com/wp-content/uploads/2022/11/Performance-Case-Studies.pdf

3. https://www.nutanix.com/partners/technology-alliances/badu-networks

On the Edge with AI

Edge AI reduces network latency, improves user experience and mitigates risk for critical systems. But one major source of performance overhead doesn’t go away, and often gets worse!

About Badu

Blog Categories

Archives