Skip to main content

One post tagged with "nvidia-mgx"

View All Tags

Delivering Distributed AI at the Edge with Bridge

· 6 min read
Amar Kapadia
Amar Kapadia
Product
Sriram Rupanagunta
Sriram Rupanagunta
Engineering

All AI is not created equal. While centralized inference serves some use-cases well where long thinking times are acceptable, new use cases such as physical AI, real-time agentic AI chatbots, digital avatars doing real time dialog, and computer vision require faster response times. It is not just about network latency, but compute latency becomes important, mandating computation closer to data sources, and lower bandwidth usage across the network in order to scale cost effectively.

These applications can't tolerate the latency of round trips to centralized data centers nor can they afford the cost of constantly transferring large volumes of data. Instead, they require inference that is geographically distributed, dynamically orchestrated, and tightly optimized for latency and bandwidth.