Distributed vLLM — Multi-Node LLM Serving on DGX Spark and Jetson Thor

Published: December 01, 2025

Guide to running vLLM across multiple nodes including DGX Spark and Jetson Thor systems for distributed large language model serving.

Full tutorial: Distributed vLLM

What You’ll Learn

Setting up a Ray cluster across DGX Spark and Jetson Thor nodes
NCCL environment variable configuration for multi-node communication
Serving large models like Nemotron Super 120B across distributed GPU resources
Network configuration and performance optimization

Share on

Twitter Facebook LinkedIn