Pliops claims its XDP LightningAI card and FusIOnX software accelerate large language model inference by offloading context data to SSDs, reducing redundant computation, and boosting vLLM throughput by up to eight times while avoiding the need for additional GPUs.
Go to Source
Author: