Engine-Agnostic Model Hot-Swapping for Cost-Effective LLM Inference
Radostin Stoyanov, Viktória Spišaková, Adrian Reber, Wesley Armour, Marcin Copik and Rodrigo Bruno
7th International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC)
DOI | Paper