Towards On-the-Fly Snapshot Memory Compression for Low-Latency Elastic Inference Serving Systems

Recommended citation: Radostin Stoyanov, Viktória Spišaková, Adrian Reber, Andrei Vagin and Rodrigo Bruno
6th Workshop on Machine Learning and Systems (EuroMLSys)

Direct Link