DeepSpeed
DeepSpeed enables world's most powerful language models like MT-530B and BLOOM. It is an easy-to-use deep learning optimization software suite that powers unprecedented scale and speed for both training and inference. With DeepSpeed you can: Train/Inference dense or sparse models with billions or trillions of parameters Achieve excellent system throughput and efficiently scale to thousands of GPUs Train/Inference on resource constrained GPU systems Achieve unprecedented low latency and high throughput for inference Achieve extreme compression for an unparalleled inference latency and model size reduction with low costs.