Cloud Computing

Spotnik: Designing Distributed Machine Learning for Transient Cloud Resources

To achieve higher utilisation, cloud providers offer VMs with GPUs as lower-cost transient cloud resources. Transient VMs can be revoked at short notice and vary in their availability. This poses challenges to distributed machine learning (ML) jobs, …

Towards a Network Marketplace in a Cloud

Virtually all public clouds today are run by single providers, and this creates near-monopolies, inefficient markets, and hinders innovation at the infrastructure level. There are current proposals to change this, by creating open architectures that …

Exploiting Time-malleability in Cloud-based Batch Processing Systems

Existing cloud provisioning schemes allocate re- sources to batch processing systems at deployment time and only change this allocation at run-time due to unexpected events such as server failures. We observe that MapReduce-like jobs are time- …