Efficient Replication of Queued Tasks for Latency Reduction in Cloud Systems

Author name not available (Why is that?)

Publication date: 15 October 2015

Abstract: In cloud computing systems, assigning a job to multiple servers and waiting for the earliest copy to finish is an effective method to combat the variability in response time of individual servers. Although adding redundant replicas always reduces service time, the total computing time spent per job may be higher, thus increasing waiting time in queue. The total time spent per job is also proportional to the cost of computing resources. We analyze how different redundancy strategies, for eg. number of replicas, and the time when they are issued and canceled, affect the latency and computing cost. We get the insight that the log-concavity of the service time distribution is a key factor in determining whether adding redundancy reduces latency and cost. If the service distribution is log-convex, then adding maximum redundancy reduces both latency and cost. And if it is log-concave, then having fewer replicas and canceling the redundant requests early is more effective.

This page was built for publication: Efficient Replication of Queued Tasks for Latency Reduction in Cloud Systems

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6266478)