Abstract:
Providing performance QoS (Quality-of-Service) guarantees in a distributed data center environment poses unique challenges. In a typical setting, a client must be guaranteed a minimum amount of service (its contractual reservation) aggregated over all servers on which it has demand for service. A server may receive service demands from multiple clients, which may in aggregate exceed its service capacity. The system must decide how much service to provide to each client on each server that it has demand, in order to satisfy all clients’ reservations. In case there is no feasible allocation of the service, the amount of reservation that is fulfilled must be maximized. In practice, the number of clients is several orders of magnitude larger than the number of servers.We describe a simple iterative algorithm called pTrans for reservation allocation based on a directed graph model; servers are vertices and edges are characterized by a vector of feasible service transfers between adjacent servers. We provide formal proofs that the algorithm converges to the global optimal and has a polynomial worst-case running time. Moreover, pTrans can be easily parallelized using multiple threads. Empirical evaluation results show pTrans has low runtime overheads.