•   When: Monday, September 24, 2018 from 09:00 AM to 11:00 AM
  •   Speakers: Venkat Tadakamalla
  •   Location: ENGR 4201
  •   Export to iCal

Internet datacenters and cloud computing environments consist of a multitude of servers that process requests that arrive from a population of customers. Incoming requests that find all servers busy have to wait until a server becomes idle. This type of queuing system is known as a G/G/c system, and has been extensively studied in the queuing literature under steady state conditions, i.e., when the average arrival rate of requests is smaller than the maximum rate at which the system can perform work, i.e., the system capacity . In this dissertation, we studied multi-server systems that are subject to finite duration traffic surges during which time the average arrival rate of requests exceeds the system’s capacity.

Traffic surges generate very high response times that can be orders of magnitude higher than corresponding steady state values and can be very disruptive to users and damaging to organizations that provide computing services. Cloud providers, such as Infrastructure as a Service (IaaS), allow for resources in the form of virtual machines to be dynamically added or removed from the set of available resources to cope with traffic intensity variability to help ensure that response times stay within expected values. This is called elasticity. To mitigate the impacts of traffic surges we designed, implemented, and extensively evaluated several autonomic controllers for multi-server elasticity that use the analytic estimate equations derived in this dissertation.

The contributions of the dissertation fall broadly into two main categories: (1) Analysis of the impacts of traffic surges of various shapes by proposing several metrics and deriving analytic estimates for them; and (2) Design of several autonomic controllers using the derived formulas to mitigate the negative impacts of such traffic surges.

We analyzed the impact of rectangular workload surges by proposing several metrics and deriving closed form approximate analytic expressions for them. We extended the research for other shapes of surges, including generic trapezoidal and triangular surges. To validate the derived equations, we developed a G/G/c simulator that we used to carry out extensive experiments that showed that the error between the analytic estimates and simulation is very small in all cases analyzed.

We designed and implemented several autonomic controllers that dynamically control the elasticity of a multi-server system to mitigate the negative effects of temporary traffic surges. The results showed that our autonomic controller successfully varied the number of servers to mitigate the impacts of the traffic surges. We also conducted experiments using workload surges from Google traces, in lieu of distribution-driven request arrival data, with and without the autonomic controller. We also proposed several new autonomic controllers that incorporate server startup delays; these controllers are evaluated through simulation experiments. Additionally, we proposed a new generalized method for workload characterization for elasticity control of arbitrary-shaped workload surges. The effectiveness of the controller was then evaluated.

Finally, this dissertation’s main contributions are:  (1) Proposal of several metrics to estimate the impacts of traffic surges; (2) Derivation of approximate closed form expressions for the proposed metrics for rectangular, trapezoidal and triangular shaped traffic surge impacts; (3) Design and implementation of a simulator for a G/G/c system to evaluate the accuracy of the equations in (2); (4) Design, implementation, and extensive evaluation of several autonomic controllers for multi-server elasticity that use the equations derived in (2). (5) Proposal and use of a generalized method for workload characterization for elasticity control under traffic surges.

Posted 5 years, 7 months ago