Optimizing Cloud Cost-Effectiveness

Enterprises are increasingly moving their data and computation to the cloud with the goal of reducing costs without sacrificing application performance. Cloud service providers offer their tenants a myriad of storage options, which while flexible, makes the choice of storage deployment non trivial. Crafting deployment scenarios to leverage these choices in a cost-effective manner---under the unique pricing models and multi-tenancy dynamics of the cloud environment---presents unique challenges in designing cloud-based data analytics frameworks.

Pricing Games for Hybrid Cloud Object Store

To ensure cost-effectiveness of the storage service, the object stores use hard disk drives (HDDs). However, the lower performance of HDDs affect tenants who have strict performance requirements for their big data applications. The use of faster storage devices such as solid state drives (SSDs) is thus desirable by the tenants, but incurs significant maintenance costs to the provider. We design a tiered object store for the cloud, which comprises both fast and slow storage devices. The resulting hybrid store exposes the tiering to tenants with a dynamic pricing model that is based on the tenants’ usage and the provider’s desire to maximize profits. The tenants leverage knowledge of their workloads and current pricing information to select a data placement strategy that would meet the application requirements at the lowest cost.


CAST: Cloud Analytics Storage Tiering

CAST is a Cloud Analytics Storage Tiering solution that cloud tenants can use to reduce monetary cost and improve performance of analytics workloads. The approach takes the first step towards providing storage tiering support for data analytics in the cloud. CAST performs offline workload profiling to construct job performance prediction models on different cloud storage services, and combines these models with workload specifications and high-level tenant goals to generate a cost-effective data placement and storage provisioning plan. Furthermore, we build CAST++ to enhance CAST's optimization model by incorporating data reuse patterns and across-jobs interdependencies common in realistic analytics workloads.


Related Publications    Full list

Provider versus Tenant Pricing Games for Hybrid Object Stores in the Cloud

Yue Cheng, M. Safdar Iqbal, Aayush Gupta, Ali R. Butt

IEEE Internet Computing (Special issue: May/June 2016 - Cloud Storage)    

Pricing Games for Hybrid Object Stores in the Cloud: Provider vs. Tenant

Yue Cheng, M. Safdar Iqbal, Aayush Gupta, Ali R. Butt

USENIX HotCloud '15     talk

CAST: Tiering Storage for Data Analytics in the Cloud

Yue Cheng, M. Safdar Iqbal, Aayush Gupta, Ali R. Butt

ACM HPDC '15     slides