Moving your data warehouse to the cloud: Look before you jump

By Ravi Dharnikota

Where’s your data warehouse? Is it still on-premises? If so, you’re not alone. Way back in 2011, in its IT predictions for 2012 and beyond, Gartner said, “At year-end 2016, more than 50 percent of Global 1000 companies will have stored customer-sensitive data in the public cloud.”

While it’s hard to find an exact statistic on how many enterprise data warehouses have migrated, cloud warehousing is increasingly popular as companies struggle with growing data volumes, service-level expectations, and the need to integrate structured warehouse data with unstructured data in a data lake.

Cloud data warehousing provides many benefits but getting there isn’t easy. Migrating an existing data warehouse to the cloud is a complex process of moving schema, data, and ETL. The complexity increases when restructuring of database schema or rebuilding of data pipelines is needed.

This post is the first in a “look before you leap” three-part series on how to jump-start your migration of an existing data warehouse to the cloud. As part of that, I’ll also cover how cloud-based data integration solutions can significantly speed your time to value.

Beyond basic: The benefits of cloud data warehousing

Cloud data warehousing is a Data Warehouse as a Service (DWaaS) approach that simplifies time-consuming and costly management, administration, and tuning activities that are typical of on-premises data warehouses. But beyond the obvious – data warehouses being stored in the cloud - there’s more. Processing is also cloud-based, and all major solution providers charge separately for storage and compute resources, both of which are highly scalable.

All of which leads us to a more detailed list of key advantages:

  • Scale up (and down): The volume of data in a warehouse typically grows at a steady pace as time passes and history is collected. Sudden upticks in data volume occur with events such as mergers and acquisitions, and when new subjects are added. The inherent scalability of a cloud data warehouse allows you to adapt to growth, adding resources incrementally (via automated or manual processes) as data and workload increase. The elasticity of cloud resources allows the data warehouse to quickly expand and contract data and processing capacity as needed, with no impact to infrastructure availability, stability, performance, and security.
  • Scale out: Adding more concurrent users requires the cloud data warehouse to scale out. You will be able to add more resources – either more nodes to an existing cluster or an entirely new cluster, depending on the situation – as the number of concurrent users rises, allowing more users to access the same data without query performance degradation.
  • Managed infrastructure: Eliminating the overhead of data center management and operations for the data warehouse frees up resources to focus where value is produced: using the data warehouse to deliver information and insight.
  • Cost savings: On-premises data centers themselves are extremely expensive to build and operate, requiring staff, servers, and hardware, networking, floor space, power, and cooling. (This comparison site provides hard dollar data on many data center elements.) When your data warehouse lives in the cloud, the operating expense in each of these areas is eliminated or substantially reduced.
  • Simplicity: Cloud data warehouse resources can be accessed through a browser and activated with a payment card. Fast self-service removes IT middlemen and democratizes access to enterprise data.

In my next post, I’ll do a quick review of additional benefits and then dive into data migration. If you’d like to read all the details about the benefits, techniques, and challenges of migrating your data warehouse to cloud, download the Eckerson Group white paper, “Jump-Start Your Cloud Data Warehouse: Meeting the Challenges of Migrating to the Cloud.

Ravi Dharnikota is Chief Enterprise Architect at SnapLogic. Follow him on Twitter @rdharn1