Today's Risk Options with a Lift & Shift Project

A pattern that's become quite regular these days is the dilemma organizations have with what to do with their previous generation data warehouses and data marts.  These are the databases that were designed over a decade ago quite often with a BI use cases in mind that was also from a decade ago.  The dilemma arises because the business is interested in doing new things with the data but the decade old technology and architecture can't support the new workload or additional capacity without a significant additional spend.  The problem I'm thinking of is most prevalent with data warehouses that were just small enough that they were able to be built with a traditional DBMS like Oracle or SQL Server using a shared-everything architecture with SAN for storage.  

The technology solution to this dilemma is not new.  MPP and in-memory architectures were around as far back as when these databases were first built.  Such vendors have long proposed the "Lift & Shift" where you lift your existing data model and Shift it to their product.  What is new is that performing the "Lift & Shift" to the correct technology today can be approached without the project risks of the past.  Previously an organization had to gain a level of confidence in a new or different DW technology to warrant an investment to buy the servers and licenses prior to moving the data before any.

Read more: Today's Risk Options with a Lift & Shift Project

Lift & Shift Scenario

Today I thought we'd run through a typical scenario for performing a Lift & Shift of a traditional data warehouse using an on-premise RDBMS to AWS Redshift.  

An organization has a 2TB dimensional-model data warehouse built on Oracle, originally on 9i in the mid-to-late 2000s.  The data warehouse was designed with detailed reporting in mind but now folks want to do analytics off the data with Tableau.  The Oracle version is significant because Oracle introduced some key data warehousing features good for analytics with 10g after this data warehouse was first designed.  Consequently, the solution is insanely slow or even non-responsive when attempted for analytic use.  Specifically, large multi-year aggregate queries from tools like Tableau along the original, low-cardinality dimensions just don't respond.  Furthermore, this data warehouse has too much data to take an extract approach into Tableau: the data needs to be pre-aggregated before reaching a BI tool.
To keep queries manageable, and the cost down the data warehouse has old data archived to save on storage costs (think SAN and the focus a decade ago).  The customer upgraded to the latest DB version (Oracle 11g).  But the app is still way too slow for analytics.  

So what next, change the data model?  Build aggregate marts with those 11g features (bitmap indexes)?  Buy ODA or even Exadata?  

Read more: Lift & Shift Scenario