Pivot, iterate and be agile. These are words rarely applied to data warehouses, yet this is precisely what you can expect to do with WhereScape RED, the tool which automates the creation, management and control of data warehouses.
At the outset, I’ll just go ahead and say that other toolsets which claim to automate data warehouses are substandard. Nothing else enables rapid creation of a flexible data warehouse quite like RED does. With this toolset, your data warehouse isn’t a stodgy, static deployment.
It is a dynamic entity which evolves to match the changing requirements of the business. It does things that you simply will not get from traditional ETL (extract, transform and load) toolsets.
Let’s look at how it does that:
- RED automates development operations and workflows for data warehouses
Where ETL is like a Swiss Army knife with multiple bits and pieces for different tasks (typically related to moving or syncing data), RED is solely focused on building the data warehouse as efficiently as possible. This is a scalpel.
It’s also the first scalpel of its kind, purpose built for data warehouse developers.
RED works off the premise that a lot of processes in data warehouse construction are repeatable. The way a data store is built is practically the same regardless of the database being used (the minor variances are like the differences in the English accent in Yorkshire versus London’s East End. It’s still English).
That means RED automates and accelerates development using the native language of databases (SQL), using repeatable frameworks.
- RED is ready for the cloud
RED is ‘cloud ready’, because there is specific functionality which allows it to understand different SQL dialects; it supports multiple databases with a light architecture which doesn’t require an additional ETL server. That means it is simple to plug RED into cloud Platform as a Service (PaaS) infrastructure; in fact, working in RED, there is very little difference between on-premise and cloud.
What’s more, any SQL developer can easily get started with RED. Once the core concepts are understood, it is a simple matter to transition from on-premise to SQL on Azure or AWS or other platforms.
Another key point is RED’s push down low data movement architecture which is perfect for cloud and bursty workloads. ETL tools by comparison generate data movement and must be spec’d for peak loads.
- RED is flexible and consistent
Given that many data warehousing/analysis projects are exploratory in nature, having the ability to turn on a dime is even more valuable. It saves money and means your project is more likely to succeed. With best practice and consistency built into the tool, but retaining the flexibility to adapt rapidly, RED delivers consistency to the development process. That means no more ‘key man dependency’; one developer can pick up where the next left off, without having to deal with the personal idiosyncrasies which inevitably come with hand-code. There’s a further crucial benefit to which I’ve already alluded. In the past, data warehouses were monolithic, expensive things, something like chiselling into granite. However, the business picture changes, often quickly. Hammering out a new tablet just takes too long and costs too much. RED’s automation means you can adjust, amend, experiment even. It’s the modern word processor versus scratching on rocks. RED also provides lifecycle management, essential to maintain the value of the data warehouse as those business requirements inevitably change. It manages code deployments between environments. It has a scheduler to report back on activity and workflow which identifies issues or errors and alerts administrators. No need to build that yourself – just configure it in the tool.
- RED does the documentation
While nobody is going to argue the value and importance of documentation for those developing and using the system, by equal measure, nobody is going to step up and proclaim their love for creating it. Documentation should include where data comes from, how it is combined, what rules have been applied. Here’s the really good news then. RED generates all of that and more. And it keeps all the documentation in a single place, making it readily accessible to those who need it.
It shows them how things are calculated, what they mean, how to use them, where the data comes from. Driving up confidence, in other words, that the outputs from the data warehouse can be trusted. And, by extension, allowing the business to become more productive.
- RED is a one stop shop
One of the problems with ETL tools is that they tend to be focused on broad data movement rather than being the data warehouse scalpel which is RED. With a single tool which is designed to handle the creation and control of a data warehouse from both the operations and developer side, RED is a complete toolset. That means one interface, one set of documentation, one place for the lot. If it is data warehouse, it is RED. No wondering about which tool is right for which operation.
Why are these features and capabilities important? Fundamentally, because expensive and rare skills are in demand and in most businesses, the people who make data warehouses (and serve up analytics) are overworked and constantly on the back foot. With the automation of development operations and workflows, you get to focus on the important stuff – that 20 percent of the code where 80 percent of the value is created. No need to waste your time on the basics, because RED does it for you.
The combination of code generation, best practice, flexibility to change quickly and painlessly as feedback comes in from the business, means you get to focus on the business problem and not the code itself. You’ve got the time to understand the business, build relationships and ascertain where you as a developer can build a better solution to add value.
And that’s the power of good software.