Perhaps the more relevant question is…Do you need a data warehouse before you can derive value and insight from the data produced by your BI solution?
The answer most certainly is “No!” While a traditional, structured data warehouse provides much value and governance over data assets, the value derived can sometimes come at a steep cost from both a time and resource perspective.
Adding to these challenges, the platforms and tools for developing and maintaining data warehouses add to the total cost of ownership over the lifetime of the project. It’s also challenging to decide which tools to implement and if you should use vendor-specific or open source tools. Below are some of the reasons not to build a data warehouse:
Decisions around tooling are often ill-informed or biased toward the comfort level and experience of the resources assigned to the project. At the same time, little regard is paid to ongoing costs associated with licensing, subscriptions, support and maintenance.
Data warehouse solutions can also be overengineered—the age-old “using a sledgehammer when a tack hammer will suffice.” This can lead to technical over-reach, where the data architects and engineers are more interested in building their “playground” on the latest and greatest, opensourced, gluten-free data tools that are available.
Instead, to ensure your BI solution ultimately meets your needs, architects and engineers should focus on delivering incremental business value. Your project will be on safer ground using proven, mature platforms that can achieve the same desired results at a much lower TCO, and with a faster time-to-market.
Making the Ultimate Decision
Given all these considerations, many enterprise level organizations still take on the expense and risk of embarking upon a full-blown enterprise data warehouse. They believe anything less would be short-sighted and more costly in the long run.
In some cases, this may be true. However, a practical, cost-driven approach to BI that delivers incremental value can solve for most use-cases while also providing a solid foundation toward the progression of a broader end-goal of a future data warehouse, if desired.
Every circumstance is different. The solution to business intelligence is not one-size-fits-all. A well designed, purpose-built data warehouse certainly has its advantage. But there are methods to approach the BI challenge that can help you progress to your stated end-goal—without sacrificing months or even years of development and risking the potential for failed projects due to cost overruns and little-to-no perceived business value.
BI Journey Considerations
As you embark on your BI journey, there are several considerations to keep in mind. Begin with a very small slice. This will give you time to conduct real-world analysis of available tools and help you evaluate the pros and cons of a platform that may not be readily identifiable without going through a true proof-of-concept.
Be sure to also check the boxes around performance, security, and quality—no approach should sacrifice any
of these values. This may require a little more effort than a mock-up of a simple report, but it will ensure you
don’t forgo the key tenets of a BI reporting platform.
As you develop the solution, abstract as much of the business logic as possible. This creates a layer between
presentation and data storage. For example, if your presentation layer can connect to a tabular data source (such as SQL), it’s best to create views that hide the underlying schema structure. This not only allows for portability of your data platform (if needed in the future), but also permits you to perform simple transformations and restructuring without having to modify your ETL when new requirements surface.
From there, leverage an ELT pattern (Extract, Load, Transform) rather than an ETL (Extract, Transform, Load) for moving data between a source and target. A great deal of cost in a traditional data warehouse project can be attributed to ETL development.
In contrast, the ELT patterns that have emerged in recent years help alleviate the pain of modifying complex data transformation procedures that are sometimes present in ETL solutions. With ELT, data is loaded in a raw, un-transformed state and later transformed during operations, such as in reading data views at runtime.
Another key consideration is the use of a mature product platform and tools. For the data platform, consider leveraging cloud resources to minimize the maintenance and support that accompany on-premises deployments. The actual tools you choose will be based on several factors, but the less you have to worry about infrastructure, the faster time-to-value.
Also utilize a standard approach. Traditional relational data stores and data stores that support standard SQL syntax give you the most flexibility with respect to supporting leading visualization tools. SQL is the language of data, and almost all reporting tools can work with it. Platforms with a propriety language ultimately try adding a SQL layer on top to support the broader market of reporting tools, but most fall short in some fashion.
Many reporting tools also offer an embedded data store that can pull data either directly from a source or an intermediate data store. If your use -cases support higher latency to ensure data freshness, and there are no requirements around temporal storage of the data, this can be very economical as the data in the reporting tool is efficiently compressed, stored and optimized.
Finally, the pay-as-you-go compute model of the cloud can be an effective means of reducing platform costs. Many solutions opt to leverage the power of the reporting tools data engine as opposed to the SQL backend. This allows you to use the SQL instance as the intermediate data store and programmatically scale back the resources of the SQL instance when they are not in use (which is most of the time). This, in turn, provides significant cost savings.
BI Without Data Warehouse May Be All You Require
By following these recommended best practices, your business can derive significant value from your BI initiative. An iterative approach gains you a faster time to insights while the right level of abstraction allows your solution to grow with your business. And the use of cloud technologies will significantly reduce your costs.
Author: Mike Kennie