While doing consulting work for a “start-up” client, I came across a situation where there were multiple sources of external data (from various different geographies) that needed to be quickly aggregated together for internal analysis and management decision making. Because the client was a start-up, they never had any database infrastructure in place. The long term solution would be to build a data warehouse/BI solution that would be fully integrated with api’s from data providers — however, this solution would take time; on the order of 1+ years given the number of different data sources over many different countries. Due to the urgent business need, I proposed a quick-win solution by piecing together a few technologies:
- A flexible and nimble ETL automated solution (Alteryx)
- Instead of creating the standard “database”, we created a “cleaned” master excel file from the ETL with interim files hosted on cloud storage (e.g. Box, Dropbox) while software was put on a virtual server via AWS (Amazon Web Service)
- Lastly, we then output the normalized “master” to various Data Analytics/Visualization tools (e.g. Tableau / Qlik)
The output looks great with the awesome Visualization tools and no one knows that it is working off a excel spreadsheet. The key to this is the automated ETL, Alteryx. Years ago, the focus of IT was on creating a centralized and standard database. But now, as data sources multiply and data standards change rapidly, I posit that the key focus for IT should be on a flexible, nimble and automated ETL. Please click here to download my slide templates illustrating the typical challenges of multiple data sources and the resulting solution archetypes.
Furthermore, here is a guidebook that you can find on Amazon here.
But you can probably learn more by viewing this quick product intro video:
Better yet, you should download a free trial and play with it yourself via the link below. If you have any questions, feel free to email me (email@example.com) with any questions about this.