In this story, I will try to shed some light on the benefits of modern data warehouse (DWH) solutions compared to other types of data platform architecture. I would venture to say that DWH is the most popular platform among data engineers right now. It offers invaluable benefits compared to other types of solutions, but it also has some well-known limitations. Do you want to learn data engineering? This story is a good starting point because it explains data engineering at its core: the DWH solution at the center of the architecture diagram. We will see how data can be ingested and transformed into different DWH available in the market.
I would also like to open the discussion with experienced users. It would be great to hear your opinion and see what you have to say on this topic.
Key Features of a Data Warehouse
A distributed, serverless SQL engine (BigQuery, Snowflake, Redshift, Microsoft Azure Synapse, Teradata) is what we call a modern data warehouse (DWH). It is an SQL-first data architecture (1) where the data is stored in a data warehouse, and we can take advantage of all the advantages of using denormalized star schema data sets (2) because most modern data warehouses They are distributed and scale well, meaning there is no need to worry about table keys and indexes. It is suitable for ad hoc analytical queries on Big Data.
Most modern data warehouse solutions can process structured and unstructured data and are very convenient for data analysts with good SQL skills.
Modern data warehouses easily integrate with business intelligence solutions such as Looker, Tableau, Sisense, and Mode, which use ANSI-SQL to process data. In the diagram below I tried to map out a common data transformation journey and the tools used (not a complete list, of course). We can see that…