This article is contributed. See the original author and article here.

Delta Lake and Azure Databricks enable the modern data architecture to simplify and accelerate data and AI solutions at any scale. The implementation of the modern data architecture allowed Relogix to scale back costs on wasted compute resources by 80% while further empowering their data team.

 

Modern Data Architecture with Azure Databricks and Delta Lake.png
Figure 1. Modern data architecture with Delta Lake and Azure Databricks

 

The medallion architecture (as noted in the following diagram) allows for flexible access and extendable data processing. The Bronze tables are for data ingestion and enable quick access (without the need for data modeling) to a single source of truth for incoming IoT and transactional events. As data flows to Silver tables, it becomes more refined and optimized for business intelligence and data science use cases through data transformations and feature engineering. The Bronze and Silver tables also act as Operational Data Store (ODS) style tables allowing for agile modifications and reproducibility of downstream tables. Deeper analysis is done on Gold tables where analysts are empowered to use their method of choice (PySpark, Koalas, SQL, BI, and Excel all enable business analytics at Relogix) to derive new insights and formulate queries.

 

Architecting your Delta Lake with the medallion data quality data flow.png

Figure 2. Architecting your Delta Lake with the medallion data quality data flow



 

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.