Data Warehousing

Importance of Cross-Platform Data Consistency

Arguably the biggest part of data warehousing is tying data from multiple sources together to create a single version of the truth. Sometimes, lookup tables or data conversions are needed to join data. In these instances we often rely on cross-platform data consistency.

Diagnosing Airflow’s Auto-Scaling Flaw in AWS MWAA

Apache Airflow is a fantastic platform for scheduling workflows for the ETL of a Data Warehouse. MWAA is the AWS managed implementation of this, allowing for easier management and scalability of workers among other improvements. The problem with MWAA however is that the automatic scaling of workers is currently broken, with a major architectural miss.