Garrett Burroughs

ETL Data Pipeline

An Extract -> transform -> load data pipeline for my

Github Link

Summary

This project is an ELT process that takes data from multiple soruces (A MySQL database, a MongoDB cluster, and an API endpoint), and conglomerates them into a data warehouse that uses a star schema to represent a rental fact.

Languages Used: Python, SQL

Technologies Used: Pandas

Takeaways

This project allowed me to understand multiple different data-science systems including SQL databases, NoSQL databases, APIs, and dataframes. Throughout the project, I was also required to think about data design when thinking about how to translate the data from normal form into a star schema.