In this project I will create a an automated pipeline that takes in scraped data from Wikipedia and IMDB, then transform and load it o an already existing PostgreSQL database.
- Read three data files (IMDB- Wikipedia- Ratings).
- Extract and Transform data.
- Load data to a PostgreSQL Movie Database.
Software: Python, Anaconda Navigator, Conda, Jupyter Notebook, PostgreSQL, pgAdmin 4.
The ETL jupyter notebook created collects and cleans movie data from different sources (Wikipedia JSON and Kaggle and ratings csv files). It transforms and merges the data and loads it into two updatable PostgreSQL database table.