ETL (Extract, Transform, and Load) is a process used to gather and prepare data for analysis.
- Stage 1: Extracted raw nested JSON data by crawling GPT pages.
- Stage 2: Transformed nested JSON data into flat CSV files.
- Stage 3: Normalized data by processing the full history to compile simpler tables that contain the latest details and a timeline of performance metrics (Conversations, Stars, Reviews).
The files for ETL stages 1 and 2 are too large to be hosted on GitHub. To access the complete raw data, download the files from the following Google Drive link:
https://drive.google.com/drive/folders/1hUGnQ_AWeL2wi5UhUTt05dMHYb_FIvz4?usp=sharing