Skip to content

Latest commit

 

History

History
 
 

OpenAI Custom GPTs Dataset

ETL Stages

ETL (Extract, Transform, and Load) is a process used to gather and prepare data for analysis.

  • Stage 1: Extracted raw nested JSON data by crawling GPT pages.
  • Stage 2: Transformed nested JSON data into flat CSV files.
  • Stage 3: Normalized data by processing the full history to compile simpler tables that contain the latest details and a timeline of performance metrics (Conversations, Stars, Reviews).

Download Large Files from Google Drive

The files for ETL stages 1 and 2 are too large to be hosted on GitHub. To access the complete raw data, download the files from the following Google Drive link:

https://drive.google.com/drive/folders/1hUGnQ_AWeL2wi5UhUTt05dMHYb_FIvz4?usp=sharing