Skip to content

dime-worldbank/mega-boost

Repository files navigation

mega-boost

ELT pipelines that take BOOST coded microdata for each country, clean, harmonize, then store them in a single Delta table. Aggregated budget and expenditure data at the following levels are also created for visualization in PowerBI:

  • by year by country
  • by year by country by adm1

How to add a new country

  • Create a new folder with the country name in CamelCase
  • Code & test the ELT scripts in databricks (set up DLT workflow if necessary)
  • Add the script/DLT workflow as steps to the "BOOST Harmonize" Job (Workflows > Jobs)
  • Add ELT pipeline for the country's adm1 level subnational population to mega-indicator repo – because it's a public good by itself.
  • Add the intermediate table (containing only new country's cleaned microdata) as a source for stacking to cross_country_aggregate_dlt.py
  • Check the PowerBI report to ensure the new country is reflected.
  • If not all its adm1 areas are present in the map, check adm1_name are aligned in the subnational population dataset and the BOOST dataset.
  • Check that the by year by country expenditure matches the aggregated data in BOOST source file

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages