mega-boost

ELT pipelines that take BOOST coded microdata for each country, clean, harmonize, then store them in a single Delta table. Aggregated budget and expenditure data at the following levels are also created for visualization in PowerBI:

by year by country
by year by country by adm1

How to add a new country

Create a new folder with the country name in CamelCase
Code & test the ELT scripts in databricks (set up DLT workflow if necessary)
Add the script/DLT workflow as steps to the "BOOST Harmonize" Job (Workflows > Jobs)
Add ELT pipeline for the country's adm1 level subnational population to mega-indicator repo – because it's a public good by itself.
Add the intermediate table (containing only new country's cleaned microdata) as a source for stacking to cross_country_aggregate_dlt.py
Check the PowerBI report to ensure the new country is reflected.
If not all its adm1 areas are present in the map, check adm1_name are aligned in the subnational population dataset and the BOOST dataset.
Check that the by year by country expenditure matches the aggregated data in BOOST source file

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
Albania		Albania
Bangladesh		Bangladesh
Bhutan		Bhutan
Burkina_Faso		Burkina_Faso
Colombia		Colombia
Congo_DR		Congo_DR
Kenya		Kenya
Mozambique		Mozambique
Nigeria		Nigeria
Pakistan		Pakistan
Paraguay		Paraguay
Tunisia		Tunisia
auxiliary_data		auxiliary_data
quality		quality
README.md		README.md
cross_country_aggregate_dlt.py		cross_country_aggregate_dlt.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mega-boost

How to add a new country

About

Releases

Packages

Contributors 4

Languages

dime-worldbank/mega-boost

Folders and files

Latest commit

History

Repository files navigation

mega-boost

How to add a new country

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages