Skip to content
forked from JohnPaton/airbase

🌬 An easy downloader for the AirBase air quality data.

License

Notifications You must be signed in to change notification settings

avaldebe/airbase

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PyPI version Downloads CI/CD Documentation Status pre-commit Code style: black Checked with mypy Imports: isort

🌬 AirBase

An easy downloader for the AirBase air quality data.

AirBase is an air quality database provided by the European Environment Agency (EEA). The data is available for download at the portal, but the interface makes it a bit time consuming to do bulk downloads. Hence, an easy Python-based interface.

Read the full documentation at https://airbase.readthedocs.io.

πŸ”Œ Installation

To install airbase, simply run

$ pip install airbase

πŸš€ Getting Started

πŸ—Ί Get info about available countries and pollutants:

>>> import airbase
>>> client = airbase.AirbaseClient()
>>> client.all_countries
['GR', 'ES', 'IS', 'CY', 'NL', 'AT', 'LV', 'BE', 'CH', 'EE', 'FR', 'DE', ...

>>> client.all_pollutants
{'k': 412, 'CO': 10, 'NO': 38, 'O3': 7, 'As': 2018, 'Cd': 2014, ...

>>> client.pollutants_per_country
{'AD': [{'pl': 'CO', 'shortpl': 10}, {'pl': 'NO', 'shortpl': 38}, ...

>>> client.search_pollutant("O3")
[{'pl': 'O3', 'shortpl': 7}, {'pl': 'NO3', 'shortpl': 46}, ...

πŸ—‚ Request download links from the server and save the resulting CSVs into a directory:

>>> r = client.request(country=["NL", "DE"], pl="NO3", year_from=2015)
>>> r.download_to_directory(dir="data", skip_existing=True)
Generating CSV download links...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:03<00:00,  2.03s/it]
Generated 12 CSV links ready for downloading
Downloading CSVs to data...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 12/12 [00:01<00:00,  8.44it/s]

πŸ’Ύ Or concatenate them into one big file:

>>> r = client.request(country="FR", pl=["O3", "PM10"], year_to=2014)
>>> r.download_to_file("data/raw.csv")
Generating CSV download links...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:12<00:00,  7.40s/it]
Generated 2,029 CSV links ready for downloading
Writing data to data/raw.csv...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2029/2029 [31:23<00:00,  1.04it/s]

πŸ“¦ Download the entire dataset (not for the faint of heart):

>>> r = client.request()
>>> r.download_to_directory("data")
Generating CSV download links...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 40/40 [03:38<00:00,  2.29s/it]
Generated 146,993 CSV links ready for downloading
Downloading CSVs to data...
  0%|          | 299/146993 [01:50<17:15:06,  2.36it/s]

🌑 Don't forget to get the metadata about the measurement stations:

>>> client.download_metadata("data/metadata.tsv")
Writing metadata to data/metadata.tsv...

πŸ›£ Roadmap

  • Parallel CSV downloads Contributed by @avaldebe
  • CLI to avoid using Python all together Contributed by @avaldebe
  • Data wrangling module for AirBase output data

About

🌬 An easy downloader for the AirBase air quality data.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%