Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

key error for query with empty result #29

Closed
JGuetschow opened this issue Dec 1, 2021 · 6 comments · Fixed by #30
Closed

key error for query with empty result #29

JGuetschow opened this issue Dec 1, 2021 · 6 comments · Fixed by #30
Labels
invalid This doesn't seem right

Comments

@JGuetschow
Copy link

JGuetschow commented Dec 1, 2021

  • UNFCCC DI API version: 2.0.1
  • Python version: 3.8.10
  • Operating System: Linux

Description

When using query (single category interface) and there is no data available a key error is thrown instead of returning an empty dataframe or "None" or a message saying that there are no results.

What I Did

import unfccc_di_api

reader = unfccc_di_api.UNFCCCApiReader()
test = reader.non_annex_one_reader.query(party_codes=party_codes_nai, category_ids=[14817])
@JGuetschow JGuetschow added the invalid This doesn't seem right label Dec 1, 2021
@mikapfl
Copy link
Member

mikapfl commented Dec 1, 2021

Hm, but that is what I would expect, no? KeyError means "no data for this key", and can be handled programmatically. None would be terribly wrong (the query function usually returns a dataframe, returning None will just lead to confusing downstream errors after df = query(…)), returning a message is also not type-safe and will be confusing. The only other option I see would be an empty dataframe. But why? Usually, the user can't really do anything useful with an empty dataframe, and failing early with a KeyError ensures that the user doesn't waste their time trying any analysis on the empty results.

@mikapfl
Copy link
Member

mikapfl commented Dec 1, 2021

We could have our own class NoDataError inheriting from KeyError to make it easy to distinguish this error from other KeyErrors. Has the advantage that it is easy to catch this specific error, has the disadvantage that the meaning of unfccc_di_api.NoDataError is less immediately obvious to Python people that KeyError.

@mikapfl
Copy link
Member

mikapfl commented Dec 1, 2021

Compare what we do with what pandas does:

In [1]: import pandas as pd

In [3]: df = pd.DataFrame([{"a": 2, "b": 3}, {"a": 4, "b": 12}], index=["first", "second"])

In [4]: df
Out[4]: 
        a   b
first   2   3
second  4  12

In [5]: df.loc["third"]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/.local/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3360             try:
-> 3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:

~/.local/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

~/.local/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'third'

@JGuetschow
Copy link
Author

My problems with the current key error are
1.) If you run stuff for several queries in a row you would need to catch it, else your code fails. That of course is doable. You might not want to delete the query from your list as you don't want to manually check before every run if any of the formerly empty categories now have data.
2.) The error thrown is for key "party" so it's not really obvious that your query result is empty, I think. As said in the original issue text, I would be happy with an error that says "no data".

@mikapfl
Copy link
Member

mikapfl commented Dec 1, 2021

True, KeyError for party looks like the party doesn't exist, that's bad. So a good solution would be the unfccc_di_api.EmptyQueryResultError, I guess? Or would you prefer an empty dataframe?

@JGuetschow
Copy link
Author

I think an error is best as empty dataframes can also pose problems when working with them so the error would just occur a bit later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants