Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return categorical dtypes from (JDBC)Backend to reduce memory usage #228

Open
zikolach opened this issue Nov 28, 2019 · 1 comment
Open
Labels
backend.jdbc Interaction with ixmp_source via JDBCBackend & JPype

Comments

@zikolach
Copy link
Contributor

Whenever possible switch to using categorical dtypes instead of object, at it allows to significantly reduce memory utilization when number of repeated values in a column is more than 50%.
Pay extra attention as comparison of dataframes (e.g. in tests) is sensible to dtypes (e.g. order of the values in category).
Here is an article providing more information about internal structures of dataframes in pandas.

@khaeru khaeru changed the title Use categorical dtypes to optimize memory itilization Return categorical dtypes from (JDBC)Backend to reduce memory usage Nov 29, 2019
@khaeru
Copy link
Member

khaeru commented Nov 30, 2019

It should be decided whether this will be something that is:

  1. allowed by—that is, not specified by, but compatible with—the Backend API, and implemented by JDBCBackend, or
  2. specified as part of the Backend API, and then implemented by JDBCBackend.

The changes to the tests and documentation will differ depending on whether (1) or (2) is chosen.

@khaeru khaeru added the backend.jdbc Interaction with ixmp_source via JDBCBackend & JPype label Apr 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend.jdbc Interaction with ixmp_source via JDBCBackend & JPype
Projects
None yet
Development

No branches or pull requests

2 participants