Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Create indices with object dtype #1084

Merged
merged 4 commits into from
Sep 6, 2024

Conversation

effigies
Copy link
Collaborator

@effigies effigies commented Sep 6, 2024

Pandas is currently generating FutureWarnings:

.../.venv/lib/python3.12/site-packages/bids/variables/variables.py:535: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'False' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  df.iloc[prev_ix:prev_ix + reps, col_ix] = v

def _create_index(all_keys, all_reps, all_ents):
all_keys = np.array(sorted(all_keys))
df = pd.DataFrame(np.zeros((sum(all_reps), len(all_keys))), columns=all_keys)
prev_ix = 0
for i, reps in enumerate(all_reps):
for k, v in all_ents[i].items():
col_ix = np.where(all_keys == k)[0][0]
df.iloc[prev_ix:prev_ix + reps, col_ix] = v
prev_ix = reps
return df

Debugging, the issue is that we're creating the dataframe with a np.zeros(), so dtype=float, but entities and metadata may be strings, lists, etc. Using dtype=object resolves this.

Copy link

codecov bot commented Sep 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.75%. Comparing base (03a1af5) to head (31f0441).
Report is 5 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1084   +/-   ##
=======================================
  Coverage   89.75%   89.75%           
=======================================
  Files          63       63           
  Lines        7123     7123           
  Branches     1363     1364    +1     
=======================================
  Hits         6393     6393           
  Misses        531      531           
  Partials      199      199           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@effigies
Copy link
Collaborator Author

effigies commented Sep 6, 2024

Have resolved a few more warnings while I'm here.

@adelavega I'm done working on these for now. Care to review? Happy to explain changes more in case the commit messages and comments are insufficient.

@@ -156,7 +156,7 @@ def _transform(self, var, constraint='none', ref_level=None, sep='.'):
continue
name = ''.join([var.name, sep, str(lev)])
lev_data = data.copy()
lev_data['amplitude'] = new_cols[lev].astype(float)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we have this in the first place but is no longer necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_dummies() creates bools by default. This brought it back to float when reinserting into the dataframe. Instead, I make them floats at the beginning, since there's nothing that depends on them being bools above.

@adelavega
Copy link
Collaborator

Looks good, just one question

@effigies effigies merged commit bc93815 into bids-standard:master Sep 6, 2024
24 checks passed
@effigies effigies deleted the fix/pandas_deprecation branch September 6, 2024 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants