Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baseline model raise exception when using float32 value #496

Closed
moshe-rl opened this issue Nov 30, 2021 · 12 comments
Closed

Baseline model raise exception when using float32 value #496

moshe-rl opened this issue Nov 30, 2021 · 12 comments
Assignees
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Milestone

Comments

@moshe-rl
Copy link

At 'Explain' mode, for regression task.
Other models complete training successfully.

errors.md file content:

Error for 1_Baseline Object of type 'float32' is not JSON serializable Traceback (most recent call last): File "/home/moshe/.local/lib/python3.6/site-packages/supervised/base_automl.py", line 970, in _fit trained = self.train_model(params) File "/home/moshe/.local/lib/python3.6/site-packages/supervised/base_automl.py", line 312, in train_model mf.save(model_path) File "/home/moshe/.local/lib/python3.6/site-packages/supervised/model_framework.py", line 395, in save fout.write(json.dumps(desc, indent=4)) File "/usr/lib/python3.6/json/init.py", line 238, in dumps **kw).encode(obj) File "/usr/lib/python3.6/json/encoder.py", line 201, in encode chunks = list(chunks) File "/usr/lib/python3.6/json/encoder.py", line 430, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/usr/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict yield from chunks File "/usr/lib/python3.6/json/encoder.py", line 437, in _iterencode o = _default(o) File "/usr/lib/python3.6/json/encoder.py", line 180, in default o.class.name) TypeError: Object of type 'float32' is not JSON serializable

@pplonski
Copy link
Contributor

Thank you for reporting. Please provide minimal code example to reproduce.

@moshe-rl
Copy link
Author

moshe-rl commented Dec 2, 2021

import numpy as np
import pandas as pd
from supervised.automl import AutoML
import tempfile
import os

# create random input data
data = np.random.randn(1000, 11)

# convert to dataframe
data_df = pd.DataFrame(data)
data_df.columns = [f'f{idx}' for idx in range(data.shape[1]-1)] + ['y']

# separate x and y
y_vec = data_df.pop('y').values
x_df = data_df

# train model
model = AutoML(
    results_path=os.path.join(tempfile.gettempdir(), 'aml_results__6'), 
    ml_task='regression',
    mode='Explain', 
    eval_metric='rmse',
    explain_level=2
)

##### HERE IS THE IMPORTANT PART #####
model.fit(x_df, y_vec.astype(np.float32))

Some things I've notice:

  1. Some warnings raised:
Numerical issues were encountered when centering the data and might not be solved. Dataset may contain too large values. You may need to prescale your features.
Numerical issues were encountered when scaling the data and might not be solved. The standard deviation of the data is probably very close to 0. 
Numerical issues were encountered when centering the data and might not be solved. Dataset may contain too large values. You may need to prescale your features.
  1. When using y_vec as is or even using y_vec.astype(np.float64) it works fine, the only problem is with float32
  2. I digged a bit into the code, the problem seems to be with 'final_loss' key of the dict (desc) the code try to store as json

@pplonski
Copy link
Contributor

pplonski commented Dec 3, 2021

@moshe-rl great job! Thank you!

@pplonski pplonski added bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed labels Dec 3, 2021
@pplonski pplonski added this to the 0.11.1 milestone Feb 14, 2022
DanielR59 added a commit to DanielR59/mljar-supervised that referenced this issue Feb 28, 2022
@pplonski
Copy link
Contributor

Should be fixed with @DanielR59 PR.

@csetzkorn
Copy link

just used latest version and encountered this issue as well ...

@pplonski
Copy link
Contributor

pplonski commented Sep 4, 2023

Hi @csetzkorn,

could you provide minimal code to reproduce the issue?

@pplonski pplonski reopened this Sep 4, 2023
@JacobMarley
Copy link

JacobMarley commented Sep 25, 2023

Yep. Getting this too. Problem caused by using xgb.

@DanielR59
Copy link
Contributor

Hi @pplonski, the solution provided before is no longer in the code. I've been looking for the error, it seems to be in line 711 of base_automl.py

The state["all_params"] contains no JSON serializable objects.

Still looking for a solution...

@pplonski
Copy link
Contributor

Hi @DanielR59,

Thank you for information. Could you please provide code to reproduce the issue? We have intern to the end of week, he can look into it @Bocianski

@DanielR59
Copy link
Contributor

DanielR59 commented Sep 26, 2023

This issue can be reproduced using the one provided before #496 (comment)

@pplonski
Copy link
Contributor

@DanielR59 @Bocianski I confirm, I can reproduce the issue with the above code.

@pplonski
Copy link
Contributor

I fixed it myself :) There is custom JSON Encoder added that can handle numpy types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants