Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: Object of type 'int64' is not JSON serializable during trainer.save('bah') #21

Open
ljmartin opened this issue May 17, 2019 · 1 comment

Comments

@ljmartin
Copy link

ljmartin commented May 17, 2019

Hi,
Looking forward to using FastXML. This is not quite a bug, but it might be worth handling? Just thought I'd report it in case anyone else comes across it. JSON doesn't take numpy data types, so Y has to be changed to int when converting from numpy labels.
This is my setup:

from fastxml import Trainer, Inferencer
from sklearn.datasets import make_multilabel_classification

X, Y = make_multilabel_classification(n_classes=10, n_labels=1,
                                      allow_unlabeled=True,
                                      random_state=1)

X = [X[i].astype('float32') for i in range(X.shape[0])]
X_sparse = [csr_matrix(b) for b in X]

##This line will lead to trainer.save('bah') failing
Y_list = [list(np.where(i==1)[0]) for i in Y]

##This line converts the values to ints, and then trainer.save('bah') will work down the line
Y_list = [[int(k) for k in list(np.where(i==1)[0])] for i in Y]

trainer = Trainer(n_trees=10, n_jobs=1)
trainer.fit(X_sparse, Y_list)

trainer.save('bah')
@Refefer
Copy link
Owner

Refefer commented May 17, 2019

Thanks for the find; good ol' Numpy.. I'll make a note in the README that labels need to be JSON serializable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants