Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dump RGF and FastRGF to the JSON file #167

Open
StrikerRUS opened this issue Mar 17, 2018 · 6 comments
Open

dump RGF and FastRGF to the JSON file #167

StrikerRUS opened this issue Mar 17, 2018 · 6 comments
Assignees

Comments

@StrikerRUS
Copy link
Member

Initial support for dumping the RGF model is already implemented in #161. At present it's possible to print the model to the console. But it's good idea to bring the possibility of dumping the model to the file (e.g. JSON).

@StrikerRUS:

Really like new features introduced in this PR. But please think about "real dump" of a model. I suppose it'll be more useful than just printing to the console.

@fukatani:

For example dump in JSON format like lightGBM.
It's convenient and we may support it in the future, but we should do it with another PR.

@StrikerRUS
Copy link
Member Author

@fukatani Are there any progress with real dump to JSON?

@fukatani
Copy link
Member

Unfortunately I think that I can not get out for a while.
It is likely to take time for RGF on LightGBM. I am stuck.

@StrikerRUS
Copy link
Member Author

StrikerRUS commented Jun 11, 2018

Oh, OK. I just wondered 😃 .

Then I can start refining the repo structure, so you'll be needed only for reviews.

@StrikerRUS
Copy link
Member Author

It seems that FastRGF already has this functionality:

print_forest.insert(prefix+"print-forest","","if nonempty, print forest to this file",this);

@StrikerRUS StrikerRUS changed the title dump RGF to the JSON file dump RGF and FastRGF to the JSON file Jul 20, 2018
@StrikerRUS
Copy link
Member Author

What do you think about the following JSON scheme for RGF model?

Code for the model:

import lightgbm as lgb
import numpy as np
from sklearn.datasets import load_iris
from rgf import RGFClassifier

data = load_iris()

clf = RGFClassifier()

clf.fit(data.data, data.target)

clf.dump_model()

The beginning of the model:

"dump_model": 
   model_fn=D:\rgf\temp\26cf8e70-a394-4982-8169-e3d2f66be1941.model-10
Sat Jul 28 02:36:18 2018: Dump model ... 
constant=0, orgdim=4, #tree=500
tree[0]

[  0], depth=0, gain=85.2273, F2, 2.45
  [  1], (0.0111), depth=1, gain=0
  [  2], (-0.0123), depth=1, gain=0
tree[1]

[  0], depth=0, gain=23.5202, F2, 2.45
  [  1], (0.0111), depth=1, gain=0
  [  2], (-0.0122), depth=1, gain=0
tree[2]

[  0], depth=0, gain=9.90365, F2, 2.45
  [  1], (0.0111), depth=1, gain=0
  [  2], (-0.0122), depth=1, gain=0
tree[3]

[  0], depth=0, gain=5.03139, F2, 2.45
  [  1], (0.0111), depth=1, gain=0
  [  2], (-0.0122), depth=1, gain=0
tree[4]

JSON sample:

{
  "num_forests": "3",
  "forests": [
    {
      "num_trees": "500",
      "trees": [
        {
          "tree_index": "0",
          "nodes": [
            {
              "node_index": "0",
              "is_leaf": "False",
              "depth": "0",
              "gain": "85.2273",
              "feature": "F2",
              "threshold": "2.45",
              "value": "NA"
            },
            {
              "node_index": "1",
              "is_leaf": "True",
              "depth": "1",
              "gain": "0",
              "feature": "NA",
              "threshold": "NA",
              "value": "0.0111"
            },
            {
              "node_index": "2",
              "is_leaf": "True",
              "depth": "1",
              "gain": "0",
              "feature": "NA",
              "threshold": "NA",
              "value": "-0.0123"
            }
          ]
        },
        {
          "tree_index": "1",
          "nodes": [
            {
              "node_index": "0",
              "is_leaf": "False",
              "depth": "0",
              "gain": "23.5202",
              "feature": "F2",
              "threshold": "2.45",
              "value": "NA"
            },
            {
              "node_index": "1",
              "is_leaf": "True",
              "depth": "1",
              "gain": "0",
              "feature": "NA",
              "threshold": "NA",
              "value": "0.0111"
            },
            {
              "node_index": "2",
              "is_leaf": "True",
              "depth": "1",
              "gain": "0",
              "feature": "NA",
              "threshold": "NA",
              "value": "-0.0122"
            }
          ]
        }
      ]
    },
    {
      "num_trees": "264",
      "trees": []
    },
    {
      "num_trees": "349",
      "trees": []
    }
  ]
}

https://jsoneditoronline.org/?id=5da3e57a6ae24cb5a6b583c8f41dd307

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants