How to register a MultiProcessDatasetReader and config it in config.json? #3794

haoyuan80s · 2020-02-17T19:07:01Z

Hi,

I have a data reader config as the follows in a config.json file:

"dataset_reader" : {
    "type": "my_data_reader"
}

I am trying to make is multi-processing. I am able to do it in python as the follows:

reader_ = DatasetReader.by_name(config['dataset_reader'].pop('type'))()
reader = MultiprocessDatasetReader(reader_, num_workers=32)

How could I config a multi-process-data-reader in my config.json file?

Thanks!

DeNeutoy · 2020-02-18T16:59:53Z

Hi @haoyuan80s,

Unfortunately the correct answer to this at the moment is to not bother, because you won't see any effective speedup. We are working very hard on fixing this problem, see #3386 , #3529 #3700 etc.

If you have found it to be faster, then you should just be able to have something like this config file
https://github.com/allenai/allennlp/blob/v0.9.0/training_config/bidirectional_language_model.jsonnet#L41

haoyuan80s · 2020-02-19T02:40:12Z

There is actually no speed up. OK, I will just wait.

Thanks

dirkgr · 2020-02-21T20:02:53Z

I will close this issue then, since there is no action other than what @DeNeutoy is already working on.

dirkgr closed this as completed Feb 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to register a MultiProcessDatasetReader and config it in config.json? #3794

How to register a MultiProcessDatasetReader and config it in config.json? #3794

haoyuan80s commented Feb 17, 2020

DeNeutoy commented Feb 18, 2020 •

edited

Loading

haoyuan80s commented Feb 19, 2020

dirkgr commented Feb 21, 2020

How to register a MultiProcessDatasetReader and config it in config.json? #3794

How to register a MultiProcessDatasetReader and config it in config.json? #3794

Comments

haoyuan80s commented Feb 17, 2020

DeNeutoy commented Feb 18, 2020 • edited Loading

haoyuan80s commented Feb 19, 2020

dirkgr commented Feb 21, 2020

DeNeutoy commented Feb 18, 2020 •

edited

Loading