You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Looks like with multiple datasets, e.g. in local_setup.yml, each dataset will be sampled certain % of data. For example, with two datasets, and each has 100 samples with weights [1,2]. Then it ends up with 33 samples from dataset A, and 66 samples from dataset B.
Is there a way to keep 100 samples of dataset A and 100 samples of dataset B?
Describe the solution you'd like
A flag or instruction on which codes need to be changed.
Describe alternatives you've considered
Overwrite the ratio in helper.cpp.
Additional context
I have a very large dataset, and I have to partition it into multiple small ones in order to process to mmap files. I want to train the model with all data.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Looks like with multiple datasets, e.g. in local_setup.yml, each dataset will be sampled certain % of data. For example, with two datasets, and each has 100 samples with weights [1,2]. Then it ends up with 33 samples from dataset A, and 66 samples from dataset B.
Is there a way to keep 100 samples of dataset A and 100 samples of dataset B?
Describe the solution you'd like
A flag or instruction on which codes need to be changed.
Describe alternatives you've considered
Overwrite the
ratio
in helper.cpp.Additional context
I have a very large dataset, and I have to partition it into multiple small ones in order to process to mmap files. I want to train the model with all data.
The text was updated successfully, but these errors were encountered: