You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The load_chunk_data method is aggressively cosuming huge amounts of RAM when concatenating np arrays.
I am currently trying to implement something that will reduce the RAM consumption
@karnwatcharasupat @thomeou I am happy to request a PR when I am done, if that's is acceptable by you.
PS: I noticed that the previous method never worked, and I apologize for not properly testing it; I am trying something new now.
@karnwatcharasupat The splitting idea didn't work, even after I fixed it to actually concat the chunks because in the end, I am still going to concatenate np arrays that will eventually reach the shape of (7, 1920000, 200), which is unhandleable anyway. I had an idea to not concatenate them at all, but to export them to the db_data in get_split method, like this for example:
where features, features_2, features_3, and features_4 are just features, but splitted into 4 chunks. And then adjust the use of features in the whole project to include the other features sequentially. I have already developed such a method to export 4 arrays, but I am still exploring the code to better understand it before changing how it works. Currently, I can see that the get_split method is called when training in the datamodule.py file, specifically in
The
load_chunk_data
method is aggressively cosuming huge amounts of RAM when concatenating np arrays.I am currently trying to implement something that will reduce the RAM consumption
@karnwatcharasupat @thomeou I am happy to request a PR when I am done, if that's is acceptable by you.
PS: I noticed that the previous method never worked, and I apologize for not properly testing it; I am trying something new now.
@karnwatcharasupat The splitting idea didn't work, even after I fixed it to actually concat the chunks because in the end, I am still going to concatenate np arrays that will eventually reach the shape of (7, 1920000, 200), which is unhandleable anyway. I had an idea to not concatenate them at all, but to export them to the
db_data
inget_split
method, like this for example:where
features
,features_2
,features_3
, andfeatures_4
are justfeatures
, but splitted into 4 chunks. And then adjust the use offeatures
in the whole project to include the otherfeatures
sequentially. I have already developed such a method to export 4 arrays, but I am still exploring the code to better understand it before changing how it works. Currently, I can see that theget_split
method is called when training in thedatamodule.py
file, specifically inand in
The call from
train_db
variable is currently my problem.If you have an idea how to add the chunks part to the code, please let me know.
The text was updated successfully, but these errors were encountered: