Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trading With RL Pickle Error - TypeError: can't pickle _thread.RLock objects #8

Open
windowshopr opened this issue Nov 20, 2020 · 5 comments

Comments

@windowshopr
Copy link

Love the work you've posted! Super thorough.

I'm trying to implement the Trading with RL notebook into Google Colab but running into an issue. I will document my moves here for reproducibility.

Basically, I copied, cell for cell, the notebook from this repository into a Google Colab notebook.

First, you have to install backtrader into the environment. I did this by putting this at the very beginning of the imports:

try:
    import backtrader as bt
except:
    print('Backtrader not installed yet. Installing now...')
    !pip install backtrader
    print('Backtrader installed.')
    print('Restart and Run All now.')
    exit()

This way, it'll prompt the user to Restart and Run All once Backtrader is installed.

Second issue I figured out was to make sure this notebook is being run with eager execution disabled, so I added:

from tensorflow.compat import v1
v1.disable_eager_execution()

...to the imports at the top as well.

Now my issue is this error:

TypeError                                 Traceback (most recent call last)
<ipython-input-21-e8262df5188b> in <module>()
     38 
     39     if e and (e+1) % agent.save_interval == 0:
---> 40         agent.save()
     41 
     42 elapsed_time = time.time() - start_time

<ipython-input-20-21179c258b19> in save(self)
    149         self.predict_model.save("%s_predict.h5" % fullname)
    150         # can't save / load train model due to custom loss
--> 151         pickle.dump(self, open("%s.p" % fullname, "wb"))
    152 
    153     def load(filename, memory=True):

TypeError: can't pickle _thread.RLock objects

This happens when the cell with this at the beginning is run (i.e. the cell after the class REINFORCE_Agent(Agent): cell) :

N_EPISODES = 2000
ticks_per_episode = 1256
nstocks = 1
lag = 1

Googling that error, I found this answer that might help? But hoping to get some help troubleshooting this one? Would greatly appreciate it as I can't wait to get this working online.

Thanks!

@druce
Copy link
Owner

druce commented Nov 21, 2020 via email

@windowshopr
Copy link
Author

Ah that's correct, I don't see any instance of backtrader in the script, you're right. Commented out that install section for now.

Oh, and I also added in the below script into a cell just below the imports:

    # If model save directory isn't made yet, make it
    if not os.path.exists('model_output'):
        os.makedirs('model_output')
    if not os.path.exists('model_output/trading'):
        os.makedirs('model_output/trading')

... as the folder doesn't get made in Colab automatically, so this just makes the folders for you, if they don't already exist.

I commented out the pickle.save() portion of the agent.save() function and seems to be running now, saving some kreinforceXXXX_predict.h5 files. Was the pickle'ing just a backup to the regular save method? Would be awesome to make sure models are getting saved properly. Will advise if I run into anything else in mean time. :D

@windowshopr
Copy link
Author

@druce I also noticed that using multiple nstocks does not seem to work in Colab either. I don't know if this functionality was just left out when the notebook was created or not, but would be cool to get that working also. When try to change the variables nstocks everywhere to 2 instead of 1, I get the error:

ValueError                                Traceback (most recent call last)
<ipython-input-24-a5939e1fbe15> in <module>()
     34     if not os.path.exists('model_output/trading'):
     35         os.makedirs('model_output/trading')
---> 36     agent.run_episode()
     37     agent.score_episode(e, N_EPISODES)
     38 

1 frames
<ipython-input-22-b8da4b29deae> in run_episode(self, render)
     67                 env.render()
     68             self.action = self.act(self.state.reshape([1, self.state_size]))
---> 69             self.next_state, self.reward, self.done, _ = env.step(self.action)
     70             self.total_reward += self.reward
     71 

<ipython-input-20-0b69e1f91c9e> in step(self, action)
     45         # map actions 0 1 2 to positions -1, 0, 1
     46         position = action - 1
---> 47         reward = position @ stock_delta
     48         self.total_reward += reward
     49         self.t += 1

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 1)

I don't know what this error is trying to tell me lol but seems to be pointed at the position @ stock_delta step. I'm just now getting into wanting to train a model to take multiple actions at 1 step, given a state, so it would be cool to work this one out so I can apply it elsewhere as well. I read the write up on your site and it mentioned:

The Ritter paper applies reinforcement learning to a multiple-stock portfolio. This is fairly straightforward from here by changing the input to be the states from multiple stocks, adding multiple outputs for multiple stocks, and computing the reward as a portfolio return. The Ritter paper also uses Sharpe ratio as the reward, and finds that the algorithm successfully optimizes it, which is a very nice result. The model empirically maximized risk-reward without knowing anything about the stock market dynamics, a covariance matrix, normality, or even how the reward is computed.

The input to where? The environment? Same for output?

Thanks! Love working with the notebook!

@druce
Copy link
Owner

druce commented Dec 9, 2020 via email

@windowshopr
Copy link
Author

Best answer I've received in a long time lol thank you for taking the time, it makes lots of sense now.

I am interested in that portfolio optimization problem with correlated prices. I read the Ritter paper last night. Seems interesting. I have a modified Colab notebook that takes your primary code and trims out everything not pertaining to the OU part of it, and also modified it to only allow 1 long position at a time, just to see how it performs. It still finds profit which is nice, but my main worry is this idea of mean reversion. Buying when the price is below its long term mean is a risky move as prices can easily move out of mean reversion, and keep moving down. I've also added in the Hurst exponent into that above notebook, so I may want to play around with incorporating that into the correlation stuff you mentioned, such that it will only trade the mean reversion when it starts entering a lower Hurst exponent, hopefully taking advantage of the short term mean reversion. Maybe?

I'll also check out your other repository next weekend and see what I can come up with. Thanks a lot for the insight, and I'll let you know if I come up with anything!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants