Bethke et al., 2009 - Google Patents
Approximate dynamic programming using Bellman residual elimination and Gaussian process regressionBethke et al., 2009
View PDF- Document ID
- 2934555757130940576
- Author
- Bethke B
- How J
- Publication year
- Publication venue
- 2009 American Control Conference
External Links
Snippet
This paper presents an approximate policy iteration algorithm for solving infinite-horizon, discounted Markov decision processes (MDPs) for which a model of the system is available. The algorithm is similar in spirit to Bellman residual minimization methods. However, by …
- 238000000034 method 0 title abstract description 32
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G06N3/0454—Architectures, e.g. interconnection topology using a combination of multiple neural nets
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/02—Computer systems based on specific mathematical models using fuzzy logic
- G06N7/023—Learning or tuning the parameters of a fuzzy system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Su et al. | Sample-efficient actor-critic reinforcement learning with supervised data for dialogue management | |
Moerland et al. | A0c: Alpha zero in continuous action space | |
Parisi et al. | Multi-objective reinforcement learning through continuous pareto manifold approximation | |
Li et al. | Periodogram estimation based on LSSVR-CCPSO compensation for forecasting ship motion | |
Song et al. | Air quality prediction based on LSTM-Kalman model | |
Bouin et al. | Travelling waves for the cane toads equation with bounded traits | |
JPH04160463A (en) | Optimizing method by neural network | |
CN110286586A (en) | A kind of MR damper hybrid modeling method | |
Wang et al. | Dynamic representation of fuzzy knowledge based on fuzzy petri net and genetic-particle swarm optimization | |
CN116933948A (en) | Prediction method and system based on improved seagull algorithm and back propagation neural network | |
Bethke et al. | Approximate dynamic programming using Bellman residual elimination and Gaussian process regression | |
Chen | Learning symbolic expressions via gumbel-max equation learner networks | |
Rahman et al. | Implementation of artificial neural network on regression analysis | |
Cheng et al. | Robust Actor-Critic With Relative Entropy Regulating Actor | |
Shemyakin et al. | Online identification of large-scale chaotic system | |
Wu et al. | Intelligent forecasting system based on integration of electromagnetism-like mechanism and fuzzy neural network | |
Su et al. | A combined model based on secondary decomposition technique and grey wolf optimizer for short-term wind power forecasting | |
Mottaghi-Kashtiban et al. | Optimization of rational-powered membership functions using extended Kalman filter | |
CN115345303A (en) | Convolutional neural network weight tuning method, device, storage medium and electronic equipment | |
Xu et al. | Adaptive dynamic programming for optimal control of discrete-time nonlinear systems with trajectory-based initial control policy | |
Zhu | Generative Adversarial Network and Score-Based Generative Model Comparison | |
Zhang et al. | Aoam: Automatic optimization of adjacency matrix for graph convolutional network | |
Leoni et al. | Explainable data-driven modeling via mixture of experts: towards effective blending of grey and black-box models | |
Yue et al. | Parameter estimation for Choquet fuzzy integral based on Takagi–Sugeno fuzzy model | |
Hosino | Variational Bayesian parameter-based policy exploration |