Bethke et al., 2009 - Google Patents

Approximate dynamic programming using Bellman residual elimination and Gaussian process regression

Bethke et al., 2009

Document ID: 2934555757130940576
Author: Bethke B; How J
Publication year: 2009
Publication venue: 2009 American Control Conference

External Links

Cited by

Snippet

This paper presents an approximate policy iteration algorithm for solving infinite-horizon, discounted Markov decision processes (MDPs) for which a model of the system is available. The algorithm is similar in spirit to Bellman residual minimization methods. However, by …

Continue reading at dspace.mit.edu (PDF) (other versions)

238000000034 method 0 title abstract description 32

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G06N3/0454—Architectures, e.g. interconnection topology using a combination of multiple neural nets
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/02—Computer systems based on specific mathematical models using fuzzy logic
- G06N7/023—Learning or tuning the parameters of a fuzzy system
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices

Similar Documents

Publication	Publication Date	Title
Su et al.	2017	Sample-efficient actor-critic reinforcement learning with supervised data for dialogue management
Moerland et al.	2018	A0c: Alpha zero in continuous action space
Parisi et al.	2016	Multi-objective reinforcement learning through continuous pareto manifold approximation
Li et al.	2019	Periodogram estimation based on LSSVR-CCPSO compensation for forecasting ship motion
Song et al.	2019	Air quality prediction based on LSTM-Kalman model
Bouin et al.	2014	Travelling waves for the cane toads equation with bounded traits
JPH04160463A (en)	1992-06-03	Optimizing method by neural network
CN110286586A (en)	2019-09-27	A kind of MR damper hybrid modeling method
Wang et al.	2014	Dynamic representation of fuzzy knowledge based on fuzzy petri net and genetic-particle swarm optimization
CN116933948A (en)	2023-10-24	Prediction method and system based on improved seagull algorithm and back propagation neural network
Bethke et al.	2009	Approximate dynamic programming using Bellman residual elimination and Gaussian process regression
Chen	2020	Learning symbolic expressions via gumbel-max equation learner networks
Rahman et al.	2021	Implementation of artificial neural network on regression analysis
Cheng et al.	2022	Robust Actor-Critic With Relative Entropy Regulating Actor
Shemyakin et al.	2018	Online identification of large-scale chaotic system
Wu et al.	2014	Intelligent forecasting system based on integration of electromagnetism-like mechanism and fuzzy neural network
Su et al.	2023	A combined model based on secondary decomposition technique and grey wolf optimizer for short-term wind power forecasting
Mottaghi-Kashtiban et al.	2008	Optimization of rational-powered membership functions using extended Kalman filter
CN115345303A (en)	2022-11-15	Convolutional neural network weight tuning method, device, storage medium and electronic equipment
Xu et al.	2023	Adaptive dynamic programming for optimal control of discrete-time nonlinear systems with trajectory-based initial control policy
Zhu	2023	Generative Adversarial Network and Score-Based Generative Model Comparison
Zhang et al.	2021	Aoam: Automatic optimization of adjacency matrix for graph convolutional network
Leoni et al.	2024	Explainable data-driven modeling via mixture of experts: towards effective blending of grey and black-box models
Yue et al.	2005	Parameter estimation for Choquet fuzzy integral based on Takagi–Sugeno fuzzy model
Hosino	2020	Variational Bayesian parameter-based policy exploration