Bethke et al., 2009 - Google Patents

Approximate dynamic programming using Bellman residual elimination and Gaussian process regression

Bethke et al., 2009

View PDF
Document ID
2934555757130940576
Author
Bethke B
How J
Publication year
Publication venue
2009 American Control Conference

External Links

Snippet

This paper presents an approximate policy iteration algorithm for solving infinite-horizon, discounted Markov decision processes (MDPs) for which a model of the system is available. The algorithm is similar in spirit to Bellman residual minimization methods. However, by …
Continue reading at dspace.mit.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/04Architectures, e.g. interconnection topology
    • G06N3/0454Architectures, e.g. interconnection topology using a combination of multiple neural nets
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/02Computer systems based on specific mathematical models using fuzzy logic
    • G06N7/023Learning or tuning the parameters of a fuzzy system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/005Probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices

Similar Documents

Publication Publication Date Title
Su et al. Sample-efficient actor-critic reinforcement learning with supervised data for dialogue management
Moerland et al. A0c: Alpha zero in continuous action space
Parisi et al. Multi-objective reinforcement learning through continuous pareto manifold approximation
Li et al. Periodogram estimation based on LSSVR-CCPSO compensation for forecasting ship motion
Song et al. Air quality prediction based on LSTM-Kalman model
Bouin et al. Travelling waves for the cane toads equation with bounded traits
JPH04160463A (en) Optimizing method by neural network
CN110286586A (en) A kind of MR damper hybrid modeling method
Wang et al. Dynamic representation of fuzzy knowledge based on fuzzy petri net and genetic-particle swarm optimization
CN116933948A (en) Prediction method and system based on improved seagull algorithm and back propagation neural network
Bethke et al. Approximate dynamic programming using Bellman residual elimination and Gaussian process regression
Chen Learning symbolic expressions via gumbel-max equation learner networks
Rahman et al. Implementation of artificial neural network on regression analysis
Cheng et al. Robust Actor-Critic With Relative Entropy Regulating Actor
Shemyakin et al. Online identification of large-scale chaotic system
Wu et al. Intelligent forecasting system based on integration of electromagnetism-like mechanism and fuzzy neural network
Su et al. A combined model based on secondary decomposition technique and grey wolf optimizer for short-term wind power forecasting
Mottaghi-Kashtiban et al. Optimization of rational-powered membership functions using extended Kalman filter
CN115345303A (en) Convolutional neural network weight tuning method, device, storage medium and electronic equipment
Xu et al. Adaptive dynamic programming for optimal control of discrete-time nonlinear systems with trajectory-based initial control policy
Zhu Generative Adversarial Network and Score-Based Generative Model Comparison
Zhang et al. Aoam: Automatic optimization of adjacency matrix for graph convolutional network
Leoni et al. Explainable data-driven modeling via mixture of experts: towards effective blending of grey and black-box models
Yue et al. Parameter estimation for Choquet fuzzy integral based on Takagi–Sugeno fuzzy model
Hosino Variational Bayesian parameter-based policy exploration