- Surveys
- Books
- Datasets
- Programming Frameworks
- Learning to Compute
- Natural Language Processing
- Convolutional Neural Networks
- Recurrent Neural Networks
- Convolutional Recurrent Neural Networks
- Autoencoders
- Restricted Boltzmann Machines
- Biologically Plausible Learning
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
- Theory
- Quantum Computing
- Training Innovations
- Numerical Optimization
- Motion Planning
- Numerical Precision
- Hardware
- Cognitive Architectures
- Computational Creativity
- Cryptography
- Distributed Computing
- Clustering
- Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
- The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems
- Caffe: Convolutional Architecture for Fast Feature Embedding
- Theano: A CPU and GPU Math Compiler in Python
- Theano: new features and speed improvements
- Blocks and Fuel: Frameworks for deep learning
- [Announcing Computation Graph Toolkit](https://joschu.github.io/index.html#Announcing CGT "John Schulman")
- Torch7: A Matlab-like Environment for Machine Learning
- cuDNN: Efficient Primitives for Deep Learning
- Fast Convolutional Nets With fbfft: A GPU Performance Evaluation
- Probabilistic Programming in Python using PyMC
- Neural Turing Machines
- Memory Networks
- Learning to Transduce with Unbounded Memory
- Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets
- Pointer Networks
- Learning to Execute
- Grammar as a Foreign Language
- Deep Learning, NLP, and Representations
- Language Models for Image Captioning: The Quirks and What Works
- Zero-Shot Learning Through Cross-Modal Transfer
- Natural Language Processing (almost) from Scratch
- Efficient Estimation of Word Representations in Vector Space
- GloVe: Global Vectors for Word Representation
- Learning to Understand Phrases by Embedding the Dictionary
- Inverted indexing for cross-lingual NLP
- Random walks on discourse spaces: a new generative language model with applications to semantic word embeddings
- Breaking Sticks and Ambiguities with Adaptive Skip-gram
- Language Recognition using Random Indexing
- Distributed Representations of Sentences and Documents
- A Fixed-Size Encoding Method for Variable-Length Sequences with its Application to Neural Network Language Models
- Skip-Thought Vectors
- Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation
- Character-Aware Neural Language Models
- Modeling Order in Neural Word Embeddings at Scale
- Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs
- Sequence to Sequence Learning with Neural Networks
- Neural Machine Translation by Jointly Learning to Align and Translate
- Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
- Neural Transformation Machine: A New Architecture for Sequence-to-Sequence Learning
- Teaching Machines to Read and Comprehend
- Investigation of Recurrent-Neural-Network Architectures and Learning Methods for Spoken Language Understanding
- Language Understanding for Text-based Games Using Deep Reinforcement Learning
- Large-scale Simple Question Answering with Memory Networks
- Deep Learning for Answer Sentence Selection
- Neural Responding Machine for Short-Text Conversation
- A Neural Conversational Model
- VQA: Visual Question Answering
- Question Answering with Subgraph Embeddings
- Hierarchical Neural Network Generative Models for Movie Dialogues
- A Convolutional Neural Network for Modelling Sentences
- Convolutional Neural Networks for Sentence Classification
- Text Understanding from Scratch
- DeepWriterID: An End-to-end Online Text-independent Writer Identification System
- Encoding Source Language with Convolutional Neural Network for Machine Translation
- Long Short-Term Memory Over Tree Structures
- Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
- Spatial Transformer Networks
- Striving for Simplicity: The All Convolutional Net
- Very Deep Convolutional Networks for Large-Scale Image Recognition
- Network In Network
- Going Deeper with Convolutions
- Learning to Generate Chairs with Convolutional Neural Networks
- Deep Convolutional Inverse Graphics Network
- Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks
- Long-term Recurrent Convolutional Networks for Visual Recognition and Description
- A Machine Learning Approach for Filtering Monte Carlo Noise
- Image Super-Resolution Using Deep Convolutional Networks
- Learning to Deblur
- Monocular Object Instance Segmentation and Depth Ordering with CNNs
- FlowNet: Learning Optical Flow with Convolutional Networks
- DeepStereo: Learning to Predict New Views from the World's Imagery
- Deep convolutional filter banks for texture recognition and segmentation
- FaceNet: A Unified Embedding for Face Recognition and Clustering
- DeepFace: Closing the Gap to Human-Level Performance in Face Verification
- Deep Karaoke: Extracting Vocals from Musical Mixtures Using a Convolutional Deep Neural Network
- 3D ConvNets with Optical Flow Based Regularization
- DeepPose: Human Pose Estimation via Deep Neural Networks
- Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
- Rotation-invariant convolutional neural networks for galaxy morphology prediction
- Deep Fried Convnets
- Fractional Max-Pooling
- Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
- Invariant backpropagation: how to train a transformation-invariant neural network
- Recommending music on Spotify with deep learning
- Conv Nets: A Modular Perspective
- Training recurrent networks online without backtracking
- Modeling sequential data using higher-order relational features and predictive training
- Recurrent Neural Network Regularization
- Long Short-Term Memory (ftp:https://ftp.idsia.ch/pub/juergen/lstm.pdf)
- Learning Longer Memory in Recurrent Neural Networks
- A Simple Way to Initialize Recurrent Networks of Rectified Linear Units
- A Clockwork RNN
- DRAW: A Recurrent Neural Network For Image Generation
- Gated Feedback Recurrent Neural Networks
- A Recurrent Latent Variable Model for Sequential Data
- ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks
- Translating Videos to Natural Language Using Deep Recurrent Neural Networks
- Unsupervised Learning of Video Representations using LSTMs
- Visualizing and Understanding Recurrent Networks
- Advances in Optimizing Recurrent Networks
- Learning Stochastic Recurrent Networks
- Understanding LSTM Networks
- Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
- Describing Multimedia Content using Attention-based Encoder--Decoder Networks
- Auto-Encoding Variational Bayes
- Analyzing noise in autoencoders and deep networks
- MADE: Masked Autoencoder for Distribution Estimation
- k-Sparse Autoencoders
- Zero-bias autoencoders and the benefits of co-adapting features
- Importance Weighted Autoencoders
- Generalized Denoising Auto-Encoders as Generative Models
- Marginalized Denoising Auto-encoders for Nonlinear Representations
- Real-time Hebbian Learning from Autoencoder Features for Control Tasks
- Is Joint Training Better for Deep Auto-Encoders?
- Towards universal neural nets: Gibbs machines and ACE
- Transforming Auto-encoders
- Discovering Hidden Factors of Variation in Deep Networks
- The wake-sleep algorithm for unsupervised neural networks
- An Infinite Restricted Boltzmann Machine
- Quantum Deep Learning
- Quantum Inspired Training for Boltzmann Machines
- How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation
- Random feedback weights support learning in deep neural networks
- Fast Label Embeddings via Randomized Linear Algebra
- Locally Non-linear Embeddings for Extreme Multi-label Learning
- Index-learning of unsupervised low dimensional embedding
- An Analysis of Unsupervised Pre-training in Light of Recent Advances
- Is Joint Training Better for Deep Auto-Encoders?
- Semi-Supervised Learning with Ladder Network
- Semi-Supervised Learning with Deep Generative Models
- Rectified Factor Networks
- An Analysis of Single-Layer Networks in Unsupervised Feature Learning
- Deep Unsupervised Learning using Nonequilibrium Thermodynamics
- Human-level control through deep reinforcement learning
- Playing Atari with Deep Reinforcement Learning
- Universal Value Function Approximators
- On the Number of Linear Regions of Deep Neural Networks
- On the saddle point problem for non-convex optimization
- The Loss Surfaces of Multilayer Networks
- Qualitatively characterizing neural network optimization problems
- An exact mapping between the Variational Renormalization Group and Deep Learning
- Why does Deep Learning work? - A perspective from Group Theory
- Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
- On the Stability of Deep Networks
- Over-Sampling in a Deep Neural Network
- A theoretical argument for complex-valued convolutional networks
- A Probabilistic Theory of Deep Learning
- Deep Convolutional Networks on Graph-Structured Data
- Calculus on Computational Graphs: Backpropagation
- Understanding Convolutions
- Groups & Group Convolutions
- Neural Networks, Manifolds, and Topology
- Neural Networks, Types, and Functional Programming
- Causal Entropic Forces
- Physics, Topology, Logic and Computation: A Rosetta Stone
- Analyzing Big Data with Dynamic Quantum Clustering
- Quantum algorithms for supervised and unsupervised machine learning
- Entanglement-Based Machine Learning on a Quantum Computer
- A quantum speedup in machine learning: Finding a N-bit Boolean function for a classification
- The Effects of Hyperparameters on SGD Training of Neural Networks
- Empirical Evaluation of Rectified Activations in Convolutional Network
- Gradient-based Hyperparameter Optimization through Reversible Learning
- Scale-invariant learning and convolutional networks
- No Regret Bound for Extreme Bandits
- Accelerating Stochastic Gradient Descent via Online Learning to Sample
- Deeply-Supervised Nets
- Weight Uncertainty in Neural Networks
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Highway Networks
- Improving neural networks by preventing co-adaptation of feature detectors
- Maxout Networks
- Regularization of Neural Networks using DropConnect
- Distilling the Knowledge in a Neural Network
- Random Walk Initialization for Training Very Deep Feedforward Networks
- Domain-Adversarial Neural Networks
- Compressing Neural Networks with the Hashing Trick
- Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks
- Recursive Decomposition for Nonconvex Optimization
- Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods
- Graphical Newton
- Gradient Estimation Using Stochastic Computation Graphs
- Equilibrated adaptive learning rates for non-convex optimization
- Path-SGD: Path-Normalized Optimization in Deep Neural Networks
- Deep learning via Hessian-free optimization
- On the importance of initialization and momentum in deep learning
- Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
- ADADELTA: An Adaptive Learning Rate Method
- ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient
- Adam: A Method for Stochastic Optimization
- A sufficient and necessary condition for global optimization
- Unit Tests for Stochastic Optimization
- A* Sampling
- Solving Random Quadratic Systems of Equations Is Nearly as Easy as Solving Linear Systems
- Automatic differentiation in machine learning: a survey
- Continuous Character Control with Low-Dimensional Embeddings
- End-to-End Training of Deep Visuomotor Policies (youtu.be/Q4bMcUk6pcw)
- Sampling-based Algorithms for Optimal Motion Planning (youtu.be/r34XWEZ41HA)
- Planning biped locomotion using motion capture data and probabilistic roadmaps (youtu.be/cKrcjrdnD-M)
- Deep Learning with Limited Numerical Precision
- Low precision storage for deep learning
- 1-Bit Stochastic Gradient Descent and Application to Data-Parallel Distributed Training of Speech DNNs
- Training and operation of an integrated neuromorphic network based on metal-oxide memristors
- AHaH Computing–From Metastable Switches to Attractors to Machine Learning
- Derivation of a novel efficient supervised learning algorithm from cortical-subcortical loops
- A Minimal Architecture for General Cognition
- Inceptionism: Going Deeper into Neural Networks
- A Neural Algorithm of Artistic Style
- The Unreasonable Effectiveness of Recurrent Neural Networks
- GRUV: Algorithmic Music Generation using Recurrent Neural Networks
- Composing Music With Recurrent Neural Networks