Hacker News new | past | comments | ask | show | jobs | submit login
Reactive Probabilistic Programming (arxiv.org)
110 points by matt_d on July 20, 2020 | hide | past | favorite | 24 comments

From my robotics point of view, probabilistic programming looks really attractive, and this paper seems to give very interesting/neat examples for control and Kalman filtering.

I wonder what has prevented these languages from being widely adopted by the robotics community? My suspicion is that it's always easier to work with general purpose languages, but even then "probabilistic programming libraries" for Python, e.g. Pyro or Edward, haven't really taken off either... Most people write SLAM algorithms in C++ and don't pay much attention to what the PPL people are doing.

I think we're still on the path to making (efficient) inference work for broader classes of problems (expressive problem formulation). One of the most interesting recent projects I've come across is "Gen" by the probcomp group at MIT (BTW, a lot of interesting work related to PPL seems to be happening in the Julia language).

I’m happy someone mentioned the community in Julia!

We’re always looking for interested people to join and try out some of the systems. For easy access to some of the active PP frameworks:

Turing.jl https://turing.ml/dev/

Gen.jl https://www.gen.dev/

Soss.jl https://github.com/cscherrer/Soss.jl

Jaynes.jl https://github.com/femtomc/Jaynes.jl

Why so many? The design space is just beginning to be well-explored! And the community has welcomed experimentation!

Gen.jl and Jaynes.jl are research projects aiming to push the boundary of what you can express inside a PP framework - inference here is sample-based, with incremental computation providing most of the optimization. Gen.jl is much more mature than the latter (which I develop) and has tons of great and thought-provoking documentation. Turing.jl is also very mature, and they have a great number of resources showing how to express a number of classic models in their language, as well as how to compose sampling-based inference algorithms. The Turing folks also implemented the de-facto version of HMC in Julia https://github.com/TuringLang/AdvancedHMC.jl. Soss.jl is also very cool - and works by AST re-writing - with cool functionality integration from SymPy in Python. I’m not sure the state of the docs for Soss.jl - but the researchers for each of these systems are always willing to discuss the systems on the Julia Slack or Zulip!

The reason Julia is a good fit is it has both good numerics support (arrays, matrices, autodiff, samplers, etc) and good metaprogramming support. Probabilistic programming systems really need both to feel natural and well-integrated while still being able to have nice inference algorithms.

A PPS in an environment without those two things is at a disadvantage.

I am pretty new to all of this but I get the impression that Stan and PyMC3 are the leaders in this area and I don't see them as having great meta-programming support. Maybe I am wrong? Are they currently hitting limitations in that regard? Or is this in a particular area of PPL such as non-parametrics?

I think you’re right! I haven’t explored these systems enough to comment with certainty.

I think Stan and PyMC3 both focus on optimized implementations of Hamiltonian Monte Carlo - a Markov chain Monte Carlo algorithm which requires that you express models with target variables whose log probability densities are differentiable with respect to their sample spaces. I think Stan provides other algorithms as well, and possibly also PyMC3 but I know that these systems are known for their implementations of HMC (similar to NumPyro, which has a highly optimized version of HMC).

This is certainly not all probabilistic programs - but many popular models do tend to fall in this category (where you can use HMC for inference). In both cases, I’m also unsure if these systems are “universal” in the sense that you can express any stochastic computable function which halts with probability 1. Similar to Turing completeness, if the system does not allow you to express control flow with runtime bounds, or disallows stochastic bounds - it’s not Turing universal. This is not typically a bad thing, because most classical models don’t require this feature, but it does tend to separate frameworks for PP.

As a shameless plug, my own library doesn’t rely on AST meta programming (e.g macros) but instead relies upon dynamic compiler interception - which is metaprogramming of a different sort.

PS for another PP system which doesn’t rely on metaprogramming, you could explore Probabilistic C: http://proceedings.mlr.press/v32/paige14.pdf

I think systems built around effect handlers like pyro and edward2 form a nice corner in the design space by being less reliant on metaprogramming, being fairly composable, and give nicer UIs.

The downside is a trickier to write and debug API on the inference developer's side. Do link your library as I try to evaluate every PPS I encounter.

I agree. Turing.jl, which is one of the major PPLs in Julia, is also based around effect handlers and does not relying on meta programming for the inference part. This allows the composition of inference algorithms and makes it more easy to overload functions for specific behaviour.

It's in the earlier comment-jaynes.jl

PyMC3 does a lot of AST manipulation. More metaprogramming support a language has the easier and more at home this feels. The easier it is to inspect and modify the AST, less is the disconnect between the modeling language and the host language.

I don't know quite enough about the field, but it's possible they're the leaders (like Python) because they're just "good enough" and they were the best options widely available when they were adopted + took off?

One of the arguments for python is the exceptional support of automation differentiation and GPU computing through deep learning libraries. Most python based PPLs focus on static model with differentiable log joints, allowing the application of HMC or variational inference. Unfortunately, the support of efficient automatic differentiation libraries in Julia is still in its infancy. But I hope with some more work by the community and the Turing team, this will change sooner than later.

I thought with libraries like Zygote there is some really nice stuff already in Julia. I'd say it's still early days for good autodiff libraries in general and I think we still haven't really explored what they can do.

The reality is that most of a modelling task is preprocessing your data before it can be passed to a probabilistic model and postprocessing to make decisions using it. The code is usually written in R or Python so there is a strong pressure for your library to be in that language as well.

And being rough around the edges is an ok price to pay for not losing an ecosystem.

My take is that most modern PPLs have language bindings in JavaScript/Python/R because they are explicitly courting analysts/data-scientists/applied-statisticians, or they are taking advantage of modern technical stacks implementing auto-diff and co-processor routines.

Most pre- and post- processing (with the exception of visualization) should probably be a part of your model!

Thanks for mentioning Turing.

In comparison to other PPLs in Julia, Turing is less a single probabilistic programming library and more a framework for probabilistic programming by providing a large collection of exchangeable libraries.

AdvancedHMC, which you mentioned, is simply one of the many projects for probabilistic machine learning that we actively develop. AdvancedHMC specifically is meant as a research platform for HMC algorithms and implements state of the art algorithms for HMC based inference. Other Turing libraries focus more towards variational approximations of static models, Bayesian learning in neural networks, and universal probabilistic programming with support for dynamic models.

It's probably just me being dense, but I don't understand how much prob. prog. differs from “traditional” programming.

I just read the tutorials & examples for Turing and Gen, and to me, it seems that they are more or less nifty DSLs for expressing statistical models, but their functionalities could be easily replicated, albeit in an uglier way, with standard Julia.

Is there some deeper theoretical divergence I missed, or are they just strongly facilitating the expression of statistical models?

This basically applies to all programming? Reconsider this thought in the context of automatic differentiation, where a similar argument can be made. I doubt, however, that people would contest the utility of automatic differentiation systems.

I also disagree that their functionality could be easily replicated in standard Julia. What you see is the easiest way to provide this functionality in a model-agnostic way.

> This basically applies to all programming?

That's true. But if you should describe the essence of probabilistic programming to someone used only to “classical” scientific computing, what would be the key point(s)?

BTW, I'm sorry if I came off as dismissive, I'm only frustrated not to grasp why it's considered to be such a step forward. AD I get it, all the differentials are automagically imputed directly from the source code, which is something that practically could not be done otherwise. But what I get from Turing or Gen are just nifty DSLs.

AD does automatic differentiation, PLLs transform a generative model into some suitable form to perform automatic Bayesian inference, e.g. by using AD and black-box variational inference. Or said differently, in a PPL you specify the forward simulation of a generative process and the PPL helps to automatically invert this process using black-box algorithms and suitable transformations.

Without a PPL, you would traditionally write your code for your model and would have to implement a suitable inference algorithm yourself. With a PPL you only specify the generative process and don't have to implement the inference side of things nor care about an implementation of your model that is suitable for inference.

If you want something pithy. AD makes it easy to compute derivatives, PPLs make it easy to compute integrals. In particularly, the kind that come from taking the expectation of a function with respect to some probability distribution.

Most of them are from engineering fields, not from computer science. Their interest is designing and implementing engineering models rather than programming itself. So they are not as adventurous as computer science guys in investing new programming tools (languages).

Also, I am pretty sure more than 90% of them have never learned functional programming or related computational theories, even in schools.

You're answering to a guy who's been in 2 computer science labs focused on robotics! It is true that a lot of people have an EE/ME background, but I think you'd be surprised at the amount of CS people in robotics. Especially in SLAM which I mentioned, given that this is a probabilistic inference problem.

Typed probabilistic real-time machine learning? Hell yeah!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact