It's probably just me being dense, but I don't understand how much prob. prog. differs from “traditional” programming.

I just read the tutorials & examples for Turing and Gen, and to me, it seems that they are more or less nifty DSLs for expressing statistical models, but their functionalities could be easily replicated, albeit in an uglier way, with standard Julia.

Is there some deeper theoretical divergence I missed, or are they just strongly facilitating the expression of statistical models?

This basically applies to all programming? Reconsider this thought in the context of automatic differentiation, where a similar argument can be made. I doubt, however, that people would contest the utility of automatic differentiation systems.

I also disagree that their functionality could be easily replicated in standard Julia. What you see is the easiest way to provide this functionality in a model-agnostic way.

That's true. But if you should describe the essence of probabilistic programming to someone used only to “classical” scientific computing, what would be the key point(s)?

BTW, I'm sorry if I came off as dismissive, I'm only frustrated not to grasp why it's considered to be such a step forward. AD I get it, all the differentials are automagically imputed directly from the source code, which is something that practically could not be done otherwise. But what I get from Turing or Gen are just nifty DSLs.

AD does automatic differentiation, PLLs transform a generative model into some suitable form to perform automatic Bayesian inference, e.g. by using AD and black-box variational inference. Or said differently, in a PPL you specify the forward simulation of a generative process and the PPL helps to automatically invert this process using black-box algorithms and suitable transformations.

Without a PPL, you would traditionally write your code for your model and would have to implement a suitable inference algorithm yourself.
With a PPL you only specify the generative process and don't have to implement the inference side of things nor care about an implementation of your model that is suitable for inference.

If you want something pithy. AD makes it easy to compute derivatives, PPLs make it easy to compute integrals. In particularly, the kind that come from taking the expectation of a function with respect to some probability distribution.

I just read the tutorials & examples for Turing and Gen, and to me, it seems that they are more or less nifty DSLs for expressing statistical models, but their functionalities could be easily replicated, albeit in an uglier way, with standard Julia.

Is there some deeper theoretical divergence I missed, or are they just strongly facilitating the expression of statistical models?