I think Stan and PyMC3 both focus on optimized implementations of Hamiltonian Monte Carlo - a Markov chain Monte Carlo algorithm which requires that you express models with target variables whose log probability densities are differentiable with respect to their sample spaces. I think Stan provides other algorithms as well, and possibly also PyMC3 but I know that these systems are known for their implementations of HMC (similar to NumPyro, which has a highly optimized version of HMC).
This is certainly not all probabilistic programs - but many popular models do tend to fall in this category (where you can use HMC for inference). In both cases, I’m also unsure if these systems are “universal” in the sense that you can express any stochastic computable function which halts with probability 1. Similar to Turing completeness, if the system does not allow you to express control flow with runtime bounds, or disallows stochastic bounds - it’s not Turing universal. This is not typically a bad thing, because most classical models don’t require this feature, but it does tend to separate frameworks for PP.
As a shameless plug, my own library doesn’t rely on AST meta programming (e.g macros) but instead relies upon dynamic compiler interception - which is metaprogramming of a different sort.
PS for another PP system which doesn’t rely on metaprogramming, you could explore Probabilistic C: http://proceedings.mlr.press/v32/paige14.pdf
The downside is a trickier to write and debug API on the inference developer's side. Do link your library as I try to evaluate every PPS I encounter.
And being rough around the edges is an ok price to pay for not losing an ecosystem.
Most pre- and post- processing (with the exception of visualization) should probably be a part of your model!