We, the PyMC core development team, are incredibly excited to announce the release of a major rewrite of PyMC3 (now called just PyMC): 4.0
. Internally, we have already been using PyMC 4.0 almost exclusively for many months and found it to be very stable and better in every aspect. Every user should upgrade, as there are many exciting new updates that we will talk about in this and upcoming blog posts.

Graphic by Ravin Kumar#
Full API compatibility for model building#
To get the main question out of the way: Yes, you can just keep your existing PyMC modeling code without having to change anything (in most cases) and get all the improvements for free. The only thing most users will have to change is the import from import pymc3 as pm
to import pymc as pm
. For more information, see the quick migration guide. If you are using more advanced features of PyMC beyond the modeling API, you might have to change some things.
It’s now called PyMC instead of PyMC3#
First, the biggest news: PyMC3 has been renamed to PyMC. PyMC3 version 3.x will stay under the current name to not break production systems but future versions will use the PyMC name everywhere. While there were a few reasons for this, the main one is that PyMC3 4.0 looks quite confusing.
Theano → Aesara#
While evaluating other tensor libraries like TensorFlow
and PyTorch
as new backends we realized how amazing and unique Theano
really was. It has a mature and hackable code base and a simple graph representation that allows easy graph manipulations, something that’s very useful for probabilistic programming languages. In addition, TensorFlow
and PyTorch
focus on a dynamic graph which is useful for some things, but for a probabilistic programming package, a static graph is actually much better, and Theano
is the only library that provided this.
So, we went ahead and forked the Theano
library and undertook a massive cleaning up of the code-base (this charge was led by Brandon Willard), removing swaths of old and obscure code, and restructuring the entire library to be more developer friendly.
This rewrite motivated renaming the package to Aesara
(Theano’s daughter in Greek mythology). Quickly, a new developer team focused around improving aesara
independent of PyMC
.
What’s new in PyMC 4.0?#
Alright, let’s get to the good stuff. What makes PyMC 4.0 so awesome?
New JAX backend for faster sampling#
By far the most shiny new feature is the new JAX backend and the associated speed-ups.
How does it work? aesara
provides a representation of the model logp graph in form of various aesara
Ops
(operators) which represent the computations to be be performed. For example exp(x + y)
would be an Add
Op
with two input arguments x
and y
. The result of the Add
Op
is then inputted into an exp
Op
.
This computation graph doesn’t say anything about how we actually execute this graph, just what operations we want to perform. The functionality inherited from theano
is to transpile this graph to C-code which would then get compiled, loaded into Python as a C-extension, and then could be executed very fast. But instead of transpiling the graph to C, aesara
can now also target JAX
.
This is very exciting, because JAX
(through XLA
) is capable of a whole bunch of low-level optimizations which lead to faster model evaluation, in addition to being able to run your PyMC model on the GPU.
Even more exciting is that this allows us to combine the JAX
code that executes the PyMC model with a MCMC sampler also written in JAX. That way, the model evaluation and the sampler are one big JAX graph that gets optimized and executed without any Python call-overhead. We currently support a NUTS implementation provided by numpyro
as well as blackjax
.
Early experiments and benchmarks show impressive speed-ups. Here is a small example of how much faster this is on a fairly small and simple model: the hierarchical linear regression of the famous Radon example.
In order to do side-by-side comparisons, let’s import both, the old PyMC3
and Theano
as well as the new PyMC 4.0
and Aesara
. For your own use-case, you will only need the new pymc
packages of course.
# PyMC3 Imports import pymc3 as pm3 import theano.tensor as tt import theano # PyMC 4.0 imports import pymc as pm import aesara.tensor as at import aesara
Load in the radon dataset and preprocess it:
data = pd.read_csv(pm.get_data("radon.csv")) county_names = data.county.unique() data["log_radon"] = data["log_radon"].astype(theano.config.floatX) county_idx, counties = pd.factorize(data.county) coords = {"county": counties, "obs_id": np.arange(len(county_idx))}
Next, let’s define a hierarchical regression model inside of a function (see this blog post for a description of this model). Note that we provide pm
, our PyMC library, as an argument here. This is a bit unusual but allows us to create this model in pymc3
or pymc 4.0
, depending on which module we pass in. Here you can also see that most models that work in pymc3
also work in pymc 4.0
without any code change, you only need to change your imports.
def build_model(pm): with pm.Model(coords=coords) as hierarchical_model: # Intercepts, non-centered mu_a = pm.Normal("mu_a", mu=0.0, sigma=10) sigma_a = pm.HalfNormal("sigma_a", 1.0) a = pm.Normal("a", dims="county") * sigma_a + mu_a # Slopes, non-centered mu_b = pm.Normal("mu_b", mu=0.0, sigma=2.) sigma_b = pm.HalfNormal("sigma_b", 1.0) b = pm.Normal("b", dims="county") * sigma_b + mu_b eps = pm.HalfNormal("eps", 1.5) radon_est = a[county_idx] + b[county_idx] * data.floor.values radon_like = pm.Normal( "radon_like", mu=radon_est, sigma=eps, observed=data.log_radon, dims="obs_id" ) return hierarchical_model
Create and sample model in pymc3
, nothing special:
model_pymc3 = build_model(pm3)
%%time with model_pymc3: idata_pymc3 = pm3.sample(target_accept=0.9, return_inferencedata=True)
100.00% [8000/8000 00:12<00:00 Sampling 4 chains, 1 divergences]
CPU times: user 1.8 s, sys: 229 ms, total: 2.03 s Wall time: 21.5 s
Create and sample model in pymc
4.0, also nothing special (but note that pm.sample()
now returns and InferenceData
object by default):
model_pymc4 = build_model(pm)
%%time with model_pymc4: idata_pymc4 = pm.sample(target_accept=0.9)
100.00% [8000/8000 00:07<00:00 Sampling 4 chains, 0 divergences]
CPU times: user 4.08 s, sys: 315 ms, total: 4.39 s Wall time: 16.9 s
Now, lets use a JAX sampler instead. Here we use the one provided by numpyro
. These samplers live in a different submodule sampling_jax
but the plan is to integrate them into pymc.sample(backend="JAX")
.
%%time with model_pymc4: idata = pm.sampling_jax.sample_numpyro_nuts(target_accept=0.9, progress_bar=False)
Compiling... Compilation time = 0:00:00.648311 Sampling... Sampling time = 0:00:04.195409 Transforming variables... Transformation time = 0:00:00.025698 CPU times: user 7.51 s, sys: 108 ms, total: 7.62 s Wall time: 5.01 s
That’s a 3x speed-up – for a single-line code change (although we’ve seen speed-ups much more impressive than that in the 20x range)! And this is just running things on the CPU, we can just as easily run this on the GPU where we saw even more impressive speed-ups (especially as we scale the data).
Again, for a more proper benchmark that also compares this to Stan, see this blog post.
Better integration into aesara
#
The next feature we are excited about is a better integration of PyMC
into aesara
.
In PyMC3 3.x
, the random variables (RVs) created by e.g. calling x = pm.Normal('x')
were not truly theano
Ops
so they did not integrate as nicely with the rest of theano
. This created a lot of issues, limitations, and complexities in the library.
Aesara
now provides a proper RandomVariable
Op which perfectly integrates with the rest of the other Ops
.
This is a major change in 4.0
and lead to huge swaths of brittle code in PyMC3 get removed or greatly simplified. In many ways, this change is much more exciting than the different co