Python is a slow language, so computation is best delegated to code written in something faster.
You can do this with existing libraries like NumPy and SciPy, but what happens when you need to implement a new algorithm, and you don’t want to write code in a lower-level language?
For certain types of computation, in particular array-focused code, the Numba library can significantly speed up your code.
Sometimes you’ll need to tweak it a bit, sometimes it’ll just work with no changes.
And when it works, it’s a very transparent speed fix.
In this article we’ll cover:
- Why using NumPy on its own is sometimes not enough.
- The basics of using Numba.
- How Numba works, at a high-level, and the difference that makes to how your code runs.
When NumPy doesn’t help
Let’s say you have a very large array, and you want to calculate the monotonically increasing version: values can go up, but never down.
For example:
[1, 2, 1, 3, 3, 5, 4, 6] → [1, 2, 2, 3, 3, 5, 5, 6]
Here’s a straightforward in-place implementation:
def monotonically_increasing(a):
max_value = 0
for i in range(len(a)):
if a[i] > max_value:
max_value = a[i]
a[i] = max_value
There’s a problem, though.
NumPy is fast because it can do all its calculations without calling back into Python.
Since this function involves looping in Python, we lose all the performance benefits of using NumPy.
For a 10,000,000-entry NumPy array, this functions takes 2.5 seconds to run on my computer.
Can we do better?
Numba can speed things up
Numba is a just-in-time compiler for Python specifically focused on code that runs in loops over NumPy arrays.
Exactly what we need!
All we have to do is add two lines of code:
from numba import njit
@njit
def monotonically_increasing(a):
max_value = 0
for i in range(len(a)):
if a[i] > max_value:
max_value = a[i]
a[i] = max_value
This runs in 0.19 seconds, about 13× faster; not bad for just reusing the same code!
Of course, it turns out that NumPy has a function that will do this already, numpy.maximum.accumulate
.
Using that, running only takes 0.03 seconds.
Runtime | |
---|---|
Python for loop |
2560ms |
Numba for loop |
190ms |
np.maximum.accumulate |
30ms |
Introducing Numba
When you can find a NumPy or SciPy function that does what you want, problem solved.
But what if numpy.maximum.accumulate
hadn’t existed?
At that point the other option if you wanted a fast result would be to write some low-level code, but that means switching programming languages, a more complex build system, and more complexity in general.
With Numba you can:
- Run the same code both in normal Python, and in a faster compiled version, from inside the normal interpreter runtime.
- Easily and quickly iterate on algorithms.
Numba parses the code, and then compiles it in a j