fastplotlib is a new GPU-accelerated fast and interactive scientific plotting library that leverages WGPU
Introduction
Scientific visualization is hard — but it doesn’t have to be.
What makes scientific visualization so challenging?
- High-dimensional, massive datasets — often spanning terabytes of information
- Computational bottlenecks — efficiently utilizing resources is non-trivial
- Limited interactive tools — most are designed for static plotting or do not scale to large data
- Real-time analysis barriers — visualization lags behind modern data generation speeds
Enter fastplotlib, which is built for high-performance, interactive scientific visualization.
fastplotlib
is a very new open source Python-based GPU-accelerated scientific plotting library.
What can you do with fastplotlib?
- GPU-accelerated visualization (a modern integrated GPU is sufficient for most use cases)
- Rapid prototyping and algorithm design
- Exploration and fast rendering of large-scale data
- Creation of real-time acquisition systems for instruments
Note: this is by no means an exhaustive list, merely just the highlights :)
Agenda
In the remainder of this article, I am going to cover the following topics that highlight why fastplotlib
is a powerful tool that can be used to drive scientific discovery through data visualization:
- Scientific visualization is more than just static plots
- API design matters
- Leveraging new hardware is critical
Scientific visualization is more than just static plots
While scientific visualization has traditionally relied on static plots, dynamic and interactive visualizations are the key to enhancing data exploration and analysis.
For example, consider the interactive plot depicted below.
This plot shows a simple interactive visualization of a covariance matrix in fastplotlib
. As a formal definition, a covariance matrix gives a measure of how pairs of random variables change together.
In the left subplot, the covariance matrix for the Olivetti faces dataset is shown. In this case, each entry in the covariance matrix indicates how the intensity of any two pixels varies. In the right subplot is the reconstructed row image from the currently selected row of the covariance matrix. We can easily change what row of the covariance matrix we are looking at by simply moving the selector.
It is very clear visually that looking at the reconstructed image for each row of the covariance matrix is much more informative than having a static plot of just the covariance matrix by itself. By looking at the reconstructed row image, we can get a sense of how the pixel intensity is changing across faces in the dataset.
The purpose of displaying this plot is not to conduct a detailed analysis of the covariance matrix but to demonstrate how even a small degree of interactivity can enhance our understanding of the data, ultimately transforming the types of analysis we may pursue in the future.
The goal of fastplotlib
is to expand the field of scientific visualization by providing a mechanism that allows the creation of high-level interactive plots, ultimately driving scientific discovery.
API design matters
The ecosystem for scientific visualization has come a long way since the early 2000s.
These days, there are many open-source Python visualization tools available. However, one limiting factor for scientists and other users is the high barrier to entry in learning how to use some of these libraries. Often, users are forced to learn complicated APIs that make it difficult to focus on their data or research questions.
In fastplotlib
, we aim to provide fast interactive visualization via an easy-to-use and intuitive API.
1) Data interaction
The premise behind our API design is that you should never have to think of your data as anything but an array.
If the data in our visualization maintains an array-like structure we are familiar with, interacting with the visualization becomes much easier.
Consider the following example:
Suppose we want to plot a simple sine wave.
import fastplotlib as fpl
import numpy as np# generate some data
xs = np.linspace(-10, 10, 100)
ys = np.sin(xs)
data = np.dstack([xs, ys])[0]
# create a figure
figure = fpl.Figure(
18 Comments
ZeroCool2u
Seems like a nice library, but I have a hard time seeing myself using it over plotly. The plotly express API is just so simple and easy. For example, here's the docs for the histogram plot: https://plotly.com/python/histograms/
This code gives you a fully interactive, and performant, histogram plot:
“`python
import plotly.express as px
df = px.data.tips()
fig = px.histogram(df, x="total_bill")
fig.show()
“`
asangha
>sine_wave.colors[::3] = "red"
I never knew I needed this until now
sfpotter
Very cool effort. That said, and it's probably because of the kind of work that I do, but I have almost never found the four challenges to be any kind of a problem for me. Although I do think there is some kind of contradiction there. Plotting (exploratory data analyis ("EDA"), really) is all about distilling key insights and finding features hidden in data. But you have to some kind of intuition about where the needle in the haystack is. IME, throwing up a ton of plots and being able to scrub around in them never seems to provide much insight. It's also very fast—usually the feedback loop is like "make a plot, go away and think about it for an hour, decide what plot I need to make next, repeat". If there is too much data on the screen it defeats the point of EDA a little bit.
For me, matplotlib still reigns supreme. Rather than a fancy new visualization framework, I'd love for matplotlib to just be improved (admittedly, fastplotlib covers a different set of needs than what matplotlib does… but the author named it what they named it, so they have invited comparison. ;-) ).
Two things for me at least that would go a long way:
1) Better 3D plotting. It sucks, it's slow, it's basically unusable, although I do like how it looks most of the time. I mainly use PyVista now but it sure would be nice to have the power of a PyVista in a matplotlib subplot with a style consistent with the rest of matplotlib.
2) Some kind of WYSIWYG editor that will let you propagate changes back into your plot easily. It's faster and easier to adjust your plot layout visually rather than in code. I'd love to be able to make a plot, open up a WYSIWYG editor, lay things out a bit, and have those changes propagate back to code so that I can save it for all time.
(If these features already exist I'll be ecstatic ;-) )
paddy_m
Really nice post introducing your library.
When would you reach for a different library instead of fastplotlib?
How does this deal with really large datasets? Are you doing any type of downsampling?
How does this work with pandas? I didn't see it as a requirement in setup.py
Does this work in Jupyter notebooks? What about marimo?
carabiner
GPU all the things! GPU-accelerated Tableau would be incredible.
pama
I know 3D is in the roadmap. Once the basic functionality is in place, it would be great to also consider integrating molecular visualization or at least provide enough fast primitives to simplify the integration of molecular visualization tools with this library.
theLiminator
Do you have any numbers for the rough number of datapoints that can be handled? I'm curious if this enables plotting many millions of datapoints in a scatterplot for example.
CreRecombinase
Every two weeks or so I peruse github looking for something like this and I have to say this looks really promising. In statistical genetics we make really big scatterplots called Manhattan plots https://en.wikipedia.org/wiki/Manhattan_plot and we have to use all this highly specialized software to visualize at different scales (for a sense of what this looks like: https://my.locuszoom.org/gwas/236887/). Excited to try this out
abdullahkhalids
Is it possible to put the interactive plots on your website? Or is this a Jupyter notebook only tool.
meisel
One of the big bottlenecks of plotting libraries is simply the time it takes to import the library. I’ve seen matplotlib being slow to import, and in Julia they even have a “time to first plot” metric. I’d be curious to see how this library compares.
749402826
"Fast" is a bold claim, given the complete lack of benchmarks and the fact that it's written entirely in Python…
Starlord2048
[flagged]
klaussilveira
Very cool to see imgui empowering so many different things.
rossant
Shameless plug: I'm actively working on a similar project, Datoviz [1], a C/C++ library with thin Python bindings (ctypes). It supports both 2D and 3D but is currently less mature and feature-complete than fastplotlib. It is also lower level (high-level capabilities will soon be provided by VisPy 2.0 which will be built on top of Datoviz, among other possible backends).
My focus is primarily on raw performance, visual quality, and scalability for large datasets—millions, tens of millions of points, or even more.
[1] https://datoviz.org/
qoez
This smells of claude generated code for some reason.
roter
Very interesting and promising package.
I especially like that there is a PyQt interface which might provide an alternative to another great package: pyqtgraph[0].
[0] https://github.com/pyqtgraph/pyqtgraph
gooboo
Yeah, many browsers have webgpu turned off by default,
So you're stuck with wasm (wasm Simd if you're lucky)
Hopefully both are implemented.
crazygringo
Sounds really compelling.
But it doesn't seem to answer how it works in Jupyter notebooks, or if it does at all. Is the GPU acceleration done "client-side" (JavaScript?) or "server-side" (in the kernel?) or is there an option for both?
Because I've used supposedly fast visualization libraries in Google Colab before, but instead of updating at 30 fps, it takes 2 seconds to update after a click, because after the new image is rendered it has to be transmitted via the Jupyter connector and network and that can turn out to be really slow.