Notebooks as reusable Python programs by akshayka

ByHackTech 8 hours ago

16Comments

Share This Article

Sed ut perspiciatis unde.

Send to HN

Notebooks as reusable Python programs
Read More

0Likes

Written by

HackTech

View all posts by HackTech

Show comments (16)

16 Comments

Post Author

dchuk

Posted March 19, 2025 at 7:48 pm

I’ve been tinkering with Marimo, it’s pretty sweet (and you can use cursor or other AI IDEs pretty easily with it).

On running notebooks as scripts: I can’t find in the docs what happens if you have plotting and other notebook oriented code? Like I’m using pygwalker to explore data through transformation steps, and end with saving to csv. If I just run the notebook as a script, is all of the plotting automatically skipped?

0Likes Log in to Reply
Post Author

floathub

Posted March 19, 2025 at 7:49 pm

One approach to this is org-mode with babel.

You can have a plaintext file which is also the program which is also the documentation/notebook/website/etc. It's extremely powerful, and is a compelling example of literate programming.

Decent overview here:
https://www.johndcook.com/blog/2022/08/02/org-babel-vs-jupyt…

[edit: better link]

0Likes Log in to Reply
Post Author

florbnit

Posted March 19, 2025 at 7:54 pm

> When working with Jupyter, too often you end up with directories strewn with spaghetti-code notebooks, counting up to Untitled12.ipynb or higher. You the notebook author don’t know what’s in these notebooks

This is such a small UX thing but it’s so damn important. The simple fix is to not auto-name notebooks untitled-# when the user clicks new notebook just ask the name straight away, if they can’t name it don’t create it. It might add the smallest amount of friction to the UX, but it’s so damn important.

Also the choice of json as the file format is just plain wrong. Why the project hasn’t just abandoned that entirely and done a json-#python back and forth when writing to file is beyond me. There are extensions that do this, but that’s a really clunky interface, and while I can set it up for myself it’s difficult to force upon others in a corporate environment.

Great to see someone is taking the seemingly small things up, because they mean a world of difference to the overall ecosystem.

0Likes Log in to Reply
Post Author

paddy_m

Posted March 19, 2025 at 8:01 pm

I develop an open source notebook widget. Working with marimo has been a joy compared to developing on top of any other notebook environment.

The team is responsive and they care about getting it right. Having a sane file format for serializing notebooks is an example of this. They are thinking about core problems. They are also building in the open.

The core jupyter team is very unresponsive and unfocused. When you have a bug, you need to figure out which one of many many interelated projects caused the bug, issues go weeks without a response. It's a mess.

Then there are the proprietary notebook like environments. VSCode notebooks, and google colab in particular. They frequently rely on opaque undocumented APIs and are also very unresponsive.

0Likes Log in to Reply
Post Author

jdaw0

Posted March 19, 2025 at 8:01 pm

i wanted to like marimo, but the best notebook interface i've tried so far is vscode's interactive window [0]. the important thing is that it's a python file first, but you can divide up the code into cells to run in the jupyter kernel either all at once or interactively.

0: https://code.visualstudio.com/docs/python/jupyter-support-py

0Likes Log in to Reply
Post Author

abdullahkhalids

Posted March 19, 2025 at 8:11 pm

One design decision they made is that outputs are not stored. This means these notebooks are not suitable replacement for heavy computation routines, where the notebook is a record of the final results. Other people are not expected to run minutes/hours long computation to see what the author intended.

You can work your way around it by storing the results in a separate file(s), and writing the boiler plate to let the reader load the results. Or they let you export to ipynb – which is still sharing two files.

Presumably the reason for this decision is making git diffs short. But to me the solution is to fix git diff to operate on JSON nicely, rather than changing the entire notebook format.

0Likes Log in to Reply
Post Author

dmadisetti

Posted March 19, 2025 at 8:21 pm

Very excited for the top level functions change coming up !!

0Likes Log in to Reply
Post Author

TheAlchemist

Posted March 19, 2025 at 8:32 pm

This looks really very very neat.

One (not a great) workflow I have, is that I use notebooks as quicks UIs to visualize some results.
1. Run a simulation that outputs results to some file
2. Load results in a notebook and do some quick processing + visualization

Very often, I want to compare quickly between 2 different runs and end up copying down the cell with visualization, then just re-run the data load + processing + visualization and compare them.

My understanding is that this would not be possible with marimo, since it will re-run automatically the cell with my previous data right ?

0Likes Log in to Reply
Post Author

randomNumber7

Posted March 19, 2025 at 8:48 pm

I recently started in data science. I like to just refactor out my stuff into normal phyton files that I import in notebooks

0Likes Log in to Reply
Post Author

lostdog

Posted March 19, 2025 at 9:17 pm

They kinda skip over it, but jupytext is underrated. You store the notebook primarily as text, and your version control handles it seamlessly.

0Likes Log in to Reply
Post Author

idanp

Posted March 19, 2025 at 9:47 pm

For lightweight calculations re-execute everything while typing cells https://github.com/idanpa/jupad

0Likes Log in to Reply
Post Author

epistasis

Posted March 19, 2025 at 9:59 pm

I have looked at Marimo in the past, and read this blog post with great interest, but I still don't "get" Marimo. What it does well: have a sane way to create and interact with widgets. Lots of widget authors and tooling authors, people I respect a lot, admire Marimo and like how it does stuff.

However, I'm not sure what the use case is for Marimo. I see Jupyter notebooks being used in two primary use cases: 1) prototyping new code and interactions with services and databases and datasets, as a record of the REPL used to understand something, with interactive notes and plots and pasted in images from docs, etc. 2) a record of how a calculation was performed, experimental data analyzed, and in a permanent artifact that others can look up later. For both of these, outputs and markdown/image cells are just as important as the code cells. These are both "write once" types of things where changes in git are rare, and ideally would never happen.

With Marimo, can I check the outputs directory into version control in a reasonable way and have it stored for posterity? Is that .ipynb?

Is there a way to convert a stored .ipynb checkpoint back into the marimo format?

And why does a small .ipynb change lead to many lines of change in the git diff? It's because the outputs changed. Deciding to not store outputs in version control and counting it as a win for pretty git diffs is saying "this core feature of .ipynb should be ignored because it's inconvenient". I'd much rather educate people about turning on GitHub's visual Jupyter diff rather than switch to an environment where I can no longer store outputs inline.

Similarly, being able to import one cell into a different notebook seems like the wrong direction to solve the problem of "it's time to turn the prototype notebook into a reusable module." If it's time to reuse a cell, it's time to make a cleaned-up Python code module file, not have the code interspersed with all the rest of the stuff.

I'd like to learn more about the use cases where Marimo is useful. As a scientist, it's not useful to me. I don't care about smaller git diffs on a notebook, in fact if a notebook is getting changed and re-checked into version control then a big awkward diff is not a problem and probably a feature, because notebooks should not be getting changed. They are a notebook something that you write in once and it's done!

0Likes Log in to Reply
Post Author

cjohnson318

Posted March 19, 2025 at 10:32 pm

I sometimes use notebooks mostly for taking notes, with a few code samples. In these cases, dealing with ipykernel and firing up a notebook is kind of a pain. Being able to open a "notebook" and make changes in vim sounds great.

0Likes Log in to Reply
Post Author

Kydlaw

Posted March 19, 2025 at 10:39 pm

I discovered Marimo a couple weeks/months ago here iirc. This really lands on a sweet spot for me for data exploration. For me the features that really nails it are the easy imports from other modules, the integrated UI components, and the app mode.

Being able to build model/simulations easily and being able to share them with others, who can then even interact with the results, as truly motivated me to try more stuff and build more. I've been deploying more and more of these apps as PoCs to prospects and people really like them as well.

Big thanks to the team!

0Likes Log in to Reply
Post Author

ayhanfuat

Posted March 19, 2025 at 10:40 pm

Unfortunately they don’t have Jupyter’s command mode. I wanted to switch a few times but not being able to create/delete/copy/move cells as easily is a big issue for me.

0Likes Log in to Reply
Post Author

stared

Posted March 19, 2025 at 11:33 pm

I am surprised they didn't mention RMarkdown (https://rmarkdown.rstudio.com/), which was developed in parallel to Jupyter Notebooks, with lots of convergent evolution.

RMarkdown is essentially Markdown with executable code blocks. While it comes from an R background, code blocks can be written in any language (and you can mix multiple languages).

The biggest difference (and, I would say, advantage) is that it separates code from output, making it work well with version control.

0Likes Log in to Reply

Notebooks as reusable Python programs by akshayka

Notebooks as reusable Python programs by akshayka

Share This Article

Newsletter

HackTech

16 Comments

dchuk

floathub

florbnit

paddy_m

jdaw0

abdullahkhalids

dmadisetti

TheAlchemist

randomNumber7

lostdog

idanp

epistasis

cjohnson318

Kydlaw

ayhanfuat

stared

Leave a comment Cancel reply

Editor's Choice

Notebooks as reusable Python programs by akshayka

Notebooks as reusable Python programs by akshayka

Share This Article

Newsletter

16 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter