I’ve been tinkering with Marimo, it’s pretty sweet (and you can use cursor or other AI IDEs pretty easily with it).
On running notebooks as scripts: I can’t find in the docs what happens if you have plotting and other notebook oriented code? Like I’m using pygwalker to explore data through transformation steps, and end with saving to csv. If I just run the notebook as a script, is all of the plotting automatically skipped?
You can have a plaintext file which is also the program which is also the documentation/notebook/website/etc. It's extremely powerful, and is a compelling example of literate programming.
> When working with Jupyter, too often you end up with directories strewn with spaghetti-code notebooks, counting up to Untitled12.ipynb or higher. You the notebook author don’t know what’s in these notebooks
This is such a small UX thing but it’s so damn important. The simple fix is to not auto-name notebooks untitled-# when the user clicks new notebook just ask the name straight away, if they can’t name it don’t create it. It might add the smallest amount of friction to the UX, but it’s so damn important.
Also the choice of json as the file format is just plain wrong. Why the project hasn’t just abandoned that entirely and done a json-#python back and forth when writing to file is beyond me. There are extensions that do this, but that’s a really clunky interface, and while I can set it up for myself it’s difficult to force upon others in a corporate environment.
Great to see someone is taking the seemingly small things up, because they mean a world of difference to the overall ecosystem.
I develop an open source notebook widget. Working with marimo has been a joy compared to developing on top of any other notebook environment.
The team is responsive and they care about getting it right. Having a sane file format for serializing notebooks is an example of this. They are thinking about core problems. They are also building in the open.
The core jupyter team is very unresponsive and unfocused. When you have a bug, you need to figure out which one of many many interelated projects caused the bug, issues go weeks without a response. It's a mess.
Then there are the proprietary notebook like environments. VSCode notebooks, and google colab in particular. They frequently rely on opaque undocumented APIs and are also very unresponsive.
i wanted to like marimo, but the best notebook interface i've tried so far is vscode's interactive window [0]. the important thing is that it's a python file first, but you can divide up the code into cells to run in the jupyter kernel either all at once or interactively.
One design decision they made is that outputs are not stored. This means these notebooks are not suitable replacement for heavy computation routines, where the notebook is a record of the final results. Other people are not expected to run minutes/hours long computation to see what the author intended.
You can work your way around it by storing the results in a separate file(s), and writing the boiler plate to let the reader load the results. Or they let you export to ipynb – which is still sharing two files.
Presumably the reason for this decision is making git diffs short. But to me the solution is to fix git diff to operate on JSON nicely, rather than changing the entire notebook format.
One (not a great) workflow I have, is that I use notebooks as quicks UIs to visualize some results.
1. Run a simulation that outputs results to some file
2. Load results in a notebook and do some quick processing + visualization
Very often, I want to compare quickly between 2 different runs and end up copying down the cell with visualization, then just re-run the data load + processing + visualization and compare them.
My understanding is that this would not be possible with marimo, since it will re-run automatically the cell with my previous data right ?
I have looked at Marimo in the past, and read this blog post with great interest, but I still don't "get" Marimo. What it does well: have a sane way to create and interact with widgets. Lots of widget authors and tooling authors, people I respect a lot, admire Marimo and like how it does stuff.
However, I'm not sure what the use case is for Marimo. I see Jupyter notebooks being used in two primary use cases: 1) prototyping new code and interactions with services and databases and datasets, as a record of the REPL used to understand something, with interactive notes and plots and pasted in images from docs, etc. 2) a record of how a calculation was performed, experimental data analyzed, and in a permanent artifact that others can look up later. For both of these, outputs and markdown/image cells are just as important as the code cells. These are both "write once" types of things where changes in git are rare, and ideally would never happen.
With Marimo, can I check the outputs directory into version control in a reasonable way and have it stored for posterity? Is that .ipynb?
Is there a way to convert a stored .ipynb checkpoint back into the marimo format?
And why does a small .ipynb change lead to many lines of change in the git diff? It's because the outputs changed. Deciding to not store outputs in version control and counting it as a win for pretty git diffs is saying "this core feature of .ipynb should be ignored because it's inconvenient". I'd much rather educate people about turning on GitHub's visual Jupyter diff rather than switch to an environment where I can no longer store outputs inline.
Similarly, being able to import one cell into a different notebook seems like the wrong direction to solve the problem of "it's time to turn the prototype notebook into a reusable module." If it's time to reuse a cell, it's time to make a cleaned-up Python code module file, not have the code interspersed with all the rest of the stuff.
I'd like to learn more about the use cases where Marimo is useful. As a scientist, it's not useful to me. I don't care about smaller git diffs on a notebook, in fact if a notebook is getting changed and re-checked into version control then a big awkward diff is not a problem and probably a feature, because notebooks should not be getting changed. They are a notebook something that you write in once and it's done!
I sometimes use notebooks mostly for taking notes, with a few code samples. In these cases, dealing with ipykernel and firing up a notebook is kind of a pain. Being able to open a "notebook" and make changes in vim sounds great.
I discovered Marimo a couple weeks/months ago here iirc. This really lands on a sweet spot for me for data exploration. For me the features that really nails it are the easy imports from other modules, the integrated UI components, and the app mode.
Being able to build model/simulations easily and being able to share them with others, who can then even interact with the results, as truly motivated me to try more stuff and build more. I've been deploying more and more of these apps as PoCs to prospects and people really like them as well.
Unfortunately they don’t have Jupyter’s command mode. I wanted to switch a few times but not being able to create/delete/copy/move cells as easily is a big issue for me.
I am surprised they didn't mention RMarkdown (https://rmarkdown.rstudio.com/), which was developed in parallel to Jupyter Notebooks, with lots of convergent evolution.
RMarkdown is essentially Markdown with executable code blocks. While it comes from an R background, code blocks can be written in any language (and you can mix multiple languages).
The biggest difference (and, I would say, advantage) is that it separates code from output, making it work well with version control.
Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.
Our site uses cookies. Learn more about our use of cookies: cookie policyACCEPTREJECT
Privacy & Cookies Policy
Privacy Overview
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
16 Comments
dchuk
I’ve been tinkering with Marimo, it’s pretty sweet (and you can use cursor or other AI IDEs pretty easily with it).
On running notebooks as scripts: I can’t find in the docs what happens if you have plotting and other notebook oriented code? Like I’m using pygwalker to explore data through transformation steps, and end with saving to csv. If I just run the notebook as a script, is all of the plotting automatically skipped?
floathub
One approach to this is org-mode with babel.
You can have a plaintext file which is also the program which is also the documentation/notebook/website/etc. It's extremely powerful, and is a compelling example of literate programming.
Decent overview here:
https://www.johndcook.com/blog/2022/08/02/org-babel-vs-jupyt…
[edit: better link]
florbnit
> When working with Jupyter, too often you end up with directories strewn with spaghetti-code notebooks, counting up to Untitled12.ipynb or higher. You the notebook author don’t know what’s in these notebooks
This is such a small UX thing but it’s so damn important. The simple fix is to not auto-name notebooks untitled-# when the user clicks new notebook just ask the name straight away, if they can’t name it don’t create it. It might add the smallest amount of friction to the UX, but it’s so damn important.
Also the choice of json as the file format is just plain wrong. Why the project hasn’t just abandoned that entirely and done a json-#python back and forth when writing to file is beyond me. There are extensions that do this, but that’s a really clunky interface, and while I can set it up for myself it’s difficult to force upon others in a corporate environment.
Great to see someone is taking the seemingly small things up, because they mean a world of difference to the overall ecosystem.
paddy_m
I develop an open source notebook widget. Working with marimo has been a joy compared to developing on top of any other notebook environment.
The team is responsive and they care about getting it right. Having a sane file format for serializing notebooks is an example of this. They are thinking about core problems. They are also building in the open.
The core jupyter team is very unresponsive and unfocused. When you have a bug, you need to figure out which one of many many interelated projects caused the bug, issues go weeks without a response. It's a mess.
Then there are the proprietary notebook like environments. VSCode notebooks, and google colab in particular. They frequently rely on opaque undocumented APIs and are also very unresponsive.
jdaw0
i wanted to like marimo, but the best notebook interface i've tried so far is vscode's interactive window [0]. the important thing is that it's a python file first, but you can divide up the code into cells to run in the jupyter kernel either all at once or interactively.
0: https://code.visualstudio.com/docs/python/jupyter-support-py
abdullahkhalids
One design decision they made is that outputs are not stored. This means these notebooks are not suitable replacement for heavy computation routines, where the notebook is a record of the final results. Other people are not expected to run minutes/hours long computation to see what the author intended.
You can work your way around it by storing the results in a separate file(s), and writing the boiler plate to let the reader load the results. Or they let you export to ipynb – which is still sharing two files.
Presumably the reason for this decision is making git diffs short. But to me the solution is to fix git diff to operate on JSON nicely, rather than changing the entire notebook format.
dmadisetti
Very excited for the top level functions change coming up !!
TheAlchemist
This looks really very very neat.
One (not a great) workflow I have, is that I use notebooks as quicks UIs to visualize some results.
1. Run a simulation that outputs results to some file
2. Load results in a notebook and do some quick processing + visualization
Very often, I want to compare quickly between 2 different runs and end up copying down the cell with visualization, then just re-run the data load + processing + visualization and compare them.
My understanding is that this would not be possible with marimo, since it will re-run automatically the cell with my previous data right ?
randomNumber7
I recently started in data science. I like to just refactor out my stuff into normal phyton files that I import in notebooks
lostdog
They kinda skip over it, but jupytext is underrated. You store the notebook primarily as text, and your version control handles it seamlessly.
idanp
For lightweight calculations re-execute everything while typing cells https://github.com/idanpa/jupad
epistasis
I have looked at Marimo in the past, and read this blog post with great interest, but I still don't "get" Marimo. What it does well: have a sane way to create and interact with widgets. Lots of widget authors and tooling authors, people I respect a lot, admire Marimo and like how it does stuff.
However, I'm not sure what the use case is for Marimo. I see Jupyter notebooks being used in two primary use cases: 1) prototyping new code and interactions with services and databases and datasets, as a record of the REPL used to understand something, with interactive notes and plots and pasted in images from docs, etc. 2) a record of how a calculation was performed, experimental data analyzed, and in a permanent artifact that others can look up later. For both of these, outputs and markdown/image cells are just as important as the code cells. These are both "write once" types of things where changes in git are rare, and ideally would never happen.
With Marimo, can I check the outputs directory into version control in a reasonable way and have it stored for posterity? Is that .ipynb?
Is there a way to convert a stored .ipynb checkpoint back into the marimo format?
And why does a small .ipynb change lead to many lines of change in the git diff? It's because the outputs changed. Deciding to not store outputs in version control and counting it as a win for pretty git diffs is saying "this core feature of .ipynb should be ignored because it's inconvenient". I'd much rather educate people about turning on GitHub's visual Jupyter diff rather than switch to an environment where I can no longer store outputs inline.
Similarly, being able to import one cell into a different notebook seems like the wrong direction to solve the problem of "it's time to turn the prototype notebook into a reusable module." If it's time to reuse a cell, it's time to make a cleaned-up Python code module file, not have the code interspersed with all the rest of the stuff.
I'd like to learn more about the use cases where Marimo is useful. As a scientist, it's not useful to me. I don't care about smaller git diffs on a notebook, in fact if a notebook is getting changed and re-checked into version control then a big awkward diff is not a problem and probably a feature, because notebooks should not be getting changed. They are a notebook something that you write in once and it's done!
cjohnson318
I sometimes use notebooks mostly for taking notes, with a few code samples. In these cases, dealing with ipykernel and firing up a notebook is kind of a pain. Being able to open a "notebook" and make changes in vim sounds great.
Kydlaw
I discovered Marimo a couple weeks/months ago here iirc. This really lands on a sweet spot for me for data exploration. For me the features that really nails it are the easy imports from other modules, the integrated UI components, and the app mode.
Being able to build model/simulations easily and being able to share them with others, who can then even interact with the results, as truly motivated me to try more stuff and build more. I've been deploying more and more of these apps as PoCs to prospects and people really like them as well.
Big thanks to the team!
ayhanfuat
Unfortunately they don’t have Jupyter’s command mode. I wanted to switch a few times but not being able to create/delete/copy/move cells as easily is a big issue for me.
stared
I am surprised they didn't mention RMarkdown (https://rmarkdown.rstudio.com/), which was developed in parallel to Jupyter Notebooks, with lots of convergent evolution.
RMarkdown is essentially Markdown with executable code blocks. While it comes from an R background, code blocks can be written in any language (and you can mix multiple languages).
The biggest difference (and, I would say, advantage) is that it separates code from output, making it work well with version control.