Replies: 4 comments 7 replies
-
@Ubehebe , sorry for the late response -- just saw your post.
I've thought about this insofar as the indirection bothers me, too. But I haven't spent time trying to design something better. In particular, number 2 resonates with me -- it would be great to make editing in an IDE/text editor easier. Definitely open to hearing your suggestions. Thanks so much for the thoughtful message! |
Beta Was this translation helpful? Give feedback.
-
I think the main question I have is: why does marimo have to serialize the dataflow graph into the source file? Why can't it be an in-memory data structure on the server? The first problem is that marimo needs some way to partition a Python source file into cells. The The edges of the dataflow graph (the parameters of the synthetic functions) are what confuse non-marimo tools. Why do they need to be in the source file? Couldn't marimo run the dataflow analysis once on startup and keep the graph in-memory? This might slow down the initial time to interactive, but I doubt it would be significant when If we're able to make marimo notebooks more idiomatic Python so that other tools can work on them seamlessly, I think that's a good tradeoff. |
Beta Was this translation helpful? Give feedback.
-
I renamed this topic to reflect the most important issue (and to focus less on specific solutions). marimo's text format is good compared to other notebook formats, but I think it can be even better. |
Beta Was this translation helpful? Give feedback.
-
marimo could also just have a flat script version. Exports are already possible, it would just mean handling imports. The difficulty is that there is then 3 formats (normal python, script python, and markdown) to manage. Maybe this is worth it? But also removes the non-linearity of marimo notebooks. Alternatively, deeper code editor integration (think "VS code" or vim plugin) could be set up to use marimo's LSP directly and overcome the issues described. marimo's current imports are actually good, because they are lazy. I thought about the following alternative (see below), before I realized maybe this is an editor issue and not a marimo issue. I'm only including because it's already written but I think it's a lot of engineering effort vs pay off (not my decision to make, just my thoughts) marimo could parse out import statements and put in relevant stubs. e.g. # Cell 1
import pandas as pd
y = pd.df(...) Gets turned into import marimo
import pandas as pd
app = marimo.App()
__import_manager__ = app.__import_manager__
@app.cell
def __():
__import_manager__("pandas", _as="pd")
y = pd.df(...)
return y and behind the scenes, Benefits:
sanity check: # Cell 2
if isinstance(y, list):
import numpy as np
y = np.array(y)
x = pd.df(np.sqrt(y)) should fail if y is not a list, as import marimo
import pandas as pd
import numpy as np
app = marimo.App()
__import_manager__ = app.__import_manager__
@app.cell
def cell2(y):
if isinstance(y, list):
__import_manager__("numpy", _as="np")
y = np.array(y)
x = pd.df(np.sqrt(y))
return x Thanks to the decorator, def __import_manager__(module, _as=None):
_locals = inspect.currentframe().f_back.f_locals
if module not in sys.modules:
__import__(module)
_locals[_as] = sys.modules[module] which will put |
Beta Was this translation helpful? Give feedback.
-
One of the main advantages of marimo compared to other notebook formats is that marimo notebooks are syntactically valid Python files. This means that tools that analyze Python files (linters, formatters, type-checkers, IDEs) can generally do something useful with marimo notebooks without any setup.
As I've used marimo more, I've discovered some exceptions. marimo notebooks are syntactically valid Python, but they aren't idiomatic Python. This means that some tools can't analyze marimo notebooks in a useful way.
Here's an example. When you use an import in a notebook:
marimo serializes that to disk as something like:
This is basically a serialized DAG: the nodes (cells) are represented by top-level functions decorated with
@app.cell
, and the edges (dependencies) are represented by function params/return values.The serialization is elegant, but tools other than marimo can't understand the indirection -- for example, they can't understand that the DataFrame constructor comes from the pandas import. This means that:
import pandas as pd
statement remains.I can see a few approaches we might take to improve this situation, but before proposing anything specific, I wanted to start a discussion. Maintainers, have you thought about this? How important do you think it is to improve?
My own view is that it's medium importance. For (1), unused imports can significantly slow down notebook execution. And for (2), being able to use IDE features to edit marimo notebooks would make large codebases significantly more maintainable (refactoring, etc.).
Thanks for your time!
Beta Was this translation helpful? Give feedback.
All reactions