-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: remove docstring examples when publishing packages? #1742
Comments
@MarcoGorelli reducing the size of Personally, I quite like having examples in my IDE and with |
Thanks @dangotbanned for your insights! The wheel size is currently about 1% of DuckDB's, and on disk it's about 6%. Which is a bit more than what I'd like it to be... One compromise could be to just have much shorter docstring examples, such as >>> import polars as pl
>>> import narwhals as nw
>>> df_native = pl.DataFrame({'a': [1,2,3], 'b': [4,5,-6]})
>>> df = nw.from_native(df_native)
>>> df.with_columns(nw.all().abs().name.suffix('_abs')).to_native()
shape: (3, 4)
┌─────┬─────┬───────┬───────┐
│ a ┆ b ┆ a_abs ┆ b_abs │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═══════╪═══════╡
│ 1 ┆ 4 ┆ 1 ┆ 4 │
│ 2 ┆ 5 ┆ 2 ┆ 5 │
│ 3 ┆ -6 ┆ 3 ┆ 6 │
└─────┴─────┴───────┴───────┘ (with a roughly even split between backends, e.g. 1/3 of eager functions use pandas, 1/3 polars, 1/3 pyarrow) and then leave the longer "here's how to write a dataframe-agnostic function"-style ones to the website 🤔 not sure yet, curious to hear thoughts |
ooh I see 🤔 You might be able to cut down on the
I'm not 100% on if this would be compatible with the |
I think that just removing a single |
On the content of docstring examples, it reminded of a little writeup I did here vega/altair#3500 (comment) Shout out to https://diataxis.fr/ |
I'm not sure if I'm misunderstanding here, but what #1742 (comment) meant (or I suppose implied) was you'd have 1 import per file vs 1 per docstring |
Yup, clear, thanks! I just tried measuring that, and removing all the lines which match For contrast, removing all docstring examples reduces the size to 1078 bytes We can probably infer that if we were to rewrite all docstrings in the style of #1742 (comment), then the size would reduce to ~1300 bytes. |
Been thinking about this further I think that:
This way
So, my suggestion would be:
Any thoughts / objections? |
Docstring examples account for
30%40% of Narwhals' sizeWe love docs, but maybe this is a bit excessive 😄
I think it's pretty important to ship docstrings with source code, so that people can inspect docstrings while they're coding. But do we also need the docstring examples? I think they're kind of hard to read just from the IDE's preview anyway
Perhaps only removing the examples from the docstrings when building the wheel strikes a balance between:
The docstring examples would still be visible in the API Reference
I just tried this out, and removing all examples from functions is enough to reduce the package size by
25%40%I also tried removing the "single-line-imports" rule, but that made a barely detectable difference (less than 0.1MB) - sometimes it actually becomes longer to have imports on multiple lines
The text was updated successfully, but these errors were encountered: