Skip to content

Add doctests to functions to help humans and agents#1400

Open
ntjohnson1 wants to merge 5 commits intoapache:mainfrom
rerun-io:nick/doctests
Open

Add doctests to functions to help humans and agents#1400
ntjohnson1 wants to merge 5 commits intoapache:mainfrom
rerun-io:nick/doctests

Conversation

@ntjohnson1
Copy link
Contributor

@ntjohnson1 ntjohnson1 commented Feb 26, 2026

Which issue does this PR close?

Related to #1394
But I can file another issue.
I generated some examples manually then asked Claude to extend the pattern.
I've looked over the examples and they all seem reasonably minimal and get tested for correctness by doctest.

Rationale for this change

Generally when I am using the dataframe API the docstring/definition alone isn't always sufficient to highlight how or when to use all of the different options. I often find myself generating a small standalone example to verify..

Related to agents they do a great job interpolating so have more small standalone examples for various functionality should help them but I don't have a reference for that.

What changes are included in this PR?

  • Turns on doctest so pytest verifies docstring examples
  • Minor update to the test.yml to fix test discovery with doctest turned on, and fixed a github lint that triggered for me since I touched that file
  • Fixes a few existing examples to be compliant
  • Generates examples for everything in functions

Are there any user-facing changes?

No

>>> builtins.round(
... result.collect_column("rad")[0].as_py(), 6
... )
3.141593
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't like the builtins round we can just use the doctest ellipses approach instead.

Comment on lines +1473 to +1474
>>> import datafusion as dfn
>>> ctx = dfn.SessionContext()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think doctests allows specifying a prefix to always run if the import and ctx feel like too much bloat.

@ntjohnson1 ntjohnson1 marked this pull request as ready for review February 26, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant