Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EPIC] Automatically generate all function documentation from code #12740

Closed
24 of 26 tasks
alamb opened this issue Oct 3, 2024 · 12 comments · Fixed by #13395
Closed
24 of 26 tasks

[EPIC] Automatically generate all function documentation from code #12740

alamb opened this issue Oct 3, 2024 · 12 comments · Fixed by #13395
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Oct 3, 2024

Is your feature request related to a problem or challenge?

Currently, when we add a new function to DataFusion library we have to remember to document that function in the documentation, but currently we have to remember this during code review.

This has a few downsides:

  1. we have forgotten to document some functions
  2. the documentation has drifted over time,
  3. help text for various functions can only be found on the DataFusion website, and not, for example within the function itself.

@Omega359 has created the basic framework for automatically generating content in #12668

This ticket tracks the work required to port the rest of the documentation to programatic form

Tasks

Aggregate Function Documentation Migration

Scalar Function Documentation Migration

Window Function Documentation Migration

@alamb
Copy link
Contributor Author

alamb commented Oct 3, 2024

Update is I have a few follow on PRs

Then I figure I will file one or two other tickets to see how easy it is to port other functions

And if all goes well I'll make a ticket storm to port the docs

@Omega359
Copy link
Contributor

Omega359 commented Oct 4, 2024

FYI I've finished up the string expression migration locally - just waiting on the updated code from #12742 to land before pushing a PR.

@alamb
Copy link
Contributor Author

alamb commented Oct 4, 2024

FYI I've finished up the string expression migration locally - just waiting on the updated code from #12742 to land before pushing a PR.

Thanks, I'll merge those as soon as they get a review (I can't merge them until they get an approval from another committer)

@Omega359
Copy link
Contributor

Omega359 commented Oct 5, 2024

No rush I just wanted to make sure no one was duplicating work needlessly

@jonathanc-n
Copy link
Contributor

@alamb I think it should be mentioned here that ./dev/update_function_docs.sh should be ran, as it wasn't immediately obvious, and when I did my first migration I took a bit of time to find it.

@alamb
Copy link
Contributor Author

alamb commented Oct 11, 2024

@alamb I think it should be mentioned here that ./dev/update_function_docs.sh should be ran, as it wasn't immediately obvious, and when I did my first migration I took a bit of time to find it.

Thank you @jonathanc-n -- I added it to the description of #12859 and will add it on subsequent tickets

@alamb
Copy link
Contributor Author

alamb commented Oct 15, 2024

Thanks to @jonathanc-n @Omega359 and others we are cranking right along with this. I expect it to be done sometime later this week

@alamb alamb changed the title [EPIC] Automatically generate all function content from code [EPIC] Automatically generate all function documentation from code Oct 21, 2024
@alamb
Copy link
Contributor Author

alamb commented Oct 23, 2024

I think we are pretty close here

After we merge these PRs from @jonathanc-n and myself

We then have:

Then we have just a few more functions to document (lead, lag) and we can turn on a CI check to ensure all new functions are documented

(venv) andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ ./dev/update_function_docs.sh
/Users/andrewlamb/Software/datafusion
Inserting header
Running CLI and inserting aggregate function docs table
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.34s
     Running `target/debug/print_functions_docs aggregate`
Running prettier
docs/source/user-guide/sql/aggregate_functions_new.md 59ms
'docs/source/user-guide/sql/aggregate_functions_new.md' successfully updated!
Inserting header
Running CLI and inserting scalar function docs table
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.09s
     Running `target/debug/print_functions_docs scalar`
INFO: The following functions do not have documentation:
  - map_values
  - map_keys
  - map
  - map_extract
Running prettier
docs/source/user-guide/sql/scalar_functions_new.md 208ms
'docs/source/user-guide/sql/scalar_functions_new.md' successfully updated!
Inserting header
Running CLI and inserting window function docs table
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.09s
     Running `target/debug/print_functions_docs window`
INFO: The following functions do not have documentation:
  - lag
  - lead
Running prettier
docs/source/user-guide/sql/window_functions_new.md 26ms
'docs/source/user-guide/sql/window_functions_new.md' successfully updated!

@Omega359
Copy link
Contributor

lead/lag PR @ #13095

Other functions that still exist on scalar functions page - #13036

@alamb
Copy link
Contributor Author

alamb commented Oct 24, 2024

Here is a PR to make sure no new functions are added without documentation: #12938

@alamb
Copy link
Contributor Author

alamb commented Oct 29, 2024

🎉 I think all we need is #13171 and we can claim this project is complete. Thanks again @Omega359 @buraksenn and @jonathanc-n for pushing it along

@alamb
Copy link
Contributor Author

alamb commented Nov 13, 2024

🎉 epic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants