Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize documentation for v1.2 #1619

Merged
merged 25 commits into from
Jun 29, 2022
Merged

Initialize documentation for v1.2 #1619

merged 25 commits into from
Jun 29, 2022

Conversation

dbeatty10
Copy link
Contributor

@dbeatty10 dbeatty10 commented Jun 24, 2022

Continues #1609

Things to review

Link to Netlify Preview

Location within navigation

Questions

  • The content is currently within Jinja Reference > dbt Jinja functions > cross-database macros. Is this where we want it? Or somewhere else?

Screenshot

image
Group by category or alphabetical listing (or both)

Questions

  • The first draft contains both an alphabetical listing and grouped into categories. Do we favor one over the other?
    • I prefer retaining a grouping by functional category as it makes it easier for me understand the usage of each macro when it is surrounded by more context.
    • Snowflake includes both types of listings: by category (here and here) and alphabetical.
  • Single page listing all macros? Or separate pages based on functional category?
    • My only preference is that the detailed entires are grouped into categories rather than alphabetical. An alphabetical listing of links feels desirable too.

Screenshots

Alphabetical:

image

Functional grouping:

image
Format of examples

Questions

  • Include dbt. prefix for macros? Or drop?
    • The result feels more readable without the prefix to me.
  • Convert to a SQL query that can actually execute?
    • It could be hard to create examples that will work cross-database without making them a bit complicated with common table expressions (CTE) and usage of other macros
  • Include the filename wrappers (i.e., models/example.sql)?
    • this feels a bit inaccurate unless we supply SQL examples that are fully executable
  • Multiple variations per example? Or just a single variation per example?
    • I strongly prefer multiple variations to make it easier to see how to express column names vs. SQL string literals vs. an expression containing a function.

Screenshot

image
Versioning this page

Questions

  • I added an entry to dbt-versions.js, but it doesn't appear to be taking effect. Versioning didn't appear to take effect for this page either (even though it is listed within dbt-versions.js).
    • Do we know how to get this to work?

Screenshots

Versioning working:

  • image

Versioning not working:

  • image

Description & motivation

We want to document the macros introduced here so that:

  • package maintainers (and other users) know what they do and how to use them
  • adapter maintainers know how to implement them

These macros came from dbt_utils, but few of them had pre-existing documentation. Existing documentation was lifted-and-shifted when available, but but most of this content is net new.

The most novel feature is organization into functional groups (similar to the groupings within Snowflake docs).

To-do before merge

  • Determine if current_timestamp* macros should be included or not.
  • Determine if data type macros should be included or not (i.e., whatever abstraction on top of api.Column.translate_type() that we go with).

Prerelease docs

If this change is related to functionality in a prerelease version of dbt (delete if not applicable):

Checklist

If you added new pages (delete if not applicable):

  • The page has been added to website/sidebars.js is within a category listed within website/sidebars.js
  • The new page has a unique filename

@netlify
Copy link

netlify bot commented Jun 24, 2022

Deploy Preview for docs-getdbt-com ready!

Name Link
🔨 Latest commit 75f3171
🔍 Latest deploy log https://app.netlify.com/sites/docs-getdbt-com/deploys/62bc46ecb4486c000a7eac4a
😎 Deploy Preview https://deploy-preview-1619--docs-getdbt-com.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@github-actions github-actions bot added content Improvements or additions to content size: large This change will more than a week to address and might require more than one person labels Jun 24, 2022
@dbeatty10 dbeatty10 requested review from jtcohen6 and jasnonaz June 27, 2022 00:02
@dbeatty10 dbeatty10 marked this pull request as ready for review June 27, 2022 00:03
@jtcohen6 jtcohen6 mentioned this pull request Jun 28, 2022
Copy link
Collaborator

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work here @dbeatty10! I left a handful of comments (mostly structural), and responded to your questions below. I didn't appreciate that most of these macros were effectively undocumented in dbt-utils, and so this is the first time we have real docs for them. Nice work!

(Food for far-future thought: Is there a better way for us to embed macro/method descriptions within dbt, so we don't need to hand-roll these docs every time?)

I'm for shipping this PR as soon as possible, and then swinging back with updates once we tackle the type_* + current_timestamp macros. It would be better to get the v1.2 docs live sooner.

questions

The content is currently within Jinja Reference > dbt Jinja functions > cross-database macros. Is this where we want it? Or somewhere else?

This feels fine to me

The first draft contains both an alphabetical listing and grouped into categories. Do we favor one over the other?

I agree with categorical grouping as the primary point of entry. I think an alphabetical list of every dbt-Jinja function is also warranted, but we may want that to live on its own separate page. Not a problem we must solve in this PR.

Single page listing all macros? Or separate pages based on functional category?

From my point of view, as long as there are clear anchors (ways to link to specific categories + macros), one single page is an okay way to start. We can always split them up later on if/when the page gets unwieldy.

Include dbt. prefix for macros? Or drop?

I prefer without the prefix, too.

Convert to a SQL query that can actually execute?

I agree this gets complicated if we want the query to work across databases. I know this has given us a hard time for the macros documented in the dbt-utils README. Out of scope for this PR, I'd say.

Include the filename wrappers (i.e., models/example.sql)?

I hear you on, if they aren't real/full SQL, it shouldn't be a full model file. I don't have such a strong feeling here. Let's keep it more generic for now.

Multiple variations per example? Or just a single variation per example?

I buy what you're saying — multiple variations is good by me!


### For maintainers of adapter plugins

We added a collection of ["cross-database macros"](cross-database-macros) to dbt Core v1.2. Default implementations are automatically inherited by adapters and included in the testing suite. Adapter maintainers may need to override the implementation of one or more macros to align with database-specific syntax or optimize performance. For details on the testing suite, see: ["Testing a new adapter"](testing-a-new-adapter).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also reflect this in https://docs.getdbt.com/docs/contributing/building-a-new-adapter (or wherever you + @dataders feel is appropriate).

I agree that we shouldn't re-list every macro there, but we should at least link to the list and encourage adapter authors/maintainers to add support for them, and test them accordingly. In fact, our recommendation to adapter maintainers should probably look like:

  • Inherit + run all utility tests, like so
  • For any failing tests, where the default implementation does not work, first try to reimplement the macro
  • If it's not possible to get the test passing with a custom macro: mark the test to skip/xfail. Either the database doesn't support that functionality, or the core test has been written in a way that's not extensible.

website/dbt-versions.js Outdated Show resolved Hide resolved
<File name='models/example.sql'>

```sql
{{ dbt.except() }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should include just the default "output" for each of these macros? In this case, it's hard to tell from just looking at the macro that all it does is return the string except (or database-specific equivalent)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also help adapter authors see "at a glance" if they need to reimplement a given macro (though starting by inheriting the functional tests and seeing what fails is definitely still the Right Way To Go)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Included the default output from the Postgres adapter for each of these macros.

Is it pretty in all cases? Nope.

We've got options for how to handle that:

  • update the whitespace handling on a per-macro basis and re-generate the example output
  • format the example output however we want it to look even if it is not the exact output

Do you have a preference @jtcohen6 ?

Here's the input file I used:

/analyses/output.sql
{{ except() }}

{{ intersect() }}

{{ concat(["column_1", "column_2"]) }}
{{ concat(["year_column", "'-'" , "month_column", "'-'" , "day_column"]) }}
{{ concat(["first_part_column", "'.'" , "second_part_column"]) }}
{{ concat(["first_part_column", "','" , "second_part_column"]) }}

{{ hash("column") }}
{{ hash("'Pennsylvania'") }}

{{ length("column") }}

{{ position("substring_column", "text_column") }}
{{ position("'-'", "text_column") }}

{{ replace("string_text_column", "old_chars_column", "new_chars_column") }}
{{ replace("string_text_column", "'-'", "'_'") }}

{{ right("string_text_column", "length_column") }}
{{ right("string_text_column", "3") }}

{{ escape_single_quotes("they're") }}
{{ escape_single_quotes("ain't ain't a word") }}

select {{ string_literal("Pennsylvania") }}

{{ any_value("column_name") }}

{{ bool_or("boolean_column") }}
{{ bool_or("integer_column = 3") }}
{{ bool_or("string_column = 'Pennsylvania'") }}
{{ bool_or("column1 = column2") }}

{{ listagg(measure="column_to_agg", delimiter_text="','", order_by_clause="order by order_by_column", limit_num=10) }}

{{ cast_bool_to_text("boolean_column_name") }}
{{ cast_bool_to_text("false") }}
{{ cast_bool_to_text("true") }}
{{ cast_bool_to_text("0 = 1") }}
{{ cast_bool_to_text("1 = 1") }}
{{ cast_bool_to_text("null") }}

{{ safe_cast("column_1", api.Column.translate_type("string")) }}
{{ safe_cast("column_2", api.Column.translate_type("integer")) }}
{{ safe_cast("'2016-03-09'", api.Column.translate_type("date")) }}

{{ dateadd(datepart="day", interval=1, from_date_or_timestamp="'2016-03-09'") }}
{{ dateadd(datepart="month", interval=-2, from_date_or_timestamp="'2016-03-09'") }}

{{ datediff("column_1", "column_2", "day") }}
{{ datediff("column", "'2016-03-09'", "month") }}
{{ datediff("'2016-03-09'", "column", "year") }}

{{ date_trunc("day", "updated_at") }}
{{ date_trunc("month", "updated_at") }}
{{ date_trunc("year", "'2016-03-09'") }}

{{ last_day("created_at", "month") }}
{{ last_day("'2016-03-09'", "year") }}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbeatty10 Genius move to use dbt to quickly & efficiently get the compiled output! Let's not talk about how I was thinking of doing this......

I think it's okay to clean up a little for display in the docs. I think it's also okay if it's a bit ugly (truth in advertising). All the more reason to motivate us doing The Right Thing, which is definitely this:

update the whitespace handling on a per-macro basis and re-generate the example output

Great opportunities for community contributors!

How will we keep this up to date going forward? Short of... the semantic layer? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A handful of ideas for keeping up-to-date going forward (generally from most manual to most automated):

  1. Use the same strategy as this PR:
    • dbt init a dbt project
    • copy-paste /analyses/output.sql into the project
    • execute dbt compile --select output
    • copy-paste from output in target directory into each relevant section of the docs
  2. Create a git repo for the dbt project I used. Put instructions for generating the sample output in the README.
  3. Build an elaborate mechanism that is mostly hands-off
  4. Throw up our hands and declare the current examples in the docs are "good enough" and leave them as-is.

Option 2 (or 4!) might hit the sweet spot for us for the time being.

@@ -656,6 +656,7 @@ const sidebarSettings = {
slug: '/guides/migration/versions',
},
items: [
Copy link
Collaborator

@runleonarun runleonarun Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtcohen6 and @dbeatty10 what do you think about autogenerating this sidebar? Then we don't need to update sidebar.js every release. It would looks something like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a preference. Whatever you both think is best.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of auto-generating! I don't think that needs to block this PR if it's work we can pick up separately.

@dbeatty10
Copy link
Contributor Author

@jtcohen6 I think I've addressed all the outstanding feedback. Will you take a look?

Copy link
Collaborator

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fit to ship! Left two (non-blocking) comments. One's just a nit, the other is a TODO for updating the docs targeting adapter maintainers, which can happen in a subsequent PR.


### For maintainers of adapter plugins

We added a collection of ["cross-database macros"](cross-database-macros) to dbt Core v1.2. Default implementations are automatically inherited by adapters and included in the testing suite. Adapter maintainers may need to override the implementation of one or more macros to align with database-specific syntax or optimize performance. For details on the testing suite, see: ["Testing a new adapter"](testing-a-new-adapter).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dataders Could you take a note for updating the adapter documentation, to notify adapter maintainers that there are:

  • more tests to start running ("utils")
  • likely more macros that need reimplementing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: opened #1634 to track this work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Improvements or additions to content size: large This change will more than a week to address and might require more than one person
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants