-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: introduce virtual sequences and use with SERIAL #28575
Conversation
I tested locally and this seems to work well. Please have a look already. However I still need to iterate to iron out a couple of test failures. |
7ac0ae7
to
3aa8703
Compare
oh it so appears that there are no test failures. Wowza |
28573: roachtest: fix queue failure message r=petermattis a=tschottdorf It was using a global variable instead of the one it wanted. Touches #28372. Release note: None 28576: sql: cache sequence descriptors r=knz a=knz Required for the tests in #28575. `nextval()` was using uncached sequence descriptors. There is no reason to do so anymore. Release note (performance improvement): SQL sequences receive a slight performance boost in the `nextval()` built-in function. Co-authored-by: Tobias Schottdorf <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]>
3aa8703
to
1f76c1c
Compare
1f76c1c
to
149d85f
Compare
I just completed prototype support for sequences with SERIAL in CREATE TABLE and ALTER TABLE. The thing is not (yet) configurable. I am pushing this early to get a CI run and see how much tests need adjusting. Feel free to look but it's not ready yet. Note to self: ensure that ALTER TABLE SET TYPE with SERIAL types does the same as pg. |
@mjibson since you plan to look at this let me flesh out the configuration options that andy W, Bram and I have discussed. There would be a new cluster setting (name TBD) to serve as cluster-wide default. The setting would have the following 3 values:
|
We're thinking about having the cluster setting default to |
Reading over the code, it looks like these new virtual sequences are a sugar wrapper over unique_rowid. The IMPORT stuff calls that same function, so I think the actual bytes generated by IMPORT are compatible with this. It appears that IMPORT just needs to be taught about SERIAL or something. Overall I think this will be an easy fix for IMPORT. |
149d85f
to
4ed2ef7
Compare
@mjibson you may have glanced only at the first commit. Either a virtual or real sequence needs to be created as a descriptor, get a table ID, and serve name resolution requests. I am not sure how to best integrate that and where it needs to be integrated. |
I think we have two options here. The first and easiest one is to have IMPORT convert SERIAL like the The second option is a full solution. The current IMPORT PGDUMP code supports sequences
cockroach/pkg/ccl/importccl/import_stmt.go Line 197 in 575ecbc
I don't think it'd be too hard to do similar SERIAL processing in the IMPORT PGDUMP stuff. A difficulty may arise when processing CSV, though, since it's never had to make more than 1 descriptor and so lacks some of the plumbing. I'm looking at CSV now to see what would be required for this. |
@mjibson I modified Also there's another thing. PostgreSQL has the SERIAL pseudo-type always imply NOT NULL (we didn't do that, incorrectly, previously). This patch changes this, so I had to modify the import CSV test. Please have a look at that too. Thanks |
699d2b8
to
3a0860a
Compare
pkg/ccl/importccl/import_stmt.go
Outdated
@@ -191,6 +211,9 @@ func MakeSimpleTableDescriptor( | |||
Sequence: &importSequenceOperators{}, | |||
} | |||
affected := make(map[sqlbase.ID]*sqlbase.TableDescriptor) | |||
|
|||
log.Warningf(ctx, "WOO %q", create.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove.
@@ -46,11 +46,14 @@ func TestMakeSimpleTableDescriptorErrors(t *testing.T) { | |||
stmt: "create table a (i int, constraint a foreign key (i) references c (id))", | |||
error: `this IMPORT format does not support foreign keys`, | |||
}, | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test still accurate? I think this and the next test should be reverted.
if err != nil { | ||
return nil, err | ||
} | ||
*refTable = tree.MakeUnqualifiedTableName(refTable.TableName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this.
3a0860a
to
33f3636
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/ccl/importccl/import_stmt.go, line 215 at r8 (raw file):
Previously, mjibson (Matt Jibson) wrote…
Remove.
Done.
pkg/ccl/importccl/csv_internal_test.go, line 49 at r8 (raw file):
Previously, mjibson (Matt Jibson) wrote…
Is this test still accurate? I think this and the next test should be reverted.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CCL stuff LGTM.
28629: sql: better acknowledge fresh sequence descs in the current txn r=knz a=knz To be used with #28575. Release note: None Co-authored-by: Raphael 'kena' Poss <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few nits and comments.
Reviewed 8 of 11 files at r2, 1 of 1 files at r3, 20 of 20 files at r4, 9 of 29 files at r5, 8 of 11 files at r6, 1 of 1 files at r7, 27 of 27 files at r8, 2 of 2 files at r9.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/sql/serial.go, line 116 at r4 (raw file):
return nil, nil, nil, err } if res != nil {
Can you add the first attempt to the loop to?
Might make sense to pull it out into a function.
pkg/sql/logictest/testdata/logic_test/sequences, line 833 at r4 (raw file):
user root # Test virtual sequences.
subtest please
pkg/sql/logictest/testdata/logic_test/serial, line 88 at r8 (raw file):
DROP TABLE serials; DROP TABLE smallbig; DROP TABLE serial
lint: extra lines
pkg/sql/logictest/testdata/logic_test/serial, line 91 at r8 (raw file):
subtest serial_virtual_sequence
Please add a test that has to loop 2 times to get a sequence name.
pkg/sql/logictest/testdata/logic_test/serial, line 182 at r8 (raw file):
statement ok DROP TABLE serials; DROP TABLE smallbig; DROP TABLE serial
this can be a single drop table statement
pkg/sql/logictest/testdata/logic_test/serial, line 276 at r8 (raw file):
statement ok DROP TABLE serials; DROP TABLE smallbig; DROP TABLE serial
can be a single drop table
pkg/sql/sqlbase/structured.proto, line 699 at r4 (raw file):
// Start value of the sequence. optional int64 start = 4 [(gogoproto.nullable) = false]; // Whether the sequence is virtual.
lint: tabs
All tests pass :) Let me clean this up for you then we'll let the tires kick the (master) road. |
Suggested/recommended by @BramGruneir. This patch introduces **virtual sequences**, a CockroachDB-specific extension to SQL sequences. Virtual sequences are sequences that do not generate monotonically increasing values and instead produce values like the built-in function `unique_rowid()`. They are intended for use in combination with the PostgreSQL column pseudo-type "SERIAL", where clients expect a sequence object to be created but usually don't really care about the particular values generated by the sequence. Virtual sequences buy us a behavior that's compatible schema-wise with PostgreSQL but without the performance scalability bottleneck of regular sequences. Technically, a virtual sequence is supported by a regular sequence descriptor with just an extra option called "virtual". Therefore it behaves as a database object much alike a regular SQL sequence and can be manipulated with the same DDL (CREATE/DROP/SHOW) etc. and thus appears very much alike a regular SQL sequence to clients -- to ensure they can be used as a substitute to regular SQL sequences without clients noticing. The difference between a virtual sequence and a regular sequence is only to be found in the behavior of the `nextval()` built-in function: with virtual sequences this skips the KV increment operation and instead runs the code of `unique_rowid()`. Release note (sql change): A CockroachDB-specific experimental extension to SQL sequences, called "virtual sequences", is introduced as an experiment to more transparently support PostgreSQL applications using the SERIAL pseudo-type.
This will enable reuse in other places. Release note: None
33f3636
to
5c5ad5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained
pkg/sql/serial.go, line 116 at r4 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
Can you add the first attempt to the loop to?
Might make sense to pull it out into a function.
Done.
pkg/sql/logictest/testdata/logic_test/sequences, line 833 at r4 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
subtest please
Done.
pkg/sql/logictest/testdata/logic_test/serial, line 88 at r8 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
lint: extra lines
I liked the little boost to readability of the logic test file. Oh well. Done.
pkg/sql/logictest/testdata/logic_test/serial, line 91 at r8 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
Please add a test that has to loop 2 times to get a sequence name.
Done.
pkg/sql/logictest/testdata/logic_test/serial, line 182 at r8 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
this can be a single drop table statement
Done.
pkg/sql/logictest/testdata/logic_test/serial, line 276 at r8 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
can be a single drop table
Done.
pkg/sql/sqlbase/structured.proto, line 699 at r4 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
lint: tabs
Done.
Prior to this patch, SERIAL in CockroachDB was always alias for `INT DEFAULT unique_rowid()`. However 3rd party apps developed for PostgreSQL expect SERIAL to associate a column with a SQL sequence. Although the particular values produced for the SERIAL column rarely matter to the client app, we have found in practice that client apps (or ORM frameworks) are likely to assert that there is a sequence object created in the namespace with a particular name, and that they are able to call `nextval()` / `currval()` on that object while populating new rows on the table. This patch increases compatibility with those apps by optionally associating SERIAL columns to sequences like PostgreSQL. This behavior is driven by a new session variable called `experimental_serial_normalization`. Its default for new SQL sessions is configured via the new cluster setting `sql.defaults.serial_normalization`. The three values are: - `rowid` (the default): SERIAL behaves like before and expands to `INT NOT NULL DEFAULT unique_rowid()`. The SERIAL width (SMALLSERIAL/SERIAL4/etc) is ignored. - `virtual_sequence`: SERIAL creates a virtual sequence and expands to `INT NOT NULL DEFAULT nextval(...seqname...)`. The SERIAL width (SMALLSERIAL/SERIAL4/etc) is ignored. - `sql_sequence`: This is the PostgreSQL "stricter" compatibility mode. SERIAL creates a regular SQL sequence and expands to `...inttype... NOT NULL DEFAULT nextval(...seqname...)`. The SERIAL width (SMALLSERIAL/SERIAL4/etc) is respected. Finally, this patch ensures that SERIAL always implies NOT NULL, as required for PostgreSQL compatibility, even when SERIAL is not used as PRIMARY KEY. Release note (sql change): CockroachDB now supports two experimental compatibility modes with how PostgreSQL handles SERIAL and sequences, to ease reuse of 3rd party frameworks or apps developed for PostgreSQL. These modes can be enabled with the session variable `experimental_serial_normalization` (per client) and cluster setting `sql.defaults.serial_normalization` (cluster-wide). The first mode `virtual_sequence` enables compatibility with many applications using SERIAL with maximum performance and scalability. The second mode `sql_sequence` enables maximum PostgreSQL compatibility but thus uses regular SQL sequences and is thus subject to performance constraints.
5c5ad5c
to
46e0449
Compare
ok, let's let these tires kick the road bors r+ |
28575: sql: introduce virtual sequences and use with SERIAL r=knz a=knz Very good idea by @BramGruneir New features: virtual sequences (opt-in) and create-sequence-upon-SERIAL (opt-in). See individual commits for details and release notes. For reference: - I made everything opt-in here (i.e. default to previous CockroachDB behavior) to minimize risk. - UX with Hibernate will thus not automatically be improved. We need to inform users (and upgrade our tutorials/docs) that this experimental feature is available and how to use it to unblock problems. - It is possible that making the new default be `virtual_sequence` will both be backward-compatible with performance and unlock UX for Hibernate users. This has @awoods187 preference I think. However I propose to only flip that switch after we've seen a week or two pass with this patch merged. Fixes #22607 (using the new setting set to `sql_sequence`). Fixes #24062 (using the new setting set to `sql_sequence`). Fixes #26730 (using the new setting set to either `virtual_sequence` or `sql_sequence`). Co-authored-by: Raphael 'kena' Poss <[email protected]>
Build succeeded |
Nice work; so speedy! It's not on by default, right? Have to use the |
@vilterp read the release notes :) |
I'm confused - what does that cluster/session setting really do? Does it affect runtime behavior of existing SERIAL datatypes? Or does it only affect what's created when you make a table with SERIAL? |
The setting only affects new tables created inside that session. |
Very good idea by @BramGruneir
New features: virtual sequences (opt-in) and create-sequence-upon-SERIAL (opt-in). See individual commits for details and release notes.
For reference:
virtual_sequence
will both be backward-compatible with performance and unlock UX for Hibernate users. This has @awoods187 preference I think. However I propose to only flip that switch after we've seen a week or two pass with this patch merged.Fixes #22607 (using the new setting set to
sql_sequence
).Fixes #24062 (using the new setting set to
sql_sequence
).Fixes #26730 (using the new setting set to either
virtual_sequence
orsql_sequence
).