Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tutorials to be more cognizant of CRDB vs. application language numeric types (esp. JavaScript) #4298

Open
rmloveland opened this issue Jan 28, 2019 · 7 comments
Labels
O-external Origin: Issue comes from external users. P-2 Normal priority; secondary task T-missing-info

Comments

@rmloveland
Copy link
Contributor

rmloveland commented Jan 28, 2019

Richard Loveland (rmloveland) commented:

According to the int_size logic test, the return type of unique_rowid() is INT8.

According to the PG numeric data type docs, INT8 is a bigint type with a max value of 9223372036854775807.

However, JS numbers are 64-bit, which means they can only represent numbers up to 9007199254740991.

This is a pretty big mismatch! E.g.,

9223372036854775807
0009007199254740991

Estimated scope of work:

  • Update JavaScript docs to be aware of this issue
  • Consider updating the other "build an app" docs since (1) those languages may also run into this issue depending on their numeric types, and/or (2) those languages may be used on the backend with a JS frontend, in which case this issue will rear its head.

Related issues:

(h/t @markthethomas for this tweet that prompted filing this issue.)

Jira Issue: DOC-227

@rmloveland rmloveland added O-external Origin: Issue comes from external users. A-sql P-2 Normal priority; secondary task T-missing-info labels Jan 28, 2019
@rolandcrosby
Copy link

I recall this was an issue for the Twitter API when they moved to their current identifier system -- here's their doc on the topic.

@markthethomas
Copy link

💯🙌

@markthethomas
Copy link

The way that I was solving this was by using Google’s UUID go library, but I’d guess snowflake or anything with a strong uniqueness guarantee that doesn’t have overflow/float issues would work (even mongo’s objectid)

@markthethomas
Copy link

Also thank you guys for hopping on this! Love the response 👌

@jseldess
Copy link
Contributor

@rmloveland, is this SQL FAQ related?

@rmloveland
Copy link
Contributor Author

is this SQL FAQ related?

Sounds like a good idea, thanks @jseldess

rmloveland added a commit that referenced this issue Jan 31, 2019
Fixes #4183, #4254.

Addresses part of #3420, #4298.

Summary of changes:

- Add `default_int_size` to session vars page

- Update `INT` data type page with new subsections of the **Size**
  section:

  - Description of compatibility issues caused by 64-bit integers
    vs. e.g. JavaScript runtimes, including when using autogenerated JS
    clients

  - Why you should avoid integer primary keys, with links to more
    information in Perf Best Practices and the Architecture docs
rmloveland added a commit that referenced this issue Feb 1, 2019
Fixes #4183, #4254.

Lays some groundwork for future work on #4298.

Summary of changes:

- Add `default_int_size` to session vars page

- Update `INT` data type page with additions to the **Size**
  section, specifically:

  - A description of possible compatibility issues caused by 64-bit
    integers vs. e.g. JavaScript runtimes, including when using
    frameworks that autogenerate frontend code.

  - Mention and link to `default_int_size` session var and cluster
    setting.
rmloveland added a commit that referenced this issue Feb 1, 2019
Fixes #4183, #4254.

Lays some groundwork for future work on #4298.

Summary of changes:

- Update `INT` page to include examples of actual min/max integers
  supported by each type for easier reference.

- Update `INT` data type page with additions to the **Size**
  section, specifically:

  - A description of possible compatibility issues caused by 64-bit
    integers vs. e.g. JavaScript runtimes, including when using
    frameworks that autogenerate frontend code.

  - Mention and link to `default_int_size` session var and cluster
    setting.

- Add `default_int_size` to session vars page
rmloveland added a commit that referenced this issue Feb 1, 2019
Fixes #4183, #4254.

Lays some groundwork for future work on #4298.

Summary of changes:

- Update `INT` page to include examples of actual min/max integers
  supported by each type for easier reference.

- Update `INT` data type page with additions to the **Size**
  section, specifically:

  - A description of possible compatibility issues caused by 64-bit
    integers vs. e.g. JavaScript runtimes, including when using
    frameworks that autogenerate frontend code.

  - Mention and link to `default_int_size` session var and cluster
    setting.

- Add `default_int_size` to session vars page
rmloveland added a commit that referenced this issue Feb 4, 2019
Fixes #4183, #4254.

Lays some groundwork for future work on #4298.

Summary of changes:

- Update `INT` page to include examples of actual min/max integers
  supported by each type for easier reference.

- Update `INT` data type page with additions to the **Size**
  section, specifically:

  - A description of possible compatibility issues caused by 64-bit
    integers vs. e.g. JavaScript runtimes, including when using
    frameworks that autogenerate frontend code.

  - Mention and link to `default_int_size` session var and cluster
    setting.

- Add `default_int_size` to session vars page
rmloveland added a commit that referenced this issue Feb 6, 2019
Fixes #4183, #4254.

Lays some groundwork for future work on #4298.

Summary of changes:

- Update `INT` page to include examples of actual min/max integers
  supported by each type for easier reference.

- Update `INT` data type page with additions to the **Size**
  section, specifically:

  - A description of possible compatibility issues caused by 64-bit
    integers vs. e.g. JavaScript runtimes, including when using
    frameworks that autogenerate frontend code.

  - Mention and link to `default_int_size` session var and cluster
    setting.

- Add `default_int_size` to session vars page
@bobvawter
Copy link
Contributor

bobvawter commented Oct 2, 2020

I just had to solve this for a customer. They have an existing schema using INT8 ids that cannot be changed and have downstream JS clients that require the ids to be numeric (i.e. not a stringified number nor a UUD). We also want to avoid the use of sequences, since they are performance bottlenecks.

As we've documented, the maximum numeric value that JS can represent without loss of precision is 2^53. unique_rowid() isn't particularly high-entropy, so it's not safe to just carve off the desired number of bits. Instead, we'll wash it through a hash function first, to maximize the entropy in any given bit. The fnv64a() function built into CockroachDB fits the bill, since we don't need cryptographic-levels of security here.

All of the usual caveats apply here about hash collisions, but you'd have to insert 111,000,000 values before there's a 50% chance of a collision occurring.

CREATE TABLE js_friendly_id (
  id INT8
    CHECK (id >=0 and id < (1<<53) )
    PRIMARY KEY
    DEFAULT (fnv64a(unique_rowid()::string) & ((1<<53)-1)),
  foo int);

INSERT INTO js_friendly_id (foo) SELECT generate_series(1,100000);
SELECT * FROM js_friendly_id LIMIT 10;
       id      |  foo
---------------+--------
   87724929937 | 32274
   93015663888 | 98787
  199741459955 | 74592
  238588845328 | 90859
  379450939404 | 27335
  499271598639 | 24311
  576712899457 | 52622
  803671687983 | 31679
  818499152170 |  7085
  847724521443 | 58947
(10 rows)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-external Origin: Issue comes from external users. P-2 Normal priority; secondary task T-missing-info
Projects
None yet
Development

No branches or pull requests

6 participants