We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The HASH-JOIN dataset API could be similar with below codes:
func HashJoin(left,right Dataset, joinColumns, ...other options) Dataset { // ... }
The HASH-JOIN should contain two phases:
hash(values of join_columns)
some docs:
A tiny example, we have two datasets, and we want to execute SQL like select foo.id,bar.id from foo join bar on foo.x = bar.y
select foo.id,bar.id from foo join bar on foo.x = bar.y
--- Dataset foo
foo
--- Dataset bar
bar
x -> x%2
{ 0: [b-6], 1: [a-5,c-7] }
j-5
k-8
a-5
The text was updated successfully, but these errors were encountered:
pls assign to me
Sorry, something went wrong.
huangwenkan9
No branches or pull requests
The HASH-JOIN dataset API could be similar with below codes:
The HASH-JOIN should contain two phases:
hash(values of join_columns)
, value=rowssome docs:
A tiny example, we have two datasets, and we want to execute SQL like
select foo.id,bar.id from foo join bar on foo.x = bar.y
--- Dataset
foo
--- Dataset
bar
foo
: a hash map by hash methodx -> x%2
, we got a map like{ 0: [b-6], 1: [a-5,c-7] }
bar
:j-5
will check the chunk[key=1] and thek-8
will check the chunk[key=0]a-5
andj-5
, bingo!The text was updated successfully, but these errors were encountered: