Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 8980 #11638

Closed
wants to merge 88 commits into from
Closed

Issue 8980 #11638

wants to merge 88 commits into from

Conversation

cryptoe
Copy link

@cryptoe cryptoe commented Oct 4, 2018

No description provided.

Ramesh B and others added 30 commits August 24, 2018 20:13
Ranger integration with presto. Please change local file path

See merge request !1
Making information schema skippable

Making information schema skippable

See merge request !2
reverse merge

See merge request !3
minor improvements

See merge request !4
…ec not found in ranger 4. Fixing presto test cases
Any temporary files with the same size as a previous temporary file
were discarded due to the TreeSet comparator.
wenleix and others added 24 commits October 4, 2018 16:32
When using sockets created using SocketChannel, OkHttp handles thread
interruption differently and does not restore the interrupt status.
Use a different table name for testCreateEmptyBucketedPartition
and testInsertPartitionedBucketedTable to avoid test failures
when running in parallel.
This allows us to calculate stats for fragmented plans by using the
equivalent plan with exchanges.
Print estimated stats and costs for all fragements in a distributed
explain plan.

Sample output:
presto:sf1> explain (type distributed) select * from nation n join region r on n.regionkey = r.regionkey;
                                                                                                        Query Plan
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Fragment 0 [SINGLE]
     Output layout: [nationkey, name, regionkey, comment, regionkey, name_1, comment_2]
     Output partitioning: SINGLE []
     Execution Flow: UNGROUPED_EXECUTION
     - Output[nationkey, name, regionkey, comment, regionkey, name, comment] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), regionkey:bigint, name_1:varchar(25), comment_2:varchar(152)]
             Cost: {rows: 25 (4.91kB), cpu: 15427.00, memory: 504.00, network: 0.00}
             name := name_1
             comment := comment_2
         - RemoteSource[1] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), name_1:varchar(25), comment_2:varchar(152)]
                 Cost: {rows: 25 (4.69kB), cpu: 15427.00, memory: 504.00, network: 0.00}

 Fragment 1 [HASH]
     Output layout: [nationkey, name, regionkey, comment, name_1, comment_2]
     Output partitioning: SINGLE []
     Execution Flow: UNGROUPED_EXECUTION
     - InnerJoin[("regionkey" = "regionkey_0")][$hashvalue, $hashvalue_34] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), name_1:varchar(25), comment_2:varchar(152)]
             Distribution: PARTITIONED
             Cost: {rows: 25 (4.69kB), cpu: 15427.00, memory: 504.00, network: 0.00}
         - RemoteSource[2] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), $hashvalue:bigint]
                 Cost: {rows: 25 (2.89kB), cpu: 5693.00, memory: 0.00, network: 0.00}
         - LocalExchange[HASH][$hashvalue_34] ("regionkey_0") => regionkey_0:bigint, name_1:varchar(25), comment_2:varchar(152), $hashvalue_34:bigint
                 Cost: {rows: 5 (504B), cpu: 1467.00, memory: 0.00, network: 0.00}
             - RemoteSource[3] => [regionkey_0:bigint, name_1:varchar(25), comment_2:varchar(152), $hashvalue_35:bigint]
                     Cost: {rows: 5 (504B), cpu: 963.00, memory: 0.00, network: 0.00}

 Fragment 2 [SOURCE]
     Output layout: [nationkey, name, regionkey, comment, $hashvalue_33]
     Output partitioning: HASH [regionkey][$hashvalue_33]
     Execution Flow: UNGROUPED_EXECUTION
     - ScanProject[table = tpch:tpch:nation:sf1.0, originalConstraint = true] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), $hashvalue_33:bigint]
             Cost: {rows: 25 (2.67kB), cpu: 2734.00, memory: 0.00, network: 0.00}/{rows: 25 (2.89kB), cpu: 5693.00, memory: 0.00, network: 0.00}
             $hashvalue_33 := "combine_hash"(bigint '0', COALESCE("$operator$hash_code"("regionkey"), 0))
             nationkey := tpch:nationkey
             regionkey := tpch:regionkey
             name := tpch:name
             comment := tpch:comment

 Fragment 3 [SOURCE]
     Output layout: [regionkey_0, name_1, comment_2, $hashvalue_36]
     Output partitioning: HASH [regionkey_0][$hashvalue_36]
     Execution Flow: UNGROUPED_EXECUTION
     - ScanProject[table = tpch:tpch:region:sf1.0, originalConstraint = true] => [regionkey_0:bigint, name_1:varchar(25), comment_2:varchar(152), $hashvalue_36:bigint]
             Cost: {rows: 5 (459B), cpu: 459.00, memory: 0.00, network: 0.00}/{rows: 5 (504B), cpu: 963.00, memory: 0.00, network: 0.00}
             $hashvalue_36 := "combine_hash"(bigint '0', COALESCE("$operator$hash_code"("regionkey_0"), 0))
             comment_2 := tpch:comment
             name_1 := tpch:name
             regionkey_0 := tpch:regionkey

(1 row)
1.Col masking 2. Handling all query cases 3. Handling when catalog spec not found in ranger 4. Fixing presto test cases

 testcases and db_name. hack

 adding more test cases, removing dead code

checksyle things

 adding column permission check + test cases

1. Documentation 2. Clearning authorizer flow 3.Adding auditing 4.Adding the plugin jar in presto xml

Removing ola specific things

Typos
Ranger pom chages

some unwanted files

pom changes

Ranger base plugin

Ranger base plugin
# This is the 1st commit message:

# This is a combination of 8 commits.
# This is the 1st commit message:

# This is a combination of 6 commits.
# This is the 1st commit message:

# This is a combination of 3 commits.
# This is the 1st commit message:

# This is a combination of 6 commits.
# This is the 1st commit message:

# This is a combination of 3 commits.
# This is the 1st commit message:

# This is a combination of 2 commits.
# This is the 1st commit message:

# This is a combination of 2 commits.
# This is the 1st commit message:

# This is a combination of 4 commits.
# This is the 1st commit message:

# This is a combination of 2 commits.
# This is the 1st commit message:

# This is a combination of 3 commits.
# This is the 1st commit message:

# This is a combination of 4 commits.
# This is the 1st commit message:

 WIP: Ranger integration with presto. Please change local file path

# This is the commit message prestodb#2:

Making information schema skippable

# This is the commit message prestodb#3:

minor improvements

# This is the commit message prestodb#4:

Fix truncated print for ProjectionNode in PlanPrinter

# This is the commit message prestodb#2:

Cleanup all temporary files when writing sorted Hive tables

# This is the commit message prestodb#3:

Prevent nulls fraction to be negative in subtractColumnStats

# This is the commit message prestodb#2:

Better estimate NDVs and range in subtractColumnStats

When there is too little rows per distinct values to
be substracted, then both original range and NDVs
should be preserved.

# This is the commit message prestodb#2:

Mark subtractColumnStats as @deprecated as it semantics is undefined

# This is the commit message prestodb#3:

Dispose resources in WorkProcessorUtils when they are no longer needed

# This is the commit message prestodb#4:

Add WorkProcessor#transformProcessor method

This methods allows to write more streamlined
transformations of the processor itself, e.g:

processor.transformProcessor(WorkProcessorUtils::flatten)

# This is the commit message prestodb#2:

Fix error message in QuerySessionSupplier

# This is the commit message prestodb#2:

Rename singleExpression to standaloneExpression

This better reflects the intent of the parsing rule.

# This is the commit message prestodb#2:

Improve parsing error message

The name appears as the prefix of some errors. E.g.,

    xxxxx is too large (stack overflow while parsing)

"path specification" reads better than "pathSpec"

# This is the commit message prestodb#3:

Log raised during error handling

It was logging the parsing exception, not the exception that could
occur while handling the parsing error due to a bug in the implementation.

# This is the commit message prestodb#2:

Tighten assertion for parsing failure in TestSqlEnvironmentConfig

# This is the commit message prestodb#3:

Implement recursion-free ATN simulator

The new implementation tracks all the possible states the ATN can be
in in a single queue instead of recursing when processing sub-rules.

This vastly simplifies the code, makes it easier to reason about and
debug. It also fixes a latent bug where some contexts were not being
visited, which missed some candidate tokens.

# This is the commit message prestodb#4:

Raise requried Java version to 8u151

Some test cases failed due to JDK MethodHandle bug in 8u92.
These tests passed in 8u151.

# This is the commit message prestodb#5:

Track origin column for fields

This is needed to ensure we only check column access privileges for
origin columns.

# This is the commit message prestodb#6:

Don't check column access for aliases

Fixes queries that have column aliases defined in the query like
`SELECT col1 AS my_alias FROM table`.  Previously, we checked if a
user had permission to access "col1" and "my_alias". Now we only
 check "col1".

# This is the commit message prestodb#2:

Add wrapped Boolean benchmarks

Benchmark                                         Mode  Cnt     Score    Error   Units
BenchmarkBoxedBoolean.booleanEquals              thrpt   30  4596.863 ± 26.817  ops/ms
BenchmarkBoxedBoolean.booleanEqualsNotNull       thrpt   30  4484.421 ± 43.358  ops/ms
BenchmarkBoxedBoolean.identity                   thrpt   30  4610.811 ± 85.104  ops/ms
BenchmarkBoxedBoolean.object                     thrpt   30  4558.072 ± 68.325  ops/ms
BenchmarkBoxedBoolean.primitive                  thrpt   30  4450.140 ± 52.258  ops/ms
BenchmarkBoxedBoolean.unboxing                   thrpt   30  4506.205 ± 26.116  ops/ms

# This is the commit message prestodb#3:

Implement equality comparisons projection benchmark

Bechmark creates an expression with 100 comparisions concatenated with
OR and executes them on a single 1MB page with 10 BIGINT channels.

Before change:

Benchmark                                  Mode  Cnt     Score    Error  Units
BenchmarkEqualsOperator.processPage	   avgt   15  5164.793 ± 29.110  us/op

After change:

Benchmark                                  Mode  Cnt     Score    Error  Units
BenchmarkEqualsOperator.processPage	   avgt   15  5142.010 ± 59.681  us/op

# This is the commit message prestodb#2:

Implement nullable EQUAL and NOT_EQUAL for ARRAY type

# This is the commit message prestodb#3:

Change equals semantincs for null in MAP

Make it comptible with current ARRAY equals semantics

If the key sets are different - return false.
If the key sets are the same - compare the values as if they
were arrays sorted by the keys.

# This is the commit message prestodb#4:

Implement nullable EQUAL and NOT_EQUAL for ROW type

# This is the commit message prestodb#5:

Support IN predicate for complex type values with nulls

# This is the commit message prestodb#6:

Allow alternate implementations of HiveMetadata

Extracting an interface allows implementations that use delegation
rather than subclasses.

# This is the commit message prestodb#2:

Fix handling of thread interruption in JDBC driver

When using sockets created using SocketChannel, OkHttp handles thread
interruption differently and does not restore the interrupt status.

# This is the commit message prestodb#3:

Add per table column_ranges system table in Raptor

# This is the commit message prestodb#4:

Fix Hive smoke test when running in parallel

Use a different table name for testCreateEmptyBucketedPartition
and testInsertPartitionedBucketedTable to avoid test failures
when running in parallel.

# This is the commit message prestodb#5:

Separate EXCEEDED_MEMORY_LIMIT error into local and global

# This is the commit message prestodb#6:

Fix test failures due to ExceededMemoryLimitException message

# This is the commit message prestodb#7:

Fix more test failures due to ExceededMemoryLimitException message

A few tests failures are not fixed by
47355c8

# This is the commit message prestodb#8:

Replace usage of deprecated TreeTraverser

# This is the commit message prestodb#2:

Add missing column to verifier SQL documentation
1.Col masking 2. Handling all query cases 3. Handling when catalog spec not found in ranger 4. Fixing presto test cases

 testcases and db_name. hack

 adding more test cases, removing dead code

checksyle things

 adding column permission check + test cases

1. Documentation 2. Clearning authorizer flow 3.Adding auditing 4.Adding the plugin jar in presto xml

Removing ola specific things

Typos

Le bump

checksyle things

 adding column permission check + test cases

1. Documentation 2. Clearning authorizer flow 3.Adding auditing 4.Adding the plugin jar in presto xml

Typos
@cryptoe
Copy link
Author

cryptoe commented Oct 4, 2018

#8980

@atris
Copy link

atris commented Oct 4, 2018

  1. Please remove the unwanted commits
  2. Please squash your own commits
  3. Please write a description of your changes

@cryptoe
Copy link
Author

cryptoe commented Oct 4, 2018

Squashing commits in #11639

@cryptoe cryptoe closed this Oct 4, 2018
@cryptoe cryptoe deleted the ISSUE_8980 branch October 4, 2018 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.