Collect column statistics on write [v2] #11054

arhimondr · 2018-07-16T02:43:38Z

Important changes since the version 1:

Commits from Return HiveColumnStatistics from the HiveMetastore interface to Move createPartitionValues method to a utility class were extracted into a separate PR and merged: Preliminary column statistics refactorings #10972
Added Implement AutoCloseableCloser commit.
SPI bits extracted to Collect column statistics on table write: SPI
getInsertStatisticsMetadata and getNewTableStatisticsMetadata method were merged into the single getStatisticsCollectionMetadata method
AggregationOperator integration with the TableWriterOperator has changed
ExtendedHiveMetastore#isColumnStatistitcsSupported() replaced with the ExtendedHiveMetastore#getSupportedColumnStatistics. That eliminates a need of CollectibleStatisticsProvider. Commit: Replace supportsColumnStatistics with getSupportedColumnStatistics
ENABLED_FOR_MARKED_TABLES option has been removed as well as a related table property.
Migrate column statistics on drop and rename column commit has been dropped
Added Collect column statistics on table write: Smoke Tests commit

arhimondr · 2018-07-16T03:00:31Z

@electrum I'm still working on the integration test

arhimondr · 2018-07-17T20:07:10Z

Please ignore the Fix memory leaks in tests from the presto-tests module commit, it is part of the #11062 PR

findepi · 2018-07-18T14:13:59Z

presto-main/src/main/java/com/facebook/presto/util/AutoCloseableCloser.java

+        return new AutoCloseableCloser();
+    }
+
+    public void register(AutoCloseable closeable)


public <T extends AutoCloseable> T register(T closeable)

findepi · 2018-07-18T14:16:56Z

presto-main/src/main/java/com/facebook/presto/util/AutoCloseableCloser.java

+            throws Exception
+    {
+        Throwable rootCause = null;
+        for (AutoCloseable closeable : closeables) {


Guava Closer (which this class mimics) closes in reverse order.
Also, it doesn't close the underlying resource twice, even if close is called twice (eg concurrently)

I checked that. First i didn't get it, but now i think i got it.

Given the example

stream = closer.register(new OutputStream) writer = closer.register(new OutputWriter(stream))

it makes total sense

even if close is called twice (eg concurrently)

The Closer is not thread safe. It is hard to reason what is going on when using it concurrently.

findepi · 2018-07-18T14:18:41Z

presto-main/src/main/java/com/facebook/presto/util/AutoCloseableCloser.java

+        if (rootCause != null) {
+            throwIfUnchecked(rootCause);
+            throwIfInstanceOf(rootCause, Exception.class);
+            throw new Exception(rootCause);


if we reach here, throw new Error or AssertionError would be more appropriate.. since this is unreachable

findepi · 2018-07-18T14:19:39Z

presto-main/src/main/java/com/facebook/presto/util/AutoCloseableCloser.java

+        }
+        if (rootCause != null) {
+            throwIfUnchecked(rootCause);
+            throwIfInstanceOf(rootCause, Exception.class);


replace those 2 with propagateIfPossible(rootCause, Exception.class)

We use throwIfInstanceOf across the code

Almost all usages of throwIfInstanceOf are alone. The few places that it is combined with throwIfUnchecked should be converted to propagateIfPossible since that is more succinct.

findepi · 2018-07-18T14:21:48Z

presto-main/src/main/java/com/facebook/presto/metadata/Metadata.java


    Optional<NewTableLayout> getInsertLayout(Session session, TableHandle target);

+    /**
+     * Describes statistics that must be collected for a new table


only new tables? I think also updated ones

findepi · 2018-07-18T14:22:37Z

presto-spi/src/main/java/com/facebook/presto/spi/connector/ConnectorMetadata.java

@@ -264,6 +266,14 @@ default void dropColumn(ConnectorSession session, ConnectorTableHandle tableHand
        return Optional.of(new ConnectorNewTableLayout(partitioningHandle, partitionColumns));
    }

+    /**
+     * Describes statistics that must be collected for a new table.


again -- "new"?

findepi · 2018-07-18T14:23:42Z

presto-spi/src/main/java/com/facebook/presto/spi/statistics/ComputedStatistics.java

+        return groupingColumns;
+    }
+
+    public List<Block> getGropingValues()


typo: groping
(field & ctor & builder too)

findepi · 2018-07-18T14:25:47Z

presto-spi/src/main/java/com/facebook/presto/spi/statistics/ComputedStatistics.java

+        public ComputedStatistics build()
+        {
+            return new ComputedStatistics(
+                    unmodifiableList(groupingColumns),


Did you mean new ArrayList<>(groupingColumns)?
Otherwise this is redundant (ctor does that) and allows creating mutable ComputedStatistics objects

Removed the unmodifiable* at all here. Since those are in the constructor.

findepi · 2018-07-18T14:26:46Z

presto-spi/src/main/java/com/facebook/presto/spi/statistics/ComputedStatistics.java

+        public Builder(List<String> groupingColumns, List<Block> gropingValues)
+        {
+            this.groupingColumns = requireNonNull(groupingColumns, "groupingColumns is null");
+            this.gropingValues = requireNonNull(gropingValues, "gropingValues is null");


convert ctor to builder methods withGroupingColumns (be sure to make defensive copy there)

This field are required. Why do you want to make them builder methods? Also defensive copy is made in the constructor. Why make it twice?

findepi · 2018-07-18T14:27:17Z

presto-spi/src/main/java/com/facebook/presto/spi/statistics/ComputedStatistics.java

+            Map<TableStatisticType, Block> tableStatistics,
+            Map<ColumnStatisticMetadata, Block> columnStatistics)
+    {
+        this.groupingColumns = unmodifiableList(requireNonNull(groupingColumns, "groupingColumns is null"));


unmodifiableList( new ArrayList<>( requireNonNull( ....

I messed up during the rebase. That was applied in the commit later. Moving it here.

findepi

"Collect column statistics on table write: Planner" -- skimmed only. I refuse to understand SymbolMapper for now

findepi · 2018-07-18T14:29:53Z

presto-main/src/main/java/com/facebook/presto/sql/planner/LogicalPlanner.java

+        List<Symbol> commitOutputs = ImmutableList.of(symbolAllocator.newSymbol("rows", BIGINT));
+
+        if (!statisticsMetadata.isEmpty()) {
+            verify(columnNames.size() == symbols.size(), "columnNames.size() != symbols.size(): %s != %s", columnNames.size(), symbols.size());


nit: in case this fails some day, including full collections in the message might be helpful (perhaps even instead of the sizes):

verify(columnNames.size() == symbols.size(), "columnNames.size() != symbols.size(): %s and %s", columnNames, symbols);

findepi · 2018-07-18T14:32:16Z

presto-main/src/main/java/com/facebook/presto/sql/planner/StatisticsAggregationPlanner.java

+        }
+
+        FunctionRegistry functionRegistry = metadata.getFunctionRegistry();
+        if (!statisticsMetadata.getTableStatistics().isEmpty()) {


You should merge this if into the for loop above:

for (TableStatisticType type : statisticsMetadata.getTableStatistics()) { if (type != ROW_COUNT) { ... } // plan the row count agg }

Or:

for (TableStatisticType type : statisticsMetadata.getTableStatistics()) { if (type == ROW_COUNT) { // plan the row count agg. } else { // fail } }

findepi · 2018-07-18T14:33:43Z

presto-main/src/main/java/com/facebook/presto/sql/planner/StatisticsAggregationPlanner.java

+            ColumnStatisticType statisticType = columnStatisticMetadata.getStatisticType();
+            Symbol inputSymbol = columnToSymbolMap.get(columnName);
+            verify(inputSymbol != null, "inputSymbol is null");
+            Type inputType = requireNonNull(symbolAllocator.getTypes().get(inputSymbol), "inputType is null");


line above uses verify, this one uses requireNonNull for checking nullity, but i see no reason for it

Replaced with a verify. Just in case the type for the symbol is not there.

findepi · 2018-07-18T14:35:32Z

presto-main/src/main/java/com/facebook/presto/sql/planner/StatisticsAggregationPlanner.java

+    {
+        switch (statisticType) {
+            case MIN: {
+                checkArgument(inputType.isOrderable(), "Input type is not orderable: %s", inputType);


what would happen if you removed this check (and the type wasn't orderable)?
would we get some useful enough exception during createAggregation in next line?

I though that the explicit message might be more clear (in case someone ever wanted to compute MIN/MAX statistics for the type that is not orderable).

findepi · 2018-07-18T14:36:32Z

presto-main/src/main/java/com/facebook/presto/sql/planner/StatisticsAggregationPlanner.java

+            case MAX: {
+                checkArgument(inputType.isOrderable(), "Input type is not orderable: %s", inputType);
+                return createAggregation(QualifiedName.of("max"), input.toSymbolReference(), inputType, inputType);
+            }


nit: these {, } are redundant in most of the cases and affect readability. IMHO it's better without them, but no strong opinion here

I usually go without { } if the switch case blocks are one-liners. I like them for multiliners though.

findepi

"Collect column statistics on table write: Execution"

findepi · 2018-07-18T14:51:31Z

presto-main/src/main/java/com/facebook/presto/operator/TableFinishOperator.java

+            return false;
+        }
+        // AggregationOperator doesn't return false unless it is finished.
+        // HashAggregationOperator doesn't return false unless it is full, that is not the option here


unless it is full

or there is some unfinishedWork

findepi · 2018-07-18T14:53:54Z

presto-main/src/main/java/com/facebook/presto/operator/TableFinishOperator.java

+        }
+        // AggregationOperator doesn't return false unless it is finished.
+        // HashAggregationOperator doesn't return false unless it is full, that is not the option here
+        // The assumption is that the spill is always disabled for the statistics aggregation, ans


findepi · 2018-07-18T14:57:54Z

presto-main/src/main/java/com/facebook/presto/operator/TableFinishOperator.java

+
+        Block[] blocks = new Block[page.getChannelCount()];
+        for (int channel = 0; channel < page.getChannelCount(); channel++) {
+            blocks[channel] = page.getBlock(channel).copyPositions(selectedPositions, 0, statisticsPositionCount);


com.facebook.presto.spi.Page#getPositions ?
else, leave a comment why not using this

Copy just seems to be safer. Because it is copy. There will be not that much data to copy. And in most of the cases it will return the page unaltered. But it can be get as well. I don't have a strong opinion here.

Well -- we sometimes return the whole page (few lines earlier), so there cannot be assumption that you do a copy.

(no strong opinion either, just opportunity to simplify the code)

findepi · 2018-07-18T14:59:56Z

presto-main/src/main/java/com/facebook/presto/operator/TableFinishOperator.java

    }

    @Override
    public Page getOutput()
    {
+        if (!isBlocked().isDone()) {


nit: move after if (state != State.FINISHING) { .. }

In most of the implementation isDone is a simple flag check. But i don't mind changing.

findepi · 2018-07-18T15:02:07Z

presto-main/src/main/java/com/facebook/presto/operator/TableFinishOperator.java

+    {
+        ImmutableList.Builder<ComputedStatistics> statistics = ImmutableList.builder();
+        while (!statisticsAggregation.isFinished()) {
+            Page page = statisticsAggregation.getOutput();


verify(statisticsAggregation.isBlocked().isDone()); to fail fast rather than voidly looping in case something goes wrong

findepi · 2018-07-18T15:06:49Z

presto-main/src/main/java/com/facebook/presto/operator/TableWriterOperator.java

        }
+        // Please read the comment in the TableFinishOperator#needsInput method


They may go out of sync. Why not make a defensive copy?

findepi · 2018-07-18T15:08:31Z

presto-main/src/main/java/com/facebook/presto/operator/TableWriterOperator.java

+            blocked = NOT_BLOCKED;
+        }
+        else {
+            blocked = allAsList(blockedOnAggregation, blockedOnWrite);


remove if, else's code is generic enough.

if you don't want to go through com.google.common.util.concurrent.Futures#allAsList(com.google.common.util.concurrent.ListenableFuture<? extends V>...) in all-done case, create a helper method that does the if

findepi · 2018-07-18T15:08:45Z

presto-main/src/main/java/com/facebook/presto/operator/TableWriterOperator.java

@@ -210,18 +240,53 @@ public void addInput(Page page)
    @Override
    public Page getOutput()
    {
-        if (state != State.FINISHING || !blocked.isDone()) {
+        if (!blocked.isDone() || state != State.FINISHING) {


check state first (cheaper first)

findepi · 2018-07-18T15:11:00Z

presto-main/src/main/java/com/facebook/presto/operator/TableFinishOperator.java

+        while (!statisticsAggregation.isFinished()) {
+            Page page = statisticsAggregation.getOutput();
+            if (page == null) {
+                continue;


you should yield here (return). Operator shouldn't do a ton of work within single call, otherwise a query might be "unkillable"

IRL we cannot get here more data than a single page. We group the statistics on per-partition basis. And we never insert more than 100 partitions at once by default. But even if we inserted 100_000 - there still will be not enough data for more than a several pages. But than you would definitely have more severe problems. So for the sake of code simplicity in this class (which is already complex), i would go with what we have now.

fine, but please add a comment that we deliberately not yield here

findepi · 2018-07-18T15:11:44Z

presto-main/src/main/java/com/facebook/presto/operator/TableWriterOperator.java

    {
+        AutoCloseableCloser closer = AutoCloseableCloser.create();


wrap in try-with-r

It is unnecessary. We don't expect the register() methods to fail.

yea... it's also customary. just a matter a taste

rschlussel · 2018-07-17T17:33:29Z

presto-spi/src/main/java/com/facebook/presto/spi/statistics/ColumnStatisticType.java

+ */
+package com.facebook.presto.spi.statistics;
+
+public enum ColumnStatisticType


should these match what's in the ColumnStatistics spi? I think it would make sense for the set of stats we support reading and writing to be the same. If so, there's no max/average_value_size_in_bytes, but there is a total_size_in_bytes.

not necessarily. SPI is a common denominator, something we ingest. Here we have superset of all possible stats external systems can store.
SPI says want we want for CBO.
Here we say what hive need (or other programs using metastore may need, including ourselves)

@rschlussel There is no exact match. NUMBER_OF_TRUE_VALUES is something boolean specific, NUMBER_OF_NON_NULL_VALUES is chosen over NUMBER_OF_NULL_VALUES just because it is easier to compute (no extra projection), and so on.

rschlussel · 2018-07-17T21:29:11Z

presto-main/src/main/java/com/facebook/presto/operator/TableFinishOperator.java

+        }
+        // AggregationOperator doesn't return false unless it is finished.
+        // HashAggregationOperator doesn't return false unless it is full, that is not the option here
+        // The assumption is that the spill is always disabled for the statistics aggregation, ans


rschlussel · 2018-07-17T23:40:44Z

presto-main/src/main/java/com/facebook/presto/operator/TableFinishOperator.java

+        return Optional.of(new Page(statisticsPositionCount, blocks));
+    }
+
+    private static boolean isStatisticsPosition(Page page, int position)


can you add a comment somewhere explaining the multiplexing that you explained to me in person?

rschlussel · 2018-07-17T23:56:02Z

presto-main/src/main/java/com/facebook/presto/operator/TableWriterOperator.java

-    public static final List<Type> TYPES = ImmutableList.of(BIGINT, VARBINARY);
+    public static final int ROW_COUNT_CHANNEL = 0;
+    public static final int FRAGMENT_CHANNEL = 1;
+    private static final int WRITER_CHANNELS = 2;


These are the stats channels, right? Could you call them STATISTICS_CHANNELS

This is a total count of WRITER_CHANNELS.

rschlussel · 2018-07-18T00:05:22Z

...-hive/src/main/java/com/facebook/presto/hive/statistics/MetastoreHiveStatisticsProvider.java

    {
        this.typeManager = requireNonNull(typeManager, "typeManager is null");
        this.metastore = requireNonNull(metastore, "metastore is null");
-        this.timeZone = timeZone;
+        this.timeZone = requireNonNull(timeZone, "timeZone is null");


you didn't actually touch this file, right? just reformatted/added this null check. You could extract and merge this separately.

No idea why is this change here, and why would i even tough this class here. I extracted it to a separate commit.

rschlussel · 2018-07-18T18:01:39Z

presto-hive/src/main/java/com/facebook/presto/hive/util/Statistics.java

+        HiveColumnStatistics.Builder result = HiveColumnStatistics.builder();
+
+        // MIN MAX
+        if (computedStatistics.containsKey(MIN) && computedStatistics.containsKey(MAX)) {


does this need to be both or neither? Sort of makes sense that they'd both be required fields, but it's not totally useless to have only one. E.g. if min for x was 10 and you had a predicate where x < 8, you'd know that nothing matches.

We always compute MIN/MAX in pairs. Checking this in a single if for simplicity. Will add a assertion though.

rschlussel · 2018-07-18T18:06:29Z

presto-hive/src/test/java/com/facebook/presto/hive/TestHiveIntegrationSmokeTest.java

+    @Test
+    public void testCollectColumnStatisticsOnWriteSwitches()
+    {
+        assertCollectColumnStatisticsOnWrite(false);


do we normally have tests that our session properties are actually gating the things we think they do? The test is fine, but curious why for this?

Before that there was more complicated logic involving both, table and session property. Since now the logic is trivial i will just remove this test.

kokosing · 2018-07-19T06:49:12Z

presto-main/src/main/java/com/facebook/presto/type/CharOperators.java

+    @LiteralParameters("x")
+    @ScalarOperator(XX_HASH_64)
+    @SqlType(StandardTypes.BIGINT)
+    public static long xxHash64(@SqlType("char(x)") Slice slice)


This was needed for the approx_distinct.

Actually the equals semantics for CHAR in Presto are not correct. Because of the binary comparison, the EQUALS operator would return false for abc and abc. That will result into storing wrong statistics for the CHAR type. For now i'm not going to collect the NDV statistic for the CHAR type, untils the CHAR semantics are fixed.

UPD: It is not possible to save CHAR statistics without setting NDV. Going to go with the return XxHash64.hash(slice); implementation. But potentially it may return higher number of distinct values.

Actually the equals semantics for CHAR in Presto are not correct. Because of the binary comparison, the EQUALS operator would return false for abc⎵⎵ and abc.

as noted #11101 (comment), internal repr is normalized, with trailing spaces removed (wheever there is a trailing space, this is a bug)

kokosing · 2018-07-19T06:55:26Z

presto-spi/src/main/java/com/facebook/presto/spi/statistics/ComputedStatistics.java

+    private final Map<TableStatisticType, Block> tableStatistics;
+    private final Map<ColumnStatisticMetadata, Block> columnStatistics;
+
+    public ComputedStatistics(


should not be this private if you provide a builder

kokosing · 2018-07-19T06:55:55Z

presto-spi/src/main/java/com/facebook/presto/spi/statistics/ComputedStatistics.java

+        private final Map<TableStatisticType, Block> tableStatistics = new HashMap<>();
+        private final Map<ColumnStatisticMetadata, Block> columnStatistics = new HashMap<>();
+
+        public Builder(List<String> groupingColumns, List<Block> gropingValues)


kokosing · 2018-07-19T07:15:04Z

presto-main/src/main/java/com/facebook/presto/sql/planner/LogicalPlanner.java

+            ImmutableList.Builder<Symbol> writerOutputSymbols = ImmutableList.builder();
+            writerOutputSymbols.addAll(writerOutputs);
+            writerOutputSymbols.addAll(partialAggregation.getGroupingSymbols());
+            writerOutputSymbols.addAll(partialAggregation.getAggregations().keySet());


please chain this:

List<..> outputs = builder().add().add().build()

kokosing · 2018-07-19T07:15:11Z

presto-main/src/main/java/com/facebook/presto/sql/planner/LogicalPlanner.java

+            // by the partial aggregation from all of the writer nodes
+            StatisticAggregations partialAggregation = aggregations.getPartialAggregation();
+            ImmutableList.Builder<Symbol> writerOutputSymbols = ImmutableList.builder();
+            writerOutputSymbols.addAll(writerOutputs);


inline writerOutputs

It is used in multiple places (:405)

kokosing · 2018-07-19T07:36:28Z

presto-main/src/main/java/com/facebook/presto/sql/planner/StatisticsAggregationPlanner.java

+        }
+
+        FunctionRegistry functionRegistry = metadata.getFunctionRegistry();
+        if (!statisticsMetadata.getTableStatistics().isEmpty()) {


Or:

for (TableStatisticType type : statisticsMetadata.getTableStatistics()) { if (type == ROW_COUNT) { // plan the row count agg. } else { // fail } }

kokosing · 2018-07-19T07:57:07Z

presto-main/src/main/java/com/facebook/presto/sql/planner/optimizations/SymbolMapper.java

+                node.getStatisticsAggregationDescriptor().map(descriptor -> descriptor.map(this::map)));
+    }
+
+    private PartitioningScheme canonicalizePartitionFunctionBinding(PartitioningScheme scheme, PlanNode source)


canonicalizePartitioningScheme? Or maybe even just canonicalize?

kokosing · 2018-07-19T07:58:42Z

presto-main/src/main/java/com/facebook/presto/sql/planner/optimizations/SymbolMapper.java

+    private PartitioningScheme canonicalizePartitionFunctionBinding(PartitioningScheme scheme, PlanNode source)
+    {
+        Set<Symbol> addedOutputs = new HashSet<>();
+        ImmutableList.Builder<Symbol> outputs = ImmutableList.builder();


should not you use mapAndDistinct here?

kokosing · 2018-07-19T08:02:06Z

presto-main/src/main/java/com/facebook/presto/operator/TableFinishOperator.java

@@ -79,17 +94,24 @@ public OperatorFactory duplicate()

    private final OperatorContext operatorContext;
    private final TableFinisher tableFinisher;
+    private final Operator statisticsAggregation;


statisticsAggregationOperator?

Didn't want to make it oververbose, but it looks like it indeed decreases readablity.

Renamed statisticsAggregationOperatorFactory as well

kokosing · 2018-07-19T08:04:20Z

...-hive/src/test/java/com/facebook/presto/hive/metastore/glue/TestHiveClientGlueMetastore.java

+    @Override
+    public void testUpdateTableColumnStatistics()
+    {
+        // column statistics are not supported by Glue


can you throw SkipException here and elsewhere?

it isn't our convention to do so

It is not, but I think it should be. Telling people that test is passing while it is not possible for test to pass might be a bit misleading. I hope raising SkipException might have a better developer experience.

This is provided that someone reads the list of passing tests. I don't.
I do read lists of failing tests only. And Jenkins's testng plugin lists all the skipped tests, which isn't entirely interesting when the test is skipped because the functionality simply does not exist

Usually we throw SkipException if there is no easy way of disabling a test in any other way. (throwing it from somewhere deep inside the mock objects)

findepi

"Collect column statistics on table write: Hive Connector"

findepi · 2018-07-20T12:15:43Z

presto-hive/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftMetastoreUtil.java

+            return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, MIN, MAX, NUMBER_OF_DISTINCT_VALUES);
+        }
+        if (isVarcharType(type)) {
+            return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, NUMBER_OF_DISTINCT_VALUES, MAX_VALUE_SIZE_IN_BYTES, AVERAGE_VALUE_SIZE_IN_BYTES);


Hive stores MIN,MAX as well -- we don't use this currently, but we may in the future. Also, other tools may make use of this.
Consider // TODO ...

findepi · 2018-07-20T12:18:39Z

presto-hive/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftMetastoreUtil.java

+            return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, NUMBER_OF_TRUE_VALUES);
+        }
+        if (isNumericType(type) || type.equals(DATE) || type.equals(TIMESTAMP)) {
+            return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, MIN, MAX, NUMBER_OF_DISTINCT_VALUES);


add // TODO #7122 support non-legacy TIMESTAMP

findepi · 2018-07-20T12:19:40Z

presto-hive/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftMetastoreUtil.java

+        if (type.equals(VARBINARY)) {
+            return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, MAX_VALUE_SIZE_IN_BYTES, AVERAGE_VALUE_SIZE_IN_BYTES);
+        }
+        return ImmutableSet.of();


There are types that are not here because they are not supported by Hive connector. eg. TIME_WITH_TIME_ZONE, TIMESTAMP_WITH_TIME_ZONE.
I think it would be better to end this method with

// Throwing here to make sure this method is updated when a new type is added in Hive connector throw new IllegalArgumentException("Unsupported type: " + type);

There are many types that are not here. For example ARRAY/MAP/ROW. If i throw exception here i would need to check if the type is supported above.

i still would prefer a whitelist approach. Are there types to add other than array/map/row?

No. Yeah, maybe you are right, maybe we should return ImmutableList.empty() for that types, and throw an exception otherwise.

findepi · 2018-07-20T12:21:43Z

presto-hive/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftMetastoreUtil.java

+            return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, NUMBER_OF_DISTINCT_VALUES, MAX_VALUE_SIZE_IN_BYTES, AVERAGE_VALUE_SIZE_IN_BYTES);
+        }
+        if (isCharType(type)) {
+            return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, NUMBER_OF_DISTINCT_VALUES);


MIN,MAX as well?

Isn't AVERAGE_VALUE_SIZE_IN_BYTES interesting? (with trailing spaces trimmed). Add a TODO

Isn't AVERAGE_VALUE_SIZE_IN_BYTES interesting?

It is not obvious how to compute it for CHAR. Probably some custom function will be needed. I decided to skip it for now. In the optimizer we can use the length from the type itself. Usually the deviation is not major (or otherwise there is less sense of using CHAR)

a TODO note?

Don't forget to add this TODO about *_VALUE_SIZE_IN_BYTES

Don't forget to add this TODO about *_VALUE_SIZE_IN_BYTES

I'm going to introduce it in a very next PR

findepi · 2018-07-20T12:36:18Z

presto-hive/src/main/java/com/facebook/presto/hive/util/Statistics.java

+                reduce(first.getMaxColumnLength(), second.getMaxColumnLength(), SELECT_MAX, true),
+                mergeAverage(first.getAverageColumnLength(), firstRowCount, second.getAverageColumnLength(), secondRowCount),
+                reduce(first.getNullsCount(), second.getNullsCount(), ADD, false),
+                reduce(first.getDistinctValuesCount(), second.getDistinctValuesCount(), SELECT_MAX, false));


why false? (generally MAX seems suitable and is used with returnFirstNonEmpty=true)

null can indicate 2 thins.

statistic is missing (was never computed)

In case of min/max - that the table is empty

If the table is empty NDV is not gonna be null, but 0.

findepi · 2018-07-20T12:42:06Z

presto-hive/src/main/java/com/facebook/presto/hive/util/Statistics.java

+                    reduce(first.get().getMin(), second.get().getMin(), SELECT_MIN, true),
+                    reduce(first.get().getMax(), second.get().getMax(), SELECT_MAX, true)));
+        }
+        return Optional.empty();


if second is not present, you return empty
if second is present but has all fields absent, you return first

Why? Leave some explanation in the code.

(here & a few times below)

If only one of these is present - possibly the schema migration took place. If the column as a VARCHAR but become an INTEGER for example. Further we may want to do a proper schema migration here. I haven't decided how would it look like though.

So would something like this be proper?

// normally, either both or none is present

findepi · 2018-07-20T12:45:17Z

presto-hive/src/main/java/com/facebook/presto/hive/util/Statistics.java

+        if (first.isPresent() && second.isPresent()) {
+            return Optional.of(new BooleanStatistics(
+                    reduce(first.get().getTrueCount(), second.get().getTrueCount(), ADD, false),
+                    reduce(first.get().getFalseCount(), second.get().getFalseCount(), ADD, false)));


i am lost. we have ColumnStatisticType.NUMBER_OF_TRUE_VALUES but we don't have ColumnStatisticType.NUMBER_OF_FALSE_VALUES. Please explain

NUMBER_OF_FALSE_VALUES = NUMBER_OF_NON_NULL - NUMBER_OF_TRUE_VALUES

What we ask engine to compute does not necessary match the statistics we want to store.

findepi · 2018-07-20T12:48:05Z

presto-hive/src/main/java/com/facebook/presto/hive/util/Statistics.java

+    {
+        if (first.isPresent() && second.isPresent()) {
+            if (!(firstRowCount.isPresent() && secondRowCount.isPresent())) {
+                return OptionalDouble.empty();


if first is present, but second is not, we return first
if first is present, but second lacks row count, we return empty

Why? Leave some explanation in the code.

Changed to if (!firstRowCount.isPresent() || !secondRowCount.isPresent()) {. Hopefully it makes more sense.

findepi · 2018-07-20T12:51:00Z

presto-hive/src/main/java/com/facebook/presto/hive/util/Statistics.java

+            }
+            long totalRowCount = firstRowCount.getAsLong() + secondRowCount.getAsLong();
+            if (totalRowCount == 0) {
+                return OptionalDouble.empty();


OptionalDouble.of(0) ? In case we have ended up inserting zero rows into existing partition

If there 0 rows the average length is unknow. There is no rows. If we inserted values into empty partition it will go the first.isPresent() ? first : second; path.

findepi · 2018-07-20T13:09:53Z

presto-hive/src/main/java/com/facebook/presto/hive/util/Statistics.java

+
+    private static OptionalLong getIntegerValue(ConnectorSession session, Type type, Block block)
+    {
+        return block.isNull(0) ? OptionalLong.empty() : OptionalLong.of(((Number) type.getObjectValue(session, block, 0)).longValue());


I think I know why ((Number) type.getObjectValue(session, block, 0)).longValue() instead of type.getLong(block, 0) (because the second would work also for eg short decimal, skipping internal-representation-to-external conversion). While this method is called only for integral types, it's better as-is.
Consider leaving a comment hinting at the choice made

findepi

"Add properties for column statistics collect"

findepi · 2018-07-20T13:12:03Z

presto-hive/src/main/java/com/facebook/presto/hive/HiveClientConfig.java

@@ -1044,4 +1046,18 @@ public boolean isTableStatisticsEnabled()
    {
        return tableStatisticsEnabled;
    }
+
+    @NotNull


remove, boolean cannot be null

findepi · 2018-07-20T13:12:41Z

presto-hive/src/main/java/com/facebook/presto/hive/HiveClientConfig.java

@@ -134,6 +134,8 @@

    private boolean tableStatisticsEnabled = true;

+    private boolean collectColumnStatisticsOnWrite;


i wouldn't keep the empty line before this one

findepi

"Collect column statistics on table write: Documentation"

findepi · 2018-07-20T13:18:14Z

presto-docs/src/main/sphinx/connector/hive.rst

+``VARBINARY`` ``NUMBER_OF_NULLS``, ``MAX_VALUE_SIZE_IN_BYTES``, ``AVERAGE_VALUE_SIZE_IN_BYTES``
+``DATE``      ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``
+``TIMESTAMP`` ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``
+``DECIMAL``   ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``


nit: move after REAL

findepi · 2018-07-20T13:18:31Z

presto-docs/src/main/sphinx/connector/hive.rst

+``DOUBLE``    ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``
+``REAL``      ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``
+``BOOLEAN``   ``NUMBER_OF_NULLS``, ``NUMBER_OF_FALSE``, ``NUMBER_OF_TRUE``
+``VARCHAR``   ``NUMBER_OF_NULLS``, ``NUMBER_OF_DISTINCT_VALUES``, ``MAX_VALUE_SIZE_IN_BYTES``, ``AVERAGE_VALUE_SIZE_IN_BYTES``


MIN,MAX (if you support them in the code)

No, we do not collect MIN and MAX for Varchar

findepi · 2018-07-20T13:19:58Z

presto-docs/src/main/sphinx/connector/hive.rst

+============= ================================================================================================================
+
+Automatic column level statistics collection on write can be enabled using
+the ``hive.collect-column-statistics-on-write`` property.


i am worried we may miss to update this line when changing the default of this property.
What about different wording, leveraging the property's default is in the table above

Automatic column level statistics collection on write is controlled by ``hive.collect-column-statistics-on-write`` property.

findepi · 2018-07-20T18:47:16Z

presto-hive/src/main/java/com/facebook/presto/hive/util/Statistics.java

@@ -148,6 +470,8 @@ private static long convertLocalToUtc(DateTimeZone timeZone, long value)
    {
        ADD,
        SUBTRACT,
+        SELECT_MIN,
+        SELECT_MAX,


nit: why not just "MIN", "MAX"

So i can static-import it. MIN and MAX are used by the values from the ColumnStatisticType

Going to rename it back to MIN and MAX after i rename values in ColumnStatisticType

arhimondr · 2018-07-20T19:55:27Z

@rschlussel2 @findepi @kokosing Comments addressed

arhimondr · 2018-07-23T18:16:22Z

@findepi @kokosing @rschlussel @electrum

Heads up. I'm going to remove support for the MAX_VALUE_SIZE_IN_BYTES and AVERAGE_VALUE_SIZE_IN_BYTES statistics collection.

The reasons are next:

Collecting of these statistics is inefficient. It requires additional projection.
MAX_VALUE_SIZE_IN_BYTES is not used by the optimizer
AVERAGE_VALUE_SIZE_IN_BYTES - our version of the Metastore stores this statistics as the IN_MEMORY_DATA_SIZE. Computing AVERAGE doesn't make much sense, as it is easier to compute total size, and than divide it by the number of rows.
It is not possible to compute AVERAGE_VALUE_SIZE_IN_BYTES for the CHAR column with the existing aggregation functions.

I'm going to create a separate PR that:

Introduces in_memory_data_size aggregation function for VARCHAR, VARBINARY, CHAR and complex types (MAP, ARRAY, ROW)
Uses this function to compute IN_MEMORY_DATA_SIZE statistic
Removes maxColumnLength from the HiveColumnStatistics
Replaces averageColumnLength with inMemoryDataSizeInBytes
Does the inMemoryDataSizeInBytes to averageColumnLength conversion in the ThriftMetastoreUtil

findepi · 2018-07-24T09:10:16Z

@arhimondr how does this play with #11107?
i understand, after all these PRs (current, #11107 and the one you're planning), we need to ensure interoperability:

we should be able to read statistics computed by others (eg by Hive)
we should be able to compute stats for ourselves
we should be able to compute stats readable by other tools as well
all this for a bunch of Hive Metastore versions

electrum · 2018-07-26T06:59:23Z

presto-main/src/main/java/com/facebook/presto/util/AutoCloseableCloser.java

+        }
+        if (rootCause != null) {
+            throwIfUnchecked(rootCause);
+            throwIfInstanceOf(rootCause, Exception.class);


Almost all usages of throwIfInstanceOf are alone. The few places that it is combined with throwIfUnchecked should be converted to propagateIfPossible since that is more succinct.

electrum · 2018-07-26T07:04:14Z

presto-main/src/test/java/com/facebook/presto/util/TestAutoCloseableCloser.java

+        {
+            closed = true;
+            if (failure != null) {
+                throwIfUnchecked(failure);


propagateIfPossible(failure, Exception.class);

electrum · 2018-07-26T07:08:51Z

presto-spi/src/main/java/com/facebook/presto/spi/connector/ConnectorMetadata.java

@@ -275,6 +277,14 @@ default void dropColumn(ConnectorSession session, ConnectorTableHandle tableHand
        return Optional.of(new ConnectorNewTableLayout(partitioningHandle, partitionColumns));
    }

+    /**
+     * Describes statistics that must be collected


Nit: end Javadoc sentences in a period

Let's make this say "during a write" for now. We can update to "during a write or analyze" later when that is implemented.

electrum · 2018-07-26T07:15:45Z

presto-spi/src/main/java/com/facebook/presto/spi/statistics/ComputedStatistics.java

+{
+    private final List<String> groupingColumns;
+    private final List<Block> groupingValues;
+    private final Map<TableStatisticType, Block> tableStatistics;


Are there any concerns about memory size for Block? We have to store all of these in memory at once in the coordinator. (I'm not suggesting this is a problem, but rather asking if you've considered it or done any back-of-the-napkin calculations)

electrum · 2018-07-26T07:20:28Z

presto-hive/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftMetastoreUtil.java

+        }
+        if (isNumericType(type) || type.equals(DATE) || type.equals(TIMESTAMP)) {
+            // TODO #7122 support non-legacy TIMESTAMP
+            return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, MIN, MAX, NUMBER_OF_DISTINCT_VALUES);


Nit: put min/max at the start or end, so that the NUMBER_* stats are together

I'm wondering if MAX should be MAX_VALUE. That might be more consistent with MAX_VALUE_SIZE_IN_BYTES.

I'm wondering if MAX should be MAX_VALUE. That might be more consistent with MAX_VALUE_SIZE_IN_BYTES.

Yeah. That it will be less chance that it clashes on static import. Let me change that.

electrum · 2018-07-26T07:24:22Z

presto-hive/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftMetastoreUtil.java

+
+    private static boolean isNumericType(Type type)
+    {
+        return type.equals(BIGINT) || type.equals(INTEGER) || type.equals(SMALLINT) || type.equals(TINYINT) || type.equals(DOUBLE) || type.equals(REAL) || type instanceof DecimalType;


Maybe wrap this to split up the different kinds of types

return type.equals(BIGINT) || type.equals(INTEGER) || type.equals(SMALLINT) || type.equals(TINYINT) || type.equals(DOUBLE) || type.equals(REAL) || type instanceof DecimalType;

electrum · 2018-07-26T07:30:44Z

presto-docs/src/main/sphinx/connector/hive.rst

@@ -111,9 +111,9 @@ security options in the Hive connector.
 Hive Configuration Properties
 -----------------------------

-================================================== ============================================================ ==========
+================================================== ============================================================ =============================


We don't need to widen the table now that the default is "false"

Right. I'm staill going to widen it for 2 symbols to cover the RCBINARY

electrum · 2018-07-26T07:35:03Z

presto-docs/src/main/sphinx/connector/hive.rst

+============= ================================================================================================================
+Column Type                                            Collectible Statistics
+============= ================================================================================================================
+``TINYINT``   ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``


Should we use words here? This is for humans, and AFAIK, these constants like NUMBER_OF_NULLS are an internal detail of Presto.

Also, let's put min/max last, rather than in the middle of the "number of" stats.

``TINYINT`` number of nulls, number of distinct values, min/max values ``BOOLEAN`` number of nulls, number of true/false values

electrum · 2018-07-26T07:36:52Z

presto-docs/src/main/sphinx/connector/hive.rst

+============= ================================================================================================================
+
+Automatic column level statistics collection on write is controlled by
+``hive.collect-column-statistics-on-write`` property.


Automatic column level statistics collection on write is controlled by the ``collect-column-statistics-on-write`` catalog session property.

electrum · 2018-07-26T07:38:32Z

presto-docs/src/main/sphinx/connector/hive.rst

+The Hive connector can also collect column level statistics:
+
+============= ================================================================================================================
+Column Type                                            Collectible Statistics


Let's be consistent with other tables and put the "Collectible Statistics" at the start of the cell.

============= ==================================== Column Type Collectible Statistics

findepi · 2018-07-27T07:08:50Z

Travis failure looks related:

2018-07-27 02:42:50 INFO: FAILURE     /    com.facebook.presto.tests.hive.TestTablePartitioningInsertInto.selectFromPartitionedNation (Groups: hive_connector, smoke) took 11.0 seconds
2018-07-27 02:42:50 SEVERE: Failure cause:
java.lang.AssertionError: 
Expecting:
 <0L>
to be equal to:
 <10L>
but was not.
	at com.facebook.presto.tests.hive.TestTablePartitioningInsertInto.testQuerySplitsNumber(TestTablePartitioningInsertInto.java:84)
	at com.facebook.presto.tests.hive.TestTablePartitioningInsertInto.selectFromPartitionedNation(TestTablePartitioningInsertInto.java:66)

arhimondr · 2018-07-27T15:14:59Z

Travis failure looks related:

It seems to be intermittent. I see no reason why this patch should've affected this test in any way.

findepi · 2018-07-27T17:26:01Z

hm...did you observe it failing intermittently on master as well?

arhimondr · 2018-07-27T17:44:11Z

hm...did you observe it failing intermittently on master as well?

At least once. I didn't look for more.

https://api.travis-ci.org/v3/job/400752461/log.txt (https://travis-ci.org/prestodb/presto/jobs/400752461, https://travis-ci.org/prestodb/presto/builds/400752451)

Different metastores may support slightly different column statistics

connector/session/table properties

Sample output: https://gist.github.com/arhimondr/f32fca68e84ff098f67c84480f148d72

wenleix · 2018-08-07T19:57:56Z

presto-spi/src/main/java/com/facebook/presto/spi/statistics/ComputedStatistics.java

+            Map<TableStatisticType, Block> tableStatistics,
+            Map<ColumnStatisticMetadata, Block> columnStatistics)
+    {
+        this.groupingColumns = unmodifiableList(new ArrayList<>(requireNonNull(groupingColumns, "groupingColumns is null")));


Is there any reason not using ImmutableList.copyOf() ?

ImmutableList is a Guava class. We don't have Guava in the classpath of the presto-spi module.

arhimondr requested review from electrum, findepi, kokosing and rschlussel July 16, 2018 02:43

facebook-github-bot added the CLA Signed label Jul 16, 2018

arhimondr mentioned this pull request Jul 16, 2018

Collect column statistics on write #10617

Closed

arhimondr force-pushed the column-hive-stats-v2 branch from 9fb8caa to 6912593 Compare July 16, 2018 02:49

arhimondr force-pushed the column-hive-stats-v2 branch 5 times, most recently from a88391a to c417eb6 Compare July 17, 2018 05:39

arhimondr changed the title ~~[WIP] Collect column statistics on write [v2]~~ Collect column statistics on write [v2] Jul 17, 2018

arhimondr force-pushed the column-hive-stats-v2 branch from c417eb6 to 996fe84 Compare July 17, 2018 19:32

arhimondr force-pushed the column-hive-stats-v2 branch from 996fe84 to bca8699 Compare July 17, 2018 21:18

findepi reviewed Jul 18, 2018

View reviewed changes

rschlussel reviewed Jul 18, 2018

View reviewed changes

kokosing reviewed Jul 19, 2018

View reviewed changes

findepi reviewed Jul 20, 2018

View reviewed changes

prestodb deleted a comment from findepi Jul 20, 2018

arhimondr force-pushed the column-hive-stats-v2 branch 2 times, most recently from 02e45bd to 67fc97a Compare July 20, 2018 19:53

rschlussel approved these changes Jul 23, 2018

View reviewed changes

arhimondr force-pushed the column-hive-stats-v2 branch from 76c4c54 to 2ef0a98 Compare July 23, 2018 18:51

arhimondr force-pushed the column-hive-stats-v2 branch 3 times, most recently from 754ae7f to ede8d78 Compare July 25, 2018 19:29

electrum approved these changes Jul 26, 2018

View reviewed changes

arhimondr force-pushed the column-hive-stats-v2 branch from ede8d78 to f02ae29 Compare July 26, 2018 20:27

arhimondr force-pushed the column-hive-stats-v2 branch from f02ae29 to 81e4b3c Compare July 27, 2018 14:10

arhimondr mentioned this pull request Jul 27, 2018

Add internal aggregations over estimated in-memory data size for stats #11150

Merged

arhimondr and others added 11 commits August 2, 2018 08:54

Implement AutoCloseableCloser

05fbe72

Collect column statistics on table write: SPI

6df8ff2

Collect column statistics on table write: Planner

561d60c

Collect column statistics on table write: Execution

bfe113c

Replace supportsColumnStatistics with getSupportedColumnStatistics

123f735

Different metastores may support slightly different column statistics

Collect column statistics on table write: Hive Connector

03e2eb1

Add properties for column statistics collect

eb6a32c

connector/session/table properties

Collect column statistics on table write: Product Tests

3494048

Add information about collected statistcs to explain output

c267561

Sample output: https://gist.github.com/arhimondr/f32fca68e84ff098f67c84480f148d72

Collect column statistics on table write: Documentation

9dcf44d

Collect column statistics on table write: Smoke Tests

b6158fb

arhimondr force-pushed the column-hive-stats-v2 branch from 81e4b3c to b6158fb Compare August 2, 2018 12:57

arhimondr merged commit b6158fb into prestodb:master Aug 2, 2018

arhimondr deleted the column-hive-stats-v2 branch August 2, 2018 12:59

wenleix reviewed Aug 7, 2018

View reviewed changes

		}
		// Please read the comment in the TableFinishOperator#needsInput method

		@@ -134,6 +134,8 @@

		private boolean tableStatisticsEnabled = true;

		private boolean collectColumnStatisticsOnWrite;

Collect column statistics on write [v2] #11054

Collect column statistics on write [v2] #11054

Conversation

arhimondr commented Jul 16, 2018 • edited Loading

arhimondr commented Jul 16, 2018

arhimondr commented Jul 17, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arhimondr Jul 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arhimondr Jul 20, 2018 • edited Loading

Choose a reason for hiding this comment

findepi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

findepi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arhimondr commented Jul 16, 2018 •

edited

Loading

arhimondr Jul 20, 2018 •

edited

Loading

arhimondr Jul 20, 2018 •

edited

Loading