Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect column statistics on write [v2] #11054

Merged
merged 11 commits into from
Aug 2, 2018

Conversation

arhimondr
Copy link
Member

@arhimondr arhimondr commented Jul 16, 2018

Important changes since the version 1:

  • Commits from Return HiveColumnStatistics from the HiveMetastore interface to Move createPartitionValues method to a utility class were extracted into a separate PR and merged: Preliminary column statistics refactorings #10972
  • Added Implement AutoCloseableCloser commit.
  • SPI bits extracted to Collect column statistics on table write: SPI
  • getInsertStatisticsMetadata and getNewTableStatisticsMetadata method were merged into the single getStatisticsCollectionMetadata method
  • AggregationOperator integration with the TableWriterOperator has changed
  • ExtendedHiveMetastore#isColumnStatistitcsSupported() replaced with the ExtendedHiveMetastore#getSupportedColumnStatistics. That eliminates a need of CollectibleStatisticsProvider. Commit: Replace supportsColumnStatistics with getSupportedColumnStatistics
  • ENABLED_FOR_MARKED_TABLES option has been removed as well as a related table property.
  • Migrate column statistics on drop and rename column commit has been dropped
  • Added Collect column statistics on table write: Smoke Tests commit

@arhimondr
Copy link
Member Author

@electrum I'm still working on the integration test

@arhimondr arhimondr force-pushed the column-hive-stats-v2 branch 5 times, most recently from a88391a to c417eb6 Compare July 17, 2018 05:39
@arhimondr arhimondr changed the title [WIP] Collect column statistics on write [v2] Collect column statistics on write [v2] Jul 17, 2018
@arhimondr arhimondr force-pushed the column-hive-stats-v2 branch from c417eb6 to 996fe84 Compare July 17, 2018 19:32
@arhimondr
Copy link
Member Author

Please ignore the Fix memory leaks in tests from the presto-tests module commit, it is part of the #11062 PR

@arhimondr arhimondr force-pushed the column-hive-stats-v2 branch from 996fe84 to bca8699 Compare July 17, 2018 21:18
return new AutoCloseableCloser();
}

public void register(AutoCloseable closeable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public <T extends AutoCloseable> T register(T closeable)

throws Exception
{
Throwable rootCause = null;
for (AutoCloseable closeable : closeables) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guava Closer (which this class mimics) closes in reverse order.
Also, it doesn't close the underlying resource twice, even if close is called twice (eg concurrently)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked that. First i didn't get it, but now i think i got it.

Given the example

stream = closer.register(new OutputStream)
writer = closer.register(new OutputWriter(stream))

it makes total sense

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if close is called twice (eg concurrently)

The Closer is not thread safe. It is hard to reason what is going on when using it concurrently.

if (rootCause != null) {
throwIfUnchecked(rootCause);
throwIfInstanceOf(rootCause, Exception.class);
throw new Exception(rootCause);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we reach here, throw new Error or AssertionError would be more appropriate.. since this is unreachable

}
if (rootCause != null) {
throwIfUnchecked(rootCause);
throwIfInstanceOf(rootCause, Exception.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace those 2 with propagateIfPossible(rootCause, Exception.class)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use throwIfInstanceOf across the code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost all usages of throwIfInstanceOf are alone. The few places that it is combined with throwIfUnchecked should be converted to propagateIfPossible since that is more succinct.


Optional<NewTableLayout> getInsertLayout(Session session, TableHandle target);

/**
* Describes statistics that must be collected for a new table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only new tables? I think also updated ones

@@ -264,6 +266,14 @@ default void dropColumn(ConnectorSession session, ConnectorTableHandle tableHand
return Optional.of(new ConnectorNewTableLayout(partitioningHandle, partitionColumns));
}

/**
* Describes statistics that must be collected for a new table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again -- "new"?

return groupingColumns;
}

public List<Block> getGropingValues()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: groping
(field & ctor & builder too)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

public ComputedStatistics build()
{
return new ComputedStatistics(
unmodifiableList(groupingColumns),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean new ArrayList<>(groupingColumns)?
Otherwise this is redundant (ctor does that) and allows creating mutable ComputedStatistics objects

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the unmodifiable* at all here. Since those are in the constructor.

public Builder(List<String> groupingColumns, List<Block> gropingValues)
{
this.groupingColumns = requireNonNull(groupingColumns, "groupingColumns is null");
this.gropingValues = requireNonNull(gropingValues, "gropingValues is null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert ctor to builder methods withGroupingColumns (be sure to make defensive copy there)

Copy link
Member Author

@arhimondr arhimondr Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field are required. Why do you want to make them builder methods? Also defensive copy is made in the constructor. Why make it twice?

Map<TableStatisticType, Block> tableStatistics,
Map<ColumnStatisticMetadata, Block> columnStatistics)
{
this.groupingColumns = unmodifiableList(requireNonNull(groupingColumns, "groupingColumns is null"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unmodifiableList( new ArrayList<>( requireNonNull( ....

Copy link
Member Author

@arhimondr arhimondr Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I messed up during the rebase. That was applied in the commit later. Moving it here.

Copy link
Contributor

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Collect column statistics on table write: Planner" -- skimmed only. I refuse to understand SymbolMapper for now

List<Symbol> commitOutputs = ImmutableList.of(symbolAllocator.newSymbol("rows", BIGINT));

if (!statisticsMetadata.isEmpty()) {
verify(columnNames.size() == symbols.size(), "columnNames.size() != symbols.size(): %s != %s", columnNames.size(), symbols.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: in case this fails some day, including full collections in the message might be helpful (perhaps even instead of the sizes):

verify(columnNames.size() == symbols.size(), "columnNames.size() != symbols.size(): %s and %s", columnNames, symbols);

}

FunctionRegistry functionRegistry = metadata.getFunctionRegistry();
if (!statisticsMetadata.getTableStatistics().isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should merge this if into the for loop above:

for (TableStatisticType type : statisticsMetadata.getTableStatistics()) {
    if (type != ROW_COUNT) {
       ...
    }

    // plan the row count agg
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or:

for (TableStatisticType type : statisticsMetadata.getTableStatistics()) {
    if (type == ROW_COUNT) {
       // plan the row count agg.
    }
   else {
       // fail
   }
}

ColumnStatisticType statisticType = columnStatisticMetadata.getStatisticType();
Symbol inputSymbol = columnToSymbolMap.get(columnName);
verify(inputSymbol != null, "inputSymbol is null");
Type inputType = requireNonNull(symbolAllocator.getTypes().get(inputSymbol), "inputType is null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line above uses verify, this one uses requireNonNull for checking nullity, but i see no reason for it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with a verify. Just in case the type for the symbol is not there.

{
switch (statisticType) {
case MIN: {
checkArgument(inputType.isOrderable(), "Input type is not orderable: %s", inputType);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would happen if you removed this check (and the type wasn't orderable)?
would we get some useful enough exception during createAggregation in next line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though that the explicit message might be more clear (in case someone ever wanted to compute MIN/MAX statistics for the type that is not orderable).

case MAX: {
checkArgument(inputType.isOrderable(), "Input type is not orderable: %s", inputType);
return createAggregation(QualifiedName.of("max"), input.toSymbolReference(), inputType, inputType);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these {, } are redundant in most of the cases and affect readability. IMHO it's better without them, but no strong opinion here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually go without { } if the switch case blocks are one-liners. I like them for multiliners though.

Copy link
Contributor

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Collect column statistics on table write: Execution"

return false;
}
// AggregationOperator doesn't return false unless it is finished.
// HashAggregationOperator doesn't return false unless it is full, that is not the option here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless it is full

or there is some unfinishedWork

}
// AggregationOperator doesn't return false unless it is finished.
// HashAggregationOperator doesn't return false unless it is full, that is not the option here
// The assumption is that the spill is always disabled for the statistics aggregation, ans
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: ans


Block[] blocks = new Block[page.getChannelCount()];
for (int channel = 0; channel < page.getChannelCount(); channel++) {
blocks[channel] = page.getBlock(channel).copyPositions(selectedPositions, 0, statisticsPositionCount);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

com.facebook.presto.spi.Page#getPositions ?
else, leave a comment why not using this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy just seems to be safer. Because it is copy. There will be not that much data to copy. And in most of the cases it will return the page unaltered. But it can be get as well. I don't have a strong opinion here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well -- we sometimes return the whole page (few lines earlier), so there cannot be assumption that you do a copy.

(no strong opinion either, just opportunity to simplify the code)

}

@Override
public Page getOutput()
{
if (!isBlocked().isDone()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move after if (state != State.FINISHING) { .. }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most of the implementation isDone is a simple flag check. But i don't mind changing.

{
ImmutableList.Builder<ComputedStatistics> statistics = ImmutableList.builder();
while (!statisticsAggregation.isFinished()) {
Page page = statisticsAggregation.getOutput();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify(statisticsAggregation.isBlocked().isDone()); to fail fast rather than voidly looping in case something goes wrong

}
// Please read the comment in the TableFinishOperator#needsInput method
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They may go out of sync. Why not make a defensive copy?

blocked = NOT_BLOCKED;
}
else {
blocked = allAsList(blockedOnAggregation, blockedOnWrite);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove if, else's code is generic enough.

if you don't want to go through com.google.common.util.concurrent.Futures#allAsList(com.google.common.util.concurrent.ListenableFuture<? extends V>...) in all-done case, create a helper method that does the if

@@ -210,18 +240,53 @@ public void addInput(Page page)
@Override
public Page getOutput()
{
if (state != State.FINISHING || !blocked.isDone()) {
if (!blocked.isDone() || state != State.FINISHING) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check state first (cheaper first)

while (!statisticsAggregation.isFinished()) {
Page page = statisticsAggregation.getOutput();
if (page == null) {
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should yield here (return). Operator shouldn't do a ton of work within single call, otherwise a query might be "unkillable"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IRL we cannot get here more data than a single page. We group the statistics on per-partition basis. And we never insert more than 100 partitions at once by default. But even if we inserted 100_000 - there still will be not enough data for more than a several pages. But than you would definitely have more severe problems. So for the sake of code simplicity in this class (which is already complex), i would go with what we have now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine, but please add a comment that we deliberately not yield here

{
AutoCloseableCloser closer = AutoCloseableCloser.create();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap in try-with-r

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unnecessary. We don't expect the register() methods to fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea... it's also customary. just a matter a taste

*/
package com.facebook.presto.spi.statistics;

public enum ColumnStatisticType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these match what's in the ColumnStatistics spi? I think it would make sense for the set of stats we support reading and writing to be the same. If so, there's no max/average_value_size_in_bytes, but there is a total_size_in_bytes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessarily. SPI is a common denominator, something we ingest. Here we have superset of all possible stats external systems can store.
SPI says want we want for CBO.
Here we say what hive need (or other programs using metastore may need, including ourselves)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rschlussel There is no exact match. NUMBER_OF_TRUE_VALUES is something boolean specific, NUMBER_OF_NON_NULL_VALUES is chosen over NUMBER_OF_NULL_VALUES just because it is easier to compute (no extra projection), and so on.

}
// AggregationOperator doesn't return false unless it is finished.
// HashAggregationOperator doesn't return false unless it is full, that is not the option here
// The assumption is that the spill is always disabled for the statistics aggregation, ans
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ans/and

return Optional.of(new Page(statisticsPositionCount, blocks));
}

private static boolean isStatisticsPosition(Page page, int position)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment somewhere explaining the multiplexing that you explained to me in person?

public static final List<Type> TYPES = ImmutableList.of(BIGINT, VARBINARY);
public static final int ROW_COUNT_CHANNEL = 0;
public static final int FRAGMENT_CHANNEL = 1;
private static final int WRITER_CHANNELS = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the stats channels, right? Could you call them STATISTICS_CHANNELS

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a total count of WRITER_CHANNELS.

{
this.typeManager = requireNonNull(typeManager, "typeManager is null");
this.metastore = requireNonNull(metastore, "metastore is null");
this.timeZone = timeZone;
this.timeZone = requireNonNull(timeZone, "timeZone is null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you didn't actually touch this file, right? just reformatted/added this null check. You could extract and merge this separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea why is this change here, and why would i even tough this class here. I extracted it to a separate commit.

HiveColumnStatistics.Builder result = HiveColumnStatistics.builder();

// MIN MAX
if (computedStatistics.containsKey(MIN) && computedStatistics.containsKey(MAX)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be both or neither? Sort of makes sense that they'd both be required fields, but it's not totally useless to have only one. E.g. if min for x was 10 and you had a predicate where x < 8, you'd know that nothing matches.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always compute MIN/MAX in pairs. Checking this in a single if for simplicity. Will add a assertion though.

@Test
public void testCollectColumnStatisticsOnWriteSwitches()
{
assertCollectColumnStatisticsOnWrite(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we normally have tests that our session properties are actually gating the things we think they do? The test is fine, but curious why for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before that there was more complicated logic involving both, table and session property. Since now the logic is trivial i will just remove this test.

@LiteralParameters("x")
@ScalarOperator(XX_HASH_64)
@SqlType(StandardTypes.BIGINT)
public static long xxHash64(@SqlType("char(x)") Slice slice)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was needed for the approx_distinct.

Actually the equals semantics for CHAR in Presto are not correct. Because of the binary comparison, the EQUALS operator would return false for abc and abc. That will result into storing wrong statistics for the CHAR type. For now i'm not going to collect the NDV statistic for the CHAR type, untils the CHAR semantics are fixed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPD: It is not possible to save CHAR statistics without setting NDV. Going to go with the return XxHash64.hash(slice); implementation. But potentially it may return higher number of distinct values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the equals semantics for CHAR in Presto are not correct. Because of the binary comparison, the EQUALS operator would return false for abc⎵⎵ and abc.

as noted #11101 (comment), internal repr is normalized, with trailing spaces removed (wheever there is a trailing space, this is a bug)

private final Map<TableStatisticType, Block> tableStatistics;
private final Map<ColumnStatisticMetadata, Block> columnStatistics;

public ComputedStatistics(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not be this private if you provide a builder

private final Map<TableStatisticType, Block> tableStatistics = new HashMap<>();
private final Map<ColumnStatisticMetadata, Block> columnStatistics = new HashMap<>();

public Builder(List<String> groupingColumns, List<Block> gropingValues)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private?

ImmutableList.Builder<Symbol> writerOutputSymbols = ImmutableList.builder();
writerOutputSymbols.addAll(writerOutputs);
writerOutputSymbols.addAll(partialAggregation.getGroupingSymbols());
writerOutputSymbols.addAll(partialAggregation.getAggregations().keySet());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please chain this:

List<..> outputs = builder().add().add().build()

// by the partial aggregation from all of the writer nodes
StatisticAggregations partialAggregation = aggregations.getPartialAggregation();
ImmutableList.Builder<Symbol> writerOutputSymbols = ImmutableList.builder();
writerOutputSymbols.addAll(writerOutputs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline writerOutputs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used in multiple places (:405)

}

FunctionRegistry functionRegistry = metadata.getFunctionRegistry();
if (!statisticsMetadata.getTableStatistics().isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or:

for (TableStatisticType type : statisticsMetadata.getTableStatistics()) {
    if (type == ROW_COUNT) {
       // plan the row count agg.
    }
   else {
       // fail
   }
}

node.getStatisticsAggregationDescriptor().map(descriptor -> descriptor.map(this::map)));
}

private PartitioningScheme canonicalizePartitionFunctionBinding(PartitioningScheme scheme, PlanNode source)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

canonicalizePartitioningScheme? Or maybe even just canonicalize?

private PartitioningScheme canonicalizePartitionFunctionBinding(PartitioningScheme scheme, PlanNode source)
{
Set<Symbol> addedOutputs = new HashSet<>();
ImmutableList.Builder<Symbol> outputs = ImmutableList.builder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not you use mapAndDistinct here?

@@ -79,17 +94,24 @@ public OperatorFactory duplicate()

private final OperatorContext operatorContext;
private final TableFinisher tableFinisher;
private final Operator statisticsAggregation;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statisticsAggregationOperator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't want to make it oververbose, but it looks like it indeed decreases readablity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed statisticsAggregationOperatorFactory as well

@Override
public void testUpdateTableColumnStatistics()
{
// column statistics are not supported by Glue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you throw SkipException here and elsewhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it isn't our convention to do so

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not, but I think it should be. Telling people that test is passing while it is not possible for test to pass might be a bit misleading. I hope raising SkipException might have a better developer experience.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is provided that someone reads the list of passing tests. I don't.
I do read lists of failing tests only. And Jenkins's testng plugin lists all the skipped tests, which isn't entirely interesting when the test is skipped because the functionality simply does not exist

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we throw SkipException if there is no easy way of disabling a test in any other way. (throwing it from somewhere deep inside the mock objects)

Copy link
Contributor

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Collect column statistics on table write: Hive Connector"

return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, MIN, MAX, NUMBER_OF_DISTINCT_VALUES);
}
if (isVarcharType(type)) {
return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, NUMBER_OF_DISTINCT_VALUES, MAX_VALUE_SIZE_IN_BYTES, AVERAGE_VALUE_SIZE_IN_BYTES);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hive stores MIN,MAX as well -- we don't use this currently, but we may in the future. Also, other tools may make use of this.
Consider // TODO ...

return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, NUMBER_OF_TRUE_VALUES);
}
if (isNumericType(type) || type.equals(DATE) || type.equals(TIMESTAMP)) {
return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, MIN, MAX, NUMBER_OF_DISTINCT_VALUES);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add // TODO #7122 support non-legacy TIMESTAMP

if (type.equals(VARBINARY)) {
return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, MAX_VALUE_SIZE_IN_BYTES, AVERAGE_VALUE_SIZE_IN_BYTES);
}
return ImmutableSet.of();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are types that are not here because they are not supported by Hive connector. eg. TIME_WITH_TIME_ZONE, TIMESTAMP_WITH_TIME_ZONE.
I think it would be better to end this method with

// Throwing here to make sure this method is updated when a new type is added in Hive connector
throw new IllegalArgumentException("Unsupported type: " + type);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many types that are not here. For example ARRAY/MAP/ROW. If i throw exception here i would need to check if the type is supported above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i still would prefer a whitelist approach. Are there types to add other than array/map/row?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Yeah, maybe you are right, maybe we should return ImmutableList.empty() for that types, and throw an exception otherwise.

return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, NUMBER_OF_DISTINCT_VALUES, MAX_VALUE_SIZE_IN_BYTES, AVERAGE_VALUE_SIZE_IN_BYTES);
}
if (isCharType(type)) {
return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, NUMBER_OF_DISTINCT_VALUES);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIN,MAX as well?

Isn't AVERAGE_VALUE_SIZE_IN_BYTES interesting? (with trailing spaces trimmed). Add a TODO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't AVERAGE_VALUE_SIZE_IN_BYTES interesting?

It is not obvious how to compute it for CHAR. Probably some custom function will be needed. I decided to skip it for now. In the optimizer we can use the length from the type itself. Usually the deviation is not major (or otherwise there is less sense of using CHAR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a TODO note?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add this TODO about *_VALUE_SIZE_IN_BYTES

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add this TODO about *_VALUE_SIZE_IN_BYTES

I'm going to introduce it in a very next PR

reduce(first.getMaxColumnLength(), second.getMaxColumnLength(), SELECT_MAX, true),
mergeAverage(first.getAverageColumnLength(), firstRowCount, second.getAverageColumnLength(), secondRowCount),
reduce(first.getNullsCount(), second.getNullsCount(), ADD, false),
reduce(first.getDistinctValuesCount(), second.getDistinctValuesCount(), SELECT_MAX, false));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why false? (generally MAX seems suitable and is used with returnFirstNonEmpty=true)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null can indicate 2 thins.

  • statistic is missing (was never computed)
  • In case of min/max - that the table is empty

If the table is empty NDV is not gonna be null, but 0.

reduce(first.get().getMin(), second.get().getMin(), SELECT_MIN, true),
reduce(first.get().getMax(), second.get().getMax(), SELECT_MAX, true)));
}
return Optional.empty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if second is not present, you return empty
if second is present but has all fields absent, you return first

Why? Leave some explanation in the code.

(here & a few times below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only one of these is present - possibly the schema migration took place. If the column as a VARCHAR but become an INTEGER for example. Further we may want to do a proper schema migration here. I haven't decided how would it look like though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So would something like this be proper?

// normally, either both or none is present

if (first.isPresent() && second.isPresent()) {
return Optional.of(new BooleanStatistics(
reduce(first.get().getTrueCount(), second.get().getTrueCount(), ADD, false),
reduce(first.get().getFalseCount(), second.get().getFalseCount(), ADD, false)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am lost. we have ColumnStatisticType.NUMBER_OF_TRUE_VALUES but we don't have ColumnStatisticType.NUMBER_OF_FALSE_VALUES. Please explain

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NUMBER_OF_FALSE_VALUES = NUMBER_OF_NON_NULL - NUMBER_OF_TRUE_VALUES

What we ask engine to compute does not necessary match the statistics we want to store.

{
if (first.isPresent() && second.isPresent()) {
if (!(firstRowCount.isPresent() && secondRowCount.isPresent())) {
return OptionalDouble.empty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if first is present, but second is not, we return first
if first is present, but second lacks row count, we return empty

Why? Leave some explanation in the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to if (!firstRowCount.isPresent() || !secondRowCount.isPresent()) {. Hopefully it makes more sense.

}
long totalRowCount = firstRowCount.getAsLong() + secondRowCount.getAsLong();
if (totalRowCount == 0) {
return OptionalDouble.empty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OptionalDouble.of(0) ? In case we have ended up inserting zero rows into existing partition

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there 0 rows the average length is unknow. There is no rows. If we inserted values into empty partition it will go the first.isPresent() ? first : second; path.


private static OptionalLong getIntegerValue(ConnectorSession session, Type type, Block block)
{
return block.isNull(0) ? OptionalLong.empty() : OptionalLong.of(((Number) type.getObjectValue(session, block, 0)).longValue());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I know why ((Number) type.getObjectValue(session, block, 0)).longValue() instead of type.getLong(block, 0) (because the second would work also for eg short decimal, skipping internal-representation-to-external conversion). While this method is called only for integral types, it's better as-is.
Consider leaving a comment hinting at the choice made

Copy link
Contributor

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Add properties for column statistics collect"

@@ -1044,4 +1046,18 @@ public boolean isTableStatisticsEnabled()
{
return tableStatisticsEnabled;
}

@NotNull
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove, boolean cannot be null

@@ -134,6 +134,8 @@

private boolean tableStatisticsEnabled = true;

private boolean collectColumnStatisticsOnWrite;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wouldn't keep the empty line before this one

Copy link
Contributor

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Collect column statistics on table write: Documentation"

``VARBINARY`` ``NUMBER_OF_NULLS``, ``MAX_VALUE_SIZE_IN_BYTES``, ``AVERAGE_VALUE_SIZE_IN_BYTES``
``DATE`` ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``
``TIMESTAMP`` ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``
``DECIMAL`` ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move after REAL

``DOUBLE`` ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``
``REAL`` ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``
``BOOLEAN`` ``NUMBER_OF_NULLS``, ``NUMBER_OF_FALSE``, ``NUMBER_OF_TRUE``
``VARCHAR`` ``NUMBER_OF_NULLS``, ``NUMBER_OF_DISTINCT_VALUES``, ``MAX_VALUE_SIZE_IN_BYTES``, ``AVERAGE_VALUE_SIZE_IN_BYTES``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIN,MAX (if you support them in the code)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we do not collect MIN and MAX for Varchar

============= ================================================================================================================

Automatic column level statistics collection on write can be enabled using
the ``hive.collect-column-statistics-on-write`` property.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am worried we may miss to update this line when changing the default of this property.
What about different wording, leveraging the property's default is in the table above

Automatic column level statistics collection on write is controlled by ``hive.collect-column-statistics-on-write`` property.

@@ -148,6 +470,8 @@ private static long convertLocalToUtc(DateTimeZone timeZone, long value)
{
ADD,
SUBTRACT,
SELECT_MIN,
SELECT_MAX,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not just "MIN", "MAX"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So i can static-import it. MIN and MAX are used by the values from the ColumnStatisticType

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to rename it back to MIN and MAX after i rename values in ColumnStatisticType

@prestodb prestodb deleted a comment from findepi Jul 20, 2018
@arhimondr arhimondr force-pushed the column-hive-stats-v2 branch 2 times, most recently from 02e45bd to 67fc97a Compare July 20, 2018 19:53
@arhimondr
Copy link
Member Author

@rschlussel2 @findepi @kokosing Comments addressed

@arhimondr
Copy link
Member Author

arhimondr commented Jul 23, 2018

@findepi @kokosing @rschlussel @electrum

Heads up. I'm going to remove support for the MAX_VALUE_SIZE_IN_BYTES and AVERAGE_VALUE_SIZE_IN_BYTES statistics collection.

The reasons are next:

  • Collecting of these statistics is inefficient. It requires additional projection.
  • MAX_VALUE_SIZE_IN_BYTES is not used by the optimizer
  • AVERAGE_VALUE_SIZE_IN_BYTES - our version of the Metastore stores this statistics as the IN_MEMORY_DATA_SIZE. Computing AVERAGE doesn't make much sense, as it is easier to compute total size, and than divide it by the number of rows.
  • It is not possible to compute AVERAGE_VALUE_SIZE_IN_BYTES for the CHAR column with the existing aggregation functions.

I'm going to create a separate PR that:

  • Introduces in_memory_data_size aggregation function for VARCHAR, VARBINARY, CHAR and complex types (MAP, ARRAY, ROW)
  • Uses this function to compute IN_MEMORY_DATA_SIZE statistic
  • Removes maxColumnLength from the HiveColumnStatistics
  • Replaces averageColumnLength with inMemoryDataSizeInBytes
  • Does the inMemoryDataSizeInBytes to averageColumnLength conversion in the ThriftMetastoreUtil

@arhimondr arhimondr force-pushed the column-hive-stats-v2 branch from 76c4c54 to 2ef0a98 Compare July 23, 2018 18:51
@findepi
Copy link
Contributor

findepi commented Jul 24, 2018

@arhimondr how does this play with #11107?
i understand, after all these PRs (current, #11107 and the one you're planning), we need to ensure interoperability:

  • we should be able to read statistics computed by others (eg by Hive)
  • we should be able to compute stats for ourselves
  • we should be able to compute stats readable by other tools as well
  • all this for a bunch of Hive Metastore versions

@arhimondr arhimondr force-pushed the column-hive-stats-v2 branch 3 times, most recently from 754ae7f to ede8d78 Compare July 25, 2018 19:29
}
if (rootCause != null) {
throwIfUnchecked(rootCause);
throwIfInstanceOf(rootCause, Exception.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost all usages of throwIfInstanceOf are alone. The few places that it is combined with throwIfUnchecked should be converted to propagateIfPossible since that is more succinct.

{
closed = true;
if (failure != null) {
throwIfUnchecked(failure);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

propagateIfPossible(failure, Exception.class);

@@ -275,6 +277,14 @@ default void dropColumn(ConnectorSession session, ConnectorTableHandle tableHand
return Optional.of(new ConnectorNewTableLayout(partitioningHandle, partitionColumns));
}

/**
* Describes statistics that must be collected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: end Javadoc sentences in a period

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this say "during a write" for now. We can update to "during a write or analyze" later when that is implemented.

{
private final List<String> groupingColumns;
private final List<Block> groupingValues;
private final Map<TableStatisticType, Block> tableStatistics;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any concerns about memory size for Block? We have to store all of these in memory at once in the coordinator. (I'm not suggesting this is a problem, but rather asking if you've considered it or done any back-of-the-napkin calculations)

}
if (isNumericType(type) || type.equals(DATE) || type.equals(TIMESTAMP)) {
// TODO #7122 support non-legacy TIMESTAMP
return ImmutableSet.of(NUMBER_OF_NON_NULL_VALUES, MIN, MAX, NUMBER_OF_DISTINCT_VALUES);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: put min/max at the start or end, so that the NUMBER_* stats are together

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if MAX should be MAX_VALUE. That might be more consistent with MAX_VALUE_SIZE_IN_BYTES.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if MAX should be MAX_VALUE. That might be more consistent with MAX_VALUE_SIZE_IN_BYTES.

Yeah. That it will be less chance that it clashes on static import. Let me change that.


private static boolean isNumericType(Type type)
{
return type.equals(BIGINT) || type.equals(INTEGER) || type.equals(SMALLINT) || type.equals(TINYINT) || type.equals(DOUBLE) || type.equals(REAL) || type instanceof DecimalType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe wrap this to split up the different kinds of types

return type.equals(BIGINT) || type.equals(INTEGER) || type.equals(SMALLINT) || type.equals(TINYINT) ||
        type.equals(DOUBLE) || type.equals(REAL) ||
        type instanceof DecimalType;

@@ -111,9 +111,9 @@ security options in the Hive connector.
Hive Configuration Properties
-----------------------------

================================================== ============================================================ ==========
================================================== ============================================================ =============================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to widen the table now that the default is "false"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I'm staill going to widen it for 2 symbols to cover the RCBINARY

============= ================================================================================================================
Column Type Collectible Statistics
============= ================================================================================================================
``TINYINT`` ``NUMBER_OF_NULLS``, ``MIN``, ``MAX``, ``NUMBER_OF_DISTINCT_VALUES``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use words here? This is for humans, and AFAIK, these constants like NUMBER_OF_NULLS are an internal detail of Presto.

Also, let's put min/max last, rather than in the middle of the "number of" stats.

``TINYINT``   number of nulls, number of distinct values, min/max values
``BOOLEAN``   number of nulls, number of true/false values

============= ================================================================================================================

Automatic column level statistics collection on write is controlled by
``hive.collect-column-statistics-on-write`` property.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automatic column level statistics collection on write is controlled by
the ``collect-column-statistics-on-write`` catalog session property.

The Hive connector can also collect column level statistics:

============= ================================================================================================================
Column Type Collectible Statistics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be consistent with other tables and put the "Collectible Statistics" at the start of the cell.

============= ====================================
Column Type   Collectible Statistics

@arhimondr arhimondr force-pushed the column-hive-stats-v2 branch from ede8d78 to f02ae29 Compare July 26, 2018 20:27
@findepi
Copy link
Contributor

findepi commented Jul 27, 2018

Travis failure looks related:

2018-07-27 02:42:50 INFO: FAILURE     /    com.facebook.presto.tests.hive.TestTablePartitioningInsertInto.selectFromPartitionedNation (Groups: hive_connector, smoke) took 11.0 seconds
2018-07-27 02:42:50 SEVERE: Failure cause:
java.lang.AssertionError: 
Expecting:
 <0L>
to be equal to:
 <10L>
but was not.
	at com.facebook.presto.tests.hive.TestTablePartitioningInsertInto.testQuerySplitsNumber(TestTablePartitioningInsertInto.java:84)
	at com.facebook.presto.tests.hive.TestTablePartitioningInsertInto.selectFromPartitionedNation(TestTablePartitioningInsertInto.java:66)

@arhimondr arhimondr force-pushed the column-hive-stats-v2 branch from f02ae29 to 81e4b3c Compare July 27, 2018 14:10
@arhimondr
Copy link
Member Author

Travis failure looks related:

It seems to be intermittent. I see no reason why this patch should've affected this test in any way.

@findepi
Copy link
Contributor

findepi commented Jul 27, 2018

hm...did you observe it failing intermittently on master as well?

@arhimondr
Copy link
Member Author

hm...did you observe it failing intermittently on master as well?

At least once. I didn't look for more.

https://api.travis-ci.org/v3/job/400752461/log.txt (https://travis-ci.org/prestodb/presto/jobs/400752461, https://travis-ci.org/prestodb/presto/builds/400752451)

@arhimondr arhimondr force-pushed the column-hive-stats-v2 branch from 81e4b3c to b6158fb Compare August 2, 2018 12:57
@arhimondr arhimondr merged commit b6158fb into prestodb:master Aug 2, 2018
@arhimondr arhimondr deleted the column-hive-stats-v2 branch August 2, 2018 12:59
Map<TableStatisticType, Block> tableStatistics,
Map<ColumnStatisticMetadata, Block> columnStatistics)
{
this.groupingColumns = unmodifiableList(new ArrayList<>(requireNonNull(groupingColumns, "groupingColumns is null")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not using ImmutableList.copyOf() ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImmutableList is a Guava class. We don't have Guava in the classpath of the presto-spi module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants