-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23280][SQL] add map type support to ColumnVector #20450
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -215,10 +215,18 @@ public final ColumnarRow getStruct(int rowId) { | |
|
||
/** | ||
* Returns the map type value for rowId. | ||
* | ||
* In Spark, map type value is basically a key data array and a value data array. A key from the | ||
* key array with a index and a value from the value array with the same index contribute to | ||
* an entry of this map type value. | ||
* | ||
* To support map type, implementations must construct an {@link ColumnarMap} and return it in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. construct an -> construct a. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is very minor, may not worth to wait for another QA round. Maybe we can fix it in your "return null" PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure. LGTM. |
||
* this method. {@link ColumnarMap} requires a {@link ColumnVector} that stores the data of all | ||
* the keys of all the maps in this vector, and another {@link ColumnVector} that stores the data | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* of all the values of all the maps in this vector, and a pair of offset and length which | ||
* specify the range of the key/value array that belongs to the map type value at rowId. | ||
*/ | ||
public MapData getMap(int ordinal) { | ||
throw new UnsupportedOperationException(); | ||
} | ||
public abstract ColumnarMap getMap(int ordinal); | ||
|
||
/** | ||
* Returns the decimal type value for rowId. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.spark.sql.vectorized; | ||
|
||
import org.apache.spark.sql.catalyst.util.MapData; | ||
|
||
/** | ||
* Map abstraction in {@link ColumnVector}. | ||
*/ | ||
public final class ColumnarMap extends MapData { | ||
private final ColumnarArray keys; | ||
private final ColumnarArray values; | ||
private final int length; | ||
|
||
public ColumnarMap(ColumnVector keys, ColumnVector values, int offset, int length) { | ||
this.length = length; | ||
this.keys = new ColumnarArray(keys, offset, length); | ||
this.values = new ColumnarArray(values, offset, length); | ||
} | ||
|
||
@Override | ||
public int numElements() { return length; } | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a API from parent, we can't change it. |
||
|
||
@Override | ||
public ColumnarArray keyArray() { | ||
return keys; | ||
} | ||
|
||
@Override | ||
public ColumnarArray valueArray() { | ||
return values; | ||
} | ||
|
||
@Override | ||
public ColumnarMap copy() { | ||
throw new UnsupportedOperationException(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we may also have a method
isMap()
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be an overkill,
isArray
needs to take care of many types, butisMap
we only accept one type: MapType.