Skip to content

Commit

Permalink
Add batch query support for drop step [tp-tests]
Browse files Browse the repository at this point in the history
- Reuse multi-query  optimization for TinkerPop's
- Change restriction on eligible multi-query traversals and allow multi-query optimizations to be used for queries with  steps
- Add release template for JanusGraph 1.1.0

Signed-off-by: Oleksandr Porunov <[email protected]>
  • Loading branch information
porunov committed May 14, 2024
1 parent 90b9694 commit eee1996
Show file tree
Hide file tree
Showing 17 changed files with 458 additions and 27 deletions.
63 changes: 63 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ All currently supported versions of JanusGraph are listed below.
| JanusGraph | Storage Version | Cassandra | HBase | Bigtable | ScyllaDB | Elasticsearch | Solr | TinkerPop | Spark | Scala |
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| 1.0.z | 2 | 3.11.z, 4.0.z | 2.5.z | 1.3.0, 1.4.0, 1.5.z, 1.6.z, 1.7.z, 1.8.z, 1.9.z, 1.10.z, 1.11.z, 1.14.z | 5.y | 6.y, 7.y, 8.y | 8.y | 3.7.z | 3.2.z | 2.12.z |
| 1.1.z | 2 | 3.11.z, 4.0.z | 2.5.z | 1.3.0, 1.4.0, 1.5.z, 1.6.z, 1.7.z, 1.8.z, 1.9.z, 1.10.z, 1.11.z, 1.14.z | 5.y | 6.y, 7.y, 8.y | 8.y | 3.7.z | 3.2.z | 2.12.z |

!!! info
Even so ScyllaDB is marked as `N/A` prior version 1.0.0 it was actually supported using `cql` storage option.
Expand All @@ -49,6 +50,68 @@ The versions of JanusGraph listed below are outdated and will no longer receive

## Release Notes

### Version 1.1.0 (Release Date: ???)

/// tab | Maven
```xml
<dependency>
<groupId>org.janusgraph</groupId>
<artifactId>janusgraph-core</artifactId>
<version>1.1.0</version>
</dependency>
```
///

/// tab | Gradle
```groovy
compile "org.janusgraph:janusgraph-core:1.1.0"
```
///

**Tested Compatibility:**

* Apache Cassandra 3.11.10, 4.0.6
* Apache HBase 2.5.0
* Oracle BerkeleyJE 7.5.11
* ScyllaDB 5.1.4
* Elasticsearch 6.0.1, 6.6.0, 7.17.8, 8.10.4
* Apache Lucene 8.11.1
* Apache Solr 8.11.1
* Apache TinkerPop 3.7.2
* Java 8, 11

**Installed versions in the Pre-Packaged Distribution:**

* Cassandra 4.0.6
* Elasticsearch 7.14.0

#### Changes

For more information on features and bug fixes in 1.1.0, see the GitHub milestone:

- <https://github.com/JanusGraph/janusgraph/milestone/27?closed=1>

#### Assets

* [JavaDoc](https://javadoc.io/doc/org.janusgraph/janusgraph-core/1.1.0)
* [GitHub Release](https://github.com/JanusGraph/janusgraph/releases/tag/v1.1.0)
* [JanusGraph zip](https://github.com/JanusGraph/janusgraph/releases/download/v1.1.0/janusgraph-1.1.0.zip)
* [JanusGraph zip with embedded Cassandra and ElasticSearch](https://github.com/JanusGraph/janusgraph/releases/download/v1.1.0/janusgraph-full-1.1.0.zip)

#### Upgrade Instructions

##### Batch query optimizations are now eligible for traversals containing `drop()` step

In JanusGraph 1.1.0 the new batch optimization for vertices removal is introduced in `drop()` step. This optimization is
enabled by default.
Previously any batch optimization would be skipped for queries containing at least a single `drop()` step. However,
starting from version 1.1.0 this restriction is now removed and such queries are eligible for batch query optimization
(mutli-query).
Notice, that `LazyBarrierStrategy` (TinkerPop strategy) is disabled for any query which contains at least a single
`drop()` step.
To disable `drop` optimization (to preserve same behaviour as it was previously) users can set the next configuration:
`query.batch.drop-step-mode=none`

### Version 1.0.0 (Release Date: October 21, 2023)

/// tab | Maven
Expand Down
1 change: 1 addition & 0 deletions docs/configs/janusgraph-cfg.md
Original file line number Diff line number Diff line change
Expand Up @@ -365,6 +365,7 @@ Configuration options to configure batch queries optimization behavior

| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| query.batch.drop-step-mode | Batching mode for `drop()` step. Used only when `query.batch.enabled` is `true`.<br>Supported modes:<br>- `all` - Drops all vertices in a batch.<br>- `none` - Skips drop batching optimization.<br> | String | all | MASKABLE |
| query.batch.enabled | Whether traversal queries should be batched when executed against the storage backend. This can lead to significant performance improvement if there is a non-trivial latency to the backend. If `false` then all other configuration options under `query.batch` namespace are ignored. | Boolean | true | MASKABLE |
| query.batch.has-step-mode | Properties pre-fetching mode for `has` step. Used only when `query.batch.enabled` is `true`.<br>Supported modes:<br>- `all_properties` - Pre-fetch all vertex properties on any property access (fetches all vertex properties in a single slice query)<br>- `required_properties_only` - Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps (uses a separate slice query per each required property)<br>- `required_and_next_properties` - Prefetch the same properties as with `required_properties_only` mode, but also prefetch<br>properties which may be needed in the next properties access step like `values`, `properties,` `valueMap`, `elementMap`, or `propertyMap`.<br>In case the next step is not one of those properties access steps then this mode behaves same as `required_properties_only`.<br>In case the next step is one of the properties access steps with limited scope of properties, those properties will be<br>pre-fetched together in the same multi-query.<br>In case the next step is one of the properties access steps with unspecified scope of property keys then this mode<br>behaves same as `all_properties`.<br>- `required_and_next_properties_or_all` - Prefetch the same properties as with `required_and_next_properties`, but in case the next step is not<br>`values`, `properties,` `valueMap`, `elementMap`, or `propertyMap` then acts like `all_properties`.<br>- `none` - Skips `has` step batch properties pre-fetch optimization.<br> | String | required_and_next_properties | MASKABLE |
| query.batch.label-step-mode | Labels pre-fetching mode for `label()` step. Used only when `query.batch.enabled` is `true`.<br>Supported modes:<br>- `all` - Pre-fetch labels for all vertices in a batch.<br>- `none` - Skips vertex labels pre-fetching optimization.<br> | String | all | MASKABLE |
Expand Down
3 changes: 2 additions & 1 deletion docs/operations/batch-processing.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ Batched query processing takes into account two types of steps:

1. Batch compatible step. This is the step which will execute batch requests. Currently, the list of such steps
is the next: `out()`, `in()`, `both()`, `inE()`, `outE()`, `bothE()`, `has()`, `values()`, `properties()`, `valueMap()`,
`propertyMap()`, `elementMap()`, `label()`.
`propertyMap()`, `elementMap()`, `label()`, `drop()`.
2. Parent step. This is a parent step which has local traversals with the same start. Such parent steps also implement the
interface `TraversalParent`. There are many such steps, but as for an example those could be: `and(...)`, `or(...)`,
`not(...)`, `order().by(...)`, `project("valueA", "valueB", "valueC").by(...).by(...).by(...)`, `union(..., ..., ...)`,
Expand Down Expand Up @@ -331,3 +331,4 @@ See configuration option `query.batch.has-step-mode` to control properties pre-f
See configuration option `query.batch.properties-mode` to control properties pre-fetching behaviour for `values`,
`properties`, `valueMap`, `propertyMap`, and `elementMap` steps.
See configuration option `query.batch.label-step-mode` to control labels pre-fetching behaviour for `label` step.
See configuration option `query.batch.drop-step-mode` to control drop batching behaviour for `drop` step.
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.__;
import org.apache.tinkerpop.gremlin.process.traversal.step.filter.DropStep;
import org.apache.tinkerpop.gremlin.process.traversal.step.filter.HasStep;
import org.apache.tinkerpop.gremlin.process.traversal.step.util.WithOptions;
import org.apache.tinkerpop.gremlin.process.traversal.strategy.decoration.SubgraphStrategy;
Expand Down Expand Up @@ -135,10 +136,12 @@
import org.janusgraph.graphdb.relations.StandardVertexProperty;
import org.janusgraph.graphdb.serializer.SpecialInt;
import org.janusgraph.graphdb.serializer.SpecialIntSerializer;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphDropStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphElementMapStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphHasStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphPropertiesStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphPropertyMapStep;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryDropStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryHasStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryLabelStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryPropertiesStrategyMode;
Expand Down Expand Up @@ -214,6 +217,7 @@
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.DB_CACHE;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.DB_CACHE_CLEAN_WAIT;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.DB_CACHE_TIME;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.DROP_STEP_BATCH_MODE;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.FORCE_INDEX_USAGE;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.HARD_MAX_LIMIT;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.HAS_STEP_BATCH_MODE;
Expand Down Expand Up @@ -10023,11 +10027,7 @@ public void testMultiQueryDropsVertices() {

int verticesAmount = 42;

for (int i = 0; i < verticesAmount; i++) {
Vertex vertex = tx.addVertex("id", i);
vertex.property("name", "name_test");
vertex.property("details", "details_" + i);
}
addVerticesForDropTest(verticesAmount);

clopen();

Expand All @@ -10039,20 +10039,90 @@ public void testMultiQueryDropsVertices() {
.map(v -> (JanusGraphVertex) v)
.collect(Collectors.toList());

int actualCount = tx.multiQuery(vertices).drop();
int actualCount = tx.multiQuery(vertices).drop().size();
clopen();

assertEquals(verticesAmount, actualCount);

int afterDropCount = tx.traversal()
.V()
.has("name", "name_test")
.toList()
.size();
long afterDropCount = getVerticesForDropTestCount();

assertEquals(0, afterDropCount);
}

@Test
public void testMultiQueryDropsStrategyModes() {

mgmt.makePropertyKey("id").dataType(Integer.class).cardinality(Cardinality.SINGLE).make();
PropertyKey nameProp = mgmt.makePropertyKey("name").dataType(String.class).cardinality(Cardinality.SINGLE).make();
mgmt.makePropertyKey("details").dataType(String.class).cardinality(Cardinality.SINGLE).make();
mgmt.buildIndex("nameIndex", Vertex.class).addKey(nameProp).buildCompositeIndex();

finishSchema();

long verticesAmount = 42;

// Mode: NONE

addVerticesForDropTest(verticesAmount);
graph.tx().commit();
clopen(option(DROP_STEP_BATCH_MODE), MultiQueryDropStepStrategyMode.NONE.getConfigName());
assertEquals(verticesAmount, getVerticesForDropTestCount());
TraversalMetrics profileT = graph.traversal().V().drop().profile().next();
assertTrue(profileT.getMetrics().stream().anyMatch(metrics -> metrics.getName().equals(DropStep.class.getSimpleName())));
graph.tx().commit();
assertEquals(0, getVerticesForDropTestCount());

// Mode: ALL

addVerticesForDropTest(verticesAmount);
graph.tx().commit();
clopen(option(DROP_STEP_BATCH_MODE), MultiQueryDropStepStrategyMode.ALL.getConfigName());
assertEquals(verticesAmount, getVerticesForDropTestCount());
profileT = graph.traversal().V().drop().profile().next();
assertEquals("true", profileT.getMetrics().stream().filter(metrics -> metrics.getName().equals(JanusGraphDropStep.class.getSimpleName())).findAny().get().getAnnotation("multi"));
graph.tx().commit();
assertEquals(0, getVerticesForDropTestCount());

// `limit` with `drop` step.

addVerticesForDropTest(verticesAmount);
graph.tx().commit();
clopen(option(DROP_STEP_BATCH_MODE), MultiQueryDropStepStrategyMode.NONE.getConfigName());
assertEquals(verticesAmount, getVerticesForDropTestCount());
int limitSize = 2;
profileT = graph.traversal().V().limit(limitSize).drop().profile().next();
assertTrue(profileT.getMetrics().stream().anyMatch(metrics -> metrics.getName().equals(DropStep.class.getSimpleName())));
graph.tx().commit();
long afterDropCount = getVerticesForDropTestCount();
assertEquals(verticesAmount-limitSize, afterDropCount);
graph.traversal().V().drop().iterate();
graph.tx().commit();
addVerticesForDropTest(verticesAmount);
graph.tx().commit();
clopen(option(DROP_STEP_BATCH_MODE), MultiQueryDropStepStrategyMode.ALL.getConfigName());
assertEquals(verticesAmount, getVerticesForDropTestCount());
profileT = graph.traversal().V().limit(limitSize).drop().profile().next();
assertEquals("true", profileT.getMetrics().stream().filter(metrics -> metrics.getName().equals(JanusGraphDropStep.class.getSimpleName())).findAny().get().getAnnotation("multi"));
graph.tx().commit();
afterDropCount = getVerticesForDropTestCount();
assertEquals(verticesAmount-limitSize, afterDropCount);
}

private void addVerticesForDropTest(long verticesAmount){
for (int i = 0; i < verticesAmount; i++) {
Vertex vertex = graph.addVertex("id", i);
vertex.property("name", "name_test");
vertex.property("details", "details_" + i);
}
}

private long getVerticesForDropTestCount(){
return graph.traversal()
.V()
.has("name", "name_test")
.count().next();
}

@ParameterizedTest
@ValueSource(booleans = {true, false})
public void testParallelBackendOps(boolean parallelBackendOpsEnabled) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import org.janusgraph.diskstorage.configuration.WriteConfiguration;
import org.janusgraph.diskstorage.cql.CQLConfigOptions;
import org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryDropStepStrategyMode;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
Expand Down Expand Up @@ -65,6 +66,7 @@ public WriteConfiguration getConfiguration() {
config.set(GraphDatabaseConfiguration.STORAGE_BACKEND,"cql");
config.set(CQLConfigOptions.LOCAL_DATACENTER, "dc1");
config.set(GraphDatabaseConfiguration.USE_MULTIQUERY, true);
config.set(GraphDatabaseConfiguration.DROP_STEP_BATCH_MODE, MultiQueryDropStepStrategyMode.NONE.getConfigName());
return config.getConfiguration();
}

Expand Down Expand Up @@ -103,7 +105,7 @@ public Integer dropVertices() {
.map(v -> (JanusGraphVertex) v)
.collect(Collectors.toList());

dropCount = tx.multiQuery(vertices).drop();
dropCount = tx.multiQuery(vertices).drop().size();
} else {
dropCount = tx.traversal()
.V()
Expand All @@ -117,6 +119,27 @@ public Integer dropVertices() {
return dropCount;
}

@Benchmark
public Integer dropVerticesGremlinQuery() {

JanusGraphTransaction tx;
if (isMultiDrop) {
tx = graph.buildTransaction().setDropStepStrategyMode(MultiQueryDropStepStrategyMode.ALL).start();
} else {
tx = graph.buildTransaction().setDropStepStrategyMode(MultiQueryDropStepStrategyMode.NONE).start();
}

Integer dropCount = tx.traversal()
.V()
.has("name", "name_test")
.drop()
.toList()
.size();

tx.rollback();
return dropCount;
}

private void addVertices() {
for (int i = 0; i < verticesAmount; i++) {
Vertex vertex = graph.addVertex("id", i);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,6 @@ public interface JanusGraphMultiVertexQuery<Q extends JanusGraphMultiVertexQuery
*/
JanusGraphMultiVertexQuery addAllVertices(Collection<? extends Vertex> vertices);


@Override
Q adjacent(Vertex vertex);

Expand Down Expand Up @@ -156,7 +155,8 @@ public interface JanusGraphMultiVertexQuery<Q extends JanusGraphMultiVertexQuery
/**
* Drops all vertices that match this query
*
* @return Count of dropped vertices
* @return Map of vertices and their relations which were dropped
*/
Integer drop();
Map<JanusGraphVertex, Iterable<JanusGraphRelation>> drop();

}
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
package org.janusgraph.core;

import org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryDropStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryHasStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryLabelStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryPropertiesStrategyMode;
Expand Down Expand Up @@ -186,6 +187,15 @@ public interface TransactionBuilder {
*/
TransactionBuilder setLabelsStepStrategyMode(MultiQueryLabelStepStrategyMode labelStepStrategyMode);

/**
* Sets `drop` step strategy mode.
* <p>
* Doesn't have any effect if multi-query was disabled via config `query.batch.enabled = false`.
*
* @return Object with the set drop strategy mode settings
*/
TransactionBuilder setDropStepStrategyMode(MultiQueryDropStepStrategyMode dropStepStrategyMode);

/**
* Sets the group name for this transaction which provides a way for gathering
* reporting on multiple transactions into one group.
Expand Down
Loading

0 comments on commit eee1996

Please sign in to comment.