Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add update write mode #796

Merged
merged 5 commits into from
Aug 12, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions docs-2.0/20.appendix/6.eco-tool-version.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,63 +34,63 @@ Nebula Graph Studio(简称 Studio)是一款可以通过Web访问的图数据

|Nebula Graph版本|Studio版本|
|:---|:---|
| {{ nebula.release }} | 2.2.1 |
| {{ nebula.release }} | {{studio.base220}} |

## Nebula Exchange

Nebula Exchange(简称Exchange)是一款Apache Spark™应用,用于在分布式环境中将集群中的数据批量迁移到Nebula Graph中,能支持多种不同格式的批式数据和流式数据的迁移。详情请参见[什么是Nebula Exchange](../nebula-exchange/about-exchange/ex-ug-what-is-exchange.md)。

|Nebula Graph版本|[Exchange](https://github.com/vesoft-inc/nebula-spark-utils/tree/v2.0.0/nebula-exchange)版本(commit id)|
|:---|:---|
| {{ nebula.release }} | 2.0.1(TODO:coding)、2.0.0(TODO:coding) |
| {{ nebula.release }} | {{exchange.release}}(TODO:coding) |

## Nebula Importer

Nebula Importer(简称Importer)是一款Nebula Graph的CSV文件导入工具。Importer可以读取本地的CSV文件,然后导入数据至Nebula Graph图数据库中。详情请参见[什么是Nebula Importer](../nebula-importer/use-importer.md)。

|Nebula Graph版本|[Importer](https://github.com/vesoft-inc/nebula-importer/tree/release-v2.0.0-ga)版本(commit id)|
|:---|:---|
| {{ nebula.release }} | 2.0.0(TODO:coding) |
| {{ nebula.release }} | {{importer.release}}(TODO:coding) |

## Nebula Spark Connector

Nebula Spark Connector是一个Spark连接器,提供通过Spark标准形式读写Nebula Graph数据的能力。Nebula Spark Connector由Reader和Writer两部分组成。详情请参见[什么是Nebula Spark Connector](../nebula-spark-connector.md)。

|Nebula Graph版本|[Spark Connector](https://github.com/vesoft-inc/nebula-spark-utils/tree/v2.0.0/nebula-spark-connector)版本(commit id)|
|:---|:---|
| {{ nebula.release }} | 2.0.1(TODO:coding)、2.0.0(TODO:coding) |
| {{ nebula.release }} | {{sparkconnector.release}}(TODO:coding) |

## Nebula Flink Connector

Nebula Flink Connector是一款帮助Flink用户快速访问Nebula Graph的连接器,支持从Nebula Graph图数据库中读取数据,或者将其他外部数据源读取的数据写入Nebula Graph图数据库。详情请参见[什么是Nebula Flink Connector](../nebula-flink-connector.md)。

|Nebula Graph版本|[Flink Connector](https://github.com/vesoft-inc/nebula-flink-connector)版本(commit id)|
|:---|:---|
| {{ nebula.release }} | 2.0.0(TODO:coding) |
| {{ nebula.release }} | {{flinkconnector.release}}(TODO:coding) |

## Nebula Algorithm

Nebula Algorithm(简称Algorithm)是一款基于[GraphX](https://spark.apache.org/graphx/)的Spark应用程序,通过提交Spark任务的形式使用完整的算法工具对Nebula Graph数据库中的数据执行图计算,也可以通过编程形式调用lib库下的算法针对DataFrame执行图计算。详情请参见[什么是Nebula Algorithm](../nebula-algorithm.md)。

|Nebula Graph版本|[Algorithm](https://github.com/vesoft-inc/nebula-spark-utils/tree/master/nebula-algorithm)版本(commit id)|
|:---|:---|
| {{ nebula.release }} | 2.0.0(TODO:coding) |
| {{ nebula.release }} | {{algorithm.release}}(TODO:coding) |

## Nebula Console

Nebula Console是Nebula Graph的原生CLI客户端。如何使用请参见[连接Nebula Graph](../2.quick-start/3.connect-to-nebula-graph.md)。

|Nebula Graph版本|[Console](https://github.com/vesoft-inc/nebula-console/tree/v2.0.0-ga)版本(commit id)|
|:---|:---|
| {{ nebula.release }} | 2.0.0(TODO:coding) |
| {{ nebula.release }} | {{console.release}}(TODO:coding) |

## Nebula Docker Compose

Docker Compose可以快速部署Nebula Graph集群。如何使用请参见[Docker Compose部署Nebula Graph](../4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose.md)。

|Nebula Graph版本|[Docker Compose](https://github.com/vesoft-inc/nebula-docker-compose/tree/v2.0.0)版本(commit id)|
|:---|:---|
| {{ nebula.release }} | 2.0.0(TODO:coding) |
| {{ nebula.release }} | {{dockercompose.release}}(TODO:coding) |

## API、SDK

Expand Down
34 changes: 30 additions & 4 deletions docs-2.0/nebula-spark-connector.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Nebula Spark Connector是一个Spark连接器,提供通过Spark标准形式读

提供一个Spark SQL接口,用户可以使用该接口编程将DataFrame格式的数据逐条或批量写入Nebula Graph。

更多使用说明请参见[Nebula Spark Connector](https://github.com/vesoft-inc/nebula-spark-utils/blob/v2.0.0/nebula-spark-connector/README_CN.md)。
更多使用说明请参见[Nebula Spark Connector](https://github.com/vesoft-inc/nebula-spark-utils/blob/{{sparkconnector.branch}}/nebula-spark-connector/README_CN.md)。

## 适用场景

Expand All @@ -24,7 +24,9 @@ Nebula Spark Connector适用于以下场景:

- 结合[Nebula Algorithm](nebula-algorithm.md)进行图计算。

## 优势
## 特性

Nebula Spark Connector {{sparkconnector.release}}版本特性如下:

- 提供多种连接配置项,如超时时间、连接重试次数、执行重试次数等。

Expand All @@ -36,6 +38,8 @@ Nebula Spark Connector适用于以下场景:

- Nebula Spark Connector 2.0统一了SparkSQL的扩展数据源,统一采用DataSourceV2进行Nebula Graph数据扩展。

- 支持`insert`和`update`两种写入模式。`insert`模式会插入(覆盖)数据,`update`模式仅会更新已存在的数据。

## 获取Nebula Spark Connector

### 编译打包
Expand All @@ -58,7 +62,7 @@ Nebula Spark Connector适用于以下场景:
$ mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true
```

编译完成后,在目录`nebula-spark-connector/target`下生成类似文件`nebula-spark-connector-2.0.0-SHANPSHOT.jar`。
编译完成后,在目录`nebula-spark-connector/target`下生成类似文件`nebula-spark-connector-{{sparkconnector.release}}-SHANPSHOT.jar`。

### Maven远程仓库下载

Expand Down Expand Up @@ -132,7 +136,7 @@ val edge = spark.read.nebula(config, nebulaReadEdgeConfig).loadEdgesToDF()
|`withNoColumn` |否| 是否不读取属性。默认值为`false`,表示读取属性。取值为`true`时,表示不读取属性,此时`withReturnCols`配置无效。 |
|`withReturnCols` |否| 配置要读取的点或边的属性集。格式为`List(property1,property2,...)`,默认值为`List()`,表示读取全部属性。 |
|`withLimit` |否| 配置Nebula Java Storage Client一次从服务端读取的数据行数。默认值为1000。 |
|`withPartitionNum` |否| 配置读取Nebula Graph数据时Spark的分区数。默认值为100。该值的配置最好不超过图空间的的分片数量(partition_num)。 |
|`withPartitionNum` |否| 配置读取Nebula Graph数据时Spark的分区数。默认值为100。该值的配置最好不超过图空间的的分片数量(partition_num)。|

### 向Nebula Graph写入数据

Expand Down Expand Up @@ -176,6 +180,26 @@ val nebulaWriteEdgeConfig: WriteNebulaEdgeConfig = WriteNebulaEdgeConfig
df.write.nebula(config, nebulaWriteEdgeConfig).writeEdges()
```

默认写入模式为`insert`,可以通过`withWriteMode`配置修改为`update`:

```scala
val config = NebulaConnectionConfig
.builder()
.withMetaAddress("127.0.0.1:9559")
.withGraphAddress("127.0.0.1:9669")
.build()
val nebulaWriteVertexConfig = WriteNebulaVertexConfig
.builder()
.withSpace("test")
.withTag("person")
.withVidField("id")
.withVidAsProp(true)
.withBatch(1000)
.withWriteMode(WriteMode.UPDATE)
.build()
df.write.nebula(config, nebulaWriteVertexConfig).writeVertices()
```

- `NebulaConnectionConfig`是连接Nebula Graph的配置,说明如下。

|参数|是否必须|说明|
Expand All @@ -196,6 +220,7 @@ df.write.nebula(config, nebulaWriteEdgeConfig).writeEdges()
|`withUser` |否| Nebula Graph用户名。若未开启[身份验证](7.data-security/1.authentication/1.authentication.md),无需配置用户名和密码。 |
|`withPasswd` |否| Nebula Graph用户名对应的密码。 |
|`withBatch` |是| 一次写入的数据行数。默认值为`1000`. |
|`withWriteMode`|否|写入模式。可选值为`insert`和`update`。默认为`insert`。|

- `WriteNebulaEdgeConfig`是写入边的配置,说明如下。

Expand All @@ -214,3 +239,4 @@ df.write.nebula(config, nebulaWriteEdgeConfig).writeEdges()
|`withUser` |否| Nebula Graph用户名。若未开启[身份验证](7.data-security/1.authentication/1.authentication.md),无需配置用户名和密码。 |
|`withPasswd` |否| Nebula Graph用户名对应的密码。 |
|`withBatch` |是| 一次写入的数据行数。默认值为`1000`. |
|`withWriteMode`|否|写入模式。可选值为`insert`和`update`。默认为`insert`。|
7 changes: 5 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ extra:
link: 'https://github.com/vesoft-inc/nebula-docs-cn'
studio:
base111b: 1.1.1-beta
base220: 2.2.0
base220: 2.2.1
explorer:
base100: 1.0.0
exchange:
Expand All @@ -58,14 +58,17 @@ extra:
release: master
sparkconnector:
release: master
branch: master
flinkconnector:
release: master
dockercompose:
release: master
common:
release: master
dashboard:
release: nebula-graph-dashboard-beta
release: master
console:
release: master
cpp:
release: 2.5.0
java:
Expand Down