Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update search tutorials for pipelines and README #3092

Merged
merged 3 commits into from
Aug 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions pipelines/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,8 @@ GPU 镜像下载大概耗时 3 分钟左右,容器启动成功后,通过浏
市面已有的工程规范查询系统解决方案一直延续着传统关键字词匹配的方式,依赖用户对对查询结果进行自行排序、筛选,甚至要再次人工查阅工程规范文件后,才能最终确认是否为想要查询的规范条款。传统规范查询系统至少需要进行 3~5 次查询才能找到用户想要的规范条款,而寻规系统是基于强大预训练模型构建起来的语义检索系统,针对 80% 的规范查询需求仅 **1 次查询** 就能精确命中查询意图,并返回查询条款的结果!

## :mortar_board: Tutorials
- Tutorial 1 - 语义检索 Pipeline: [AIStudio notebook]() | [Python](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/pipelines/examples/semantic-search/semantic_search_example.py)
- Tutorial 2 - 智能问答 Pipeline: [AIStudio notebook]() | [Python](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/pipelines/examples/question-answering/dense_qa_example.py)
- Tutorial 1 - 语义检索 Pipeline: [AIStudio notebook](https://aistudio.baidu.com/aistudio/projectdetail/4442670) | [Python](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/pipelines/examples/semantic-search/semantic_search_example.py)
- Tutorial 2 - 智能问答 Pipeline: [AIStudio notebook](https://aistudio.baidu.com/aistudio/projectdetail/4442857) | [Python](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/pipelines/examples/question-answering/dense_qa_example.py)
## :vulcan_salute: 社区交流
微信扫描二维码并填写问卷之后,加入交流群与来自各行各业的小伙伴交流学习吧~
<div align="center">
Expand Down
1 change: 1 addition & 0 deletions pipelines/examples/question-answering/Install_windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
cd ${HOME}/PaddleNLP/applications/experimental/pipelines/
python setup.py install
```
【注意】以下的所有的流程都只需要在`pipelines`根目录下进行,不需要跳转目录
### 1.2 数据说明
问答知识库数据是我们爬取了百度百科上对国内重点城市的百科介绍文档。我们将所有文档中的非结构化文本数据抽取出来, 按照段落切分后作为问答系统知识库的数据,一共包含 365 个城市的百科介绍文档、切分后共 1318 个段落。

Expand Down
5 changes: 3 additions & 2 deletions pipelines/examples/question-answering/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
cd ${HOME}/PaddleNLP/applications/experimental/pipelines/
python setup.py install
```
【注意】以下的所有的流程都只需要在`pipelines`根目录下进行,不需要跳转目录
### 3.2 数据说明
问答知识库数据是我们爬取了百度百科上对国内重点城市的百科介绍文档。我们将所有文档中的非结构化文本数据抽取出来, 按照段落切分后作为问答系统知识库的数据,一共包含 365 个城市的百科介绍文档、切分后共 1318 个段落。

Expand Down Expand Up @@ -132,7 +133,7 @@ python rest_api/application.py 8891
Linux 用户推荐采用 Shell 脚本来启动服务:

```bash
sh scripts/run_qa_server.sh
sh examples/question-answering/run_qa_server.sh
```
启动后可以使用curl命令验证是否成功运行:

Expand All @@ -150,7 +151,7 @@ python -m streamlit run ui/webapp_question_answering.py --server.port 8502
Linux 用户推荐采用 Shell 脚本来启动服务:

```bash
sh scripts/run_qa_web.sh
sh examples/question-answering/run_qa_web.sh
```

到这里您就可以打开浏览器访问 http://127.0.0.1:8502 地址体验城市百科知识问答系统服务了。
Expand Down
1 change: 1 addition & 0 deletions pipelines/examples/semantic-search/Install_windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
cd ${HOME}/PaddleNLP/applications/experimental/pipelines/
python setup.py install
```
【注意】以下的所有的流程都只需要在`pipelines`根目录下进行,不需要跳转目录
### 1.2 数据说明
语义检索数据库的数据来自于[DuReader-Robust数据集](https://github.com/baidu/DuReader/tree/master/DuReader-Robust),共包含 46972 个段落文本。

Expand Down
6 changes: 4 additions & 2 deletions pipelines/examples/semantic-search/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,8 @@ pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
cd ${HOME}/PaddleNLP/applications/experimental/pipelines/
python setup.py install
```
【注意】以下的所有的流程都只需要在`pipelines`根目录下进行,不需要跳转目录

### 3.2 数据说明
语义检索数据库的数据来自于[DuReader-Robust数据集](https://github.com/baidu/DuReader/tree/master/DuReader-Robust),共包含 46972 个段落文本,并选取了其中验证集1417条段落文本来搭建语义检索系统。

Expand Down Expand Up @@ -183,7 +185,7 @@ python rest_api/application.py 8891
Linux 用户推荐采用 Shell 脚本来启动服务::

```bash
sh scripts/run_search_server.sh
sh examples/semantic-search/run_search_server.sh
```
启动后可以使用curl命令验证是否成功运行:

Expand All @@ -201,7 +203,7 @@ python -m streamlit run ui/webapp_semantic_search.py --server.port 8502
Linux 用户推荐采用 Shell 脚本来启动服务::

```bash
sh scripts/run_search_web.sh
sh examples/semantic-search/run_search_web.sh
```

到这里您就可以打开浏览器访问 http://127.0.0.1:8502 地址体验语义检索系统服务了。
Expand Down