Skip to content

Commit

Permalink
1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
425776024 committed Mar 7, 2020
1 parent 088ef23 commit 537c6a4
Show file tree
Hide file tree
Showing 24 changed files with 24,320 additions and 19 deletions.
4 changes: 4 additions & 0 deletions .ignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.idea
.ignore
test
update.sh
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include nlpcda/data/*.txt #包含根目录下的所有txt文件
14 changes: 13 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,13 @@
# A (Chinese) NLP Text Summarization
# NLP Chinese Data Augmentation 一键中文数据增强工具

---

一键中文数据增强工具,支持:
- 随机实体替换
- 近义词
- 近义近音字替换
- 随机字删除

在不改变原文的情况下生成指定数量的训练语料文本

Email:[email protected]
11 changes: 11 additions & 0 deletions nlpcda.egg-info/PKG-INFO
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Metadata-Version: 1.0
Name: nlpcda
Version: 1.0
Summary: NLP Chinese Data Augmentation,一键中文数据增强工具
Home-page: https://github.com/425776024/nlpcda
Author: Jiang.XinFa
Author-email: [email protected]
License: MIT Licence
Description: 一键中文数据增强工具,支持:随机实体替换,近义词、近义近音字替换,随机字删除。在不改变原文的情况下生成指定数量的训练语料文本
Keywords: pip,nlptool,nlpcda,nlp
Platform: any
21 changes: 21 additions & 0 deletions nlpcda.egg-info/SOURCES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MANIFEST.in
README.md
setup.py
nlpcda/__init__.py
nlpcda/config.py
nlpcda/example.py
nlpcda.egg-info/PKG-INFO
nlpcda.egg-info/SOURCES.txt
nlpcda.egg-info/dependency_links.txt
nlpcda.egg-info/requires.txt
nlpcda.egg-info/top_level.txt
nlpcda/data/company.txt
nlpcda/data/反意字.txt
nlpcda/data/同义词.txt
nlpcda/data/同音意字.txt
nlpcda/tools/Basetool.py
nlpcda/tools/__init__.py
nlpcda/tools/homophone.py
nlpcda/tools/randomdeletechar.py
nlpcda/tools/randomword.py
nlpcda/tools/similarword.py
1 change: 1 addition & 0 deletions nlpcda.egg-info/dependency_links.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

1 change: 1 addition & 0 deletions nlpcda.egg-info/requires.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
jieba
1 change: 1 addition & 0 deletions nlpcda.egg-info/top_level.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
nlpcda
4 changes: 4 additions & 0 deletions nlpcda/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-

__author__ = 'Jiang.XinFa'
12 changes: 12 additions & 0 deletions nlpcda/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/python
# -*- coding: utf-8 -*-

import os

root_path = os.path.abspath(os.path.dirname(__file__))

homophone_path = root_path + '/data/同音意字.txt'
similarword_path = root_path + '/data/同义词.txt'

random_path = root_path + '/data/company.txt'
company_path = root_path + '/data/company.txt'
Loading

0 comments on commit 537c6a4

Please sign in to comment.