Skip to content

Commit

Permalink
Initial
Browse files Browse the repository at this point in the history
  • Loading branch information
XinzeZhang committed Mar 6, 2022
0 parents commit 7994588
Show file tree
Hide file tree
Showing 55 changed files with 10,380 additions and 0 deletions.
218 changes: 218 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
# File created using '.gitignore Generator' for Visual Studio Code: https://bit.ly/vscode-gig

# Created by https://www.toptal.com/developers/gitignore/api/visualstudiocode,macos,python
# Edit at https://www.toptal.com/developers/gitignore?templates=visualstudiocode,macos,python

aux_files/
corpus/
dumped/
models/Transformer/en-de-en

### macOS ###
# General
.DS_Store
.AppleDouble
.LSOverride

# Icon must end with two \r
Icon


# Thumbnails
._*

# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent

# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk

### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

### VisualStudioCode ###
.vscode/*
!.vscode/settings.json
!.vscode/tasks.json
!.vscode/launch.json
!.vscode/extensions.json
!.vscode/*.code-snippets

# Local History for Visual Studio Code
.history/

# Built Visual Studio Code Extensions
*.vsix

### VisualStudioCode Patch ###
# Ignore all local history of files
.history
.ionide

# Support for Project snippet scope

# End of https://www.toptal.com/developers/gitignore/api/visualstudiocode,macos,python

# Custom rules (everything added below won't be overriden by 'Generate .gitignore File' if you use 'Update' option)

16 changes: 16 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python: Current File",
"type": "python",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal",
"justMyCode": true
}
]
}
3 changes: 3 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"cSpell.enabled": false
}
79 changes: 79 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Crafting Adversarial Examples for Neural Machine Translation

We provide code for the Paper: *Crafting Adversarial Examples for Neural Machine Translation* \
Xinze Zhang, Junzhe Zhang, Zhenhua Chen, Kun He

## Introduction

Effective adversary generation for neural machine translation (NMT) is a crucial prerequisite for building robust machine translation systems. In this work, we investigate veritable evaluations of NMT adversarial attacks, and propose a novel method to craft NMT adversarial examples. We propose to leverage the round-trip translation technique to build valid metrics for evaluating NMT adversarial attacks. Our intuition is that an effective NMT adversarial example, which imposes minor shifting on the source and degrades the translation dramatically, would naturally lead to a semantic-destroyed round-trip translation result. We then propose a promising black-box attack method called Word Saliency speedup Local Search (WSLS) that could effectively attack the mainstream NMT architectures.

Our main contributions are as follows:

1) We introduce an appropriate definition of NMT adversary and the deriving evaluation metrics, which are capable of estimating the adversaries only using source information, and tackle well the challenge of missing ground-truth reference after the perturbation.

2) We propose a novel black-box word level NMT attack method that could effectively attack the mainstream NMT models, and exhibit high transferability when attacking popular online translators.


## Server Environment
- Pos_OS 20.04.LTS
- Python 3.6
- CUDA >= 10.2
- use `cat /usr/lib/cuda/version.txt` to check the cuda version
- NCCL for fairseq
- The official and tested builds of NCCL can be downloaded from: https://developer.nvidia.com/nccl
- apex for faster training (require RTX-based GPUs)
```
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \
--global-option="--deprecated_fused_adam" --global-option="--xentropy" \
--global-option="--fast_multihead_attn" ./
```

## Requirements
```
pickle
sklearn
torch
nltk
pkuseg
transformers
pyltp
zhon
spacy
spacy(en-pipcore-web-sm)
spacy_cld
sacremoses
subword-nmt
fairseq
fvcore
```
### Installation
- pickle
- default package in the protogenetic conda python env
- sklearn, nltk, torch
```
conda install scikit-learn
conda install nltk
conda install pytorch cudatoolkit=10.2 -c pytorch
```
- pkuseg, zhon, spacy, spacy([en-pipcore-web-sm](https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz#egg=en_core_web_sm==2.2.0)), spacy_cld, sacremoses, subword, pyltp (require python = 3.6),transformers (require pytorch), fairseq (require Pytorch >= 1.4.0, Python >= 3.5, Nvidia GPU , and NCCL)
```
pip install pkuseg
pip install zhon
pip install sacremoses
pip install subword-nmt
pip install pyltp
pip install transformers
pip install fvcore
pip install spacy-cld
pip install -U spacy
pip install ./aux_files/en_core_web_sm-2.2.0.tar.gz
cd ..
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./
```
where `en_core_web_sm-2.2.0.tar.gz` can be downloaded in the [link](https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz#egg=en_core_web_sm==2.2.0)

Loading

0 comments on commit 7994588

Please sign in to comment.