PPDAI Magic Mirror Data Application Contest

Introduction

This is the repository for PPDAI contest, which is a natural language processing (NLP) model aims to detect duplicate questions in Chinese.

Data

Data was provided by PPDAI, which are pairs of questions labeled with 0 and 1 represents similar or not. The questions are represented by two sequences of integers which are the indices of corresponding embedding vectors (word and character).

Model

We proposed three models including a RNN based model, CNN based model and a RCNN based model. These models have the following characteristics:

Bi-Directional GRU in RNN based models for semantic learning.
1-D Convolution in CNN and RCNN based models for local feature extraction.
Co-Attention was used to learn the semantic correlations between two sequences.
Self-Attention was used to enhance the feature representation.
Word embedding and Character Embedding were used simultaneously.

Performance:

Our ensemble model achieved 0.203930 of loss in the semi-final, at the top 15% in ranking.

Reference

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension[ICLR 2018]

Zhouhan Lin et al. “A Structured Self-attentive Sentence Embedding”. In:CoRRabs/1703.03130 (2017).arXiv:1703.03130.

Pranav Rajpurkar et al. “SQuAD: 100, 000+ Questions for Machine Comprehension of Text”. In:CoRRabs/1606.05250 (2016). arXiv:1606.05250.

Wenhui Wang et al. “Gated Self-Matching Networks for Reading Comprehension and Question Answering”

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
models		models
README.md		README.md
pre_process.py		pre_process.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PPDAI Magic Mirror Data Application Contest

Introduction

Data

Model

Performance:

Reference

About

Releases

Packages

Contributors 2

Languages

yuriak/PPDAI

Folders and files

Latest commit

History

Repository files navigation

PPDAI Magic Mirror Data Application Contest

Introduction

Data

Model

Performance:

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages