Character Embedding + ESIM + Focal Loss for Chinese Answer Sentence Selection

This is a course project for Web Data Mining. The task is to decide whether a sentence contains the answer to the questions. We use ESIM (Enhanced LSTM for Natural Language Inference) as our main model. Pretrained Chinese character embedding is adopted to faciliate character-level matching between questions and answers. We employ focal loss to address the unbalanced label. A PowerPoint slide is attached in which we further explain our method.

Requirement

Python (>= 3.6)
PyTorch (>= 1.0)
torchtext

Dataset

The dataset for this project is NLPCC DBQA 2016.

Result

	MAP	MRR
All-0	25.30	25.81
BERT	93.73	93.83
Ours	90.33	90.48

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
README.md		README.md
model.py		model.py
presentation.pdf		presentation.pdf
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Character Embedding + ESIM + Focal Loss for Chinese Answer Sentence Selection

Requirement

Dataset

Result

Reference

About

Releases

Packages

Languages

luciusssss/answer-sentence-selection

Folders and files

Latest commit

History

Repository files navigation

Character Embedding + ESIM + Focal Loss for Chinese Answer Sentence Selection

Requirement

Dataset

Result

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages