Skip to content

DeadCardassian/POSCorefImpact

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

JJ2NN in Coref

Overview

This project investigates the impact of part-of-speech (POS) changes on the performance of a deterministic coreference resolution system (dcoref). Using the Stanford NLP pipeline and the CoNLL-2011 dataset, we specifically modify the POS tags to explore how nominal adjectives affects coreference resolution (coref) performance. This repository is independent, but it is also part of this paper.

Table of Contents

  1. Background
    1.1 The basic idea
    1.2 Coref tagger
    1.3 Dataset
  2. Setups
  3. Methodology
  4. Usage
  5. Experimental Results
  6. Conclusion
  7. Dependencies and Installation
  8. Acknowledgement

Background

1.1 The basic idea

This is a practice that originated from an idea of investigating nominal adjectives phenomenon. Nominal adjectives are adjectives that function as nouns in certain contexts. For instance, in the phrase "Education reform supports the gifted," the word "gifted" acts as a noun. However, most rule-based POS tagger tend to mark such words as JJ. It's a convention that probably began with the Penn Treebank tagging guidelines (1990) for convenience purpose:

Generic adjectives should be tagged as adjectives (JJ) and not as plural common nouns (NNS), even when they trigger subject-verb agreement, if they can be modified by adverbs."

Therefore, this project means to investigate the impact of labeling these JJ as NN on POS based coref tagger. Other downstream tasks will be studied as appropriate.

1.2 Coref tagger

This project uses Stanford Deterministic Coreference Resolution System as the coref tagger. This is a rule-based coref tagger, It implements the multi-pass sieve coreference method described in Lee et al. (CoNLL Shared Task 2011) and Raghunathan et al. (EMNLP 2010). It requires inputs such as POS tags, parse trees, and so on.

1.3 Dataset

This project uses the CoNLL-2011 Shared Task dataset, which includes annotated coreference information. The development set is used to score the model's performance. CoNLL 2011 dataset is based on Ontonotes5.0, details here.

Setups

This research is based on the Stanford NLP and CoNLL 2011 pipelines.

Setup steps:

  1. Download Ontonotes5.0 (LDC account required).
  2. Follow the tutorial to download the CoNLL 2011 development dataset and use the corresponding script to convert .skel files to conll-format files.
  3. Download v8.01 scorer as instructed.
  4. Follow the dcoref tutorial to reproduce their CoNLL 2011 results. This project uses the following instruction: java -cp "*" -Xmx8g edu.stanford.nlp.dcoref.SieveCoreferenceSystem -props coref.properties. coref.properties is in the repository, you need to fill in the scorer and dataset path first, the comments give examples.

If you can reproduce the results in step 4, you have completed setups consistent with this project.

Methodology

First, through some tests performed on dcoref, it is observed that its performance depends heavily on the parse (syntax) tree, especially the NP phrases within it. To maximize the relationship between POS and coreference, I eliminated the effect of the parse tree on dcoref by replacing all phrases with X (unclassifiable), excluding TOP and S, which significantly reduced the accuracy of dcoref, and replacing all phrases with FRAG showed similar results.

After deactivating the parse tree, I used the rule-based JJ2NN algorithm developed by Qi Lemeng and I to update all the POS information in the dataset.

JJ2NN algorithm details: The script (conll_pos_modify.py) modifies lines in a file where the POS tag JJ (adjective) is changed to NN (noun) under certain conditions. Here’s how the process works:

  1. Identifying Lines with POS Tag 'JJ':

    • The script splits each line into words and checks if the fifth element (index 4) in the split words is one of 'JJ', 'JJS', or 'JJR', which are all adjective tags.
  2. Conditions for Modifying 'JJ':

    • The script checks the surrounding lines (previous and next lines) for specific patterns in their POS tags. These conditions involve:
      • If the previous line has more than 4 words, and its fifth word is not 'DT' (determiner).
      • If the next line's fifth word is not a noun (doesn't start with 'N') and is not in a predefined list of other POS tags (like 'JJ', 'CD', etc.).
      • Further checks are made for the POS tags of the lines after the next one (i + 2) to ensure certain combinations are met, such as whether the third line contains certain adjectives or verbs.

Summary:

  • If the script detects that a word's POS tag is 'JJ' and the surrounding lines match certain patterns, it changes the 'JJ' to 'NN' (noun). This modification happens only when the complex set of conditions involving the POS tags of adjacent lines is satisfied.

Usage

If you haven't finished setups, finish setups first.

The dataset has the following structure: .../conll-2011-dev.v2/conll-2011/v2/data/dev/data/english/annotations/...

directory refers to: .../conll-2011/v2/data/dev

file_suffix should be: .v2_auto_conll

Following process changes the dataset, so it should be backed up in advance.

Run python conll_parse_modify.py directory file_suffix to deactivate parse tree.

Run python conll_pos_modify.py directory file_suffix to execute JJ2NN algorithm.

Run java -cp "*" -Xmx8g edu.stanford.nlp.dcoref.SieveCoreferenceSystem -props coref.properties to see the result.

Experimental Results

As shown in the table, the first row is the default result of dcoref on CoNLL 2011 dev set, the second row is the result after deactivating the parse tree, and the third row is the result of deactivating the parse tree and using the JJ2NN algorithm. The format of this table is consistent with the table on the dcoref home page. The result of deactivating the parse tree dropped significantly compared to the default result, since dcoref relies on the information in the parse tree to predict the coreference.

Taking result 2 as the baseline, it is found that after modifying the POS using JJ2NN algorithm, precision improves while recall remains unchanged or decreases, which leads to a slight increase in F1, with a amplitude of about 0.1%. Since we only modified 206/136,863 (0.15%) words in total (nominal adjectives are rare), the small improvement in the model score illustrates the effectiveness of the JJ2NN algorithm, and shows that it is not linguistically reasonable to label adjectives with noun characteristics as JJ.

MUC B cubed CEAF (M) CEAF (E) Avg F1
P R F1 P R F1 P R F1 P R F1
conllst2011 dev (1) 62.06 59.31 60.65 56.20 48.55 52.10 58.00 57.52 57.76 48.89 53.47 51.08 54.61
conll 2011 w/o parse tree (2) (baseline) 58.31 41.14 48.24 50.13 31.22 38.48 57.42 38.66 46.21 49.50 28.87 36.47 41.06
conll 2011 J2N w/o parse tree (3) 58.42
(+0.19%↑)
41.14 48.28
(+0.08%↑)
50.21
(+0.16%↑)
31.22 38.50
(+0.05%↑)
57.52
(+0.17%↑)
38.66 46.24
(+0.06%↑)
49.58
(+0.16%↑)
28.86
(-0.03%↓)
36.48
(+0.03%↑)
41.09
(+0.07%↑)

Conclusion

This project demonstrates that modifying nominal adjectives (JJ) to nouns (NN) in POS tags leads to a slight increase in precision and F1 score for coreference resolution, despite a minimal impact on recall. The results suggest that such adjustments can improve the linguistic accuracy and effectiveness of coreference systems.

Dependencies and Installation

  • Python version: 3.x
  • Stanford NLP version: 4.5.6
  • CoNll 2011 scorer version: v8.01

Acknowledgments

Thanks to Sameer Pradhan for his help with data availability on the CoNLL 2011 website.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages