Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model seems to perform tasks other than paraphrasing too. #7

Open
coderpotter opened this issue Dec 27, 2020 · 0 comments
Open

Model seems to perform tasks other than paraphrasing too. #7

coderpotter opened this issue Dec 27, 2020 · 0 comments

Comments

@coderpotter
Copy link

@ramsrigouthamg I trained the model using the scripts in this repo and the model was performing some other task like sentiment analysis, etc. The predictions of the model are shown below.


Truth:
It is located at 142 South Rexford Drive in Beverly Hills . It is opposite the First Church of Christ , scientist , Beverly Hills , California .

Prediction:
True
False
________________________________________________________________________________
Daniel Armstrong is an Australian film director who is also known for his work as a writer , producer and editor .

Truth:
Daniel Armstrong is an Australian film director . Armstrong is also known for his work as a writer , producer and editor .

Prediction:
True
False
________________________________________________________________________________
Magnus turned around and reformed the British invasion of Williams by attacking the Eric Young and Orlando Jordan team .

Truth:
Magnus turned heel and reformed the British Invasion with Williams by attacking the team of Eric Young and Orlando Jordan .

Prediction:
W. Magnus in. Magnus turned around and reformed the British invasion of Williams by attacking the Eric Young and Orlando Jordan team.
"Magnus turned around and reformed the invasion of Williams by attacking the Eric Young and Orlando Jordan team ".
- In his own words., Magnus turned around and reformed the British invasion of Williams by attacking the Eric Young and Orlando Jordan team.
Magnus took over the British invasion of Williams by attacking the Eric Young and Orlando Jordan team.
Great: Magnus changed course and reversed the British invasion of Williams by attacking the Eric Young and Orlando Jordan team.

I think (and hope) something is going wrong in the way the model is called and not the training process itself. This is how I call the model -

import json
import logging
import torch
from datetime import datetime

import numpy as np
import pandas as pd
from transformers import T5ForConditionalGeneration,T5Tokenizer

from tqdm import tqdm

def set_seed(seed):
  torch.manual_seed(seed)
  if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)

set_seed(42)

logging.basicConfig(level=logging.ERROR)
device = "cuda:1"

model_args = {
    "overwrite_output_dir": True,
    "max_seq_length": 256,
    "eval_batch_size": 32,
    "num_train_epochs": 1,
    "use_multiprocessing": True,
    "num_beams": None,
    "do_sample": True,
    "max_length": 50,
    "top_k": 120,
    "top_p": 0.95,
    "num_return_sequences": 5,
}

prefix = "paraphrasing"

# Load the trained model
model = T5ForConditionalGeneration.from_pretrained('t5_paraphrase/epoch2/')
model = model.to(device)
tokenizer = T5Tokenizer.from_pretrained('t5-base')

# Load the evaluation data
df = pd.read_csv("paraphrase_data/val.tsv", sep="\t")
df.columns = ["input_text", "target_text"]
df.insert(0, "prefix", ['paraphrase']*len(df), True)

# Prepare the data for testing
to_predict = [
    prefix + ": " + str(input_text)
    for prefix, input_text in zip(df["prefix"].tolist(), df["input_text"].tolist())
]
truth = df["target_text"].tolist()
print(to_predict[:5])

preds = []
# Get the model predictions
for sentence in tqdm(to_predict):
    encoding = tokenizer.encode_plus(sentence,pad_to_max_length=True, return_tensors="pt")
    input_ids, attention_masks = encoding["input_ids"].to(device), encoding["attention_mask"].to(device)
    # set top_k = 50 and set top_p = 0.95 and num_return_sequences = 3
    beam_outputs = model.generate(
        input_ids=input_ids,
        attention_mask=attention_masks,
        do_sample=True,
        max_length=256,
        top_k=50,
        top_p=0.95,
        early_stopping=True,
        num_return_sequences=5
    )
    final_outputs =[]
    for beam_output in beam_outputs:
        sent = tokenizer.decode(beam_output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
        if sent.lower() != sentence.lower() and sent not in final_outputs:
            final_outputs.append(sent)
    preds.append(final_outputs)

# Saving the predictions if needed
with open(f"predictions/predictions_{datetime.now()}.txt", "w") as f:
    for i, text in enumerate(df["input_text"].tolist()):
        f.write(str(text) + "\n\n")

        f.write("Truth:\n")
        f.write(truth[i] + "\n\n")

        f.write("Prediction:\n")
        for pred in preds[i]:
            f.write(str(pred) + "\n")
        f.write(
            "________________________________________________________________________________\n"
        )
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant