You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@ramsrigouthamg I trained the model using the scripts in this repo and the model was performing some other task like sentiment analysis, etc. The predictions of the model are shown below.
Truth:
It is located at 142 South Rexford Drive in Beverly Hills . It is opposite the First Church of Christ , scientist , Beverly Hills , California .
Prediction:
True
False
________________________________________________________________________________
Daniel Armstrong is an Australian film director who is also known for his work as a writer , producer and editor .
Truth:
Daniel Armstrong is an Australian film director . Armstrong is also known for his work as a writer , producer and editor .
Prediction:
True
False
________________________________________________________________________________
Magnus turned around and reformed the British invasion of Williams by attacking the Eric Young and Orlando Jordan team .
Truth:
Magnus turned heel and reformed the British Invasion with Williams by attacking the team of Eric Young and Orlando Jordan .
Prediction:
W. Magnus in. Magnus turned around and reformed the British invasion of Williams by attacking the Eric Young and Orlando Jordan team.
"Magnus turned around and reformed the invasion of Williams by attacking the Eric Young and Orlando Jordan team ".
- In his own words., Magnus turned around and reformed the British invasion of Williams by attacking the Eric Young and Orlando Jordan team.
Magnus took over the British invasion of Williams by attacking the Eric Young and Orlando Jordan team.
Great: Magnus changed course and reversed the British invasion of Williams by attacking the Eric Young and Orlando Jordan team.
I think (and hope) something is going wrong in the way the model is called and not the training process itself. This is how I call the model -
import json
import logging
import torch
from datetime import datetime
import numpy as np
import pandas as pd
from transformers import T5ForConditionalGeneration,T5Tokenizer
from tqdm import tqdm
def set_seed(seed):
torch.manual_seed(seed)
if torch.cuda.is_available():
torch.cuda.manual_seed_all(seed)
set_seed(42)
logging.basicConfig(level=logging.ERROR)
device = "cuda:1"
model_args = {
"overwrite_output_dir": True,
"max_seq_length": 256,
"eval_batch_size": 32,
"num_train_epochs": 1,
"use_multiprocessing": True,
"num_beams": None,
"do_sample": True,
"max_length": 50,
"top_k": 120,
"top_p": 0.95,
"num_return_sequences": 5,
}
prefix = "paraphrasing"
# Load the trained model
model = T5ForConditionalGeneration.from_pretrained('t5_paraphrase/epoch2/')
model = model.to(device)
tokenizer = T5Tokenizer.from_pretrained('t5-base')
# Load the evaluation data
df = pd.read_csv("paraphrase_data/val.tsv", sep="\t")
df.columns = ["input_text", "target_text"]
df.insert(0, "prefix", ['paraphrase']*len(df), True)
# Prepare the data for testing
to_predict = [
prefix + ": " + str(input_text)
for prefix, input_text in zip(df["prefix"].tolist(), df["input_text"].tolist())
]
truth = df["target_text"].tolist()
print(to_predict[:5])
preds = []
# Get the model predictions
for sentence in tqdm(to_predict):
encoding = tokenizer.encode_plus(sentence,pad_to_max_length=True, return_tensors="pt")
input_ids, attention_masks = encoding["input_ids"].to(device), encoding["attention_mask"].to(device)
# set top_k = 50 and set top_p = 0.95 and num_return_sequences = 3
beam_outputs = model.generate(
input_ids=input_ids,
attention_mask=attention_masks,
do_sample=True,
max_length=256,
top_k=50,
top_p=0.95,
early_stopping=True,
num_return_sequences=5
)
final_outputs =[]
for beam_output in beam_outputs:
sent = tokenizer.decode(beam_output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
if sent.lower() != sentence.lower() and sent not in final_outputs:
final_outputs.append(sent)
preds.append(final_outputs)
# Saving the predictions if needed
with open(f"predictions/predictions_{datetime.now()}.txt", "w") as f:
for i, text in enumerate(df["input_text"].tolist()):
f.write(str(text) + "\n\n")
f.write("Truth:\n")
f.write(truth[i] + "\n\n")
f.write("Prediction:\n")
for pred in preds[i]:
f.write(str(pred) + "\n")
f.write(
"________________________________________________________________________________\n"
)
The text was updated successfully, but these errors were encountered:
@ramsrigouthamg I trained the model using the scripts in this repo and the model was performing some other task like sentiment analysis, etc. The predictions of the model are shown below.
I think (and hope) something is going wrong in the way the model is called and not the training process itself. This is how I call the model -
The text was updated successfully, but these errors were encountered: