-
Notifications
You must be signed in to change notification settings - Fork 4
Submit Answers to Annotation Assist
A human annotator needs to judge whether the answers to the questions returned by the various systems are correct.The Annotation Assist tool provides a visual interface to achieve this.
Annotation Assist takes corpus and question/answer pairs files as input. But first these files need to be converted into Annotation Assist format.
themis judge corpus <corpus.csv> > <annotation-assist.corpus.json>
here corpus.csv
is the corpus file downloaded from XMGR.
This command will generate annotation-assist.corpus.json
file as an output.
themis judge pairs <answers.wea.csv> <answers.solr.csv> <answers.nlc.csv> > <annotation-assist.pairs.csv>
here answers.wea.csv
, answers.solr.csv
and 'answers.nlc.csv' are answer files generated by querying respective systems. annotation-assist.pairs.csv
is the output file.
Annotating all the questions in the data-set is quite time consuming task. If small portion of the data(representative data) need to be annotated then following commands will be useful.
themis question sample <qa-pairs.csv> <size> > <sample.csv>
here qa-pairs.csv
is question-answer pairs extracted from usage log by the 'question extract' command. size is number of unique questions to sample. sample.csv
is the final sample file.
This command will sample questions without replacement according to a distribution determined by their frequency, so more frequently asked questions are more likely to be in the sample.
themis judge pairs --questions <sample.csv> <answers.wea.csv> <answers.solr.csv> <answers.nlc.csv> > <annotation-assist.pairs.csv>
here sample.csv
is the sample file generated by previous command.answers.wea.csv
, answers.solr.csv
and 'answers.nlc.csv' are answer files generated by querying respective systems. annotation-assist.pairs.csv
is the output file.
themis util truncate-answers <annotation-assist.pairs.csv> <length>
here annotation-assist.pairs.csv
is the Annotation Assist file. length describe the length of answer till the point it should be shortened.
This command will truncate answers to given length to be in the accordance with Annotation Assist system.
After annotation, results can be put together into single csv file. This csv file need to be converted into themis format by following command.
themis judge interpret <annotation-assist.judgments.csv> > <judgments.csv>
here annotation-assist.judgments.csv
is the csv file from annotation assist tool and judgement.csv
is the themis formatted file.