qaeval-experiments

View the Project on GitHub CogComp/qaeval-experiments

This directory contains the code to reproduce the expert QA results on the TAC 2008 dataset.

Required data:

Required environments:

Included data:

To reproduce the results, run

sh experiments/question-answering/tac2008/run.sh

After the run.sh script finishes, the metric correlations (Table 3) will be written to the following locations:

The QA metrics (Table 2) will be written to output/squad-metrics.json and output/answer-verification/log.txt. The first file contains the is-answerable F1 (is_answerable -> unweighted -> f1) and the EM/F1 scores on just the subset of the data which is answerable (is-answerable-only -> squad -> exact-match/f1). The second file contains the human labeled answer accuracy given the question is answerable (Accuracy given ground-truth question is answerable 0.8429003021148036).