Codebase for Elementary Discourse Unit based Argument Parser
This is the implementation of the paper:
Sougata Saha, Souvik Das, Rohini Srihari
The 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2022)
Neural approaches to end-to-end argument mining (AM) are often formulated as dependency parsing (DP), which relies on token-level sequence labeling and intricate post-processing for extracting argumentative structures from text. Although such methods yield reasonable results, operating solely with tokens increases the possibility of discontinuous and overly segmented structures due to minor inconsistencies in token level predictions. In this paper, we propose EDU-AP, an end-to-end argument parser, that alleviates such problems in dependency-based methods by exploiting the intrinsic relationship between elementary discourse units (EDUs) and argumentative discourse units (ADUs) and operates at both token and EDU level granularity. Further, appropriately using contextual information, along with optimizing a novel objective function during training, EDU-AP achieves significant improvements across all four tasks of AM compared to existing dependency-based methods.
Persuasive Essays (PE) Corpus: https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/2422
Bi LSTM-CRF based discourse segmenter (NeuralEDUSeg) by Wang et al. (2018): https://github.com/PKU-TANGENT/NeuralEDUSeg
You can train and evaluate all experiments using the runner.sh
script. Example: nohup bash runner.sh 1 12 > log.txt 2>&1 &
runs experiment numbers 1 to 12 sequentially. All the different configurations for the experiments can be found in the experiment_config.json
file.
In order to experiment with different parameters, you can directly execute the run_training.py
script. Sample command below:
python -m torch.distributed.run --nnodes=1 --nproc_per_node=4 --master_port 9999 ./run_training.py --batch_size 16 --num_epochs 15 --learning_rate 0.00002 --base_transformer "roberta-base"
Prior to training, please download the PE dataset into a folder named ./data/
, and run NeuralEDUSeg on the data to generate the EDUs.
If you are using this library then do cite:
@inproceedings{saha-etal-2022-edu-ap,
title = "EDU-AP: Elementary Discourse Unit based Argument Parser",
author = "Saha, Sougata and
Das, Souvik and
Srihari, Rohini",
booktitle = "Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue",
month = sep,
year = "2022",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.sigdial-1.19",
pages = "183--192"
}