Automated multi-document summarization using extractive-abstractive approaches

Maulin Nasari, Abba Suganda Girsang

Abstract


This study presents a multi-document text summarizing system that employs a hybrid approach, including both extractive and abstractive methods. The goal of document summarizing is to create a coherent and comprehensive summary that captures the essential information contained in the document. The difficulty in multi-document text summarization lies in the lengthy nature of the input material and the potential for redundant information. This study utilises a combination of methods to address this issue. This study uses the TextRank algorithm as an extractor for each document to condense the input sequence. This extractor is designed to retrieve crucial sentences from each document, which are then aggregated and utilised as input for the abstractor. This study uses bidirectional and auto-regressive transformers (BART) as an abstractor. This abstractor serves to condense the primary sentences in each document into a more cohesive summary. The evaluation of this text summarizing system was conducted using the ROUGE measure. The research yields ROUGE R1 and R2 scores of 41.95 and 14.81, respectively.

Keywords


BART; Extractive-abstractive; Multi-document summarization; TextRank

Full Text:

PDF


DOI: http://doi.org/10.11591/ijict.v13i3.pp400-409

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The International Journal of Informatics and Communication Technology (IJ-ICT)
p-ISSN 2252-8776, e-ISSNĀ 2722-2616
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

Web Analytics View IJICT Stats