Neural Attention Model for Abstractive Text Summarization Using Linguistic Feature Space

Aniqa Dilawari; Muhammad Usman Ghani Khan; Summra Saleem; Zahoor-Ur-Rehman; Fatema Sabeen Shaikh

doi:10.1109/ACCESS.2023.3249783

Neural Attention Model for Abstractive Text Summarization Using Linguistic Feature Space

Aniqa Dilawari
, Muhammad Usman Ghani Khan
, Summra Saleem
, Zahoor-Ur-Rehman^*
, Fatema Sabeen Shaikh

^*Corresponding author for this work

Computer Information Systems Department

Research output: Contribution to journal › Article › peer-review

25 Scopus citations

Abstract

Summarization generates a brief and concise summary which portrays the main idea of the source text. There are two forms of summarization: abstractive and extractive. Extractive summarization chooses important sentences from the text to form a summary whereas abstractive summarization paraphrase using advanced and nearer-to human explanation by adding novel words or phrases. For a human annotator, producing summary of a document is time consuming and expensive because it requires going through the long document and composing a short summary. An automatic feature-rich model for text summarization is proposed that can reduce the amount of labor and produce a quick summary by using both extractive and abstractive approach. A feature-rich extractor highlights the important sentences in the text and linguistic characteristics are used to enhance results. The extracted summary is then fed to an abstracter to further provide information using features such as named entity tags, part of speech tags and term weights. Furthermore, a loss function is introduced to normalize the inconsistency between word-level and sentence-level attentions. The proposed two-staged network achieved a ROUGE score of 37.76% on the benchmark CNN/DailyMail dataset, outperforming the earlier work. Human evaluation is also conducted to measure the comprehensiveness, conciseness and informativeness of the generated summary.

Original language	English
Pages (from-to)	23557-23564
Number of pages	8
Journal	IEEE Access
Volume	11
DOIs	https://doi.org/10.1109/ACCESS.2023.3249783
State	Published - 2023

Keywords

Abstractive summarization
encoder-decoder
extractive summarization
feature rich model
linguistic features
summarization evaluation

Access to Document

10.1109/ACCESS.2023.3249783

Cite this

@article{86828b4a71644da9ac6a65ee39c9ee8e,

title = "Neural Attention Model for Abstractive Text Summarization Using Linguistic Feature Space",

abstract = "Summarization generates a brief and concise summary which portrays the main idea of the source text. There are two forms of summarization: abstractive and extractive. Extractive summarization chooses important sentences from the text to form a summary whereas abstractive summarization paraphrase using advanced and nearer-to human explanation by adding novel words or phrases. For a human annotator, producing summary of a document is time consuming and expensive because it requires going through the long document and composing a short summary. An automatic feature-rich model for text summarization is proposed that can reduce the amount of labor and produce a quick summary by using both extractive and abstractive approach. A feature-rich extractor highlights the important sentences in the text and linguistic characteristics are used to enhance results. The extracted summary is then fed to an abstracter to further provide information using features such as named entity tags, part of speech tags and term weights. Furthermore, a loss function is introduced to normalize the inconsistency between word-level and sentence-level attentions. The proposed two-staged network achieved a ROUGE score of 37.76\% on the benchmark CNN/DailyMail dataset, outperforming the earlier work. Human evaluation is also conducted to measure the comprehensiveness, conciseness and informativeness of the generated summary.",

keywords = "Abstractive summarization, encoder-decoder, extractive summarization, feature rich model, linguistic features, summarization evaluation",

author = "Aniqa Dilawari and Khan, \{Muhammad Usman Ghani\} and Summra Saleem and Zahoor-Ur-Rehman and Shaikh, \{Fatema Sabeen\}",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2023",

doi = "10.1109/ACCESS.2023.3249783",

language = "English",

volume = "11",

pages = "23557--23564",

journal = "IEEE Access",

issn = "2169-3536",

}

TY - JOUR

T1 - Neural Attention Model for Abstractive Text Summarization Using Linguistic Feature Space

AU - Dilawari, Aniqa

AU - Khan, Muhammad Usman Ghani

AU - Saleem, Summra

AU - Zahoor-Ur-Rehman,

AU - Shaikh, Fatema Sabeen

PY - 2023

Y1 - 2023

N2 - Summarization generates a brief and concise summary which portrays the main idea of the source text. There are two forms of summarization: abstractive and extractive. Extractive summarization chooses important sentences from the text to form a summary whereas abstractive summarization paraphrase using advanced and nearer-to human explanation by adding novel words or phrases. For a human annotator, producing summary of a document is time consuming and expensive because it requires going through the long document and composing a short summary. An automatic feature-rich model for text summarization is proposed that can reduce the amount of labor and produce a quick summary by using both extractive and abstractive approach. A feature-rich extractor highlights the important sentences in the text and linguistic characteristics are used to enhance results. The extracted summary is then fed to an abstracter to further provide information using features such as named entity tags, part of speech tags and term weights. Furthermore, a loss function is introduced to normalize the inconsistency between word-level and sentence-level attentions. The proposed two-staged network achieved a ROUGE score of 37.76% on the benchmark CNN/DailyMail dataset, outperforming the earlier work. Human evaluation is also conducted to measure the comprehensiveness, conciseness and informativeness of the generated summary.

AB - Summarization generates a brief and concise summary which portrays the main idea of the source text. There are two forms of summarization: abstractive and extractive. Extractive summarization chooses important sentences from the text to form a summary whereas abstractive summarization paraphrase using advanced and nearer-to human explanation by adding novel words or phrases. For a human annotator, producing summary of a document is time consuming and expensive because it requires going through the long document and composing a short summary. An automatic feature-rich model for text summarization is proposed that can reduce the amount of labor and produce a quick summary by using both extractive and abstractive approach. A feature-rich extractor highlights the important sentences in the text and linguistic characteristics are used to enhance results. The extracted summary is then fed to an abstracter to further provide information using features such as named entity tags, part of speech tags and term weights. Furthermore, a loss function is introduced to normalize the inconsistency between word-level and sentence-level attentions. The proposed two-staged network achieved a ROUGE score of 37.76% on the benchmark CNN/DailyMail dataset, outperforming the earlier work. Human evaluation is also conducted to measure the comprehensiveness, conciseness and informativeness of the generated summary.

KW - Abstractive summarization

KW - encoder-decoder

KW - extractive summarization

KW - feature rich model

KW - linguistic features

KW - summarization evaluation

UR - https://www.scopus.com/pages/publications/85149370598

U2 - 10.1109/ACCESS.2023.3249783

DO - 10.1109/ACCESS.2023.3249783

M3 - Article

AN - SCOPUS:85149370598

SN - 2169-3536

VL - 11

SP - 23557

EP - 23564

JO - IEEE Access

JF - IEEE Access

ER -

Neural Attention Model for Abstractive Text Summarization Using Linguistic Feature Space

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this