HybridEval: An Improved Novel Hybrid Metric for Evaluation of Text Summarization

Main Article Content

Raheem Sarwar
Bilal Ahmad
Pin Shen Teh
Suppawong Tuarob
Tipajin Thaipisutikul
Farooq Zaman
Naif R. Aljohani
Jia Zhu
Saeed-Ul Hassan
Raheel Nawaz
Ali R Ansari
Muhammad A B Fayyaz

Abstract

The present work re-evaluates the evaluation method for text summarization tasks. Two state-of-the-art assessment measures e.g., Recall-Oriented Understudy for Gisting Evaluation (ROUGE) and Bilingual Evaluation Understudy (BLEU) are discussed along with their limitations before presenting a novel evaluation metric. The evaluation scores are significantly different because of the length and vocabulary of the sentences, this suggests that the primary restriction is its inability to preserve the semantics and meaning of the sentences and consistent weight distribution over the whole sentence. To address this, the present work organizes the phrases into six different groups and to evaluate “text summarization” problems, a new hybrid approach (HybridEval) is proposed. Our approach uses a weighted sum of cosine scores from InferSent’s SentEval algorithms combined with original scores, achieving high accuracy. HybridEval outperforms existing state-of-the-art models by 10-15% in evaluation scores.

Article Details

How to Cite
Sarwar, R., Ahmad, B., Teh, P. S., Tuarob, S., Thaipisutikul, T., Zaman, F., Aljohani, N. R., Zhu, J., Hassan, S.-U., Nawaz, R., Ansari, A. R., & Fayyaz, M. A. B. (2024). HybridEval: An Improved Novel Hybrid Metric for Evaluation of Text Summarization. Journal of Informatics and Web Engineering, 3(3), 233–255. https://doi.org/10.33093/jiwe.2024.3.3.15
Section
Thematic (Pervasive Computing)

References

J. Rodriguez-Vidal, J. Carrillo-De-Albornoz, E. Amigo, L. Plaza, J. Gonzalo, and F. Verdejo, “Automatic generation of entity-oriented summaries for reputation management,” Journal of Ambient Intelligence and Humanized Computing, vol. 11, no. 4, pp. 1577–1591, 2019, doi: 10.1007/s12652-019-01255-9.

R. C. Belwal, S. Rai, and A. Gupta, “A new graph-based extractive text summarization using keywords or topic modeling,” Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 10, pp. 8975–8990, 2020, doi: 10.1007/s12652-020-02591-x.

T. Vetriselvi and N. P. Gopalan, “RETRACTED ARTICLE: An improved key term weightage algorithm for text summarization using local context information and fuzzy graph sentence score,” Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 5, pp. 4609–4618, 2020, doi: 10.1007/s12652-020-01856-9.

F. Zaman, M. Shardlow, S.-U. Hassan, N. R. Aljohani, and R. Nawaz, “HTSS: A novel hybrid text summarisation and simplification architecture,” Information Processing & Management, vol. 57, no. 6, p. 102351, 2020, doi: 10.1016/j.ipm.2020.102351.

E. Akgul, Y. Delice, E. K. Aydogan, and F. E. Boran, “An application of fuzzy linguistic summarization and fuzzy association rule mining to Kansei Engineering: a case study on cradle design,” Journal of Ambient Intelligence and Humanized Computing, vol. 13, no. 5, pp. 2533–2563, 2021, doi: 10.1007/s12652-021-03292-9.

Z. Fang, J. Wang, X. Hu, L. Wang, Y. Yang, and Z. Liu, “Compressing Visual-linguistic Model via Knowledge Distillation,” arXiv (Cornell University), 2021, doi: 10.48550/arxiv.2104.02096.

Z. Li et al., “Text Compression-aided Transformer Encoding,” IEEE Transactions on Pattern Analysis and Machine Intelligence, p. 1, 2021, doi: 10.1109/tpami.2021.3058341.

M. Gupta and P. Agrawal, “Compression of Deep Learning Models for Text: A Survey,” arXiv (Cornell University), 2020, doi: 10.48550/arxiv.2008.05221.

W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Automatic text summarization: A comprehensive survey,” Expert Systems With Applications, vol. 165, p. 113679, 2020, doi: 10.1016/j.eswa.2020.113679.

T. Vetriselvi and N. P. Gopalan, “RETRACTED ARTICLE: An improved key term weightage algorithm for text summarization using local context information and fuzzy graph sentence score,” Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 5, pp. 4609–4618, 2020, doi: 10.1007/s12652-020-01856-9.

R. Dar and A. D. Dileep, “Small, narrow, and parallel recurrent neural networks for sentence representation in extractive text summarization,” Journal of Ambient Intelligence and Humanized Computing, vol. 13, no. 9, pp. 1-7, 2021, doi: 10.1007/s12652-021-03583-1.

J. Sheela and B. Janet, “RETRACTED ARTICLE: An abstractive summary generation system for customer reviews and news article using deep learning,” Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 7, pp. 7363–7373, 2020, doi: 10.1007/s12652-020-02412-1.

A. Ghadimi and H. Beigy, “Hybrid multi-document summarization using pre-trained language models,” Expert Systems With Applications, vol. 192, p. 116292, 2021, doi: 10.1016/j.eswa.2021.116292.

S. Gupta and S. K. Gupta, “Abstractive summarization: An overview of the state of the art,” Expert Systems With Applications, vol. 121, pp. 49–65, 2018, doi: 10.1016/j.eswa.2018.12.011.

N. Moratanch and S. Chitrakala, "A survey on extractive text summarization," 2017 International Conference on Computer, Communication and Signal Processing (ICCCSP), India, 2017, pp. 1-6, doi: 10.1109/ICCCSP.2017.7944061.

K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “BLEU: a method for automatic evaluation of machine translation” in Acm Digital Library, 2001. doi: 10.3115/1073083.1073135.

A. Lavie and A. Agarwal, “METEOR: An automatic metric for MT evaluation with improved correlation with human judgments,",” StatMT ’07: Proceedings of the Second Workshop on Statistical Machine Translation, 2007, doi: 10.3115/1626355.1626389.

I. Mani and M. T. Maybury, Advances in Automatic Text Summarization. MIT Press, 1999.

J.-M. Torres-Moreno, Automatic Text Summarization. John Wiley & Sons, 2014.

J.-M. Torres-Moreno, H. Saggion, I. Da Cunha, E. SanJuan, and P. Velazquez-Morales, “Summary Evaluation with and without References,” Polibits, vol. 42, pp. 13–19, 2010, doi: 10.17562/pb-42-2.

A. Nenkova and K. McKeown, “A Survey of Text Summarization Techniques,” in Springer eBooks, 2012, pp. 43–76. doi: 10.1007/978-1-4614-3223-4_3.

A. Louis and A. Nenkova, “Automatically Assessing Machine Summary Content Without a Gold Standard,” Computational Linguistics, vol. 39, no. 2, pp. 267–300, 2013, doi: 10.1162/coli_a_00123.

N. Moratanch and S. Chitrakala, “A survey on extractive text summarization,” International Conference on Computer, Communication and Signal Processing (ICCCSP), 2017, doi: 10.1109/icccsp.2017.7944061.

T. Vodolazova and E. Lloret, “The Impact of Rule-Based Text Generation on the Quality of Abstractive Summaries,” Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), 2019, doi: 10.26615/978-954-452-056-4_146.

A. Nenkova, R. Passonneau, and K. McKeown, “The Pyramid Method,” ACM Transactions on Speech and Language Processing, vol. 4, no. 2, p. 4, 2007, doi: 10.1145/1233912.1233913.

E. Lloret, L. Plaza, and A. Aker, “The challenging task of summary evaluation: an overview,” Language Resources and Evaluation, vol. 52, no. 1, pp. 101–148, 2017, doi: 10.1007/s10579-017-9399-2.

D. Yadav et al., “Qualitative Analysis of Text Summarization Techniques and Its Applications in Health Domain,” Computational Intelligence and Neuroscience, vol. 2022, pp. 1–14, 2022, doi: 10.1155/2022/3411881.

C.-Y. Lin, “ROUGE: A Package for Automatic Evaluation of Summaries,” Meeting of the Association for Computational Linguistics, pp. 74–81, 2004, [Online]. Available: http://anthology.aclweb.org/W/W04/W04-1013.pdf

T.-A. Nguyen-Hoang, K. Nguyen, and Q.-V. Tran, “TSGVi: a graph-based summarization system for Vietnamese documents,” Journal of Ambient Intelligence and Humanized Computing, vol. 3, no. 4, pp. 305–313, 2012, doi: 10.1007/s12652-012-0143-x.

A. Nenkova and L. Vanderwende, “The Impact of Frequency on Summarization,” 2005, [Online]. Available: https://www.cs.bgu.ac.il/~elhadad/nlp09/sumbasic.pdf

S. Nisioi, S. Stajner, S. P. Ponzetto, and L. P. Dinu, “Exploring Neural Text Simplification Models,” Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2017, doi: 10.18653/v1/p17-2014.

C. Li, W. Xu, S. Li, and S. Gao, “Guiding Generation for Abstractive Text Summarization Based on Key Information Guide Network,” Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), 2018, doi: 10.18653/v1/n18-2009.

T. Falke, L. F. R. Ribeiro, P. A. Utama, I. Dagan, and I. Gurevych, “Ranking Generated Summaries by Correctness: An Interesting but Challenging Application for Natural Language Inference,” Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, doi: 10.18653/v1/p19-1213.

A. Conneau, D. Kiela, H. Schwenk, L. Barrault, and A. Bordes, “Supervised Learning of Universal Sentence Representations from Natural Language Inference Data,” arXiv (Cornell University), 2017, doi: 10.48550/arxiv.1705.02364.

J. Pennington, R. Socher, and C. Manning, “Glove: Global Vectors for Word Representation,” Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, doi: 10.3115/v1/d14-1162.

K. Q. Yip, P. Y. Goh, and L. Y. Chong, “Social Messaging Application with Translation and Speech-to-Text Transformation,” Journal of Informatics and Web Engineering, vol. 3, no. 2, pp. 169–187, 2024, doi: 10.33093/jiwe.2023.3.2.13.

Most read articles by the same author(s)