Predicting Travel Insurance Purchases in an Insurance Firm through Machine Learning Methods after COVID-19
Main Article Content
Abstract
Travel insurance serves as a crucial financial safeguard, offering coverage against unforeseen expenses and losses incurred during travel. With the advent of the proliferation of insurance types and the amplified demand for Covid-related coverage, insurance companies face the imperative task of accurately predicting customers’ likelihood to purchase insurance. This can assist the insurance providers in focusing on the most lucrative clients and boosting sales. By employing advanced machine learning techniques, this study aims to forecast the consumer segments most inclined to acquire travel insurance, allowing targeted strategies to be developed. A comprehensive analysis was carried out on a Kaggle dataset comprising prior clients of a travel insurance firm utilizing the K-Nearest Neighbors (KNN), Decision Tree Classifier (DT), Support Vector Machines (SVM), Naïve Bayes (NB), Logistic Regression (LR), and Random Forest (RF) models. Extensive data cleaning was done before model building. Performance evaluation was then based on accuracy, F1 score, and the Area Under Curve (AUC) with Receiver Operating Characteristics (ROC) curve. Inexplicably, KNN outperformed other models, achieving an accuracy of 0.81, precision of 0.82, recall of 0.82, F1 score of 0.80, and an AUC of 0.78. The findings of this study are a valuable guide for deploying machine learning algorithms in predicting travel insurance purchases, thus empowering insurance companies to target the most lucrative clientele and bolster revenue generation.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
All articles published in JIWE are licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) License. Readers are allowed to
- Share — copy and redistribute the material in any medium or format under the following conditions:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use;
- NonCommercial — You may not use the material for commercial purposes;
- NoDerivatives — If you remix, transform, or build upon the material, you may not distribute the modified material.
References
D. A. Hamzah, A. A. Kalambe, L. S. Goklas and N. G. Alkhayyat, “Predicting travel insurance policy claim using logistic regression”, Applied Quantitative Analysis, vol. 1, no. 1, pp. 1-7, 2021.
S. K. C. R. Wickramasinghe and K. ABD Razak, “The Impact Of The Telecommunication Industry As A Moderator on Poverty Alleviation and Educational Programmes To Achieve Sustainable Development Goals In Developing Countries,” Journal of Informatics and Web Engineering, vol. 2, no. 1, pp. 25–37, 2023, doi: 10.33093/jiwe.2023.2.1.3.
A. A. Hasan and N. C. Abdullah, “Compulsory Travel Insurance in Malaysia: Some Regulatory Considerations”, Procedia Social Behavioral Science, vol. 172, pp. 344–351, 2015, doi: 10.1016/j.sbspro.2015.01.375.
K. Herbst et al., “Protocol: Leveraging a demographic and health surveillance system for Covid-19 Surveillance in rural KwaZulu-Natal”, Wellcome Open Research, vol. 5, 2020, doi: 10.12688/wellcomeopenres.15949.1.
World Tourism Organization, “IMPACT ASSESSMENT OF THE COVID-19 OUTBREAK ON INTERNATIONAL TOURISM”, The World Tourism Organization, 2020.
M. Diakonidze, “Tourism Insurance Market, Risks and Prospects: The Case Study,” Journal of corporate governance, insurance and risk management, vol. 8, 2021, doi: 10.51410/jcgirm.8.1.5.
S. L. Kang, “Insurers see rising demand for travel insurance as more countries make it mandatory”, The Edge Malaysia, 2021.
A. Martinez, C. Schmuck, S. Pereverzyev, C. Pirker and M. Haltmeier, “A machine learning framework for customer purchase prediction in the non-contractual setting,” European Journal of Operational Research, vol. 281, no. 3, pp. 588–596, 2020, doi: 10.1016/j.ejor.2018.04.034.
R. Esmeli, M. Bader-El-Den and H. Abdullahi, “Towards early purchase intention prediction in online session based retailing systems”, Electronic Markets, vol. 31, no. 3, pp. 697–715, 2021, doi: 10.1007/s12525-020-00448-x.
N. Amruthnath and T. Gupta, “A Research Study on Unsupervised Machine Learning Algorithms for Early Fault Detection in Predictive Maintenance”, International Conference on Industrial Engineering and Applications, 2018, pp. 355–361, doi: 10.1109/IEA.2018.8387124.
F. Y. Osisanwo, J. E. T. Akinsola, O. Awodele, J. O. Hinmikaiye, O. Olakanmi and J. Akinjobi, “Supervised Machine Learning Algorithms: Classification and Comparison,” International Journal of Computer Trends and Technology, vol. 48, 2017, pp. 128-138.
Y. Liu, K. Bo, Q. Yi, Z. Wang, Y. Sun, J. Xu, X. Zhang and R. Xu, “Predict Health Insurance Purchase with Machine Learning Techniques”, 2021. https://ssrn.com/abstract=3968385
R. Jaiswal, “Prognosticating Customers’ Intention To Purchase An Insurance Plan With Machine Learning”, Fostering Resilient Business Ecosystems and Economic Growth: Towards the Next Normal, A. Gawande and A. Kumar, Eds., India: Research and Publication Cell, 2022, pp. 292–296.
M. A. Rubi, M. H. I. Bijoy, S. Chowdhury and M. K. Islam, “Machine Learning Prediction of Consumer Travel Insurance Purchase Behavior”, 13th International Conference on Computing Communication and Networking Technologies, ICCCNT 2022, Institute of Electrical and Electronics Engineers Inc, 2022. doi: 10.1109/ICCCNT54827.2022.9984470.
S. B. Imandoust and M. Bolandraftar, “Application of K-nearest neighbor (KNN) approach for predicting economic events theoretical background,” International Journal of Engineering Research Applications, vol. 3, pp. 605–610, 2013.
S. Zhang, X. Li, M. Zong, X. Zhu and D. Cheng, “Learning k for kNN Classification”, ACM Transactions on Intelligent Systems Technology, vol. 8, pp. 1–19, 2017, doi: 10.1145/2990508.
M. Shouman, T. Turner and R. Stocker, “Applying k-Nearest Neighbour in Diagnosing Heart Disease Patients,” International Journal of Information and Education Technology, vol. 2, no. 3, pp. 220–223, 2012.
V. M. Sreeja and K. Umamaheswari, “Type 2 Diabetic Prediction Using Machine Learning Algorithm,” American Scientific Research Journal for Engineering, Technology and Sciences, vol. 45, no. 1, pp. 299–307, 2018, [Online]. Available: http://asrjetsjournal.org/
Y. Lim, K.-W. Ng, P. Naveen and S.-C. Haw, “Emotion Recognition by Facial Expression and Voice: Review and Analysis,” Journal of Informatics and Web Engineering, vol. 1, no. 2, pp. 45–54, 2022, doi: 10.33093/jiwe.2022.1.2.4.
A. Mashat, M. Fouad, P. Yu and T. Gharib, “A Decision Tree Classification Model for University Admission System,” International Journal of Advanced Computer Science and Applications(IJACSA), vol. 3, 2012, doi: 10.14569/IJACSA.2012.031003.
Y. Y. Song and Y. Lu, “Decision tree methods: applications for classification and prediction”, Shanghai Archives Psychiatry, vol. 27, no. 2, pp. 130–135, 2015, doi: 10.11919/j.issn.1002-0829.215044.
I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions”, SN Computer Science, vol. 2, no. 3, 2021, doi: 10.1007/s42979-021-00592-x.
G. Mountrakis, J. Im and C. Ogole, “Support vector machines in remote sensing: A review,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 66, no. 3, pp. 247–259, 2011, doi: 10.1016/j.isprsjprs.2010.11.001.
J. Nayak, B. Naik and Prof. Dr. H. Behera, “A Comprehensive Survey on Support Vector Machine in Data Mining Tasks: Applications & Challenges,” International Journal of Database Theory and Application, vol. 8, pp. 169–186, 2015. doi: 10.14257/ijdta.2015.8.1.18.
D. Srivastava and L. Bhambhu, “Data classification using support vector machine,” Journal of Theoretical and Applied Information Technology, vol. 12, pp. 1–7, 2010.
Q. Li, Q. Meng, J. Cai, H. Yoshino and A. Mochida, “Applying support vector machine to predict hourly cooling load in the building”, Applied Energy, vol. 86, no. 10, pp. 2249–2256, 2009, doi: 10.1016/j.apenergy.2008.11.035.
L. Auria, R. A. M. Berlin and R. A. Moro, “Support Vector Machines (SVM) as a Technique for Solvency Analysis”, 2008.
A. P. Wibawa et al., “Naive Bayes Classifier for Journal Quartile Classification,” International Journal of Recent Contributions from Engineering, Science & IT (iJES), vol. 7, no. 2, pp. 91-99, 2019, doi: 10.3991/ijes.v7i2.10659.
A. Jamain and D. J. Hand, “The Naive Bayes Mystery: A classification detective story”, Pattern Recognition Letters, vol. 26, no. 11, pp. 1752–1760, 2005, doi: 10.1016/j.patrec.2005.02.001.
U. Dulhare, “Prediction system for heart disease using Naive Bayes and particle swarm optimization”, Biomedical Research, vol. 29, 2018, doi: 10.4066/biomedicalresearch.29-18-620.
S. Jyothi and P. Bhargavi, “Applying Naive Bayes Data Mining Technique for Classification of Agricultural Land Soils,” IJCSNS International Journal of Computer Science and Network Security, vol. 9, no. 8, 2009.
S. B. Kotsiantis, “Supervised Machine Learning: A Review of Classification Techniques”, Informatica (Ljubljana), vol. 31, 2007.
H. A. Park, “An introduction to logistic regression: from basic concepts to interpretation with particular attention to nursing domain,” Journal Korean Academy Nursing, vol. 43, no. 2, pp. 154-164, 2013, doi: 10.4040/jkan.2013.43.2.154.
A. Strzelecka, A. Kurdys-Kujawska and D. Zawadzka, “Application of logistic regression models to assess household financial decisions regarding debt”, Procedia Computer Science, vol. 176, pp. 3418–3427, 2020, doi: 10.1016/j.procs.2020.09.055.
C. Y. J. Peng, K. L. Lee and G. M. Ingersoll, “An introduction to logistic regression analysis and reporting,” The Journal of Educational Research, vol. 96, no. 1, pp. 3–14, 2002, doi: 10.1080/00220670209598786.
Q. Ren, H. Cheng and H. Han, “Research on machine learning framework based on random forest algorithm”, AIP Conference Proceedings, 2017, 80020, doi: 10.1063/1.4977376.
F. Tarsha-Kurdi, W. Amakhchan and Z. Gharineiat, “Random Forest Machine Learning Technique for Automatic Vegetation Detection and Modelling in LiDAR Data Mini Review Int J Environ Sci Nat Res,” Journal of Environmental Science and Natural Resources, vol. 28, 2021, doi: 10.19080/IJESNR.2021.28.556234.
J. Ali, R. Khan, N. Ahmad and I. Maqsood, “Random Forests and Decision Trees,” International Journal of Computer Science Issues (IJCSI), vol. 9, 2012.
M. Aria, C. Cuccurullo and A. Gnasso, “A comparison among interpretative proposals for Random Forests”, Machine Learning with Applications, vol. 6, pp. 100094, 2021, doi: 10.1016/j.mlwa.2021.100094.
T. Wood, “What is F-score?”, DeepAI. https://deepai.org/machine-learning-glossary-and-terms/f-score (accessed Jun. 17, 2023).
T. Fawcett, “An introduction to ROC analysis”, Pattern Recognition Letter, vol. 27, no. 8, pp. 861–874, 2006, doi: 10.1016/j.patrec.2005.10.010.
S. Narkhede, “Understanding AUC - ROC Curve”, Towards Data Science, 2018. https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5