Emotion Recognition by Facial Expression and Voice: Review and Analysis

Yixen Lim; Kok-Why Ng; Palanichamy Naveen; Su-Cheng Haw

doi:10.33093/jiwe.2022.1.2.4

PDF

Published: Sep 15, 2022

DOI: https://doi.org/10.33093/jiwe.2022.1.2.4

Keywords:

Emotion Recognition, Facial Expression, Voice, Stress, Image Processing

Yixen Lim

Multimedia University, Malaysia

Kok-Why Ng

Multimedia University, Malaysia

https://orcid.org/0000-0003-4516-4634

Palanichamy Naveen

Multimedia University, Malaysia

Su-Cheng Haw

Multimedia University, Malaysia

Abstract

Emotion is a scorching topic in the recent years due to the critical unseen stress incurred during the pandemic and post-pandemic. This is worsening with the recent economy’s inflation and increase of living cost, many employees are seriously affected and drawn forth many families saddened cases and tremendous drop of working performance. The increasing stress brings a lot of harm not only to the individual but to the company’s and country’s growth. To recognize emotion through a single model is less accurate, however, recruiting multiple-models may lead to latency in data processing and possibly misleading results if the input models data are not properly filtered and segmented. This paper will review, analyze and theoretically compare 15 facial expression methods and 17 voice methods of emotion recognition research works. It will outline the pros and cons of each method and discuss the accuracy of some of the standalone and hybrid emotion recognition methods. Some of the methods (such as CNN, KNN and SVM) can span over multiple-models, but reveal different level of strengths. This is very important to discover, so that one may replace or enhance the weaker level if applying the same method across the multiple-models. This paper will also illustrate different levels of popularity of the methods in each model for visual comparison in ease. Hopefully, it can cater the new researchers a quick identification on the most suitable method for recognizing the emotion through facial expression and/or voice.

How to Cite

Lim, Y., Ng, K.-W., Naveen, P., & Haw, S.-C. (2022). Emotion Recognition by Facial Expression and Voice: Review and Analysis. Journal of Informatics and Web Engineering, 1(2), 45–54. https://doi.org/10.33093/jiwe.2022.1.2.4

Issue

Vol. 1 No. 2 (2022): September 2022

Section

Regular issue

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

All articles published in JIWE are licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) License. Readers are allowed to

Share — copy and redistribute the material in any medium or format under the following conditions:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use;
NonCommercial — You may not use the material for commercial purposes;
NoDerivatives — If you remix, transform, or build upon the material, you may not distribute the modified material.

References

A. Agrawal and N. Mittal, “Using CNN for facial expression recognition: a study of the effects of kernel size and number of filters on accuracy”, The Visual Computer, vol. 36, no. 2, pp. 405-412, 2020, doi: 10.1007/s00371-019-01630-9.

N. Agarwalla, D. Panda, and M. K. Modi, “Deep Learning using Restricted Boltzmann Machines”, International Journal of Computer Science and Information Security, vol. 7, pp. 1552-1556, 2016.

S. Khan, A. Hussain, and M. Usman, “Facial expression recognition on real world face images using intelligent techniques: A survey”, Optik, vol. 127, pp. 6195-6203, 2016, doi: 10.1016/j.ijleo.2016.04.015.

K. Chengeta and S. Viriri, “A Survey on Facial Recognition based on local directional and local binary patterns”, Conference on Information Communications Technology and Society (ICTAS), pp. 1-6, 2018, doi: 10.1109/ICTAS.2018.8368757.

R. Isnanto, A. F. Rochim, D. Eridani, and G. Cahyono, “Multi-Object Face Recognition Using Local Binary Pattern Histogram and Haar Cascade Classifier on Low-Resolution Images”, International Journal of Engineering and Technology Innovation, vol. 11, no. 1, pp. 45-58, 2021, doi: 10.46604/ijeti.2021.6174.

J. Kumari, R. Rajesh and KM. Pooja, “Facial expression recognition: A survey”, Procedia Computer Science, vol. 58, pp. 486–491, 2015 doi: 10.1016/j.procs.2015.08.011.

S. A. Hussain and A. S. A. A. Balushi, “A real time face emotion classification and recognition using deep learning model”, Journal of Physics: Conference Series, vol. 1432, no. 1, pp. 012087, 2020, doi: 10.1088/1742-6596/1432/1/012087.

M. Abdulrahman, and A. Eleyan, “Facial Expression Recognition Using Support Vector Machines”, 2015 23nd Signal Processing and Communications Applications Conference(SIU), pp. 276-279, 2015, doi: 10.1109/SIU.2015.7129813.

Y. Ouyang, N. Sang, and R. Huang, “Accurate and robust facial expressions recognition by fusing multiple sparse representation based classifiers”, Neurocomputing, vol. 149, pp. 71-78, doi: 10.1016/j.neucom.2014.03.073, 2015.

M. N. Raja, P. R. Jangid, S. M. Gulhane, “Linear Predictive Coding”, International Journal of Engineering Sciences & Research Technology, 2015.

P. Partila, M. Voznak, and J. Tovarek, “Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System”, The Scientific World Journal, doi: 10.1155/2015/573068, 2015.

L. Kerkeni, Y. Serrestou, M. Mbarki, K. Raoof, M. A. Mahjoub and C. Cleder, “Automatic Speech Emotion Recognition Using Machine Learning”, Social media and machine learning, IntechOpen, 2019, doi: 10.5772/intechopen.84856.

R. Y. Rumagit, G. Alexander and I. F. Saputra, “Model Comparison in Speech Emotion Recognition for Indonesian Language”, Procedia Computer Science, vol. 179, pp. 789-797, 2021. https://doi.org/10.1016/j.procs.2021.01.098.

P. Harar, R. Burget and M. K. Dutta, “Speech Emotion Recognition with Deep Learning”, 2017 4th International Conference on Signal Processing and Integrated Networks, pp. 137-140, 2017, doi: 10.1109/SPIN.2017.8049931.

F. Ma, W. Gu, W. Zhang, S. Ni, S. Huang and L. T. Zhang, “Speech Emotion Recognition via Attention-based DNN from Multi-Task Learning”, ACM Conference on Embedded Networked Sensor Systems, pp. 363-364, 2018, doi: 10.1145/3274783.3275184.

A. M. Badshah, J. Ahmad, N. Rahim and S. W. Baik, “Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network”, 2017 International conference on platform technology and service, pp. 1-5, 2017, doi: 10.1109/PlatCon.2017.7883728.

M. Neumann and N. T. Vu, “Attentive Convolutional Neural Network based Speech Emotion Recognition:A Study on the Impact of Input Features, Signal Length, and Acted Speech”, Interspeech, 2017.

W. Lim, D. Jang and T. Lee, “Speech Emotion Recognition using Convolutional and Recurrent Neural Networks”, 2016 Asia-Pacific signal and information processing association annual summit and conference (APSIPA), pp. 1-4, 2016, doi: 10.1109/APSIPA.2016.7820699.

N. K. Kim, J. Lee, H. K. Ha, G. W. Lee, J. H. Lee and H. K. Kim, “Speech emotion recognition based on multi-task learning using a convolutional neural network”, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 704-707, 2017, doi: 10.1109/APSIPA.2017.8282123.

W. Q. Zheng, J. S. Yu and Y. X. Zou, “An experimental study of speech emotion recognition based on deep convolutional neural networks”, 2015 international conference on affective computing and intelligent interaction (ACII), pp. 827-831, 2015, doi: 10.1109/ACII.2015.7344669.

Y. Hifny and A. Ali, “Efficient Arabic emotion recognition using deep neural networks”, ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6710-6714, 2019, doi: 10.1109/ICASSP.2019.8683632.

Article Sidebar

Main Article Content

Abstract

Article Details

References

Most read articles by the same author(s)