Emotion Recognition by Facial Expression and Voice: Review and Analysis

Main Article Content

Yixen Lim
Kok-Why Ng
Palanichamy Naveen
Su-Cheng Haw

Abstract

Emotion is a scorching topic in the recent years due to the critical unseen stress incurred during the pandemic and post-pandemic. This is worsening with the recent economy’s inflation and increase of living cost, many employees are seriously affected and drawn forth many families saddened cases and tremendous drop of working performance. The increasing stress brings a lot of harm not only to the individual but to the company’s and country’s growth. To recognize emotion through a single model is less accurate, however, recruiting multiple-models may lead to latency in data processing and possibly misleading results if the input models data are not properly filtered and segmented. This paper will review, analyze and theoretically compare 15 facial expression methods and 17 voice methods of emotion recognition research works. It will outline the pros and cons of each method and discuss the accuracy of some of the standalone and hybrid emotion recognition methods. Some of the methods (such as CNN, KNN and SVM) can span over multiple-models, but reveal different level of strengths. This is very important to discover, so that one may replace or enhance the weaker level if applying the same method across the multiple-models. This paper will also illustrate different levels of popularity of the methods in each model for visual comparison in ease. Hopefully, it can cater the new researchers a quick identification on the most suitable method for recognizing the emotion through facial expression and/or voice.

Article Details

How to Cite
Lim, Y., Ng, K.-W., Naveen, P., & Haw, S.-C. (2022). Emotion Recognition by Facial Expression and Voice: Review and Analysis. Journal of Informatics and Web Engineering, 1(2), 45–54. https://doi.org/10.33093/jiwe.2022.1.2.4
Section
Regular issue

References

Agrawal, A., & Mittal, ·. N. (2020). Using CNN for facial expression recognition: a study of the effects of kernel size and number of filters on accuracy. The Visual Computer 36 (2), 405-412. doi:10.1007/s00371-019-01630-9

Agarwalla, N., Panda, D., & Modi, M. K. (2016). Deep Learning using Restricted Boltzmann Machines. International Journal of Computer Science & Information Security, 7(3), 1552-1556.

Khan, S. A., Hussain, A., & Usmana, M. (2016). Facial expression recognition on real world face images using intelligent techniques: A survey. Optik, 127(15), 6195-6203. doi:10.1016/j.ijleo.2016.04.015

Chengeta, K., & Viriri, S. (2018). A Survey on Facial Recognition based on local directional and local binary patterns. Conference on Information Communications Technology and Society (ICTAS). doi:10.1109/ICTAS.2018.8368757

Isnanto, R. R., A. F., Eridani, D., & Cahyono, G. D. (2021). Multi-Object Face Recognition Using Local Binary Pattern Histogram and Haar Cascade Classifier on Low-Resolution Images. International Journal of Engineering and Technology Innovation, vol. 11, no. 1, 2021, 45-58. doi:10.46604/ijeti.2021.6174

Kumaria, J., R.Rajesh, & KM.Pooja. (2015). Facial expression recognition: A survey. Procedia Computer Science 58, 486–491. doi:10.1016/j.procs.2015.08.011

Hussain, S. A., & Balushi, A. S. (2020). A real time face emotion classification and recognition using deep learning model. Journal of Physics: Conference Series(Vol. 1432, No. 1, p. 012087). doi:10.1088/1742-6596/1432/1/012087

Abdulrahman, M., & Eleyan, A. (2015). Facial Expression Recognition Using Support Vector Machines. 2015 23nd Signal Processing and Communications Applications Conference (SIU), 276-279. doi:10.1109/SIU.2015.7129813

Ouyang, Y., Sang, N., & Huang, R. (2015). Accurate and robust facial expressions recognition by fusing multiple sparse representation based classifiers. Neurocomputing, 149, 71-78. doi:10.1016/j.neucom.2014.03.073

Raja, M. N., Jangid, P. R., & Gulhane, S. M. (2015). Linear Predictive Coding. International Journal of Engineering Sciences & Research Technology.

Partila, P., Voznak, M., & Tovarek, J. (2015). Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System. The Scientific World Journal, 2015. doi:10.1155/2015/573068

Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., Mahjoub, M. A., & Cleder, C. (2019). Automatic Speech Emotion Recognition Using Machine Learning. Social media and machine learning. IntechOpen. doi:10.5772/intechopen.84856

Rumagit, R. Y., Alexander, G., & Saputra, I. F. (2021). Model Comparison in Speech Emotion Recognition for Indonesian Language. Procedia Computer Science, 179, 789-797. doi:https://doi.org/10.1016/j.procs.2021.01.098

Harar, P., Burget, R., & Dutta, M. K. (2017). Speech Emotion Recognition with Deep Learning. 2017 4th International Conference on Signal Processing and Integrated Networks (SPIN), 137-140. doi:10.1109/SPIN.2017.8049931

Ma, F., Gu, W., Zhang, W., Ni, S., Huang, S.-L., & Zhang, L. (2018). Speech Emotion Recognition via Attention-based DNN from Multi-Task Learning. Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, 363-364. doi:10.1145/3274783.3275184

Badshah, A. M., Ahmad, J., Rahim, N., & Baik, S. W. (2017). Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network. 2017 international conference on platform technology and service (PlatCon), 1-5. doi:10.1109/PlatCon.2017.7883728

Neumann, M., & Vu, N. T. (2017). Attentive Convolutional Neural Network based Speech Emotion Recognition:A Study on the Impact of Input Features, Signal Length, and Acted Speech. arXiv preprint arXiv:1706.00612.

Lim, W., Jang, D., & Lee, T. (2016). Speech Emotion Recognition using Convolutional and Recurrent Neural Networks. 2016 Asia-Pacific signal and information processing association annual summit and conference (APSIPA), 1-4. doi:10.1109/APSIPA.2016.7820699

Kim, N. K., Lee, J., Ha, H. K., Lee, G. W., Lee, J. H., & Kim, H. K. (2017). Speech emotion recognition based on multi-task learning using a convolutional neural network. 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 704-707. doi:10.1109/APSIPA.2017.8282123

Zheng, W. Q., Yu., J. S., & Zou, Y. X. (2015). An experimental study of speech emotion recognition based on deep convolutional neural networks. 2015 international conference on affective computing and intelligent interaction (ACII), 827-831. doi:10.1109/ACII.2015.7344669

Hifny, Y., & Ali, A. (2019). Efficient Arabic emotion recognition using deep neural networks. ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6710-6714. doi:10.1109/ICASSP.2019.8683632