Supervised Deep Learning Approaches For Anomaly Detection And Recognition In Crowd Scenes
Abstract
These days consciousness about public safety increases and CCTV cameras are installed at almost all public places. But generally automatic smart surveillance systems are not available. In this manuscript, emphasis is given to detect and classify abnormal events in surveillance video especially in crowd environments. Abnormal event detection is a challenging task because the definition of abnormality is subjective. A normal event in one situation can be considered an abnormal event in another case. In the surveillance video with a dense crowd, automatic anomaly detection becomes very difficult because of clutter and severe occlusion.
This manuscript represents CNN (Convolutional Neural Network) and CNN-LSTM (Convolutional Neural Network-Long Short-Term Memory) based approaches for detection and classification of abnormal events. The CNN architecture is developed from scratch and can be used for spatial domains. LSTM architecture is developed for the temporal domain. Feature sequences are generated using CNN model and given as input to LSTM model. Experiments are carried out using five different publicly available benchmark datasets. The performance is measured by accuracy and area under the ROC (receiver operating characteristic) curve (AUC). CNN-LSTM approach works better than only CNN.
Keywords
Abnormal Event, Classification, CNN, LSTM, Abnormal Event DetectionReferences
[1] N. Sjarif, S. Shamsuddin, S. Hashim and S. Yuhaniz, “Crowd Analysis and Its Applications” In Proc. International Conference on Software Engineering and Computer Systems,2011.
[2] M. Zitouni, H. Bhaskar, J. Dias and M. Al-Mualla, “Advances and Trends in Visual Crowd Analysis: A Systematic Survey and Evaluation of Crowd Modelling Techniques”, Neurocomputing, Vol. 186, pp.139- 159, 2016.
[3] G. Tripathi, K. Singh and D. Vishwakarma, “Convolutional Neural Networks for Crowd Behaviour Analysis: A Survey” The Visual Computer International Journal of Computer Graphics, Vol. 35, pp.753– 776, 2019.
[4] Unusual Crowd Activity Dataset by University of Minnesota, http://mha.cs.umn.edu./movies/crowdactivity-all.avi/
[5] UCSD Anomaly Detection Dataset by University of California and San Diego, http://www.svcl.ucsd.edu/projects/anomaly/dataset.html
[6] T. Hassner, Y. Itcher, and O. Kliper-Gross, “Violent Flows: Real-Time Detection of Violent Crowd Behavior”, In Proc. 3rd IEEE Conference on Computer Vision and Pattern Recognition, June 2012.
[7] A. Adam, E. Rivlin, I. Shimshoni, and D. Reinitz. “Robust Real-Time Unusual Event Detection Using Multiple Fixed- Location Monitors”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No.3, pp. 555–560, 2008
[8] T. Mostafa, J. Uddin and Md. Haider Ali, “Abnormal Event Detection in Crowded Scenario”, In Proc. 3rd International Conference on Electrical Information and Communication Technology, 2017
[9] F. Landi, C. Snoek and R. Cucchiara, “Anomaly Locality in Video Surveillance”, arXiv:1901.10364, 2019
https://www.kaggle.com/datasets/odins0n/ucf-crime-dataset
https://aimagelab.ing.unimore.it/imagelab/page.asp?IdPage=30
[10] M. Ravanbakhsh, M. Nabi, H. Mousavi, E. Sangineto and N. Sebe, "Plug-and-Play CNN for Crowd Motion Analysis: An Application in Abnormal Event Detection”, In Proc. IEEE Winter Conference on Applications of Computer Vision, pp. 1689-1698, 2018.
[11] M. Sabokrou, M. Fayyaz, M. Fathy, Z. Moayedd, and R. Klette, “Deep-Anomaly : Fully Convolutional Neural Network for Fast Anomaly Detection in Crowded Scenes”, Computer Vision and Image Understanding, 2018.
[12] Y. Feng, Y. Yuan, and X. Lu, “Learning Deep Event Models for Crowd Anomaly Detection”, Neurocomputing, Vol. 219, pp.548–556 , 2017
[13] S. Zhou, W. Shen, D. Zeng, M. Fang, Y. Wei, and Z. Zhang, “Spatial-temporal Convolutional Neural Networks for Anomaly Detection and Localization in Crowded Scenes”, Signal Processing Image Communication, Vol. 47, pp 358–36 , 2016.
[14] S. Smeureanu, R. Ionescu, M. Popescu and B. Alexe, “Deep Appearance Features for Abnormal Behavior Detection in Video”, Image Analysis and Processing—ICIAP 2017.
[15] J. Sun, J. Shao and C. He, “Abnormal Event Detection for Video Surveillance Using Deep One-Class Learning”, Multimedia Tools and Application, 2017.
[16] W. Li, V. Mahadevan and N. Vasconcelos , “Anomaly Detection and Localization In Crowded Scenes”, IEEE Transactions on Pattern Analysis and Machine Intelligence, June - 2013.
[17] R. Hinami, T. Mei, and S. Satoh, “Joint Detection and Recounting of Abnormal Events by Learning Deep Generic Knowledge”, In Proc. International Conference on Computer Vision, 2017.
[18] S. Yan, J. Smith, W. Lu and B. Zhang, "Abnormal Event Detection from Videos Using a Two-stream Recurrent Variational Autoencoder", IEEE Transactions on Cognitive and Developmental Systems, 2018
[19] Y. Chong, and Y. Tay “Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder”, In Proc. 14th International Symposium on Advances in Neural Networks, 2017.
[20] H. Vu, T. Nguyen, T. Le, W. Luo, and D. Phung, “Robust Anomaly Detection in Videos Using Multilevel Representations”, In Proc. AAAI Conference on Artificial Intelligence, Vol. 33, No.1, pp. 5216- 5223, 2019.
[21] T. Li, H. Chang, M. Wang, B. Ni, R. Hong, and S. Yan, “Crowded Scene Analysis: A Survey”, IEEE Transactions on Circuits and Systems for Video Technology, 2015.
[22] M. Ravanbakhsh, M. Nabi, E. Sangineto, L. Marcenaro, C. Regazzoni, N. Sebe, “Abnormal Event Detection in Videos Using Generative Adversarial Nets”, In Proc. IEEE International Conference on Image Processing, 2017
[23] W. Sultani, C. Chen, M. Shah, “Real-world Anomaly Detection in Surveillance Videos”, In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.6479-6488, 2018.
[24] J. Feng, Y. Liang and L. Li, "Anomaly Detection in Videos Using Two-Stream Autoencoder with Posthoc Interpretability", Computational Intelligence and Neuroscience, Vol. 2021, 2021
[25] Tian Wang, Zichen Miao, Yuxin Chen, Yi Zhou, Guangcun Shan, Hichem Snoussi, “AED-Net: An Abnormal Event Detection Network”, Engineering, Vol. 5, No.5, pp 930-939, 2019
[26] Mishkin, D., Sergievskiy, N., & Matas, J, “Systematic evaluation of convolution neural network advances on the imagenet” Computer Vision and Image Understanding,2017
[27]Garbin, C., Zhu, X., & Marques, O. (2020). Dropout vs. batch normalization: An empiricalstudy of their impact to deep learning. Multimedia Tools and Applications, 79, 12777–12815
[28] Ioffe & Szegedy, “Batch Normalization: Accelerating deep network training by reducing internal covariate shift” Academic Press, 2015
[29] G. Christian, Z. Xingquan and M. Oge, " Dropout vs. Batch Normalization: An Empirical Study of Their Impact to Deep Learning" Multimedia Tools and Applications, January, 2020
[30] X. Li, S. Chen, X. Hu and J. Yang, “Understanding the Disharmony Between Dropout and Batch Normalization by Variance Shift”, In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019
[31] S. Hochreiter and J. Schmidhuber, “Long Short Term Memory ”, Neural Computation, Vol. 9, No.8, pp.1735-1780,1997.
[32] K. Eugine, “Long Short-Term Memory (LSTM): Concept”, September 2, 2017 https://medium.com/@kangeugine/long-short-term-memory-lstm-concept-cb3283934359
[33] D. Xu, Y. Yan, E. Ricci, and N. Sebe, "Detecting Anomalous Events in Videos by Learning Deep Representations of Appearance and Motion," Computer Vision and Image Understanding, Vol 219, issue C, pp- 548-556, 2017
[34] F. Ullah, A. Ullah, K. Muhammad, I. Haq and S. Baik, "Violence Detection Using Spatiotemporal Features with 3D Convolutional Neural Network", Sensors, MDPI, Vol. 19, No. 10, 2019
[35] S. Sudhakaran and O. Lanz, “Learning to Detect Violent Videos using Convolutional Long Short-Term Memory” 14th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp.1- 6,2017
[36] X. Glorot and Y. Bengio. "Understanding the Difficulty of Training Deep Feedforward Neural Networks." In Proc. 13th International Conference on Artificial Intelligence and Statistics, pp. 249-256, 2010.
[37] S. Saha, “A Comprehensive Guide to Convolutional Neural Networks — the ELI5 way”, Towards data Science, 2018 https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-theeli5-way-3bd2b1164a53
[38] https://analyticsindiamag.com/complete-guide-to-bidirectional-lstm-with-python-codes
Published
Downloads
Copyright (c) 2025 Kinjal Joshi, Narendra Patel

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.