Shot classification for human behavioural analysis in video surveillance applications
Abstract
Human behavior analysis plays a vital role in ensuring security and safety of people in crowded public places against diverse contexts like theft detection, violence prevention, explosion anticipation etc. Analysing human behaviour by classifying of videos in to different shot types helps in extracting appropriate behavioural cues. Shots indicates the subject size within the frame and the basic camera shots include: the close-up, medium shot, and the long shot. If the video is categorised as Close-up shot type, investigating emotional displays helps in identifying criminal suspects by analysing the signs of aggressiveness and nervousness to prevent illegal acts. Mid shot can be used for analysing nonverbal communication like clothing, facial expressions, gestures and personal space. For long shot type, behavioural analysis is by extracting the cues from gait and atomic action displayed by the person. Here, the framework for shot scale analysis for video surveillance applications is by using Face pixel percentage and deep learning based method. Face Pixel ratio corresponds to the percentage of region occupied by the face region in a frame. The Face pixel Ratio is thresholded with predefined threshold values and grouped into Close-up shot, mid shot and long shot categories. Shot scale analysis based on transfer learning utilizes effective pre-trained models that includes AlexNet, VGG Net, GoogLeNet and ResNet. From experimentation, it is observed that, among the pre-trained models used for experimentation GoogLeNet tops with the accuracy of 94.61%.
Published
Downloads
Copyright (c) 2023 Newlin Shebiah R, Arivazhagan S
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.