Video-based physical violence detection model for efficient public space surveillance

Erick Erick, Benfano Soewito

Abstract


This study aims to develop an effective real-time model for detecting violence in public spaces, focusing on achieving a balance between accuracy and computational efficiency. We evaluate various model architectures, with the main comparison between the ConvLSTM2D and Conv3D models commonly used in video analysis to capture spatial and temporal features. The ConvLSTM2D model, combined with preprocessing layers such as change detection and motion blur, showed optimal performance, achieving 86% accuracy after Bayesian optimization. With a low parameter count of 25,137, this model enables fast inference in just 0.010 seconds, making it suitable for real-time applications that require efficient computation. In contrast, the Conv3D model, which is also combined with preprocessing layers such as change detection and motion blur and has more than nine million parameters, shows a lower accuracy of 77.5% as well as a slower inference time of 0.025 seconds, making it unsuitable for real-time applications. The results of this study show that the ConvLSTM2D model is promising for real-time violence detection systems in public spaces, where a fast and accurate response is essential to prevent further acts of violence.

Keywords


Change detection; Conv3D; ConvLSTM2D; Efficient violence detection; Motion blur

Full Text:

PDF


DOI: http://doi.org/10.11591/ijict.v15i1.pp161-170

Refbacks



Copyright (c) 2026 Erick Erick, Benfano Soewito

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The International Journal of Informatics and Communication Technology (IJ-ICT)
p-ISSN 2252-8776, e-ISSNĀ 2722-2616
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

Web Analytics View IJICT Stats