DBGC: Dimension-Based Generic Convolution Block for Object Recognition-Reference-Cited by-同舟云学术

DBGC: Dimension-Based Generic Convolution Block for Object Recognition

Published:2022-02-24 Issue:5 Volume:22 Page:1780
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Patel Chirag^ORCID,Bhatt Dulari^ORCID,Sharma Urvashi^ORCID,Patel Radhika,Pandya Sharnil^ORCID,Modi Kirit^ORCID,Cholli Nagaraj,Patel Akash^ORCID,Bhatt Urvi^ORCID,Khan Muhammad Ahmed,Majumdar Shubhankar^ORCID,Zuhair Mohd^ORCID,Patel Khushi,Shah Syed Aziz^ORCID,Ghayvat Hemant

Abstract

The object recognition concept is being widely used a result of increasing CCTV surveillance and the need for automatic object or activity detection from images or video. Increases in the use of various sensor networks have also raised the need of lightweight process frameworks. Much research has been carried out in this area, but the research scope is colossal as it deals with open-ended problems such as being able to achieve high accuracy in little time using lightweight process frameworks. Convolution Neural Networks and their variants are widely used in various computer vision activities, but most of the architectures of CNN are application-specific. There is always a need for generic architectures with better performance. This paper introduces the Dimension-Based Generic Convolution Block (DBGC), which can be used with any CNN to make the architecture generic and provide a dimension-wise selection of various height, width, and depth kernels. This single unit which uses the separable convolution concept provides multiple combinations using various dimension-based kernels. This single unit can be used for height-based, width-based, or depth-based dimensions; the same unit can even be used for height and width, width and depth, and depth and height dimensions. It can also be used for combinations involving all three dimensions of height, width, and depth. The main novelty of DBGC lies in the dimension selector block included in the proposed architecture. Proposed unoptimized kernel dimensions reduce FLOPs by around one third and also reduce the accuracy by around one half; semi-optimized kernel dimensions yield almost the same or higher accuracy with half the FLOPs of the original architecture, while optimized kernel dimensions provide 5 to 6% higher accuracy with around a 10 M reduction in FLOPs.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/22/5/1780/pdf

Reference41 articles.

1. CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope

2. Survey On Various Intelligent Traffic Management Schemes For Emergency Vehicles;Bhatt;Int. J. Recent Innov.,2013

3. Object Detection and Segmentation using Local and Global Property;Patel;Int. J. Emerg. Technol. Sci. Eng.,2012

4. Comparative analysis of traditional methods for moving object detection in video sequence;Garg;Int. J. Comput. Sci. Commun.,2015

Cited by 37 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. DySARNet: a lightweight self-attention deep learning model for diagnosing dysarthria from speech recordings;Multimedia Tools and Applications;2024-08-31

2. Sparse convolutional model with semantic expression for waste electrical appliances recognition;Science China Technological Sciences;2024-08-20

3. How to Improve Video Analytics with Action Recognition: A Survey;ACM Computing Surveys;2024-08-08

4. Cellular nucleus image-based smarter microscope system for single cell analysis;Biosensors and Bioelectronics;2024-04

5. A review of convolutional neural networks in computer vision;Artificial Intelligence Review;2024-03-23