Lightweight Multi-Scale Framework for Human Pose and Action Classification
We propose a lightweight modular attention-based architecture for human pose classification using a
Swin-Transformer backbone. The model integrates Spatial Attention, Context-Aware Channel Attention,
and a Dual Weighted Cross Attention module for effective multi-scale feature fusion.
Evaluated on Yoga-82 and Stanford 40 Actions datasets, it achieves high accuracy and outperforming state-of-the-art baselines,
with only 0.79 million parameters.
Efficient and Accurate Pneumonia Detection Using a Novel Multi-Scale Transformer Approach
Paper
Code
We propose a novel multi-scale transformer framework for pneumonia detection that integrates precise lung segmentation and classification.
Using a lightweight transformer-enhanced TransUNet for lung segmentation and pre-trained ResNet backbones for feature extraction,
combined with Residual Attention and modified transformer modules,
our method achieves robust performance with high accuracy (93.75% on Kermany, 96.04% on Cohen) while remaining computationally efficient
for resource-constrained clinical environments. ...
A Lightweight Multi-Scale Refinement Network for Gastrointestinal Disease Classification
We propose a lightweight deep learning architecture for gastrointestinal disease classification using endoscopic images.
The model employs a frozen ConvMixer backbone for multi-scale feature extraction, enhanced with attention mechanisms and a Transformer
for discriminative feature refinement. Explainable AI techniques are incorporated to improve reliability and interpretability.
Evaluated on four benchmark GI datasets, our method achieves high accuracy (93.67% on Kvasir-v1, 81.12% on GastroVision, 98.75% on Kvasir-Capsule,
93.33% on Kvasir-v2) while remaining extremely parameter-efficient (0.38M), demonstrating its precision and lightweight design.