Lightweight and Interpretable Hybrid Deep Network for Potato Leaf Disease Detection Using CBAM and Vision Transformer

Potatoes are a globally important crop, crucial to food security and nutrition, particularly in developing countries. However, their yield is often threatened by leaf diseases such as early blight and late blight, causing a severe loss of yield. Although deep learning–based models have shown promise in the identification of plant diseases, most of them rely on computationally intensive architectures and datasets collected in laboratories, restricting their application in practical agricultural fields. This paper proposes a lightweight and interpretable two-stage hybrid deep learning model aimed at addressing these challenges. The first stage separates disease-relevant features from complex field backgrounds through the use of Ghost convolutional layers, significantly reducing redundant computation. Feature refinement is enhanced via Convolutional Block Attention Modules (CBAM), which selectively focus on spatial and channel-level disease features. The second phase employs Lite Vision Transformer (LiteViT) blocks to capture global dependencies at a low computational cost, facilitating strong disease classification. The proposed model was trained and tested on a field-acquired potato leaf dataset obtained from Kannauj, India. Our hybrid model, called Ghost-CBAM-LiteViT, achieves high classification accuracy with low computational intensity, reaching performance rates of more than 99.67% in the test data set. Unlike existing heavy-transformer-based or CNN-only approaches, this is the first lightweight hybrid framework that jointly integrates Ghost Convolutions, CBAM, and LiteViT, delivering both state-of-the-art accuracy and practical deployability in real agricultural conditions. This model provides a deployable and scalable solution for real-time monitoring of potato leaf disease in smart agriculture, particularly in resource-poor settings.