A Modular Multi-Layered LSTM Framework for Real-Time Air Quality Index Forecasting: A Case Study of Urban Corridors in Rajasthan
Main Article Content
Abstract
Urban centres in Rajasthan — including Jaipur, Jodhpur, and Bhiwadi — experience critical seasonal pollution spikes driven by transboundary dust transport and industrial activity. This study presents and evaluates a modular multi-layered Long Short-Term Memory (LSTM) framework for daily Air Quality Index (AQI) forecasting using Central Pollution Control Board (CPCB) data from January 2023 to December 2025. The modular architecture decomposes the forecasting pipeline into four specialised components: data integrity monitoring, meteorological feature engineering, LSTM-based temporal prediction with a weather-adaptive loss function, and post-hoc SHAP explainability analysis. On the held-out test set (November–December 2025, n = 162 city-days), the framework achieves a combined Mean Absolute Error (MAE) of 43.79 and outperforms the mean-predictor baseline (MAE: 64.47–72.67) across all three cities. Against the more demanding persistence baseline, the current model does not yet reach parity (persistence MAE: 31.23–43.78), and negative R² values for Jaipur (−0.037) and Jodhpur (−0.042) indicate that performance on these cities during the high-variability winter test period did not surpass the mean predictor — a limitation we address directly. Two-fold temporal cross-validation confirms year-to-year generalisation, with city-level CV MAE ranging from 26.52 to 39.32. SHAP analysis validates that PM2.5 and PM10 are appropriately weighted as primary AQI drivers, and that Jodhpur's higher wind-speed sensitivity is consistent with its proximity to the Thar Desert. We discuss the gaps between current performance and operational readiness and identify a clear path to closing them through episodic event features and extended training data. The modular design provides an interpretable, maintainable foundation for AQI forecasting in challenging semi-arid urban environments.
Article Details
Section

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
References
[1] A. J. Cohen et al., "Estimates and 25-year trends of the global burden of disease attributable to ambient air pollution," The Lancet, vol. 389, no. 10082, pp. 1907-1918, 2017.
[2] Ministry of Environment, Forest and Climate Change, "National Clean Air Programme (NCAP) Report," Government of India, 2019.
[3] S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[4] S. K. Guttikunda, K. A. Nishadh, and P. Jawahar, "Air pollution knowledge assessments (APnA) city program for 20 cities in India," Air Quality, Atmosphere & Health, vol. 12, pp. 589-600, 2019.
[5] P. Kumar et al., "The rise of low-cost sensing for managing air pollution in cities," Environment International, vol. 75, pp. 199-205, 2015.
[6] Central Pollution Control Board (CPCB), "National Air Quality Index Portal," 2025. [Online]. Available: https://airquality.cpcb.gov.in/
[7] F. T. Liu, K. M. Ting, and Z. H. Zhou, "Isolation Forest," in Proc. IEEE ICDM, 2008, pp. 413-422.
[8] J. Snoek, H. Larochelle, and R. P. Adams, "Practical Bayesian optimization of machine learning algorithms," in Proc. NeurIPS, 2012.
[9] R. J. Hyndman and G. Athanasopoulos, Forecasting: Principles and Practice, 3rd ed. Melbourne: OTexts, 2021. [Online]. Available: https://otexts.com/fpp3/
[10] T. G. Dietterich, "Ensemble methods in machine learning," in Proc. MCS, Lecture Notes in Computer Science, vol. 1857, 2000, pp. 1-15.
[11] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, pp. 436-444, 2015.
[12] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA: MIT Press, 2016.
[13] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," in Proc. ICLR, 2015.
[14] World Health Organization, "WHO Global Air Quality Guidelines: Particulate Matter (PM2.5 and PM10), Ozone, Nitrogen Dioxide, Sulfur Dioxide and Carbon Monoxide," 2021.
[15] Rajasthan State Pollution Control Board, "Annual State of Environment Report," 2024.
[16] S. M. Lundberg and S.-I. Lee, "A unified approach to interpreting model predictions," in Proc. NeurIPS, 2017.
[17] Y. Zheng et al., "U-Air: When urban air quality inference meets big data," in Proc. KDD, 2013.
[18] L. Zhang et al., "Bidirectional LSTM networks for air quality prediction in megacities," Atmospheric Environment, vol. 289, 2023.
[19] M. Wang et al., "Attention mechanisms for pollution episode forecasting," Environmental Science & Technology, vol. 58, no. 3, pp. 1234-1245, 2024.
[20] J. W. Taylor, "Short-term electricity demand forecasting using double seasonal exponential smoothing," Journal of the Operational Research Society, vol. 54, no. 8, pp. 799-805, 2003.
[21] F. X. Diebold and R. S. Mariano, "Comparing predictive accuracy," Journal of Business & Economic Statistics, vol. 13, no. 3, pp. 253-263, 1995.