Current position: Home > Research > Paper Publications
Paper Publications
A multi-source data fusion model for air quality inference with hybrid supervised and self-supervised learning and adaptive feature importance estimation

Impact Factor:6.9
DOI number:10.1016/j.ipm.2025.104474
Journal:Information Processing & Management
Abstract:Air quality inference is constrained by label scarcity due to the sparse distribution of standardized monitoring stations, yet existing studies rarely focus on this limitation. Consequently, we introduce AirFusion, a novel multi-task framework that integrates a supervised task for core inference with a self-supervised task for learning representations from unlabeled data. To capture the complex nature of air pollution, AirFusion employs a multi-source data fusion module consisting of five feature extraction blocks covering air quality, meteorology, traffic, geography, and timestamps. A key innovation is the air quality block, which fuses data from both standardized stations (high quality but low quantity) and micro-stations (low quality but high quantity) to enhance complementarity, providing the empirical support for ongoing micro-station deployment. To efficiently manage the vast multi-source features, we propose Adaptive Feature Selection Loss (AFSLoss), a novel loss function that prioritizes key features while filtering out irrelevant ones. Unlike previous methods limited to continuous features, AFSLoss effectively handles both continuous and categorical features. Extensive experiments on NO2 , O3 , and PM2.5 datasets (each containing 743,256 samples) demonstrate that AirFusion outperforms baselines.
Indexed by:Journal articles
Document Code:104474
Volume:63
Issue:2
Translation or Not:no
Date of Publication:2025-11-08
Included Journals:SCI
Links to published journals:https://www.sciencedirect.com/science/article/pii/S0306457325004157

中文