Methods

Satellite-based PM prediction models

The development of local satellite-based PM prediction models will leverage the use of hybrid approaches (machine-learning and geostatistics) to incorporate multiple input variables to improve prediction performance than simple linear regression/interpolation models. The modelling baseline which will be primarily tested focuses on the application of multilevel/multivariate spatial clustering and geographically weighted regressions to quantify local spatial-temporal association properties between regression variables (i.e. PM observations) and both static and time-variant spatial covariates. Specifically, covariates will include static territorial features such as land use fractions, population patterns, transport network as well as time-variant factors such as vegetation indexes and microclimatic variables. Fixed ground-sensors and on-site sampled PM observations will be used for training and validation of the developed models. A comparison of different validation procedures will be performed according to ground-truth data availability and quality.
Expected outputs will include multitemporal high-resolution PM prediction maps at a regional scale and the raw source code used to implement the modelling pipeline, accompanied by technical/scientific documentation on procedures and results for the selected case studies. Computational work will leverage exclusively free and open-source statistical and machine-learning Python libraries such as SciPy, Scikit-learn, and Keras, coupled with parallel computing data analysis libraries, e.g. Dask.

According to the objectives, expected outcomes fall into two interconnected domains which are respectively the scientific and the society/policy-making domains. To that end, the project will not only take care of disseminating research outcomes towards stakeholders and citizens but will leverage on the interactions between researchers and civil society along all the project lifetime in order to achieve a broader effect and to extend its impact.

– Inquire into best suitable satellite remote sensing missions delivering air pollutant observations as open-data to support farming-related PM monitoring and analysis in the Lombardy Region.

Source apportionment and exposure models

All the on-site collected PM samples will be directly analyzed for a selected number of components, which will be identified on the basis of toxicity and analytical requirements, to guarantee i) the absence of background contaminations (from sampling substrates and equipment) and, most importantly, ii) spectral interference-free data, with a satisfying sensitivity. To assess the influence of anthropogenic sources on the atmospheric concentration levels, Enrichment Factors (EFs) will be calculated considering the concentrations of the investigated element and the reference element, respectively, calculated for the aerosol samples. To overcome the issue related to the relatively low representativeness of data from a limited number of samples at a fixed location, methods based on geostatistics, merging observation from a PM on-field sampling with “unconventional” data sources (i.e. satellite data) will be explored. Expected outputs will include highly detailed local air quality maps and derived long-term exposure average maps, accompanied by extensive technical/scientific documentation on procedures and results for the selected case studies.