Using spatially constrained clustering in land cover mapping

Fernanda Néry 1,2, Luís de Sousa 1, Pedro Marrecas 2, Ricardo Sousa 1
 and João Matos
1 Department of Civil Engineering and Architecture, IST, Technical University of Lisbon, PT Av. Rovisco Pais, 1049-001 Lisboa, Portugal
Tel.: + 001 555 832 1155; Fax: + 001 555 832 1156;,pt;;
2 Instituto Geográfico Português
Rua Artilharia Um, n.º 107, 1099-052 Lisboa, Portugal
Tel.: + 004 555 874 414; Fax: + 004 555 874 414,

Traditional land cover mapping imposes a predefined taxonomy with mutually-exclusive hard categories upon a surface which can be perceived as continuous. Boundary uncertainty and heterogeneity of resulting regions are inherent to such approach, but can nevertheless be reduced through the definition and application of explicit criteria. Following a design-based evaluation of the positional and attribute uncertainty in a photo-interpreted land cover map, categories where identified that do not attain the predefined accuracy  levels. As expected, those were categories representing land use instead of land cover (e.g. sport and leisure facilities; green urban areas) and heterogeneous categories. This paper focuses on the latter case. Heterogeneous categories (e.g. “complex cultivation patterns” in the CORINE Land Cover nomenclature) are a result of limitations in the support of either the input information (e.g. spectral mixture due to pixel size in remote sensing applications) or the specified output information (e.g. minimum mapping unit [MMU] of vector polygon maps). Information regarding the degree of heterogeneity is generally not provided to the final user. Improvement of data accuracy can be achieved using ancillary information and/or spatially constrained clustering algorithms. Spatial constraints are built using connectivity criteria – which objects are connected? – and distance criteria – how far apart are two connected objects? Connectivity can be defined geometrically or topologically. The resulting graph structure, or its equivalent binary incidence matrix, can be used directly in the clustering algorithm or be weighted using distance functions. This allows further flexibility and the integration of spatial and semantic constraints. Simple Euclidean distance can be used, or any empirical or heuristic measure of similarity between objects or the categories they originally belong to. As an output of the clustering process, a set of heterogeneity measures is obtained for each object, which can be used to evaluate the results against the original visual interpretation.

Keywords: spatial constraints, clustering, classification, land cover

In: Caetano, M. and Painho, M. (eds). Proceedings of the 7th International Symposium on Spatial Accuracy Assessment in Natural Resources and Environmental Sciences, 5 – 7 July 2006, Lisboa, Instituto Geográfico Português

Nery2006accuracy.pdf1.77 MB