Freeman & Ford (2001)

Effects of data quality on analysis of ecological pattern using the K(d) statistical function.

Elizabeth A. Freeman and E. David Ford

Ecology 83: 35-46, 2001

Abstract. The K(d) function is a summary statistics of all plant-plant distances in a mapped area. It offers the potential for detecting both different types and scales of patterns in a single map. Two types of errors occur in maps of individual plants. Data management errors, caused by transcription errors or other mishandling, are large errors and apply to small numbers of plants. Measurement errors, caused by the mapping techniques and equipment, are small errors that apply to all plants. Simulation of known spatial patterns combined with increasing levels of both types of error showed:

(1) Data management errors cause the spatial patterns identified by the statistical function K(d) to become less significant but did not cause a shift in scale of the identified patterns.

(2) Measurement errors cause the spatial patterns identified by K(d) to become less significant, and shift to larger scales. The effects of measurement errors are proportional to the scale, or size, of the underlying spatial patterns. Detection of inhibition between points is more sensitive to measurement error than detection of clustered distributions. Detection of small clusters is more sensitive than detection of large clusters and measurement error tends to cause an over-estimation of clumping size. For patterns with inhibition, the minimum establishment distance is more sensitive to error than the maximum distance at which inhibition affects survival probability.

Two examples of tree spatial distributions from the Wind River Canopy Crane Research Facility stem map data set were analyzed with analysis. Clusters of Thuja plicata were detected with cluster size much larger than levels of mapping error identified in the data. Significant inhibition was detected between large (dbh 0.2m) trees of all species. The largest scale at which inhibition occurred was much larger than the level of mapping error but the minimum distance of significant inhibition, i.e. the distance within which neighbors are never found, was on the order of the mapping error. Accurate identification of this distance may not be possible using K(d).

In conclusion. Necessary map accuracy is a function of the question being asked. A map of individual plant locations can not be used to investigate processes occurring at scales that approach the accuracy of the measurements. If a map is only going to be used to investigate large-scale processes, such as clustering, then it does not need to be as accurate as a map that will be used to investigate small-scale processes such as inhibition. However, use of in ecology has frequently been concerned with detecting inhibition and defining its scale. If an individual plant location map will be used for a general investigation of all spatial processes occurring in the community, it should be kept in mind that any spatial processes approaching the scale of the mapping accuracy might not be revealed by K(d).