# Laws of Geography

## Summary and Keywords

The most prominent law in geography is Tobler’s first law (TFL) of geography, which states that “everything is related to everything else, but near things are more related than distant things.” No other law in geography has received more attention than TFL. It is important because many spatial statistical methods have been developed since its publication and, especially since the advent of geographic information system (GIS) and geospatial technology, have been conceptually based on it. These methods include global and local indicators of spatial autocorrelation (SA), spatial and spatial-temporal hotspots and cold spots, and spatial interpolation. All of these are highly relevant to spatial crime analysis, modeling, and mapping and will be discussed in the main part of this text.

Keywords: Tobler’s first law (TFL) of geography, spatial autocorrelation (SA), Moran’s I, Geary’s C, local indicator of spatial association (LISA), hotspot, kernel density estimation (KDE), geospatial privacy, geographical masking

The Spatial Concept

As an academic discipline, geography combines social, natural, and environmental sciences that are connected by the spatial concept. The development of geographic information systems (GIS) and of geospatial technology, which combines GIS, remote sensing, and global navigation satellite systems (GNSS) (Shellito, 2016), has undoubtedly increased the visibility of geography as an academic discipline. It also has made geography an important contributor to the technology-driven workforce in both the private sector and the government.

Today, the geography of crime is a “niche” subdiscipline of geography that has been gaining momentum with the overall development of GIS and geospatial technology since the 1990s (LeBeau & Leitner, 2011). This trend is expected to continue due to the establishment of international associations [e.g., International Association of Crime Analysts (IACA) and the Space, Place, and Crime working group of the European Society of Criminology (ESC)], the organization of annual conferences (e.g., IACA Training Conference and EuroCrim), and related developments. Spatial crime analysis and GIS amount to an interdisciplinary effort that includes many different disciplines, the most important of these being geography, criminal justice, sociology, psychology, political science, and environmental sciences. The United States and the United Kingdom have been on the forefront in the research and application of spatial crime analysis and GIS. Today’s key research and application areas include:

• Spatial and spatial-temporal hotspot analysis

• Criminal geographic profiling

• Criminal predictive analytics

• Crime risk surfaces

• Social media and big data

• Exceptional events and crime

• Criminal network analysis

• Cybercrime

Most of these topics possess a spatial component, and some of them are based on Tobler’s first law (TFL) of geography, discussed in more detail in the next section. The geography of crime is primarily a social science discipline, with some aspects of it related to environmental sciences (Brantingham & Brantingham, 1981). There have been some recent discussions in geography about whether TFL should be referred to more accurately as an observed regularity or a principle, instead of a law. The main reason is that many social scientists believe that human behavior cannot be defined by a law (Sui, 2004; Tobler, 2004). However, the authors of this article use the originally published and popular terminology, “TFL of geography” or “TFL,” for short, throughout the remainder of this discussion.

Tobler’s First Law of Geography and Other Laws

The most prominent law is indisputably TFL, which was formulated by Waldo Tobler^{1} in a publication in * Economic Geography* in 1970. It states that “everything is related to everything else, but near things are more related than distant things” (Tobler, 1970), and it means that characteristics of phenomena at one location on Earth tend to be similar to those at nearby locations. The exact definition of the words

*nearby*and

*similar*, of course, depends on the particular phenomena (Goodchild, 2009). For example, the crime rate in one neighborhood may be similar to crime rates in spatially contiguous neighborhoods and dissimilar to crime rates in neighborhoods that are at a farther distance away.

The concept of TFL is also described by geographers as “spatial dependence,” which is one of the basic assumptions for spatial interpolation. TFL remained largely unnoticed in the 1970s and 1980s. However, starting in the 1990s, the development and growth of GIS and later, geographic information science (GISc), gradually increased the popularity of TFL both within and outside the field of geography (Sui, 2004).

In addition to TFL, the geographic literature has been rather limited to research resulting in so-called laws of any kind. The strongest candidate for a second law of geography is spatial heterogeneity—the observation that conditions change from place to place, a phenomenon that statisticians refer to as *nonstationarity*. This concept has received increased attention recently with the development of place-based (or local) statistics that measure the properties of places, with no expectation that results will generalize to other places (Goodchild, 2009). One popular method is the Geographically Weighted Regression (GWR), which measures the nonstationarity in the relationship among variables across the study area (Fotheringham, Brunsdon, & Charlton, 2002). The concept of spatial heterogeneity is certainly relevant to the geography of crime. Due to the local uniqueness of individual places, results from crime research discovered in one area may neither be necessarily duplicated to other study areas nor generalizable to a larger region.

By analogy to TFL, Montello, Fabrikant, Ruocco, and Middleton (2003) proposed a “first law of cognitive geography,” which states that people *believe* that closer things are more similar. The authors test this law with visualizations of nonspatial information, termed *spatialization*. For example, in the spatialization of large databases, distance is commonly used as a metaphor to depict semantic (nonspatial) similarities among data items. Results of this research largely support the first law of cognitive geography (Montello, Fabrikant, Ruocco, & Middleton, 2003).

More than a half-century ago, Toepfer and Pillewizer (1966) introduced the “Radical Law” of cartographic generalization, which offers various formulas to estimate the reduction in the number of map features when generalizing from a larger-scale to a smaller-scale map. It was the first quantitative approach for the cartographic generalization process commonly known as *selection*. The Radical Law is well known among cartographers, but it does not have any particular relevance to crime analysis and modeling.

None of the other laws discussed here received as much recognition and interest as TFL of geography and will not be discussed in more detail in this article. The law is important because it serves as the basic concept for many spatial statistical methods that are highly relevant for crime analysis and modeling. This includes global and local indicators of spatial autocorrelation (SA) (e.g., Moran’s I, Getis-Ord Gi*), spatial and spatial-temporal hotspot and cold spot analysis, and spatial interpolation (kernel density estimation, or KDE).

In the main part of this article, these spatial and spatial-temporal statistical methods are discussed in more detail, and evidence is given of their relevance to crime analysis and modeling, including selected practical examples. At the end, an important topic associated with the potential violation of privacy is discussed—namely, the fact that personal information is increasingly collected and applied to spatial crime analysis and modeling. This concept is referred to as *geospatial privacy*.

SA

Classical statistics assumes that observations are independently distributed. When this assumption is applied to spatial statistics, it means that attribute values for locations or enumeration units are spatially unrelated to each other. It is assumed that crime locations and their attribute values are randomly distributed across space, but this has been disproven by empirical evidence many times (e.g., Eck, Chainey, Cameron, Leitner, & Wilson, 2005; Fotheringham et al., 2002; Levine, 2015). These studies have shown that crime, similar to many other social and physical phenomena, is not randomly distributed, but rather spatially clustered or spatially dependent. This dependency, also termed *complete spatial randomness (CSR)*, can be measured formally with SA statistics, which differentiate among positive SA, negative SA, and no SA (Fig. 1).

Positive SA is an arrangement in which crime locations with similar attribute values are spatially clustered. This type of SA can generate spatial hotspots, spatial cold spots, and spatial mean or spatial average spots. Spatial hotspots are a clustering of crimes with high attribute values. In contrast, spatial cold spots are crimes with low attribute values that are spatially close, and spatial mean spots are defined as the spatial clustering of crime locations with average values. However, this latter type, in contrast to hot and cold spots, is rarely of much interest to crime analysts. Finally, it is important to note that positive SA is also the formalization of the concept of TFL (Fig. 1a).

The opposite of positive SA is negative SA, which can be defined as a spatial arrangement in which nearby crime locations have dissimilar values that become more similar as distances between crime locations increase. Negative SA can create spatial outliers. Two types of spatial outliers exist. The first type occurs when a crime location with a low value is immediately surrounded by other crime locations with high values. The second type is the opposite arrangement of the first, in which a crime location with a high value has low-value crime locations immediately around it (Fig. 1b).

Both types of spatial outliers can be detected with Anselin’s local Moran statistics (Anselin, 1995). It is an example of local indicator of spatial association (LISA) statistics and is discussed in more detail in the next section. Finally, CSR indicates a lack of SA or independence between crime observations and their attribute values. This spatial pattern is observed with both crime locations and their attribute values being completely randomly distributed across space (Fig. 1c). Spatial hot, cold, or mean spots or spatial outliers may exist in a pattern of a CSR, but such local patterns have been generated purely by chance.

Global and Local Indicators of SA

TFL of geography can be formally measured with global and local indicators of SA. Global indicators of SA provide a single value that applies to the entire study area. In contrast, local indicators exhibit a variation of the SA inside and across the study area. The most prominent statistics for global SA are Moran’s I (Moran, 1950) and Geary’s C (Geary, 1954). An important measure for local SA is the local Moran, which is one example of LISA (Anselin, 1995).

## Moran’s I

Moran’s I is one of the oldest global indicators of SA. It can be applied to individual crime locations (e.g., burglaries of single-family homes) that have quantitative attribute values (e.g., amount of stolen goods or the damage done) associated with them. It also can be measured by the number of crimes aggregated to different enumeration units (e.g., counties) and related to their population, resulting in a crime rate (e.g., burglary rates). This rate represents the quantitative attribute value associated with each enumeration unit. Moran’s I compares the attribute value (stolen goods or burglary rate) at any one location (single-family home) or for any enumeration unit (county) with the attribute value at all other locations or enumeration units, as shown in Formula 1:

where *n* is the number of observations; *x _{i}* is the value of an attribute at a particular location or enumeration unit,

*i*;

*x*is the value of the same attribute at another location or enumeration unit (where

_{j}*i*≠

*j*); $\overline{X}$ is the mean of the attribute; and

*W*is the weight applied to the comparison between the location or enumeration unit

_{ij}*i*and the location or enumeration unit

*j*.

*W*can take the form of a contiguity matrix or be formalized as a distance-based weight. The contiguity matrix can be applied only to enumeration units, and the weight is set to 1 if two enumeration units share a common boundary, 0 otherwise. In the case of the distance-based weight,

_{ij}*W*is often the inverse distance (

_{ij}*d*) between locations or enumeration units

*i*and

*j*($\frac{1}{{d}_{ij}}$) (Levine, 2015).

Moran’s I ranges from ‒1 to 1, with the expected value [*E(I)*] being negative but very close to 0 (Formula 2). Moran’s I values above *E(I)* indicate positive SA, while values below *E(I)* indicate negative SA. Moran’s I values similar to *E(I)* indicate attribute values to be randomly assigned to, or spatially independent of, locations or enumeration units. Tests of significance for Moran’s I have also been developed (Levine, 2015).

where *n* is the number of observations (i.e., locations or enumeration units).

## Geary’s C

An alternative statistic to Moran’s I used to measure global SA is Geary’s C. Similar to Moran’s I, it can be applied to individual crime locations and to crimes aggregated to enumeration units that have attribute values associated with them, as shown in Formula 3:

where *n* is the number of observations; *x _{i}* is the value of an attribute at a particular location or enumeration unit,

*i*;

*x*is the value of the same attribute at another location or enumeration unit (where

_{j}*i*≠

*j*); $\overline{X}$ is the mean of the attribute; and

*W*is a weight applied to the comparison between location or enumeration unit

_{ij}*i*and location or enumeration unit

*j*(Levine, 2015).

The range of Geary’s C is between 0 and 2, with the expected value [*E(I)*] being 1, indicating a CSR or spatial independence of attribute values across crime locations. Geary’s C values ranging from 0 to < 1 typically indicate positive SA, while values between 1 and 2 indicate negative SA. Matching the range of Geary’s C with Moran’s I can be obtained by calculating an adjusted Geary’s C statistic, calculated as in Formula 4:

An adjusted Geary’s C value that is negative indicates negative SA, a value that is positive indicates positive SA, and a value of 1 indicates no SA, a CSR, or spatial independence.

The main difference between both statistics is how the dissimilarity of attribute values between crime locations is measured (i.e., the numerator of Formulas 1 and 3). Moran’s I calculates the sum of the cross-products of the deviations of individual attribute values from the mean value at two different crime locations at a time. In contrast, the difference in attribute values for any two crime locations defines the dissimilarity between attribute values in the case of Geary’s C (Levine, 2015).

LISA

The local Moran is an example of LISA statistics. It is an important measure of local SA (Anselin, 1995; Leitner & Brecht, 2007; Eck et al., 2005). The starting point to measure the local Moran comes from crime data aggregated to enumeration units and visualized in the form of a choropleth map (Fig. 2a). In this type of thematic map, crime data need to be normalized as either ratios, densities, or percentages. In the following example, crime data are expressed as burglary rates for the city of Boston based on census tracts. In Figure 2a, burglary rates are visualized as a so-called box plot map. Box plot maps are based on box plots, which summarize different measures of descriptive statistics. A box plot map usually consists of six classes, with four of the six classes containing about 25% of the ordered burglary rates. Lower and upper outliers, if they exist, make up the other two classes.

Statistical outliers are unusually high or low values and are either more than 1.5 or 3.0 interquartile ranges (IQRs) above the upper end or below the lower end of the box plot’s box. Burglary rates that are more than 1.5 or 3.0 IQRs above the upper end of the box are referred to as *mild or extreme upper outliers, respectively*. Conversely, burglary rates that are more than 1.5 or 3.0 IQRs below the lower end of the box are referred to as *mild or extreme lower outliers*, respectively. Figure 2a shows that burglary rates in Boston have no lower (mild or upper) outliers (shown as the dark blue box in the map legend), but 12 mild upper outliers (shown as census tracts with a dark red filling). In addition, some of these upper outliers could be extreme outliers, and census tracts with these 12 very high burglary rates can be defined as spatial hotspots.

The calculation of the local Moran statistic requires the construction of a spatial weights matrix, which can either be contiguity or distance based. A value of 1 is assigned to the contiguity-based matrix if two enumeration units are adjacent to each other, 0 otherwise. In a distance-based matrix, neighbors are identified according to a chosen threshold distance between enumeration units. If two enumeration units are a shorter distance apart than the threshold distance, then both units are regarded as neighbors and a value of 1 is entered into the matrix. If the distance between the two enumeration units is longer than the threshold distance, then a value of 0 is assigned. A completed spatial weights matrix is required for the computation of a spatially lagged variable. This variable is calculated for each enumeration unit and represents the average of all the units’ neighboring attribute values, as defined by the spatial weights matrix.

The Moran scatterplot in Figure 2b compares the burglary rate for each census tract on the *x*-axis (labeled “BURGRT”), with its spatially lagged burglary rate on the *y*-axis (labeled “lagged BURGRT”). The red line denotes the regression line between these two variables, with the slope of this line being identical to the global Moran’s I (0.1838). The scatterplot in Figure 2b is also partitioned into four quarters by a vertical dashed line that is defined by the mean burglary rate, and by a horizontal dashed line that is defined by the mean spatially lagged burglary rate. Census tracts and their neighbors falling into the upper right quadrant possess both similar above-average burglary rates.

In contrast, census tracts, including their neighbors falling into the lower-left quadrant, possess similar below-average burglary rates. Because census tracts in these two quarters follow the concept of TFL of geography, they can be defined as exhibiting local positive SA (i.e., spatial hot and cold spots). The other two quarters have census tracts with below-average burglary rates surrounded by neighboring census tracts whose burglary rates are above average (upper-left quadrant), and census tracts with above-average burglary rates surrounded by neighboring census tracts whose burglary rates are below average (lower-right quadrant). According to TFL of geography, census tracts falling into these two quarters exhibit a local negative SA and are referred to as *spatial outliers*. Because the global Moran’s I statistic in Figure 2b has a positive SA, the majority of census tracts fall into the lower-left and upper-right quadrants, indicating local positive SA.

Measures of local SA can be visualized with local Moran cluster maps (Fig. 2c) and local Moran significance maps (Fig. 2d). In these maps, only those census tracts are highlighted with a color other than gray, whose burglary rates exhibit a local SA with their neighboring burglary rates that are statistically significantly different from random distributions of burglary rates as created by 99 random permutations of burglary rates.

The random assignment process of attribute values across enumeration units is referred to as *Monte Carlo simulation*. In the Moran cluster map (Fig. 2c), census tracts with statistically significant local positive SAs are filled with red (spatial hotspots) and blue (spatial cold spots). In contrast, census tracts that show statistically significant local negative SAs are filled with pink and light blue (spatial outliers). In the local Moran significance map, the two different shades of green define two different levels of significance, expressed as probabilities that the resulting hotspots, cold spots, and spatial outliers are created by random distributions of burglary rates. Census tracts filled with light green define such probability with a p-value of 0.05, whereas the dark green indicates this chance with a probability of 0.01.

Spatial and Spatial-Temporal Hotspots

Data analysts do not only want to know whether their data cluster *spatially*. Geographical analysis gives them an insight into *where* data cluster. A crime analyst, for example, creates a hotspot map of recent robberies to detect locations that may be used for preventive actions.

For the following discussion, a spatial and spatial-temporal *hotspot* is defined as a location or geographical area with higher-than-average point (crime) locations. The opposite term, “cold spot,” is defined as a location or geographical area with lower-than-average point (crime) locations. Hotspots may exist at different scales of interest, and their identification is the first step to analyze why hotspots occur at specific locations or areas. Using GIS, the identification of hotspots is a rather simple task nowadays. However, cartographic visualization principles can make the creation and the interpretation of a hotspot map quite challenging. The reasons are that an analyst needs to consider the modifiable areal unit problem (MAUP)^{2} (Openshaw, 1983), the number of class ranges and their thresholds, and the appropriate use of colors and map design for spatial patterns. Hence, the analyst needs to reflect on how to present the results, whether as absolute values or concentrations. Sometimes it may be useful to normalize data against an underlying population for a meaningful map.

Dot-Based Hotspots

A heuristic, but widely used, approach to illustrating hotspots is in the form of a dot-based hotspot map. In this map type, each dot represents a single event (Fig. 3a). In these maps, hotspots are identified by analysts based on their experience and subjective judgment. Although dot-based hotspot maps offer some value for interpretation (e.g., when details about the location need to be preserved), they face a critical problem. Events on repeat locations cannot be identified because the dots overlap (i.e., they are stacked on top of each other). The analyst would consider these locations as a single event, which may result in a biased interpretation of hotspots. To overcome this problem, different size dots (i.e., graduated symbols) can be applied. The more events that occur at the same location, the larger the symbol size that these events are represented with. Dot-based hotspot maps may be a good approach for hotspot mapping when only a small number of events is displayed. The larger the data volume that needs to be shown, the more cluttered the map, and therefore, the more difficult it will be to purely visually identify hotspots. A solution to overcome this problem is to aggregate events to administrative boundaries or regular grids.

Administrative Boundary Thematic Hotspots

Using GIS, events can be aggregated to administrative boundaries such as ZIP codes, police beats, or districts (Fig. 3b) which analysts, police, and the public are very familiar with. These aggregated events can then be visualized as quantitative thematic maps (also known as *choropleth maps*) that show the distribution of the total number of events throughout a study area (Eck et al., 2005). Darker-colored areas often represent hotspots, whereas brighter-colored areas represent cold spots.

Although this technique is very popular for mapping events, the MAUP is considered as a negative factor. The MAUP refers to the problem that resulting hotspots reflect the selection of the number and shapes of administrative boundaries used. As a simple example, let us assume that all events in a police beat occur at a single location. This would mean that the count of events in this police beat would represent all events that occur at that single location. The selection of alternate administrative boundaries may show a different pattern of hotspots because of modified areas.

In addition to the MAUP, map readers may be visually attracted to larger administrative areas that fall into the class with the highest event count. Due to varying sizes of administrative areas, absolute count maps can mislead map readers in the identification of where hotspots exist. If events were evenly distributed inside their enumeration units having different sizes, the larger units would fall into a higher count class. Being aware of this problem, event counts should not be used for administrative boundary thematic mapping without normalizing counts. Data used for normalization may include the size of each administrative area, the population, or the number of households, houses, or apartments (e.g., the count of residential burglaries may be normalized by the number of residential houses).

Grid Thematic Hotspots

An alternative solution to normalizing crime counts is to use a regular grid consisting of cells with the same size, where the absolute number of events can be aggregated. This mapping method is referred to as *grid thematic mapping*. It solves the problem with varying sizes and shapes of administrative boundaries. This technique aggregates events based on a regular grid (Fig. 3c). The absolute counts for each grid cell are comparable, as all cells have the same size. Although this technique can be seen as an enhancement of administrative boundary thematic mapping, grid thematic mapping has its own limitations. For example, identified hotspots are represented by predefined shapes that consist of grid cells. This may result in inaccurate interpretations due to the MAUP. In addition, the map looks blocky (Eck et al., 2005). To reduce the blocky appearance of the map, the grid cell size is often reduced. However, this results in a speckled map of tiny, shaded grid cells, with the effect of losing the visual pattern.

KDE Hotspots

A popular method to create hotspots is continuous surface smoothing, which provides a smooth, continuous surface to visualize the distribution of events across the study area (Chainey & Ratcliffe, 2005). Common interpolation techniques such as kriging or inverse distance weighting need an intensity value that is interpolated over the entire surface. These approaches are useful for interpolating values of rainfall or temperature, which are referred to as *spatially continuous data* (i.e., they possess a value at every location on the Earth’s surface). Most data from the physical environment are spatially continuous. In contrast, data sources, such as crime data, do not exist at every location on Earth, but at specific locations. Such data are referred to as *spatially discrete data*, and an alternative set of methods (instead of kriging) needs to be applied for creating a continuous surface with such data.

A useful and popular method for visualizing spatially discrete data is KDE (Fig. 3d). This technique produces a smooth surface representing density values of point events across the study area. First, a regular grid is generated for the entire study area. Then, a three-dimensional kernel function, such as a quartic or normal function, is placed over each event and calculates weights based on the distance to a reference location. At each reference location, the weighted values are summed, resulting in a continuous surface.

A few parameters need be set prior to running a KDE. First, the kernel function needs to be defined. CrimeStat, for example, offers a normal, quartic, uniform, triangular, and negative exponential kernel function (Levine, 2015). Each of these has a specific impact on calculated weights. Many software applications do not require the user to choose a kernel function. Most of these applications apply the quartic kernel function (e.g., ArcGIS). Second, the cell size of the regular grid needs to be defined. The cell size does not have an impact on the KDE itself but on the visual appearance of the final density surface. Surfaces with larger cell sizes look blockier, whereas smaller cell sizes result in a visually appealing map.

The number of events should drive the selection of cell size, with smaller cell sizes being used for a smaller number of events and larger cell sizes for a larger number of events. The last parameter is the choice of the kernel bandwidth. For all but the normal kernel function, the bandwidth is the distance from the center of the kernel to its edge. For the normal kernel function, the bandwidth equals one standard deviation, a measure of variability. The bandwidth has the most significant impact on the density estimation. The larger the bandwidth, the smoother the resulting surface will be. Smaller bandwidths result in distinctive local hotspots. The choice of bandwidth will be influenced strongly by the purpose of the map: Smaller bandwidths could be used for focused police resource allocation, and larger bandwidths could be used for a more strategic view of crime hotspots (Eck et al., 2005).

Although the KDE provides hotspot maps that are intuitive and easily understood by the map reader, the resulting kernel density surface depends on the analyst’s selection of the aforementioned parameters. For example, the choice of bandwidth influences the extent of the smoothing of the surface and, thus, the size and intensity of hotspots. One of many approaches to choose an appropriate bandwidth length for KDE has been suggested by Williamson, McLafferty, McGuire, Goldsmith, and Mollenkopf (1999). The authors suggest to vary the bandwidth relative to different orders (K) of the mean nearest-neighbor distance. K = 1 refers to a first-order nearest-neighbor distance, K = 2 to a second-order nearest-neighbor distance, and so on.

A first-order nearest-neighbor distance is calculated as the mean (straight-line) distance between the location of each crime event and its closest other crime event. Similarly, a second-order nearest-neighbor distance is calculated as the mean (straight-line) distance between the location of each crime event and its second-closest other crime event. In general, nearest-neighbor distances increase with higher orders since neighboring crime locations are increasingly a greater distance from each other. When K = 5, the kernel bandwidth chosen by the analyst should be the same as the mean distance of the first-, second-, third-, fourth-, and fifth-order nearest-neighbor distances. This approach suggested by Williamson et al. (1999) may be useful to choose an appropriate bandwidth value, but it still depends on the analyst’s choice of the specific value of K.

Additional issues with the KDE method exist. First, KDE creates smoothed density values for areas where no crime events happened or no crime events could possibly happen (e.g., a house burglary in the middle of a lake). Second, analysts may not always select the most appropriate method to classify the density values into a small number of classes (e.g., 7 ± 2), with classes represented by a color scheme. In general, lighter color values are used for lower-density values and darker color values are applied to higher-density values.

The selection of the classification method is crucial for the visualization of kernel density values and, hence, visual interpretation of the distribution of hotspots (Dent, Torguson, & Hodler, 2009). In general, so-called optimal classification methods are preferred because they create classes with density values that are very similar to each other (this is also referred to as *internal homogeneity*) and result in classes that are very different from each other (this is also referred to as *external heterogeneity*). The Jenks optimization method is a good example of an optimal classification method (Jenks, 1967).

When it is important to identify statistically significant and objective hotspots, kernel density values should be classified by standard deviations, with each class being either one or one-half standard deviations wide (Chainey, Reid, & Stuart, 2002). The standard deviation is a common statistical measure of variability and measures how much density values deviate, on average, from the statistical mean. This usually results in an even number of classes. With the standard deviation classification, objective hotspots can be identified as the collection of all cells from the kernel density surface that possess density values that are higher than two standard deviations above the mean. In contrast, cold spots can be identified by cells with density values that are lower than two standard deviations below the mean. Alternative statistical methods to discover statistically significant hotspots do exist. Although some issues remain, the KDE is currently in vogue, as it is one of the most visually appealing and among the most statistically robust methods for identifying hotspots (Chainey & Ratcliffe, 2005).

Dual-KDE Hotspots

None of the approaches discussed thus far consider “population at risk” (or “denominator population”) data in the calculation of hotspots to discover areas where crime rates are very high. Including such data is advantageous because the distribution of hotspots could depend on the underlying population at risk, insofar as a higher number of people could result in a higher number of crimes. The “population at risk” could be the residential (or better, the ambient) population for calculating hotspots of robbery rates or pickpocketing rates. Alternative denominator populations for robbery rates also could be pedestrian counts. Other examples of “population at risk” data could be the number of registered vehicles for determining vehicle crime rates (Eck et al., 2005) or the number of houses or apartments for residential burglary rates. Using crime rates instead of absolute numbers of crime allows a direct and unbiased comparison across different neighborhoods, or even across different study areas.

The dual KDE is an example of a hotspot technique that can include a population at risk. This technique applies two different distributions of event data (e.g., individual crimes and “population at risk”). The crime rate is then calculated as the ratio between crime events in the numerator and the population at risk in the denominator (Levine, 2015). All issues identified for the single KDE still need to be taken into consideration when using the dual KDE, especially the choice of bandwidth and methods for classification of crime rates (Chainey & Ratcliffe, 2005).

Spatial-Temporal Hotspots

Each hotspot method discussed thus far in this article creates a hotspot map for a specific time period. An extension of purely spatial hotspots is the spatial-temporal hotspot, which not only analyzes crime patterns over space, but also over time. In order to fully understand crimes, it is necessary to analyze how their pattern changes over both space and time. The analysis of hotspot variations over time results in finding temporal patterns (e.g., cycles and rhythms of crime occurrences). Two different approaches are briefly discussed next, including temporal changes in the distribution of spatial crime clusters and spatial-temporal statistical methods to identify space-time interactions and clustering of crimes.

The first approach in the exploration of spatial crime cluster changes is to explore temporal variations in the distribution of spatial hotspots. This is accomplished simply by comparing a spatial distribution of hotspots at time *t*, with the distribution of hotspots at time *t*_{+1} for the same study area. The assessment and evaluation of temporal change can be achieved by either a visual comparison or, more objectively, by a statistical method, including dual KDE. As described previously, the dual KDE applies two different distributions of crime event data. Using crime data sets at two different time intervals, it is possible to map differences on a grid cell–by–grid cell basis between time *t* and *t*_{+1}. This approach is useful to locate areas where the temporal difference of the two crime distributions is stable or changes dramatically (Jacquez, 2008).

Besides the strategy to combine hotspot distributions at two different time periods, methods to account for space-time interaction and spatial-temporal clustering exist. Statistics derived from the Knox test (Knox, 1964) or the Mantel test (Mantel, 1967) are appropriate for space-time interaction. Both tests are global in nature (and therefore useful) if the analyst wants to see if there is spatial clustering for a specific time period. Hence, these tests are unable to find specific locations inside the study area, where spatial-temporal crime clusters exist. The scan statistics can be used to detect locations and sizes of statistically robust space-time hotspots. The space-time permutation scan statistic gradually scans a window across space and time, noting the number of observed and expected observations inside the window at each location. As a result, the most likely cluster is the window with the maximum likelihood. Each spatial-temporal hotspot found with the scan statistic includes both a spatial and temporal dimension (Kulldorff, Heffernan, Hartman, Assunção, & Mostashari, 2005).

A relatively new field of research and application related to hotspot mapping is forecasting future crime events and predicting where and when crime will happen next. Any hotspot technique potentially can be used to forecast crimes for mostly short-term periods (e.g., the next 24 hours, or the next week, or month, depending on the purpose of crime prevention actions). Applying hotspots to forecast crime is based on the concept that future crime will occur in areas where past crime has predominantly occurred. That is, past crime can forecast future crime. Hence this method is referred to as *retrospective forecasting*.

However, this forecasting method has been questioned, as the choice of appropriate crime data volumes is somewhat arbitrary in determining what will happen next (Chainey & Ratcliffe, 2005). Using this described technique would mean that all crime events have equal weight on the calculation of future hotspots. Current research suggests giving more recent events more weights than older events. The risk of another crime event (such as a burglary) happening is communicable in time, but also in space. This means that there is an increased risk for another crime event occurring after an initial crime event and close to that initial crime event, a concept that is referred to as *near-repeat victimization*. Bowers, Johnson, and Pease (2004) suggest that their proposed methodology based on this concept is 30% more accurate than traditional hotspot techniques, such as KDE or thematic mapping of boundary areas. Predictive crime mapping, including crime forecasting, has gained much attention in recent years, with great potential for new research in this young field of crime analysis and modeling.

Geospatial Privacy

Geospatial privacy (or geoprivacy, for short) addresses issues that are raised from the use, sharing, and publishing of spatial data and related deliverables that contain confidential, sensitive, or private information about individuals. With respect to crime data sets, spatial confidentiality disclosure occurs when actual locations of crime incidents are revealed, such as in a thematic point map of residential burglaries. A report published by the U.S. Department of Justice outlines five confidentiality issues when sharing crime data and maps: (a) victims may fear that criminals see them as easy targets; (b) victims will be unwilling to participate in investigations, given the possibility that offenders can find their addresses on a map; (c) victims also may not report a repeat victimization; (d) incident details on a map can be misused; and (e) ultimately, privacy is breached when private addresses are disclosed and identifiable (Wartell & McEwen, 2001). A survey of the public in London addressed some of these issues and other practical implications related to online crime maps in the United Kingdom (Kounadi, Bowers, & Leitner, 2014). Although participants of the study were not fully aware of the implications of publishing crime data, they were concerned overall with potential misuses associated with online crime maps.

The following two examples highlight the type of privacy violation of individuals and potential misuses that have actually happened. In December 2012, The * Journal News* published on an interactive map the names and addresses of handgun permit holders in two suburban counties in New York State. This was followed by an excessive amount of furious comments about privacy violation from readers and residents in the area, and the map was removed from the journal’s webpage soon after (CNN, 2013). Another release by a British newspaper, the

*, had more serious consequences. The newspaper published the names, pictures, and approximate residence addresses of sex offenders. As a result, innocent people were misidentified as being the offenders and suffered fire attacks on their homes (Marsh & Melville, 2011).*

*News of the World*However, not every crime type and crime-related information is associated with privacy implications. Privacy violation risks are linked only with crime types and related information that mainly or partially occur on private property, such as domestic violence or violent crime. Furthermore, both types of crime have different risks of reidentification. The location of a domestic violence incident always pinpoints to a residential location. On the other hand, violent crimes may occur on private property, commercial property, public places, or on the street.

Another aspect that is important to consider is the type of information that is at risk of being disclosed (in other words, risk of reidentification information). For residential burglaries, the risk of reidentification information is either residential addresses or households, at a finer resolution, while for a general burglaries data set (commercial burglaries cannot be distinguished from residential burglaries), the reidentification information is all addresses in a study area. A general guidance with respect to geoprivacy implications of crime data is that each data set should be examined individually to assess the disclosure risk and consult with current regulations or laws that apply to each country.

Understanding the Disclosure Risk

Geoprocessing and data mining are necessary activities to disclose information from spatial data. Activities vary depending on the form in which crime data are released. Next, four release scenarios are discussed, including the possibility that further scenarios that involve additional analytical activities may exist. The first scenario is the disclosure of a crime data set that includes the *x*- and *y*-coordinates for each incident through an open data platform such as governmental or institutional websites. A data broker could breach the confidentiality of this data set in three steps. First, addresses in each location can be retrieved with an automated or manual reverse geocoding process (i.e., extracting textual information, such as a name or an address, from geographic coordinates). Second, addresses can be linked with other sources of information, such as white page directories, to infer identities (names) linked to locations of incidents. In the last step, inferred names can be matched to a user name account of location-based social network applications. The spatial, temporal, and textual information that is available from these applications allows the data broker to mine and extract additional personal information (e.g., hobbies or personal interests).

The second scenario is the disclosure of crime incidents in online mapping websites. The data broker would initially have to create the data set from the presented locations on the map and then follow the three steps discussed in the first scenario. Depending on the size of the data set, this can result in a considerably time-consuming task. However, online maps are interactive and are depicted in a resolution that could provide precise addresses and locations. Hence, the process of creating the data set will have an insignificant effect or no effect at all on the accuracy of the collected addresses.

The third scenario is the disclosure of static crime maps (digital or printed) which involves additional geoanalytical tasks. First, geographical coordinates should be retrieved from the map with a process that is known as *reengineering of locations*. Leitner, Mills, and Curtis (2007) examined the accuracy of this process with an approach that includes the following five steps:

1. Scanning the map, if it is in a printed form

2. Georectifying the digital map and zooming into the map, where the sensitive locations are found

3. Digitizing the outline of the symbols that, after zooming in, depict sensitive locations as small circles or ellipses

4. Computing the centroids of the circles or ellipses

5. Overlaying the centroids on a street map of the same area to identify the street address in the network that is closest to each centroid

The new locations are the reengineered locations, and the data broker then will have to follow the disclosure steps from the two previous scenarios.

The last scenario is the release of a protected crime data set. In this situation, a protection algorithm has been applied to the original incidents to preserve their anonymity. The success of the protection is not always guaranteed. It strongly depends on the protection algorithm and whether information about it is known and disclosed. In particular, the disclosure risk is increasing when either metadata information or multiple protected releases of the same original data set are disclosed because they reveal to the data broker key elements on how to infer original locations from protected locations (Zimmerman & Pavlik, 2008; Cassa, Wieland, & Mandl, 2008).

Geographical Masking

*Geographical masking* is a group of protection algorithms for discrete location data. The main concept is to decrease the positional quality of an original point data set by lowering its precision or accuracy. Two common geographical masks are presented next, which reduce the positional accuracy and include some degree of randomization to reduce the disclosure risk. Further geographical masking approaches, as well as an assessment of their effectiveness to preserve confidentiality, are discussed in more detail by Zandbergen (2014).

The first geographical mask is “donut geomasking” by Hampton et al. (2010). It is depicted in the top three images in Figure 4. To demonstrate how the masks work, we assume that the data set to be protected contains locations of residential burglaries. In donut geomasking, the input data are the original locations (symbolized with black triangles) and administrative units that contain information on the risk of reidentification.

In this example, the risk of reidentification is the number of households within the administrative unit. The distribution of households within the unit is assumed to be homogeneous. Each burglary location is randomly displaced by at least a minimum distance within a circle (shown with purple tori). The tori or donuts are called *uncertainty areas* because original burglaries can be moved at any distance and direction within their boundaries. The size of the circle is computed proportionally to the size of the administrative unit so that each circle contains at least a minimum number of households. Due to the homogeneity assumption, all circles for the same minimum number of households have the same size.

The bottom three images of Figure 4 show the components and output result of the “adaptive areal elimination” geographical mask (Kounadi & Leitner, 2016). The spatial resolution of the unit that contains the risk of reidentification information should be as fine as possible to minimize the spatial error of the protected/masked burglary locations compared to the original burglary locations. In the example shown in Figure 4, households are supposed to be known for each street block within the administrative unit. Each block is dissolved with a neighbor recursively until all final uncertainty areas have at least a minimum number of households (shown as polygons of varying sizes with an orange outline).

Similar to donut geomasking, each burglary is translated randomly within its uncertainty area. The maps on the right of Figure 4 (top and bottom) show the final masked locations that were produced from the two approaches. These burglary locations are symbolized with purple and orange triangles for the donut geomasking (top right) and the adaptive areal elimination (bottom right), respectively.

Conclusion

The discussion in this article focuses primarily on TFL of geography, which states that “everything is related to everything else, but near things are more related than distant things” (Tobler, 1970). TFL is probably the most important law in geography because its concept is the basis of many spatial statistical methods that have been developed since its publication, including global and local statistics for both spatial point and polygon patterns. The latter are created by aggregating points to enumeration units, which may then be related to a “population at risk” variable to calculate a rate. Spatial statistical methods discussed include SA, spatial interpolation, and different ways to identify spatial and spatial-temporal hot and cold spots. These methods are highly relevant to spatial crime analysis, modeling, and mapping and all practical examples relate to this topic. At the end, the geospatial privacy of individual information and methods for preventing its disclosure risk are discussed.

## References

Anselin, L. (1995). Local Indicators of Spatial Association—LISA. *Geographical Analysis*, *27*, 93–115.Find this resource:

Bowers, K. J., Johnson, S. D., & Pease, K. (2004). Prospective hot-spotting. *British Journal of Criminology, 44*, 641–658.Find this resource:

Brantingham, P. J., & Brantingham, P. L. (Eds.). (1981). *Environmental criminology*. Beverly Hills: SAGE.Find this resource:

Cassa, C. A., Wieland, S. C., & Mandl, K. D. (2008). Re-identification of home addresses from spatial locations anonymized by Gaussian skew. *International Journal of Health Geographics*, *7*, 45.Find this resource:

Chainey, S. P., & Ratcliffe, J. H. (2005). *GIS and crime mapping*. London: Wiley.Find this resource:

Chainey, S. P., Reid, S., & Stuart, N. (2002). When is a hotspot a hotspot? A procedure for creating statistically robust hotspot maps of crime. In G. Higgs (Ed.), *Innovations in GIS 9 socio-economic applications of geographic information science* (pp. 21–36). London: Taylor & Francis.Find this resource:

CNN. (2013). Newspaper removes controversial online database of gun permit holders. Retrieved from http://edition.cnn.com/2013/01/18/us/new-york-gun-permit-map/.

Dent, B. D., Torguson, J. S., & Hodler, T. W. (2009) *Cartography, thematic map design* (6th ed.). Dubuque, IA: WCB, McGraw-Hill.Find this resource:

Eck, J. E., Chainey, S., Cameron, J. G., Leitner, M., & Wilson, R. E. (2005) *Mapping crime: Understanding hot spots*. Washington, DC: U.S. Department of Justice.Find this resource:

Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2002) *Geographically weighted regression: The analysis of spatially varying relationships*. Hoboken, NJ: Wiley.Find this resource:

Geary, R. (1954). The contiguity ratio and statistical mapping. *The Incorporated Statistician, 5*, 115–145.Find this resource:

Goodchild, M. F. (2009). First Law of Geography. In R. Kitchin & N. Thrift (Eds.), *International encyclopedia of human geography* (pp. 179–182). Amsterdam: Elsevier.Find this resource:

Hampton, K. H., Fitch, M. K., Allshouse, W. B., Doherty, I. A., Gesink, D. C., Leone, P. A., . . ., Miller, W. C. (2010). Mapping health data: improved privacy protection with donut method geomasking. *American Journal of Epidemiology*, *172*(9), 1062–1069.Find this resource:

Jacquez, G. M. (2008). Spatial cluster analysis. In A. S. Fotheringham & J. P. Wilson (Eds.), *Handbook of geographic information science* (pp. 395–416). Oxford: Blackwell.Find this resource:

Jenks, G. F. (1967). The data model concept in statistical mapping. *International Yearbook of Cartography*, *7*, 186–190.Find this resource:

Knox, G. (1964). The detection of space-time interactions. *Applied Statistics*, *13*, 25–29.Find this resource:

Kounadi, O., Bowers, K., & Leitner, M. (2014). Crime mapping on-line: Public perception of privacy issues. *European Journal on Criminal Policy and Research*, *21*(1), 167–190.Find this resource:

Kounadi, O., & Leitner, M. (2016). Adaptive areal elimination (AAE): A transparent way of disclosing protected spatial datasets. *Computers, Environment and Urban Systems*, *56*, 59–67.Find this resource:

Kulldorff, M., Heffernan, R., Hartman, J., Assunção, R., & Mostashari, F. (2005). A space-time permutation scan statistic for disease outbreak detection. *Public Library of Science Medicine*, *2*(3), e59.Find this resource:

LeBeau, J. L., & Leitner, M. (2011). Progress in research on the geography of crime. *The Professional Geographer*, *63*(2), 161–173.Find this resource:

Leitner, M., & Brecht, H. (2007). Crime analysis and mapping with GeoDa 0.9.5-i. *Social Science Computer Review*, *25*(2), 265–271.Find this resource:

Leitner, M., Mills, J. W., & Curtis, A. (2007). Can novices to geospatial technology compromise spatial confidentially? *Kartographische Nachrichten*, *57*(2), 78–84.Find this resource:

Levine, N. (2015). *CrimeStat*: *A spatial statistics program for the analysis of crime incident locations* (v 4.02). Houston and Washington, DC: Ned Levine & Associates and National Institute of Justice.Find this resource:

Mantel, N. (1967). The detection of disease clustering and a generalized regression approach. *Cancer Research*, *27*, 201–220.Find this resource:

Marsh, I., & Melville, G. (2011). Moral panics and the British media—a look at some contemporary “folk devils.” *Internet Journal of Criminology*. Retrieved from https://media.wix.com/ugd/b93dd4_53049e8825d3424db1f68c356315a297.pdf.Find this resource:

Montello, D. R., Fabrikant, S. I., Ruocco, M., & Middleton, R. S. (2003). Testing the First Law of Cognitive Geography on point-display spatializations. In W. Kuhn, M. F. Worboys, & S. Timpf (Eds.), COSIT 2003, *Lecture notes in computer science 2825* (pp. 316–331). Berlin and Heidelberg: Springer.Find this resource:

Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. *Biometrika, 37*, 17–23.Find this resource:

Openshaw, S. (1983). *The modifiable areal unit problem*. Norwich, U.K.: Geo Books.Find this resource:

Shellito, B. A. (2016). *Introduction to geospatial technologies* (3rd ed.). New York: W. H. Freeman & Company.Find this resource:

Sui, D. Z. (2004). Tobler’s First Law of Geography: A big idea for a small world? *Annals of the Association of American Geographers*, *94*(2), 269–277.Find this resource:

Tobler, W. R. (1970). A computer movie simulating urban growth in the Detroit region. *Economic Geography, 46*, 234–240.Find this resource:

Tobler, W. R. (2002). Ma vie: Growing up in America and Europe. In W. Pitts & P. Gould (Eds.), *Geographical voices* (pp. 292–322). Syracuse, NY: University of Syracuse Press.Find this resource:

Tobler, W. R. (2004). On the First Law of Geography. A reply. *Annals of the Association of American Geographers*, *94*(2), 304–310.Find this resource:

Toepfer, F., & Pillewizer, W. (1966). The principles of selection. *Cartographic Journal*, *3*, 10–16.Find this resource:

Wartell, J., & McEwen, J. T. (2001). *Privacy in the information age: A guide for sharing crime maps and spatial data series*: *Research report*. Washington, DC: National Institute of Justice.Find this resource:

Williamson, D., McLafferty, S., McGuire, P., Goldsmith, V., & Mollenkopf, J. (1999). A better method to smooth crime incidence data. *ArcUser Magazine*, January–March, 1–5.Find this resource:

Zandbergen, P. A. (2014). Ensuring confidentiality of geocoded health data: Assessing geographic masking strategies for individual-level data. *Advances in Medicine*.Find this resource:

Zimmerman, D. L., & Pavlik, C. (2008). Quantifying the effects of mask metadata disclosure and multiple releases on the confidentiality of geographically masked health data. *Geographical Analysis*, *40*(1), 52–76.Find this resource:

## Notes:

(1.) Waldo Tobler is professor emeritus in the Department of Geography at the University of California, Santa Barbara (UCSB). He received his PhD in the Department of Geography at the University of Washington at Seattle and spent several years at the University of Michigan, before moving to UCSB. Until his retirement in 1994, he held positions of professor of geography and professor of statistics at UCSB. Broadly, his research and teaching interests fall into areas related to mathematical modeling and graphic interpretations. He has published 140 or so publications. He is a member of the U.S. National Academy of Sciences and, until his retirement, was a Fellow of the Royal Geographical Society in London. More detailed information about Waldo Tobler can be found in Tobler (2002) or at his Department of Geography at the UCSB website: http://www.geog.ucsb.edu/~tobler/.

(2.)
The results of statistical analysis of spatial data that are aggregated to enumeration units are biased by the size and shape of such units. This bias is referred to as the *MAUP*.