Multilevel Modeling

15. Spatially Aggregated Data

While we have so far discussed the multilevel structure in terms of individuals at level-1 and places at level-2, we argue that a similar framework of people within places can be established using routinely available aggregate data (e.g., census and mortality data). As is well-known, analyses of aggregated data confounds the micro scale of people and the macro scale of places. Although regrettable, this situation is usually tolerated owing to the other obvious attractions of these data sets (e.g., large, extensive coverage of places at multiple levels). A multilevel approach offers a solution to this problem (Subramanian, Duncan et al., 2001).

Table 1

Hypothetical Counts of Death and Total Population by Social Class by Areas

Counts of Death out of total population
Low Social Class
High Social Class
9 out of 50
2 out of 50
5 out of 90
5 out of 95
10 out of 80
0 out of 50
20 out of 90
0 out of 0

Table 1 provides hypothetical data of deaths for two social groups in a format that is typical for spatially aggregated data.

Thus, in Area 1, 9 out of 50 in the low social class category died in a particular year; in Area 2, 5 out of 95 in the high social class category died, and so on. In this table, individuals are grouped as ‘types’ (low and high social class) and are represented as ‘cells’ of a table that contain counts of death for each social group in every area. Importantly, by using the compact, aggregated form of Table 1, data agencies can preserve individual confidentiality.

Figure 11

Figure depicting that individuals nest within areas producing a two-level hierarchical data structure as described in the text.

Five points needs to be made about this table.

  1. It is vital to note that underlying Table 1 is simply a set of individual records that happens to be presented in a tabular format, but can easily be changed into an individual record format.
  2. Just as individuals nest within areas producing a two-level hierarchical data structure, so do the cells presented in Table 1, as shown in Figure 11. 
  3.  Although the data here is cross-tabulated by only one individual characteristic, exactly the same principles apply when there is a greater degree of cross-tabulation. 
  4.  If in an area there are no people of a particular type (e.g., missing high social class in Area 50 in Table 1) this poses no special problems as multilevel data structures can be unbalanced. 
  5.  There are good reasons for invoking the notion of cells even when data is available in an individual record format since the amount of information, and therefore the associated computing time, can be reduced substantially.

Consequently, routinely available aggregated data can readily be adapted to a multilevel data structure with table cells at level-1 (representing the population groups) nested within places at level-2. The counts within each cell give the number of people with the outcome of interest (e.g., number of deaths) together with the ‘denominator’ (the total population). The proportion so formed becomes the response variable and the cell characteristics, meanwhile, are the individual predictor variables. Such a structure now lends itself to all the analytical capabilities that were discussed earlier (Subramanian, Duncan et al., 2001).

Subramanian, S. V., Duncan, C., et al. (2001) Multilevel perspectives on modeling census data. Environment and Planning A 33(3): 399-417.