Elsevier

Annals of Epidemiology

Volume 20, Issue 2, February 2010, Pages 99-107
Annals of Epidemiology

Spatiotemporal Analysis and Mapping of Oral Cancer Risk in Changhua County (Taiwan): An Application of Generalized Bayesian Maximum Entropy Method

https://doi.org/10.1016/j.annepidem.2009.10.005Get rights and content

Purpose

Incidence rate of oral cancer in Changhua County is the highest among the 23 counties of Taiwan during 2001. However, in health data analysis, crude or adjusted incidence rates of a rare event (e.g., cancer) for small populations often exhibit high variances and are, thus, less reliable.

Methods

We proposed a generalized Bayesian Maximum Entropy (GBME) analysis of spatiotemporal disease mapping under conditions of considerable data uncertainty. GBME was used to study the oral cancer population incidence in Changhua County (Taiwan). Methodologically, GBME is based on an epistematics principles framework and generates spatiotemporal estimates of oral cancer incidence rates. In a way, it accounts for the multi-sourced uncertainty of rates, including small population effects, and the composite space-time dependence of rare events in terms of an extended Poisson-based semivariogram.

Results

The results showed that GBME analysis alleviates the noises of oral cancer data from population size effect. Comparing to the raw incidence data, the maps of GBME-estimated results can identify high risk oral cancer regions in Changhua County, where the prevalence of betel quid chewing and cigarette smoking is relatively higher than the rest of the areas.

Conclusions

GBME method is a valuable tool for spatiotemporal disease mapping under conditions of uncertainty.

Introduction

Knowledge of the spatiotemporal distribution of disease-specific incidence rates and its associated reliability is essential in the understanding of the underlying disease risk and the potential “environmental hazards-disease” causal associations 1, 2. The crude or adjusted incidence rates of a rare event (e.g., cancer) based on small populations may exhibit high variance and are, thus, less reliable 3, 4. This small population effect can yield unreliable and uncertain incidence rate data for spatiotemporal disease mapping purposes (5). To remove the small population effect in incidence rate estimation, large spatial units (e.g. counties and states) or small area aggregation are commonly used to stabilize the rates 6, 7. A similar concept is used to calculate temporal average incidence (during several years) that can yield a more stable spatial rate distribution. However, aggregated rates over a larger geographical area or a longer time period may cause the loss of detailed information about geographical and temporal rate variations that is valuable in disease risk assessment (5).

A variety of Poisson models have achieved considerable popularity in disease rate estimation 8, 9, 10, 11. Most of these models, however, often do not provide an adequate representation of the spatial dependence structure of incidence rates 5, 12, 13, 14. In a parallel development, geostatistics techniques have been used with increased frequency recently to analyze disease data distributed across space and time 15, 16, 17. However, most mainstream geostatistics techniques (such as the various forms of Kriging) are based on assumptions of rate homogeneity/stationarity patterns and linear disease estimators, which may not provide a meaningful representation of a disease pattern with heterogeneous space-time characteristics (18). In several other cases, nonlinear space-time disease estimators may provide more accurate disease assessment than linear or linearized Kriging estimators (19). Moreover, the small population effect can cause a significant increase of the incidence rate variance that clearly violates the homogeneity/stationary assumption.

There have been certain attempts to account for the spatial heterogeneity of disease rate variances: (a) One group of techniques (say, Group A) focuses on the underlying characteristics of the rate counting process and integrates the Poisson model into geostatistics analysis. Poisson model has been implemented as the estimator of either meantrend (16) or semivariogram (20) of spatial random processes. Among them, Kriging with a Poisson embedded semivariogram has been called Poisson Kriging (PK) 17, 20. PK has shown an improved performance over previous techniques in rare event simulation studies (17). (b) Alternatively, instead of working directly with the counting process, another group of techniques (say, Group B) chooses to transform the original site-specific data into a space-time homogeneous/stationary dataset 21, 22, 23.

In view of the above considerations, this study proposed to use the generalized Bayesian Maximum Entropy approach in the spatiotemporal mapping of rare disease incidence rates (24). GBME can, without assumptions of linearity and normality, synthesize various kinds of knowledge in a rigorous and general framework and provides a sound space-time image characterization in terms of the complete predictive probability density function at every point. Insight will be gained by applying GBME in the spatiotemporal mapping of oral cancer incidence rates over 26 townships in Changhua County (Taiwan) during the time period 1993-2002.

Section snippets

Study area

Changhua County, located in the geographic middle of Taiwan, has 26 townships, a total area of about 1074 km2 and a population of over 1.3 million; 38% of the residents are farmers, Fig. 1. (25). In recent years, oral cancer mortality in Taiwan has been promptly rising. Su et al. (26) showed that the oral cancer incidence rate in Changhua, at the rate of 45.07 per 10,000 males per year, was among the highest in the world, and was also the highest among the 23 counties of Taiwan during 2001 (

Results and Discussion

The GBME method (in which the Poisson-based semivariogram model constituted part of its G-KB) was applied in the spatiotemporal estimation of annual oral cancer incidence rate over the Changhua County during the period 1993-2002. Incidence rate uncertainty due to the small population effect was considered part of the S-KB and was accounted for in terms of Eq. (3), where a hyperbolic function expressed the “incidence rate variance-population size” relationship. Fig. 6 displays a summary of

References (38)

  • W.B. Riggan et al.

    Assessment of Spatial Variation of Risks in Small Populations

    Environ Health Perspect

    (1991)
  • R. Haining et al.

    Constructing Regions for Small-Area Analysis - Material Deprivation and Colorectal-Cancer

    J Pub Health Med

    (1994)
  • N.S.N. Lam et al.

    Use of space-filling curves in generating a national rural sampling frame for HIV AIDS research

    Prof Geogr

    (1996)
  • L.P. Hanrahan et al.

    Smrfit - a Statistical-Analysis System (Sas) Program for Standardized Mortality Ratio Analyses and Poisson Regression-Model Fits in Community Disease Cluster Investigations

    Am J Epidemiol

    (1990)
  • R.J. Marshall

    Mapping Disease and Mortality-Rates Using Empirical Bayes Estimators

    Appl Stat-J Roy St C

    (1991)
  • M. Martuzzi et al.

    Empirical Bayes Estimation of small area prevalence of non-rare condition

    Stat Med

    (1996)
  • T. Nakaya et al.

    Geographically weighted Poisson regression for disease association mapping

    Stat Med

    (2005)
  • L. Held et al.

    Joint spatial analysis of gastrointestinal infectious diseases

    Stat Methods Med Res

    (2006)
  • M.A. Oliver et al.

    A Geostatistical Approach to the Analysis of Pattern in Rare Disease

    J Pub Health Med

    (1992)
  • Cited by (21)

    • Geochemical anomaly definition using stream sediments landscape modeling

      2022, Ore Geology Reviews
      Citation Excerpt :

      Many extended BME models have been developed, including a BME model for categorical variables, one for aggregating categorical and continuous variables, one combined with a covariance model for a priori knowledge, one for local spatiotemporal scales, and one combined with the Monte Carlo simulation method (Bogaert, 2002; Choi et al., 2002; Bogaert and D'Or, 2002; D'Or and Bogaert, 2003; Wibrin et al., 2006; Orton and Lark, 2007). Researchers have conducted numerous studies that have successfully used BME in various applications, including examining the spatiotemporal distribution of ozone, evaluation of the urban heat island effect, spatial distribution of air pollution particles, soil texture mapping, groundwater monitoring networks, and the spatial distribution of endemic diseases (e.g., Christakos and Kolovos, 1999; Serre and Christakos, 1999; Serre, 1999; Christakos and Serre, 2000; D'Or et al., 2001; Serre et al., 2003; Douaik et al., 2005; Lee et al., 2008; Yu et al., 2009; Pang et al., 2010; Yu et al., 2010; Hosseini and Kerachian, 2017; Fei et al., 2019; Fu et al., 2020). Data involved in the BME model are classified as hard or soft according to the level of data accuracy.

    • Improved heavy metal mapping and pollution source apportionment in Shanghai City soils using auxiliary information

      2019, Science of the Total Environment
      Citation Excerpt :

      Bayesian Maximum Entropy (BME) is a modern geostatistics method providing a general spatial interpolation framework that can incorporate different core knowledge bases as well as data sources of varying uncertainty on a sound theoretical basis and a computationally efficient manner (Brierley et al., 2003; Douaik et al., 2005; Adam-Poupart et al., 2014; He and Kolovos, 2018). The core bases may include physical laws and theoretical models (Christakos, 1992; Kolovos et al., 2002; Angulo et al., 2013), whereas the data sources may take various forms across space and time (like intervals, probability density functions (PDF), charts, and multisensory products) (Christakos and Li, 1998; Yu et al., 2010; Tang et al., 2016; Chen et al., 2018). Other advantages of the BME that distinguish it from many mainstream interpolation methods of geostatistics and spatial statistics include: it does not require the data to satisfy the Gaussian probability assumption, that is, the non-Gaussian law is automatically incorporated in BME; BME is a non-linear spatial interpolator; unlike most traditional kriging techniques that generate a single estimate at each unsampled location, BME can provide multiple estimates and estimation accuracy maps (e.g., in terms of the mean, mode and median interpolators; 95% estimation intervals, residuals and tails) which improve implementation flexibility in practice; it can handle spatially nonhomogeneous and temporally non-stationary data; and BME's generalization power is demonstrated by the fact that under some limiting conditions (Gaussian assumption is valid and only hard data are considered), it produces the same results as the kriging techniques (Christakos, 2000; Christakos and Serre, 2000; D'Or and Bogaert, 2003; Savelyeva et al., 2010; Hu et al., 2016).

    • Space-time PM<inf>2.5</inf> mapping in the severe haze region of Jing-Jin-Ji (China) using a synthetic approach

      2018, Environmental Pollution
      Citation Excerpt :

      These features distinguish BME from many mainstream techniques that are based on restrictive assumptions (Gaussian distributions, linear estimators etc.). BME has been successfully applied in many air quality, environmental exposure and public health assessment studies (e.g., Yu et al., 2010; Messier et al., 2012; Akita et al., 2012; Reyes and Serre, 2014; Yang and Christakos, 2015). Interestingly, combinations of the LUR and BME methods have been proposed in the literature that improved the space-time estimation of PM2.5 concentrations (e.g., Yu et al., 2011; Beckerman et al., 2013; Reyes and Serre, 2014).

    • Application of receptor-specific risk distribution in the arsenic contaminated land management

      2013, Journal of Hazardous Materials
      Citation Excerpt :

      On the contrast, cancer maps have also been used to determine the oral cancer occurrence rate associated with agricultural contamination of arsenic by the local indicators of spatial association method [15]. They have been additionally combined with the Bayesian Maximum Entropy Method to uncover a relationship between high oral cancer rates and large-scale regional heavy metal pollution [16]. A cancer map can show the inhabitants of a particular region who exhibit a high incidence of cancer due to environmental pollution, but it does not easily clarify the relationships between exposure caused by the receptor's living habits and pollution characteristics.

    • Efficient mapping and geographic disparities in breast cancer mortality at the county-level by race and age in the U.S.

      2013, Spatial and Spatio-temporal Epidemiology
      Citation Excerpt :

      Also, studies in Texas showed areas where breast cancer incidence was elevated and how risk factors affected the geographic disparity in incidence (Bambhroliya et al., 2012; Hsu et al., 2006). Few studies have examined geographic disparities in breast cancer mortality across the entire U.S., but new statistical approaches are now available that can take into account spatial autocorrelations, and can identify where disparities in the risk of disease are most pronounced even when data are based on few cases or small population size (Berke, 2004; Goovaerts, 2005; Haining et al., 1994; Yu et al., 2010). Along with the maturing of techniques for handling spatiotemporal data and with the improvement of software and hardware for complex calculations, there is still a tendency to analyze data which contains risk factors and outcome variables but lacks geographical details (Copeland, 2010).

    View all citing articles on Scopus
    View full text