Our knowledge of the spatial distribution of the physical properties of geologic formations is often uncertain because of ubiquitous heterogeneity and the sparsity of data. While many studies consider the effects of incorporating various types of data (including transmissivity, electrical resistivity, hydraulic heads and/or travel times) on predicting flow and transport processes in heterogeneous systems, the uncertainty associated with the delineation of lithofacies and associated hydraulic conductivity and porosity from limited geological and geophysical data are only marginally analyzed. Such data, which include grain size distribution curves, are typically derived from core samples and are often poorly differentiated thus further compounding predictive uncertainty.
Within statistical and stochastic frameworks, this uncertainty is quantified by treating a formation's properties (e.g., hydraulic conductivity) as random fields that are characterized by multivariate probability density functions or, equivalently, by their joint ensemble moments. Since in reality only a single realization of a geologic site exists, it is necessary to invoke the ergodicity assumption in order to substitute the sample spatial statistics, which can be calculated, for the ensemble statistics, which are actually required. This and other related assumptions are often impossible to validate.
Recently we (Wohlberg et al., 2006) demonstrated that Support Vector Machine (SVM) techniques provide a viable alternative to geostatistical frameworks by allowing one to delineate lithofacies in the absence of sufficient data parameterization, without treating geologic parameters as random and, hence, without the need for the ergodicity assumptions. This has been done by using well differentiated data.
Here, we extend our approach to account for poorly differentiated grain size distribution data. The procedure starts with the inference of hydraulic conductivity from the grain size distribution curves, which relies upon empirical relationships of, e.g., Beyer, 1964. The heterogeneous aquifer structure can then be assessed by analyzing the whole ensemble of grain size distribution curves with the aim of identifying distinct clusters. These are then taken as representative of different types of materials and identify different sedimentological data. The data can then be analyzed by means of geostatistical or machine learning-based methods in order to provide an estimate of the spatial arrangement of the identified lithofacies. The step involving the classification of geologic information is based on the possibility of differentiating the available data in order to clearly identify distinct geologic facies. This is a crucial point, since very often grain size distribution curves (and geologic data in general) are poorly differentiated, forcing the introduction of modeling approximations.