- General observations
- Residuals as a prior
- Sampling problems
- Probability of hydrocarbon charge
- Porosity priors
- Top seal capacity
- Oil density/gravity prior
- Oil field size
- Condensate ratio (richness)
- Recovery Efficiency
- Initial well productivity
- Reservoir engineering parameters

A prior distribution ("prior") contains information about what to expect when your personal experience is limited. They from an important part of the calibrated prospect appraisal approach. Expert systems do usually contain some background experience data, assembled from many sources, or individuals. Therefore it can be more effective than the judgment of a single individual. This "canned" experience may be in the form of a simple histogram, or a functional relationship between two or more geological variables. In this page a number of useful priors are discussed. The word "prior" in this context refers to a prior distribution of the mean value and variance of the data that forms our experience, usually world-wide experience. Some of these distributions are used as default distributions in the Gaea50 prospect appraisal program.

In a volumetric estimate a number of factors are multiplied, for instance, gross reservoir thickness times a NetToGross fraction times porosity, and so on. The numbers used for these factors are supposed to be the mean value for the reservoir rock. For this reason we are interested in the **mean value** of a given factor and in the case of a Monte Carlo simulation, in the **variance of this mean**. Because we work here at a second level of uncertainty: We have a distribution of observations with a mean and a variance, but now we are going to worry about the uncertainty of these parameters. The parameters of the distribution describing the uncertainty of the mean, or of the variance, are the parameters of the prior distribution. The term "hyperparameter" is used to distinguish these parameters from the parameters of the overlying data distribution.

The basic idea in bayesian estimation is to start with a prior distribution and use local pertinent data to **update** the prior to a posterior distribution. The local data are "analogs" of the prospect to be evaluated. Priors are usually better than a state of complete ignorance. A prior which is close to ignorance is called a weak prior. **"Ignorance"** about porosity would be expressed as "porosity can vary between 0 and 100%". Of course, we know better than that, as is described below. Although, we have to be careful to limit porosity to the range of 0 to 50%, because in the Amposta field, offshore Spain, the drill bit unexpectedly fel some 20 feet in a 100% porosity cave.

Hence, the distribution we use is either the prior of the mean in case of no analogs available, or the posterior of the mean when analogs are at hand. Note that the analog values of factors are themselves mean values of a reservoir. Therefore our prior, or posterior mean is actually a "mean of means".

The prior distribution of e.g. the mean of a process is not easily estimated subjectively in practice. However, the collection of world-wide observations is a frequentistic sampling process which allows applying simple rules to get the mean and the variance of the prior distribution of the mean. At the same time we have the variance of the world-wide observations, which is the "process variance". These statistics are required in a bayesian updating process. This update uses a sample of relevant "local" observations as new information which, together with the prior gives the parameters of the posterior distribution. The latter is usually a more realistic input to a Monte Carlo simulation than using an often too small sample of local relevant data. The drawback of using only the few local data at hand is to underestimate the true uncertainty.

Many variables relevant to appraisal are functions of the depth: temperature, pressure, oil gravity, porosity, etc. Tp predict such variables it is not sufficient to make a map of observed values and interpolate the value at an unknown location (see kriging). Although the location of analogon data is important, the 2-D picture can be misleading if the depth information is not used. Therefore all available data are used to make a regression of the variable on depth. This provides prior distributions for any depth in the form of the residuals of the regression. For an unknown location, the value is then estimated with the regression equation by inserting the depth for the particular prospect. Most of the time it can be assumed that the residuals are normally distributed. Sometimes it is better to work with the log-transformed data.

A quantitative petroleum system model (PSA model), such as Gaeapas, contains a number of "calibration results". These are various constants obtained from multivariate analysis. The analysis results are in the form of regression coefficients and distribution parameters of the residuals. A prior distribution in such case is the distribution of the mean estimate, given a number of X-variables and the variance of this mean estimate ( the "noise" about the regression line or variance, divided by the sample size). In a Monte Carlo simulation the regression with full uncertainty is reproduced in a prediction routine, which involves more than just the "standard error of esitimate".

Data gathering for a prior should not become biased by the availability of data. The usual data that one encounters is clustered and not necessarily independent. So care has to be taken to spread the sampling. The variance then will be larger, but the distribution is more representative. An example of this is porosity, where data from developed fields greatly outnumber the data from wildcats in the area, with the danger that the oilfield data bias the prior distribution for the area.

This probability ("P[HC]") is of major importance in the Gaeapas material balance model. Charge is defined as the arrival of a significant amount of oil and gas at a trap. In the exercise I did an amounty of 100,000 barrels oil in place was taken as a minimum accumulation. However, an oil seep may also be a good indication of generation/migration, although quantification of amounts is usually impossible. And a good HC show in an a wildcat also counts. For this prior data were assembled in some 300 explored sedimentary basins/provinces, some 33,000 exploration wells in all, plus data on seeps. The data were represented as "Yes/No", hence we assigned 1 to positive HC charge case and 0 to the others. A kriging analysis produced a significant variogram showing a range of about 50 Km. So beyond 50 km, on average, charge may not be predicted at A on the basis of well B, that is too far away. The conclusion is that an oil-province (on average) has a radius of 50 Km. Geologically speaking, this is of course a "Mickeymouse" approach, but it has been proven to be helpful.

It will be obvious that from an existing discovery only one well is counted.

Many studies are available in literature that analyze the relationship of porosity to geological variables, such as age, depth, lithology, pressure, overpressure, maturity, quartz content and so on.

I used the correlation of porosity with depth and lithology given by Ehrenberg & Nadeau (2005) as a prior in the Gaeapas program. This data is based on a world-wide sampling of 30,122 siliciclastic and 10,481 carbonate reservoirs. Both the mean and the variance change with depth. Any local porosity depth pairs, for the correct lithology, can be used to update the prior at a given depth to, hopefully, a narrower posterior porosity distribution.

Seal capacity is usually estimated as the ability to hold a certain differential pressure. Oil and gas in situ being less dense that the formation water, a pressure differential ("P_{d}") is created at the base of the seal. This variable was measured in some 160 well-documented cases of un faulted top seals (Nederlof & Mohler, 1981). The analysis had to handle some censored data as the P_{d} was in certain cases a minimum or maxmum observation. The geological factors for P_{d} were, amongst others, thickness, depth, lithology. Although capillary pressure is one of the most important factors, it was mainly represented by lithology as a proxy. Thickness should, at first sight, not have an influence, but it proved to be important. The conclusion was that thickness of the top seal is a proxy for the difficulty of leakage by small faults and fractures, which could not be observed in the "unfaulted caprocks" in our sample. Another process explaining thickness as a factor is diffusion. For gas this might cause considerable loss, as modeling by Montel et al. (1993) showed.

The results of this study are incorporated in Gaeapas as regression constants and the prior distributions of the residuals. In practice the thickness, lithology and the degree of faulting and fracturing are used as input variables to get an estimate of P_{d}. In addition, the reservoir engineering data for calculating in-situ densities of the HC and water are used.

Any serious quantitaive appraisal program will contain a number of formulas for calculating the PVT conditions in the reservoir and the Formation Volume Factor (FVF) of the oil, or the Expansion Factor (EF) for gas. The oil density occurs in many places in this process. API gravity data are fortunately widely published. Using world-wide data, I made a plot of API versus depth, in which the depths are sub seafloor:

"Prediction" of API on the basis of depth would be an exaggerated term, but the idea is to use the trend

Although not often applied, a prior distribution of field sizes could help in estimating sizes of discoveries. This discussed elsewhere in this website. The shape of the prior is lognormal, or a related skew distribution.

Condensates are usually light oils that are dissolved in a free gas accumulation (Fan et al., 2006).They can have different origins, such as changes of oil in the migration path, decomposition of normal, black oil inan accumulation, etc. A prior distribution of the condensate richness ("CGR") could be based on a detailed knowledge of the geochemistry of the petroleum system and the history of the accumulation and PVT conditions, but such research has not been done, or I have not seen it.

Instead, I have gathered some data from 73 known condensate accumulations, where the pressure, temperature and depth, as well as the condensate ratio were available. A multivariate regression of CGR on Pressure and Temperature explains about 46% of the variance. The contribution of T is negligable, so with only P as a predictor we still obtain 46% R-square. Therefore the estimation procedure would be (1) Estimate the probability that there is condensate, given free gas, and (2) the estimate of CGR based on P. The first condition is important because the most common situation is one with no condensate at all, hence "dry gas", with any kind of PVT condition. In the condensate cases the regression can then be used to predict the CGR.

A world-wide survey of primary and secondary recovery efficiency should lead to all sorts of regression equations, using depth, API, drive mechanism, permeability, etc. as significant factors. However, general data available did not show a correlation that would hold generally.

A number of studies for local area or plays is more useful. If those are not available, or thought not to be applicable, a simple histogram of cases observed world-wide gives a reasonable prior. Here is the prior for primary oil recovery efficiency used in the gaeapas program:

In Gaeapas the user can enter a number of locally observed recovery efficiencies. Then the bayesian update of the above prior is performed.

Well productivity in barrels per day or in terms of the productivity index can be studied on the basis of a "proxy" model, simplifying the engineering formula. A colleage in Shell, Dr. Leine, made a world-wide study and formulated a simple model on the basis of the formula for a producing well:

where:

(

k = permeability

h = thickness of pay zone

divided by:

μ = viscosity

The complicated part between brackets has to do with

the diameter of the drainage zone around the borehole and

the formation damage (skin factor).

In order to translate this formula in "geological terms" the reasoning was as follows:

- Drawdown will be very roughly represented by depth.
- Permeability is reasonably correlated with porosity.
- The thickness of pay can be guessed
- Viscosity is related to the API gravity of oil. The lower the API (means higher density), the higher the viscosity. Hence higher API (lower viscosity) gives a higher production rate.
- Borehole, drainage and skin are not easily estimated and we have no proxy for them.

where:

Z = depth

φ = porosity

h = net pay

API = oil gravity

This approach worked quite well, especially by separating data in onshore/offshore/ and clasitc reservoirs/carbonates. Also a remarkable effect was found when plotting residuals of the above regression against year. A typical logistic curve in the period 1940 to 1980 became apparent. This effect was the improvement of technology over those years, moving to larger tubing diameters, better control of skin, etc.

Therefore the older data are not a reliable guide to a prior, but new research should be undertaken to arrive at valid priors for today.

Many factors are involved in estimating amounts of oil and gas in place, gas/oil ratio, expansion factor, water saturation, residual water saturation, etc. The problem for a geologist appraising an undrilled structure is the lack of data. From analogons, the pertinent factors have to be estimated. To get a grip on the uncertainty of the estimates, geological proxies have to be translated into the required variables.

Gaeapas uses approximations for the density of the formation water, subsurface pressure, GOR, gas gravity, Z-factor, Formation Volume Factor (FVF or b_{oi}), oil API. The uncertainty is generated by the uncertainty of the input variables. It would be better if the uncertainty of the published regressions could be used, but most publications do not specify the margin of error sufficiently. Fortunately, the impact on prospect appraisal results is minor, compared to other uncertainties, such as HC charge.