Jacknife and Bootstrap

Top

  1. General considerations
  2. Jacknife
  3. Bootstrap

General considerations

Parameters of the petroleum system are estimated subjectively, or with the help of analogons, or even data bases. This involves investigating the type of distribution as well as estimating a number of distribution parameters. The uncertainty of a variable, such as porosity, is usually expressed as the standard deviation. For small samples the estimated standard deviation is itself uncertain. The usual approach to estimate the uncertainty of a distribution parameter is to use a known formula, such as the stdev of stdev = sdev/sqrt(2*n) where n is the sample size. here two methods are briefly discussed that are non-parametric resampling plans to obtain "standard error" of a parameter. With limited samples, a parameter is, fot instance, the variance as a measure of uncertainty and only an estimate. If we know the underlying distribution of the data, confidence ranges can be calculated to indicate how reliable the estimated variance, or other statistical measures from the sample. The uncertainty of the variance can also be estimated by looking for more independent samples and making more variance estimates from those additional independtly collected samples and see how much the individual variance estimates differ. However, such luxury is hardly ever available.

The Jackknife (Quenouille, 1949) is a handy tool that can be used for different purposes. In statistics it is a resampling procedure (without replacement) to get better estimates of the reliability of parameters in many contexts, such as distribution parameters, or regression coefficients, etc.. The Bootsrap (Efron et al.,1986) is a resampling procedure with replacement for the same purpose, but uses a Monte Carlo procedure. So, in both methods the same sample is used many times but in different ways.

Jackknife

Moreover, if we are uncertain about the underlying distribution type, the Jackknife is a non-parametric approach to resampling the original sample, by taking sub-samples where one of the observations is excluded. Despite the "song and dance" we made elsewhere about iid violation in sampling, we consider the subsamples as independent. The aim is to find a "standard error" of the variance, but it could also be the standard deviation. The procedure for the standard deviation is as follows and illustrated with a sample of porosities:

Bootstrap

The bootstrap (Efron, B, 1982) is, like the jackknife, a resampling method. Sub-samples are drawn from the total set of observations, whereby sampling is with replacement. The parameters of interest, notably the standard deviation estimate, is the mean over all the bootstrap sub-samples. The uncertainty of the estimated standard deviation is given by the variation of standard deviations in the sub-samples. A useful introduction is available on internet by Yen, L. (2019)

The procedure can be summarized as follows:

The results are shown below, together with the jackknife results in the upper table, as given by my program "jackBoot".

The upper part of the lower table refers to the jackknife results, the lower to the bootstrap.

The advantage of using the jackknife or the bootsrap appears to be, the resamples estimate of the parameter and the standard error, which can be quite different from the parametric equivalent, if available. Parameter estimates from a sample assuming a normal distribution could be unrealistic if the distribution is not well behaving or unknown.

Home