Critical values for 22 discordancy test variants for outliers in normal samples up to sizes 100, and applications in science and engineering

  • Surendra P. Verma Centro de Investigación en Energía, Universidad Nacional Autónoma de México, Priv. Xochicalco s/no., Col Centro, Apartado Postal 34, 62580 Temixco, Morelos, Mexico.
  • Alfredo Quiroz-Ruiz Centro de Investigación en Energía, Universidad Nacional Autónoma de México, Priv. Xochicalco s/no., Col Centro, Apartado Postal 34, 62580 Temixco, Morelos, Mexico.
Keywords: outlier methods, normal sample, two standard deviation method, 2s, reference materials, Monte Carlo simulations, critical value tables, Dixon Q-test, skewness, kurtosis, petroleum hydrocarbon.

Abstract

In this paper, the modifications of the simulation procedure as well as new, precise, and accurate critical values or percentage points (for the majority of data with four decimal places; respective average standard error of the mean ~0.0001–0.0025) of nine discordancy tests, with 22 test variants, and each with seven signi ficance levels α = 0.30, 0.20, 0.10, 0.05, 0.02, 0.01, and 0.005, for normal samples of sizes n up to 100 are reported. Prior to our work, only less precise critical values were available for most of these tests, viz., with one (for n <20) and three decimal places (for greater n) for test N14; two decimal places for tests N2, N3–k=2,3,4, N6, and N15; and three decimal places for N1, N4–k=3,4, N5, and N8; but all of them with unknown errors. In fact, the critical values were available for n only up to 20 for test N2, up to 30 for test N8, and up to 50 for N4–k=1,3,4, whereas for most other tests, in spite of the availability for n up to 100 (or more), interpolations were required because tabulated values were not reported for all n in the range 3–100. Therefore, the applicability of these discordancy tests is now extended up to 100 observations of a particular parameter in a statistical sample, without any need of interpolations. The new more precise and accurate critical values will result in a more reliable application of these discordancy tests than has so far been possible. Thus, we envision that these new critical values will result in wider applications of these tests in a variety of scientific and engineering fields such as agriculture, astronomy, biology, biomedicine, biotechnology, chemistry, electronics, environmental and pollution research, food science and technology, geochemistry, geochronology, isotope geology, meteorology, nuclear science, paleontology, petroleum research, quality assurance and assessment programs, soil science, structural geology, water research, and zoology. The multiple-test method with new critical values proposed in this work was shown to perform better than the box-and-whisker plot method used by some researchers. Finally, the so-called “two standard deviation” method frequently used for processing inter-laboratory databases was shown to be statistically-erroneous, and should therefore be abandoned. Instead, the multiple-test method with 15 tests and 33 test variants, all of which now readily applicable to sample sizes up to 100, should be used. To process inter-laboratory databases, our present approach of multiple-test method is also shown to perform better than the “two standard deviation” method.

Published
2018-04-18
Section
Regular Papers