Provided by: libstatistics-normality-perl_0.01-2_all 

NAME
Statistics::Normality - test whether an empirical distribution can be taken as being drawn from a
normally-distributed population
VERSION
Version 0.01
SYNOPSIS
use Statistics::Normality ':all';
use Statistics::Normality 'shapiro_wilk_test';
use Statistics::Normality 'dagostino_k_square_test';
DESCRIPTION
Various situations call for testing whether an empirical sample can be presumed to have been drawn from a
normally (Gaussian <http://en.wikipedia.org/wiki/Normal_distribution>) distributed population, especially
because many downstream significance tests depend upon the assumption of normality. This package
implements some of the more well-known tests <http://en.wikipedia.org/wiki/Normality_test> from the
mathematical statistics literature, though there are also others that are not included. The tests here
are all so-called omnibus tests that find departures from normality on the basis of skewness and/or
kurtosis [Dagostino71]. Note that, although the Kolmogorov-Smirnov test
<http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test> can also be used in this capacity, it is a
distance test and therefore not advisable [Dagostino71]. This, and other distance tests (e.g. Chi-
square) are not implemented here.
TESTS
The subtleties and esoterica of various statistical tests for normality require some familiarity with the
mathematical statistics literature. We give rules-of-thumb for specific tests, where they exist, but it
may be advisable to try several different tests to check the consistency of the conclusion. It is
probably also a good idea to check results graphically, either by direct plotting or by a Q-Q plot
<http://en.wikipedia.org/wiki/Q-Q_plot>. In general, small samples will often pass a normality test
suggesting the possibility that there is insufficient information to detect departure from normal for
such cases, should it exist.
Each of the methods here is a frequentist test, i.e. one that tests against the null-hypothesis
<http://en.wikipedia.org/wiki/Null_hypothesis> that the sample is normal. In other words, a low p-value
recommends rejecting the null.
EXPORT
A list of functions that can be exported. You can delete this section if you don't export anything, such
as for a purely object-oriented module.
Shapiro-Wilk Test
The Shapiro-Wilk W-Statistic test <http://en.wikipedia.org/wiki/Shapiro%E2%80%93Wilk_test> [Shapiro65] is
considered to be among the most objective tests of normality [Royston92] and also one of the most
powerful ones for detecting non-normality [Chen71]. Its statistic is essentially the roughly best
unbiased estimator of population standard deviation to the sample variance [Dagostino71]. The test is
mathematically complex and most implementations use several conventional approximations (as we do here),
including Blom's formula for the expected value of the order statistics [Harter61] and transformation to
standard normal distribution for evaluation, especially for large samples [Royston92].
$pval = shapiro_wilk_test ([0.34, -0.2, 0.8, ...]);
($pval, $w_statistic) = shapiro_wilk_test ([0.34, -0.2, 0.8, ...]);
This test may not be the best if there are many repeated values in the test distribution or when the
number of points in the test distribution is very large, e.g. more than 5000. The routine will carp
about the latter, but not the former. This particular implementation of the test also requires at least
6 data points in the sample distribution and will croak otherwise.
D'Agostino K-Squared Test
The D'Agostino K-Squared test <http://en.wikipedia.org/wiki/D%27Agostino%27s_K-squared_test> is a good
test against non-normality arising from kurtosis <http://en.wikipedia.org/wiki/Kurtosis> and/or skewness
<http://en.wikipedia.org/wiki/Skewness> [Dagostino90].
$pval = dagostino_k_square_test ([0.34, -0.2, ...]);
($pval, $ksq_statistic) = dagostino_k_square_test ([0.34, -0.2, ...]);
The test statistic depends upon both the sample kurtosis and skewness, as well as the moments of these
parameters from a normal population, as quantified by Pearson's coefficients [Pearson31]. These are
transformed [Dagostino70,Anscombe83] to expressions that sum to the K-squared statistic, which is
essentially chi-square-distributed with 2 degrees of freedom [Dagostino90]. The kurtosis transform, and
thus the overall test, generally works best when the sample distribution has at least 20 data points
[Anscombe83] and the routine will carp otherwise.
REFERENCES
• [Anscombe83] Anscombe, F. J. and Glynn, W. J. (1983) Distribution of the Kurtosis Statistic B2 for
Normal Samples, Biometrika 70(1), 227-234.
• [Chen71] Chen, E. H. (1971) The Power of the Shapiro-Wilk W Test for Normality in Samples from
Contaminated Normal Distributions, Journal of the American Statistical Association 66(336), 760-762.
• [Dagostino70] D'Agostino, R. B. (1970) Transformation to Normality of the Null Distribution of G1,
Biometrika 57(3), 679-681.
• [Dagostino71] D'Agostino, R. B. (1971) An Omnibus Test of Normality for Moderate and Large Size
Samples, Biometrika 58(2), 341-348.
• [Dagostino90] D'Agostino, R. B. et al. (1990) A Suggestion for Using Powerful and Informative Tests
of Normality, American Statistician 44(4), 316-321.
• [Harter61] Harter, H. L. (1961) Expected values of normal order statistics, Biometrika 48(1/2),
151-165.
• [Pearson31] Pearson, E. S. (1931) Notes on Tests for Normality, Biometrika 22(3/4), 423-424.
• [Royston92] Royston, J. P. (1992) Approximating the Shapiro-Wilk W-test for non-normality, Statistics
and Computing 2(3) 117-119.
• [Shapiro65] Shapiro, S. S. and Wilk, M. B. (1965) An analysis of variance test for normality -
complete samp1es, Biometrika 52(3/4), 591-611.
AUTHOR
Mike Wendl, "<mwendl at genome.wustl.edu>"
BUGS
Please report any bugs or feature requests to "bug-statistics-normality at rt.cpan.org", or through the
web interface at <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Statistics-Normality>. I will be
notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Statistics::Normality
You can also look for information at:
• RT: CPAN's request tracker
<http://rt.cpan.org/NoAuth/Bugs.html?Dist=Statistics-Normality>
• AnnoCPAN: Annotated CPAN documentation
<http://annocpan.org/dist/Statistics-Normality>
• CPAN Ratings
<http://cpanratings.perl.org/d/Statistics-Normality>
• Search CPAN
<http://search.cpan.org/dist/Statistics-Normality/>
COPYRIGHT & LICENSE
Copyright (C) 2011 Washington University
This program is free software; you can redistribute it and/or modify it under the terms of the GNU
General Public License as published by the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even
the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write
to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
perl v5.34.0 2022-06-17 Statistics::Normality(3pm)