Beliavsky / StatLib

Unofficial mirror of StatLib http://lib.stat.cmu.edu/apstat/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unofficial mirror of http://lib.stat.cmu.edu/apstat/

StatLib---Applied Statistics algorithms

The Royal Statistical Society has been publishing algorithms in its journal Applied Statistics since 1968. As of 1989, there are over 250 of them. Most are in Fortran, though a few were in Algol, and some recent ones have been in Pascal. The book - Applied Statistics Algorithms by Griffiths, P. and Hill, I.D., Ellis Horwood: Chichester (1985) contains translations of several algorithms from Algol to Fortran. A few of the other algorithms have been supplied in Fortran translations, though a few are in Algol or Pascal. The index which follows indicates which algorithms are not in Fortran.

The full source code is published in the journal. Those available here have been transcribed manually, mainly within CSIRO Division of Mathematics & Statistics, or have been supplied directly by the authors or by the RSS Algorithms Editor. In some cases, later corrections or improvements published in Applied Statistics, have been incorporated.

It is the policy of the editors of the algorithms section of Applied Statistics that algorithms do not use double precision. Many of the algorithms here have been converted to double precision, though users should be careful to check what precision is used. Many of the algorithms require the use of other algorithms, particularly functions for the gamma function and the normal distribution function. Where such functions or subroutines are required, an appropriate comment has been added to the algorithm.

In a few cases, alternative algorithms from other sources have been added. For instance, three algorithms are included in the file for as66 for calculating the area under the normal curve, and an alternative random number generator is provided in the file for as183. The Applied Statistics algorithm for Nelder-Mead simplex minimization (AS 47) does not include the fitting of a quadratic surface; the CSIRO/ Rothamsted implementation which does this is included with AS 47.

Users must consult the original journal articles for details of the calling arguments; they are not included with these algorithms.

*** Warning. In some cases, there are different arguments (usually more) in the Griffiths & Hill versions of these algorithms. Users should check this.

It has been assumed that the user will be using a Fortran-77 compiler, so functions such as alog and amin1 have been converted to their generic forms (log and min). This simplifies conversion of code between single and double precision. Also all constants in the code, such as 1.0, 0.d0, etc., have been replaced with one, zero, etc., and these are defined in either data or parameter statements. Many of the algorithms have been entered in lower case, which is not acceptable in Fortran, though most compilers accept it.

Some of the algorithms need machine-dependent constants. The user should check this. In most cases, these have been set for compilers which allow a range of floating-point numbers from about 10**(-37) to 10**(+37), though most modern compilers allow a much wider range in double precision.

No guarantee is given that the algorithms have been entered correctly, or that they perform as claimed in the journal.

To obtain an algorithm, send an E-mail request of the form:

send index from apstat send 207 from apstat to statlib@lib.stat.cmu.edu

The Royal Statistical Society holds the copyright to these routines, but has given its permission for their distribution provided that no fee is charged.

The full collection of Applied Statistics algorithms is very large. Please only request those algorithms which you need. Requesting large numbers of algorithms places a great strain on the StatLib system and the underlying mail networks.


As of the end of 1997 the Applied Statistics journal does NOT accept algorithms

Listing of available Applied Statistics algorithms

No. Brief description (volume number/year in brackets)
3 Student's t-distribution. (17/1968) See also AS 27.
5 The non-central t-distribution. (17/1968) See also AS 243.
6 Cholesky decomposition of a symmetric +ve definite matrix. (17/1968)
7 Inversion of a symmetric matrix stored in triangular form. (17/1968)
13 Minimum spanning tree. (18/1969) See also AS 40.
14 Printing the minimum spanning tree. (18/1969)
15 Single-linkage cluster analysis. (18/1969)
22 Calculate treatment effects in a complete factorial experiment for any numbers of levels of factors using the extended Yate's method. (19/1970)
27 Upper tail area under Student's t-distribution. (19/1970) See also AS 3.
30 Half-normal plotting. (19/1970)
32 The incomplete gamma integral. (19/1970) See also AS 239.
34 Update inverse of symmetric banded matrix. (19/1970)
38 Calculate R-squared for all possible regression subsets using the Gauss-Jordan method. (20/1971) See also AS 268.
40 Update a minimum spanning tree. (20/1971)
41 Updates corrected sums of squares and products matrices. (20/1971) See also AS 240.
45 Histogram plotting. (20/1971)
46 Gram-Schmidt orthogonalization. (20/1971)
47 Nelder & Mead simplex method of unconstrained minimization without requiring derivatives. Does not include the quadratic surface fitting part of the algorithm. A CSIRO/Rothamsted version of the algorithm, which does include fitting a quadratic surface, is also included. (20/1971) (Updated, 20/Dec/93)
51 Log-linear fit for contingency tables. (21/1972) See also AS 160.
52 Calculation of sums of powers (up to 4) of deviations from the mean. (21/1972)
53 Wishart variate generator. (21/1972)
57 Printing multi-dimensional tables. (22/1973)
58 Allocates observations to clusters to minimize within-cluster sum of squares. (22/1973) See also AS 136.
60 Eigenvalues/vectors of a symmetric matrix. (22/1973)
62 Distribution of the Mann-Whitney U statistic. (22/1973)
63 Incomplete beta function. (22/1973) See also TOMS algorithm 708. TOMS algorithms are available from netlib.
64 Inverse of the incomplete beta function ratio. (22/1973) The file here is actually the Griffiths & Hill version of AS 109.
65 Expands structure formula to a list of binary integers. This is actually remark R82 which replaces the original AS65. (39/1990)
66 The normal distribution function. Two other algorithms (not from Applied Statistics) have also been included. (22/1973)
75 Algorithms for least-squares calculation using square-root free planar rotations (Morven Gentleman's package). (23/1974)
76 An integral useful in calculating noncentral t and bivariate normal probabilities. (23/1974)
77 Calculate exact null distribution of the largest root of a beta matrix. (23/1974)
78 The mediancentre (i.e. the median in a multi-dimensional space). (23/1974) See also AS 143.
83 Complex discrete fast Fourier transform. (24/1975)
84 Measures of multivariate skewness and kurtosis. (24/1975)
88 Generate all nCr combinations by simulating nested Fortran DO-loops. (Jane Gentleman's routines). (24/1975) See also AS 172.
89 Tail probabilities for Spearman's rho. (24/1975)
91 Percentage points of the chi-squared distribution. (24/1975)
93 Calculates frequency distribution for the Ansari-Bradley test statistic. (25/1976) A routine has been added to return the distribution function.
95 Maximum likelihood estimation of scale and location parameters from grouped data. User's distribution function. (25/1976)
96 Finding `nice' scales for graphs. (25/1976)
97 Real discrete fast Fourier transform. Series length must be a power of 2. (25/1976) See also AS 117 and AS 176.
99 Fitting Johnson curves by moments. (25/1976)
100 Normal-Johnson and Johnson-Normal transformations. (25/1976)
103 Psi or digamma function. (25/1976)
107 Calculate operating characteristics and average sampling number for a general class of sequential sampling plans. (26/1977)
108 Multiple linear regression minimizing the sum of absolute errors. (26/1977) See also AS 238 (in Pascal).
109 Inverse of the incomplete beta function. (26/1977)
110 LP-Norm fit of straight line by extension of Schlossmacher. Updated 14.06.2013 (26/1977)
111 Percentage points of the normal distribution. (26/1977) See also AS 241.
114 Compute the numerator of certain ordinal measures of association (Kendall's tau, Somer's d, Goodman and Kruskal's gamma) when the data are ordered categories. (26/1977)
116 Calculate the tetrachoric correlation and its standard errors. (26/1977)
117 Fast Fourier transform for series of any length using the CHIRP algorithm. (26/1977)
121 Trigamma function. (27/1978)
123 Distribution function of mixtures of beta distributions. (27/1978)
125 Maximum likelihood estimation for censored exponential survival data with covariates. (27/1978)
126 Distribution function of the range for the normal distribution. (27/1978)
127 Generation of random orthogonal matrices. (27/1978)
128 Computes approximate covariance matrix for normal order statistics. (27/1978)
132 Simple regression minimizing the sum of absolute deviations. (27/1978)
133 Finding the global maximum or minimum of a function of 1 variable. (27/1978)
134 Generate random beta variates for alpha &lt 1 and beta &gt 1. (28/1979)
135 Min-Max (L-infinity) estimates for linear multiple regression. (28/1979)
136 A K-means clustering algorithm. (28/1979)
138 Maximum likelihood estimates of the mean and standard deviation of the normal distribution with censored or confined observations. (28/1979)
139 Maximum likelihood estimation in a linear model from confined and censored normal data. (28/1979)
140 Clustering the nodes of a directed graph. (28/1979)
141 Inversion of a symmetric matrix ignoring a specified row/column. (28/1979)
142 Exact tests of significance in binary regression. (28/1979)
143 Calculates the median centre. (28/1979)
145 Exact distribution of the largest multinomial frequency. (28/1979)
147 Incomplete gamma function. (29/1980) See also AS 239.
148 Removal of bias in the jackknife procedure. (This is actually ASR 62) (29/1980)
149 Amalgamation of means in the case of simple ordering ('Up-and-Down Blocks' algorithm for isotonic regression). (29/1980)
150 Computes estimate of spectrum of a point process using a centered moving average of the periodogram of the counting process. (29/1980)
151 Smoothed spectral estimate for bivariate point processes. (29/1980)
152 Cumulative hypergeometric probabilities. (This is actually AS R77) (29/1980, revised in 38/1989)
153 Distribution of weighted sum of squares of normal variables. (29/1980) Pan's procedure for the tail probabilities of the Durbin-Watson statistic.
154 Exact maximum likelihood estimation of autoregressive-moving average models by Kalman filtering. (29/1980) See also AS 182.
155 Distribution function of a linear combination of non-central chi- squared random variables. (29/1980) See also AS 204. This is a Fortran translation supplied by the author.
157 The runs-up and runs-down tests. (30/1981)
158 Calculation of probabilities for inferences under simple order restrictions. (30/1981) See also AS 198.
159 Generate random 2-way table with given marginal totals. (30/1981)
160 Partial and marginal association in multi-dimensional contingency tables. (30/1981)
161 Critical regions of an unconditional non-randomized test of homogeneity in 2 x 2 contingency tables. (30/1981)
162 Multivariate Conditional Logistic Analysis of Stratum-matched Case-control Studies. (30/1981) Includes a Fortran version of CACM algorithm 382 for generating all combinations of M out of N items.
163 A Givens Algorithm for Moving from one Linear Model to another without Going back to the Data. (30/1981)
164 Least squares subject to linear constraints. (30/1981)
165 Discriminant analysis of categorical data. (30/1981)
166 Calculates the entanglement matrix for a specified design. (30/1981)
167 Calculates efficiencies of estimation and their generalized inverse. (30/1981)
168 Calculates 'neat' values for plotting scales. (30/1981)
169 Produces scatter plots. (30/1981)
170 Computation of probability and non-centrality parameter of a non- central chi-square distribution. (30/1981)
171 Fisher's exact variance test for the Poisson distribution. (31/1982)
172 Generates indices for simulated nested DO-loops. Actually converts (either way) between a single index and a vector of subscripts. (31/1982)
173 Generates design matrix for balanced factorial experiments. (31/1982)
174 Multivariate rank sum test and median test. (31/1982)
175 Cramer-Wold factorization of self-reciprocal polynomials as the product of two polynomials. (31/1982)
176 Kernel density estimation using the fast Fourier transform. (31/1982) Also contains an alternative set of routines for density estimation.
177 Expected values of normal order statistics. (31/1982)
178 Gauss-Jordan sweep operator with multi-collinearity detection. (31/1982)
179 Enumeration of all permutations of multi-sets with fixed repetition numbers. (31/1982)
180 Linear rank estimate of the standard deviation after symmetric trimming. (31/1982)
181 Withdrawn. See R94 below
182 Finite-sample prediction from ARIMA processes. (31/1982) Uses AS 154.
183 The Wichmann & Hill random number generator. An alternative is also provided. (31/1982)
184 Non-central studentized maximum and related multiple-t probabilities. (31/1982)
185 Backward elimination procedure to find best-fitting log-linear models for contingency tables. (31/1982) Uses AS 51.
186 Discrete Fast Fourier Transform with data permutation. Series length must be a power of 2. (31/1982)
187 Derivatives of the incomplete gamma integral. (31/1982)
188 Estimation of the order of dependence in sequences. (32/1983)
189 Maximum likelihood estimation for the beta binomial distribution. (32/1983)
190 Distribution function & its inverse, for the studentized range. (32/1983)
191 Approximate likelihood calculation for ARMA and seasonal ARMA models. Includes a routine for the Banachiewicz (or modified Cholesky) factorization A = LDL'. (32/1983)
192 Calculate approximate percentage points using Pearson curves and the first three or four moments. (32/1983)
193 The Knuth spectral test for congruential random number generators. (32/1983)
194 Test of goodness of fit of ARMA models. (32/1983)
195 Multivariate normal probabilities for a rectangular region in N- dimensional space. A version which calls IMSL routines is also included. (33/1984) See AS 251 for a special case.
196 Conditional multivariate logistic analysis of stratified case-control studies. (33/1984). Edited for errors by Bill Bardsley (bill.bardsley@man.ac.uk) on 18/08/2002.
197 Likelihood function for an ARMA process. (33/1984)
198 Calculation of level probabilities for order-restricted inference. (33/1984)
199 Branch and bound algorithm to find the subset which maximizes a quadratic form. (33/1984)
200 Approximate the sum of squares of normal score. (33/1984)
201 Combine predictions about a statistic based on the orderings of a set of means with an F-test of differences between the means. (33/1984)
202 Data-based nonparametric hazard estimation. (33/1984)
203 Maximum likelihood estimation of mixtures of distributions (normal, exponential, Poisson and binomial). (33/1984) See also AS221. This is a translation from Algol into Fortran.
204 Distribution of a sum of non-central chi-squared variables. (33/1984) Translation from Algol into Pascal.
205 Enumeration of all R x C contingency tables with given row and column totals, and calculation of hypergeometric probability for each table. (33/1984)
206 Isotonic regression in two independent variables. (33/1984)
207 Fit a generalized log-linear model. (33/1984)
208 Fit multivariate logistic model by the method of moments. (34/1985)
209 The distribution function of skewness and kurtosis. (34/1985)
210 Fit 5-parameter Johnson SB curves by moments. (34/1985)
211 The F-G diagonalization algorithm. (34/1985) Modifications in ASR 71 & ASR 74 have been added to the file.
212 Fitting the exponential curve by least squares. (34/1985)
213 Generation of correlation matrices with specified eigenvalues. (34/1985)
214 Calculation of Monte Carlo confidence intervals. (34/1985)
215 Maximum likelihood estimation of the parameters of the generalized extreme-value distribution. (34/1985). Updated on [15/Nov/99]
216 Maximum likelihood fitting when model contains a linear part and auxiliary parameters. (34/1985)
217 Computation of the Dip statistic to test for unimodality. (34/1985). Bug fixed by Ferenc Mechler (fmechler@med.cornell.edu). See README file for details. File changed to incorporate this patch provided by Yong Lu [5/Sep/02] [4/Aug/05]
218 Elements of the Fisher information matrix for the smallest extreme- value distribution, also allows for censored data. (35/1986)
219 Height balanced trees. (35/1986)
220 Operating characteristics of James-Stein and Efron-Morris estimation. (35/1986)
221 Maximum likelihood estimation of a mixing distribution. (35/1986)
222 Resistant smoothing using the fast Fourier transform. (36/1987)
223 Optimum ridge parameter selection. (36/1987)
224 Combining component designs to form a design with several orthogonal blocking factors. (36/1987)
225 Minimizing linear inequality-constrained Mahalanobis distances. (36/1987)
226 Cumulative probabilities for the non-central beta distribution. (36/1987)
227 Generate all possible N-bit binary codes. (36/1987)
228 Finding I-projections (maximizing cross-entropy) subject to a finite set of linear inequality constraints. (36/1987)
229 Quantile regression. (36/1987)
230 Distribution of customers in M/G/m queues using an approximation due to Hokstad. (36/1987)
231 Distribution of a noncentral chi-squared variable with non-negative degrees of freedom. (36/1987) In Pascal.
232 Computation of population and sample correlation and partial correlation matrices in MARMA(p,q) time series. (37/1988)
233 Branch and bound algorithm for feature subset selection. (37/1988)
234 Approximating the percentage points of simple linear rank statistics with Cornish-Fisher expansions. (37/1988)
235 Tally frequencies of distinct real values. (37/1988)
236 Recursive enumeration of RxC tables for exact likelihood evaluation. (37/1988) In Pascal.
237 The corner method for identifying autoregressive moving average models. (37/1988)
238 Recursive procedure for the L1-norm fitting of a straight line. (37/1988) In Pascal.
239 Incomplete gamma function. (37/1988)
240 Updating the inverse of corrected sums of squares and products. (37/1988)
241 Inverse normal - more accurate than AS 111. (37/1988)
242 The exact likelihood of a vector autoregressive moving average model. (38/1989)
243 Cumulative distribution function of the non-central t-distribution. (38/1989)
244 Decomposability and collapsibility for log-linear models. (38/1989) In Pascal.
245 Log of the gamma function. (38/1989) The file includes a slower but more accurate algorithm which is not an AS algorithm.
246 Repeated measurements analysis of variance with unknown autoregressive parameter. (38/1989)
247 Updating the sufficient configurations for fitting ZPA (zero partial association) models to multi-dimensional contingency tables. (38/1989) In Pascal.
248 Empirical distribution function goodness-of-fit tests for the uniform, normal and exponential distributions. (38/1989)
249 Mean and covariance estimation for the truncated multivariate normal distribution. (38/1989)
250 Test the equality of dispersion matrices using methods due to Puri & Sen (distribution-free) and Anderson (normal distribution). (38/1989)
251 Multivariate normal probability integrals with product correlation structure. (38/1989)
252 Generating classes for log-linear models. (39/1990)
253 Maximum likelihood estimation of the RC(M) association model. (39/1990)
254 Fit mixture of normal distributions to grouped & truncated data. (39/1990)
255 Fitting two-way tables by means for rows, columns and cross-term. (39/1990) In Algol.
256 Distribution of a quadratic form in normal variables. (39/1990) In Pascal.
257 Isotonic regression for umbrella orderings. (39/1990)
258 Average run lengths for cumulative sum schemes. (39/1990)
259 Extending confidence intervals by the Robbins-Monro search process. (39/1990)
260 Distribution function of the square (R^2) of the sample multiple- correlation coefficient. (40/1991)
261 Quantiles of the distribution of square (R^2) of the sample multiple- correlation coefficient. (40/1991)
262 The Wei-Lachin two-sample test, the generalized Wilcoxon test and the log-rank test for incomplete multivariate observations with censoring. (40/1991)
263 Construction of irredundant test sets. (40/1991)
264 Printing of bit patterns. (40/1991)
265 Waiting time distribution for the G/G/1 queue using the Fast Fourier transform. (40/1991)
266 Maximum likelihood estimation of the parameters of the Dirichlet distribution. (40/1991)
267 Calculate lower bound for the probability of correct selection of a subset containing the best populations from a set of populations. (40/1991)
268 Calculates statistics for all possible subset regressions using an orthogonal decomposition. (40/1991)
269 Calculates the Cornish-Fisher adjustment to distribution functions (from the normal distribution function) using higher cumulants. (41/1992)
270 Maximum likelihood fitting of a 'key' density function, e.g. normal distribution, multiplied by a series in Hermite polynomials. (41/1992)
271 Optimal (smallest mis-classification probability) joint classification procedure. (41/1992) See also AS 276.
272 Produce character Box-plots using supplied quartiles. (41/1992)
273 Compare the least-squares fit of two subsets of regression variables using Spj0tvoll's method. (41/1992). Updated [04/01/03].
274 Least squares algorithms to supplement those of Gentleman in AS 75. (41/1992)
274-90 Algorithm 274, translated into Fortran 90.
275 Non-central chi-squared distribution function, number of degrees of freedom must be positive but not necessarily integer. (41/1992)
276 Optimal combinatoric classification for two normal classes. (41/1992) See also AS 271.
277 The Oja bivariate median. (41/1992)
278 The distribution of quadratic forms of multivariate generalized Student variables. (41/1992)
279 P-values for the generalized/alternative Durbin-Watson statistic with arbitrary lag using a Cholesky transformation method. (42/1993)
280 The power function for Fisher's exact test. (42/1993)
281 Scaling and rounding regression coefficients to force them to be integers. (42/1993)
282 High breakdown regression and multivariate estimation. (42/1993)
283 Rapid computation of the permutation paired and grouped t-tests. (42/1993) In ANSI Standard C.
284 Null distribution of a statistic for testing sphericity and additivity: a Jacobi polynomial expansion. (42/1993)
285 Calculate multivariate normal probabilities using Monte Carlo methods for a general region defined by a user-supplied function. (42/1993)
286 Estimation of parameters when there are errors in both the predictor variables and the dependent variable using least squares with a general model (includes implicit non-linear regression). (42/1993)
287 Adaptive rejection sampling (to generate random variables) from log-concave density functions. (42/1993)
288 P-value calculation for the generalized two-sample Smirnov tests. The tests are conditional on ties in the pooled sample. (43/1994)
289 Convolves hypergeometric distributions generated by several 2x2 tables (43/1994)
290 Generate a grid of variance ratios for contour plotting of confidence regions for a pair of parameters in non-linear regression. (43/1994)
291 Calculates overlap capability of subsequence SSEQ of length L. (43/1994)
292 Computes Fisher information matrix elements for time (type I) or failure (type II) censored units from the smallest extreme value (sev), largest extreme value (lev), normal or logistic distribution. (43/1994)
293 Convolves conditional distributions generated by several 2xK tables (43/1994)
294 Pascal program for ....?
295 Heuristic algorithm to pick N rows of X out of NCAND to maximize the determinant of X'X, using the Fedorov exchange algorithm. (43/1994)
R94 calculates Shapiro-Wilk normality test and P-value for sample sizes 3 <= n <= 5000 . Handles censored or uncensored data. Corrects AS 181, which was found to be inaccurate for n > 50. (Uses function POLY in AS 181) by Patrick Royston (44/1995)
R96 Interval for trapezoidal rule integration to limit error using moment generating function bound for weighted sum of chi-squared(1) random variables.
297 Nonparametric regression using ultraspherical polynomials.
298 Hybrid minimization routine using simulated annealing and any local minimizer supplied by the user.
300 Efficient algorithm for determining a simulated percentile point with a fixed degree of accuracy.
301 Computes the logarithms of F1, P(H) and P(H').
302 Subroutine to compute a multiple isotonic regression for umbrella ordering with known peak.
303 Generation of ordered multinomial frequencies. Given integers N (N >= 0) and K (K >= 1), generates all possible integer arrays NI of size K, whose elements are non-negative, non-decreasing and sum to N.
304 Fisher's non-parametric randomization test for two small independent random samples.
305 Compute the ARL of a CUSUM mean chart with linear drift.
306 Calculation of a BIB product operation on any two matrices.
307 Calculation of the simplicial depth and the halfspace depth
310 Computes the cumulative distribution function of a non-central beta random variable.
314 Inverts matrix with contents subject to modulo arithmetic.
317 Maximum Likelihood Estimation and Goodness-of-Fit Tests for Mixtures of Distributions. Submitted by Alan Miller (amiller@bigpond.net.au) [31/Oct/02]
319 A program to implement a quasi-newton method. Using new algorithm Varmet July 1994
319 Fortran 90 version of 319.

N.B. This library was maintained by Alan Miller, formerly of CSIRO Division of Mathematics and Statistics, Clayton, Victoria 3169.

Fortran 90 translations of some of the above algorithms can be found at https://jblevins.org/mirror/amiller/#apstat.

Recent algorithms are provided by "T.R.Hopkins" <T.R.Hopkins@ukc.ac.uk>. The StatLib collection of Applied Statistics Algorithms is maintained by Pantelis Vlachos, vlachos@stat.cmu.edu.

Credit where credit is due

If you use an algorithm, dataset, or other information from StatLib, please acknowledge both StatLib and the original contributor of the material.

Notes

See also

About

Unofficial mirror of StatLib http://lib.stat.cmu.edu/apstat/


Languages

Language:Fortran 92.9%Language:Pascal 6.0%Language:C 1.1%