| Neville Hunt's Homepage |
| Producing Statistical Tables Using Excel |
| by
Neville Hunt, Coventry University Acknowledgement A version of this paper was published in the Summer 1997 issue of Teaching Statistics. Introduction In a recent article Evans(1997) argues the case for using p-values in teaching tests of significance at the elementary level. One of the obstacles to using p-values in the school classroom is the comparative difficulty involved in their calculation. Tables of critical values for the Normal, t, Chi-squared and F distributions are widely available, both in self-contained books of tables (e.g. Murdoch and Barnes, 1986) or as appendices to textbooks. Statement of p-values requires tables of the cumulative distribution functions which, although commonly found for the Binomial, Poisson and Normal distributions, are rarely printed for other distributions. The distribution functions of t and chi-squared are tabulated by Lindley and Scott (1995). Excel functions One solution for teachers lies in the statistical functions of Microsoft’s Excel spreadsheet package. As I shall show, all the tables required for a school-level study of statistics can be generated very easily using Excel’s in-built statistical functions. Once spreadsheets of these tables have been constructed, copies may be printed for each student. Similarly, extracts can be inserted in handouts and test papers without copyright restrictions. Teachers can customise the tables to their preferred format, or that specified by the relevant Examinations Board. Moreover, tables can be produced for parameter values and ordinates not covered by the standard books. Cumulative Binomial Probabilities On a new worksheet set up the headings for the table as shown in Table 1. In C3 enter the formula =BINOMDIST($B3,$B$1,C$2,TRUE). Copy C3 across to G3 and down to Row 13. Format the table as you wish (alignments, fonts, borders, etc). This gives a table of probabilities P(X£ x). To obtain tables of P(X³ x), as given by Murdoch and Barnes, change the formula in C3 to =BINOMDIST($B$1-$B3,$B$1,1-C$2,TRUE). ). The table automatically recalculates when either n or one of the values of p is changed, although if n>10 the formula will need to be copied further down the spreadsheet.
Table 1: Cumulative Binomial probabilities Cumulative Poisson probabilities Take a copy of the Binomial spreadsheet and delete the first row. The values of p can now be considered values of the mean, m, and the crucial formula in C2 is =POISSON($B2,C$1,TRUE). The output is shown in Table 2 - the Greek letter m is obtained using the Symbol font for m. Again, this is Pr(X£ x) and to obtain Pr(X³ x), the formula in C2 should be =IF($B2>0,1-POISSON($B1,C$1,TRUE),1.0000).
Table 2: Cumulative Poisson probabilities Normal Distribution In Row 1 enter the label z and the values 0.00:0.09. In Column A enter the values 0.0:4.0 in steps of 0.1. In B2 enter the formula =NORMSDIST($A2+B$1) and copy across to K2 and down to Row 42 to obtain the cumulative Normal probabilities. Alternatively, to obtain the tail probabilities shown in Table 3, replace this formula by =1- NORMSDIST($A2+B$1).
Table 3: Tail probabilities of the Normal distribution To obtain particular percentage points, enter the tail probabilities, a, required in Row 1 and the corresponding z-values in Row 2 are obtained from the formula =NORMSINV(1-B$1) in B2 copied across to G2.
Table 4: Percentage points of the Normal distribution t-Distribution Percentage points of the t-distribution can be constructed as shown in Table 5. Here the formula =TINV(C$1,$B2) is entered in C2 and copied across and down as required. The Greek lettering is obtained using the Symbol font for "a" and "n".
Table 5: Percentage points of the t-distribution In order to be able to state approximate p-values, a table of the tail probability of the t-distribution is required as in Table 6. The key formula in C2 is =TDIST($B2,C$1,1), the final 1 indicating that a one-tail probability is required.
Table 6: Tail probabilities of the t-distribution Chi-squared distribution This can be handled in exactly the same way as the t-distribution. A table of percentage points similar to Table 5 can be generated using the formula =CHIINV(C$1,$B2) in C2, while a table of upper-tail probabilities similar to Table 6 is generated by the formula =CHIDIST($B2,C$1). A complication with this latter table is that the range of values of c2 increases with n. For n<10 a suitable range of values in Column B is 0 to 30 in steps of 0.5.F-distribution The upper percentage points of the F-distribution are best tabulated as a separate table for each tail probability a. Table 7 shows a possible layout, where the formula in C3 is =FINV($B$1,C$2,$B3). Changing the value in B1 causes the whole table to re-calculate automatically.
Table 7: Upper 5% percentage points of the F-distribution There is not really any convenient way of tabulating tail probabilities for the F-distribution. Table 8 shows a possible layout which (suitably extended) would suffice for most problems tackled at school level. The formula in C3 is =FDIST($B3,C$1,C$2) which is copied throughout the table.
Table 8: Tail probabilities of the F-distribution Random Numbers Despite the ready availability of random number functions on electronic calculators, tables of random integers still have a role to play in random sampling and simple simulation experiments. In Excel random integers are conveniently generated as text in blocks of four to a cell using the formula =RIGHT(FIXED(RAND( ),4),4) as shown in Table 9, or individually as numbers using the formula =INT(10*RAND( )).
Table 9: Random integers in blocks of four Random Normal deviates are generated by the formula =NORMSINV(RAND( )). Equally useful are tables of random exponential deviates generated by = -LN(1-RAND( )). When multiplied by a mean rate, 1/ l, these can be used to simulate inter-arrival times between events in a Poisson process with mean l.Accuracy Doubts have been expressed in some quarters regarding the accuracy of Excel’s statistical functions. Undoubtedly there were many errors in Excel 4, but most of these have been eliminated in the Excel 5 version used in this article. A former student of mine, Andrew Harrison, conducted an extensive test of the probability functions in Excel 5 as part of an undergraduate project. He compared tabulated values of the standard distributions from Excel and Minitab and concluded that for all practical purposes the results were no different. Problems with Excel’s functions do occur in the extreme margins of some of the tables. For example TINV(0.0000001, n) returns the arbitrary result 5,000,000 for any number of degrees of freedom, whereas Minitab (wisely) refuses to calculate a percentage point for such a small probability. Some disagreement was also found between Excel and Minitab in the percentage points of the F-distribution with n2=1 and tail probability 0.001, but as the results are of the order of 500,000 this is perhaps to be expected!Conclusion In days of limited resources, the homespun tables suggested here may be a practical option for teachers of statistics. However, they should be viewed as an adjunct to a standard set of published tables, not a replacement. Published tables generally include far more than the standard "school" distributions. Murdoch and Barnes, for example, include an extensive range of quality control tables and tables for non-parametric tests, while Lindley and Scott supply Bayesian confidence limits for Binomial and Poisson parameters. Any student who chooses to study Statistics at a higher level should still have a set of tables on their bookshelf. References Evans, I G (1997). A Note on p-values. Teaching Statistics 19(1), 22-23. Lindley, D.V. and Scott, W.F. (1995). New Cambridge Statistical Tables, Second Edition. Cambridge University Press. Murdoch, J and Barnes, J A (1986). Statistical Tables for Science, Engineering, Management and Business Studies, Third Edition. Macmillan. |
| All Contents © Neville Hunt 1999 |