Neville Hunt's Homepage


Producing Statistical Tables Using Excel
by Neville Hunt, Coventry University


Acknowledgement

A version of this paper was published in the Summer 1997 issue of  Teaching Statistics.

Introduction

In a recent article Evans(1997) argues the case for using p-values in teaching tests of significance at the elementary level. One of the obstacles to using p-values in the school classroom is the comparative difficulty involved in their calculation. Tables of critical values for the Normal, t, Chi-squared and F distributions are widely available, both in self-contained books of tables (e.g. Murdoch and Barnes, 1986) or as appendices to textbooks. Statement of p-values requires tables of the cumulative distribution functions which, although commonly found for the Binomial, Poisson and Normal distributions, are rarely printed for other distributions. The distribution functions of t and chi-squared are tabulated by Lindley and Scott (1995).

Excel functions

One solution for teachers lies in the statistical functions of Microsoft’s Excel spreadsheet package. As I shall show, all the tables required for a school-level study of statistics can be generated very easily using Excel’s in-built statistical functions. Once spreadsheets of these tables have been constructed, copies may be printed for each student. Similarly, extracts can be inserted in handouts and test papers without copyright restrictions. Teachers can customise the tables to their preferred format, or that specified by the relevant Examinations Board. Moreover, tables can be produced for parameter values and ordinates not covered by the standard books.

Cumulative Binomial Probabilities

On a new worksheet set up the headings for the table as shown in Table 1. In C3 enter the formula =BINOMDIST($B3,$B$1,C$2,TRUE). Copy C3 across to G3 and down to Row 13. Format the table as you wish (alignments, fonts, borders, etc). This gives a table of probabilities P(X£ x). To obtain tables of P(X³ x), as given by Murdoch and Barnes, change the formula in C3 to =BINOMDIST($B$1-$B3,$B$1,1-C$2,TRUE). ). The table automatically recalculates when either n or one of the values of p is changed, although if n>10 the formula will need to be copied further down the spreadsheet. 

 

A

B

C

D

E

F

G

1

n =

10

2

p =

0.1

0.2

0.3

0.4

0.5

3

x =

0

0.3487

0.1074

0.0282

0.0060

0.0010

4

1

0.7361

0.3758

0.1493

0.0464

0.0107

5

2

0.9298

0.6778

0.3828

0.1673

0.0547

6

3

0.9872

0.8791

0.6496

0.3823

0.1719

7

4

0.9984

0.9672

0.8497

0.6331

0.3770

8

5

0.9999

0.9936

0.9527

0.8338

0.6230

9

6

1.0000

0.9991

0.9894

0.9452

0.8281

10

7

1.0000

0.9999

0.9984

0.9877

0.9453

11

8

1.0000

1.0000

0.9999

0.9983

0.9893

12

9

1.0000

1.0000

1.0000

0.9999

0.9990

13

10

1.0000

1.0000

1.0000

1.0000

1.0000

Table 1: Cumulative Binomial probabilities

Cumulative Poisson probabilities

Take a copy of the Binomial spreadsheet and delete the first row. The values of p can now be considered values of the mean, m, and the crucial formula in C2 is =POISSON($B2,C$1,TRUE). The output is shown in Table 2 - the Greek letter m is obtained using the Symbol font for m. Again, this is Pr(X£ x) and to obtain Pr(X³ x), the formula in C2 should be =IF($B2>0,1-POISSON($B1,C$1,TRUE),1.0000).

 

A

B

C

D

E

F

G

1

m=

0.1

0.2

0.3

0.4

0.5

2

x =

0

0.9048

0.8187

0.7408

0.6703

0.6065

3

1

0.9953

0.9825

0.9631

0.9384

0.9098

4

2

0.9998

0.9989

0.9964

0.9921

0.9856

5

3

1.0000

0.9999

0.9997

0.9992

0.9982

6

4

1.0000

1.0000

1.0000

0.9999

0.9998

7

5

1.0000

1.0000

1.0000

1.0000

1.0000

Table 2: Cumulative Poisson probabilities

Normal Distribution

In Row 1 enter the label z and the values 0.00:0.09. In Column A enter the values 0.0:4.0 in steps of 0.1. In B2 enter the formula =NORMSDIST($A2+B$1) and copy across to K2 and down to Row 42 to obtain the cumulative Normal probabilities. Alternatively, to obtain the tail probabilities shown in Table 3, replace this formula by =1- NORMSDIST($A2+B$1).

 

A

B

C

D

E

F

G

H

I

J

K

1

z

.00

.01

.02

.03

.04

.05

.06

.07

.08

.09

2

0.0

.5000

.4960

.4920

.4880

.4840

.4801

.4761

.4721

.4681

.4641

3

0.1

.4602

.4562

.4522

.4483

.4443

.4404

.4364

.4325

.4286

.4247

4

0.2

.4207

.4168

.4129

.4090

.4052

.4013

.3974

.3936

.3897

.3859

5

0.3

.3821

.3783

.3745

.3707

.3669

.3632

.3594

.3557

.3520

.3483

Table 3: Tail probabilities of the Normal distribution

To obtain particular percentage points, enter the tail probabilities, a, required in Row 1 and the corresponding z-values in Row 2 are obtained from the formula =NORMSINV(1-B$1) in B2 copied across to G2.

 

A

B

C

D

E

F

G

1

a

0.2

0.1

0.05

0.025

0.01

0.005

2

z

0.8416

1.2816

1.6449

1.9600

2.3263

2.5758

Table 4: Percentage points of the Normal distribution

t-Distribution

Percentage points of the t-distribution can be constructed as shown in Table 5. Here the formula =TINV(C$1,$B2) is entered in C2 and copied across and down as required. The Greek lettering is obtained using the Symbol font for "a" and "n".

 

A

B

C

D

E

F

G

H

1

a=

0.1

0.05

0.025

0.01

0.005

0.001

2

n=

1

6.314

12.706

25.452

63.656

127.321

636.578

3

2

2.920

4.303

6.205

9.925

14.089

31.600

4

3

2.353

3.182

4.177

5.841

7.453

12.924

5

4

2.132

2.776

3.495

4.604

5.598

8.610

Table 5: Percentage points of the t-distribution

In order to be able to state approximate p-values, a table of the tail probability of the t-distribution is required as in Table 6. The key formula in C2 is =TDIST($B2,C$1,1), the final 1 indicating that a one-tail probability is required.

 

A

B

C

D

E

F

G

1

n=

1

2

3

4

5

2

t=

0.0

.5000

.5000

.5000

.5000

.5000

3

0.1

.4683

.4647

.4633

.4626

.4621

4

0.2

.4372

.4300

.4271

.4256

.4247

5

0.3

.4072

.3962

.3919

.3896

.3881

6

0.4

.3789

.3639

.3580

.3548

.3528

7

0.5

.3524

.3333

.3257

.3217

.3191

8

0.6

.3280

.3047

.2954

.2904

.2873

9

0.7

.3056

.2782

.2672

.2613

.2576

10

0.8

.2852

.2538

.2411

.2343

.2300

11

0.9

.2667

.2316

.2172

.2095

.2047

12

1.0

.2500

.2113

.1955

.1870

.1816

13

1.1

.2349

.1930

.1758

.1665

.1607

14

1.2

.2211

.1765

.1581

.1482

.1419

Table 6: Tail probabilities of the t-distribution

Chi-squared distribution

This can be handled in exactly the same way as the t-distribution. A table of percentage points similar to Table 5 can be generated using the formula =CHIINV(C$1,$B2) in C2, while a table of upper-tail probabilities similar to Table 6 is generated by the formula =CHIDIST($B2,C$1). A complication with this latter table is that the range of values of c2 increases with n. For n<10 a suitable range of values in Column B is 0 to 30 in steps of 0.5.

F-distribution

The upper percentage points of the F-distribution are best tabulated as a separate table for each tail probability a. Table 7 shows a possible layout, where the formula in C3 is =FINV($B$1,C$2,$B3). Changing the value in B1 causes the whole table to re-calculate automatically.

 

A

B

C

D

E

F

G

H

I

J

K

1

a=

0.05

2

n1=

1

2

3

4

5

6

7

8

9

3

n2=

1

161.45

199.50

215.71

224.58

230.16

233.99

236.77

238.88

240.54

4

2

18.51

19.00

19.16

19.25

19.30

19.33

19.35

19.37

19.38

5

3

10.13

9.55

9.28

9.12

9.01

8.94

8.89

8.85

8.81

6

4

7.71

6.94

6.59

6.39

6.26

6.16

6.09

6.04

6.00

7

5

6.61

5.79

5.41

5.19

5.05

4.95

4.88

4.82

4.77

Table 7: Upper 5% percentage points of the F-distribution

There is not really any convenient way of tabulating tail probabilities for the F-distribution. Table 8 shows a possible layout which (suitably extended) would suffice for most problems tackled at school level. The formula in C3 is =FDIST($B3,C$1,C$2) which is copied throughout the table.

 

A

B

C

D

E

F

G

H

I

J

1

n1=

1

1

1

1

1

2

2

2

2

n2=

1

2

3

4

5

1

2

3

3

F=

1.0

0.500

0.423

0.391

0.374

0.363

0.577

0.500

0.465

4

2.0

0.392

0.293

0.252

0.230

0.216

0.447

0.333

0.281

5

3.0

0.333

0.225

0.182

0.158

0.144

0.378

0.250

0.192

6

4.0

0.295

0.184

0.139

0.116

0.102

0.333

0.200

0.142

7

5.0

0.268

0.155

0.111

0.089

0.076

0.302

0.167

0.111

Table 8: Tail probabilities of the F-distribution

Random Numbers

Despite the ready availability of random number functions on electronic calculators, tables of random integers still have a role to play in random sampling and simple simulation experiments. In Excel random integers are conveniently generated as text in blocks of four to a cell using the formula =RIGHT(FIXED(RAND( ),4),4) as shown in Table 9, or individually as numbers using the formula =INT(10*RAND( )).

 

A

B

C

D

E

F

G

1

7161

6488

0158

3418

9685

4867

2198

2

6882

8997

9075

6825

5176

2010

0647

Table 9: Random integers in blocks of four

Random Normal deviates are generated by the formula =NORMSINV(RAND( )). Equally useful are tables of random exponential deviates generated by = -LN(1-RAND( )). When multiplied by a mean rate, 1/l, these can be used to simulate inter-arrival times between events in a Poisson process with mean l.

Accuracy

Doubts have been expressed in some quarters regarding the accuracy of Excel’s statistical functions. Undoubtedly there were many errors in Excel 4, but most of these have been eliminated in the Excel 5 version used in this article. A former student of mine, Andrew Harrison, conducted an extensive test of the probability functions in Excel 5 as part of an undergraduate project. He compared tabulated values of the standard distributions from Excel and Minitab and concluded that for all practical purposes the results were no different. Problems with Excel’s functions do occur in the extreme margins of some of the tables. For example TINV(0.0000001, n) returns the arbitrary result 5,000,000 for any number of degrees of freedom, whereas Minitab (wisely) refuses to calculate a percentage point for such a small probability. Some disagreement was also found between Excel and Minitab in the percentage points of the F-distribution with n2=1 and tail probability 0.001, but as the results are of the order of 500,000 this is perhaps to be expected!

Conclusion

In days of limited resources, the homespun tables suggested here may be a practical option for teachers of statistics. However, they should be viewed as an adjunct to a standard set of published tables, not a replacement. Published tables generally include far more than the standard "school" distributions. Murdoch and Barnes, for example, include an extensive range of quality control tables and tables for non-parametric tests, while Lindley and Scott supply Bayesian confidence limits for Binomial and Poisson parameters. Any student who chooses to study Statistics at a higher level should still have a set of tables on their bookshelf.

References

Evans, I G (1997). A Note on p-values. Teaching Statistics 19(1), 22-23.

Lindley, D.V. and Scott, W.F. (1995). New Cambridge Statistical Tables, Second Edition. Cambridge University Press.

Murdoch, J and Barnes, J A (1986). Statistical Tables for Science, Engineering, Management and Business Studies, Third Edition. Macmillan.



All Contents © Neville Hunt 1999