Skip to main content

Full text of "hp :: 9000 200 :: 98820-13111 StatisticalLibrary Jul82"

See other formats


HP Computer Systems 



Statistical Libraiy 

for the HP 9826 and. 9836 Computers 




m 



HEWLETT 
PACKARD 



m 



HEWLETT 
PACKARD 



Warranty Statement 

Hewlett-Packard makes no expressed or implied warranty of 
any kind, including, but not limited to, the implied warranties of 
merchantability and fitness for a particular purpose, with 
regard to the program material contained herein. 
Hewlett-Packard shall not be liable for incidental or 
consequential damages in connection with, or arising out of, 
the furnishing, performance or use of this program material. 

HP warrants that its software and firmware designated by HP 
for use with a CPU will execute its programming instructions 
when properly installed on that CPU. HP does not warrant that 
the operation of the CPU, software, or firmware will be 
uninterrupted or error free. 

Use of this manual and flexible disc(s) supplied for this pack is 
restricted to this product only. Additional copies of the 
programs can be made for security and back-up purposes 
only. Resale of the programs in their present form or with 
alterations, is expressly prohibited. 

Restricted Rights Legend 
Use, duplication, or disclosure by the Government is 
subject to restrictions as set forth in paragraph (b)(3)(B) of 
the Rights in Technical Data and Software clause in DAR 
7-1 04.9(a). 



s\ 



Statistical Library 

for the HP 9826 and 9836 Computers 

Manual Part No. 98820-13111 
Disc Part Numbers 



Basic Statistics 
General Statistics 
Statistical Graphics I 
Statistical Graphics II 
Regression Analysis 
Analysis of Variance I 
Analysis of Variance II 
Principle Components 
and Factor Analysis 
Monte Carlo Routines 
Monte Carlo Tests 



98820-13114 
98820-13115 
98820-13116 
98820-13117 
98820-13118 
98820-13124 
98820-13125 
98820-13126 

98820-13127 
98820-13128 



Important 

The flexible disc containing the programs is very reliable, but being a mechanical device, is 
subject to wear over a period of time. To avoid having to purchase a replacement medium, we 
recommend that you immediately duplicate the contents of the disc onto a permanent backup 
disc. You should also keep backup copies of your important programs and data on a separate 
medium to minimize the risk of permanent loss. 




Hewlett-Packard Desktop Computer Division 

3404 East Harmony Road, Fort Collins, Colorado 80525 
Copyright by Hewlett-Packard Company 1982 



Printing History 



New editions of this manual will incorporate all material updated since the previous edition. 
Update packages may be issued between editions and contain replacement and additional 
pages to be merged into the manual by the user. Each updated page will be indicated by a 
revision date at the bottom of the page. A vertical bar in the margin indicates the changes on 
each page. Note that pages which are rearranged due to changes on a previous page are not 
considered revised. 

The manual printing date and part number indicate its current edition. The printing date 
changes when a new edition is printed. (Minor corrections and updates which are incorporated 
at reprint do not cause the date to change.) The manual part number changes when extensive 
technical changes are incorporated. 

July 1982... First Edition 



Ill 



Table of Contents 



Commentary vii 

Summary of available routines viii 

Basic Statistics and Data Manipulation 1 

General Information 1 

Start 6 

Edit 10 

Tranform 12 

Missing Value 13 

Recode 15 

Sort 16 

Subfiles 18 

Change Names 18 

Store Data 18 

Join 19 

Printer Is 20 

Select and Scan 21 

Basic Statistics 22 

Missing Value 24 

Go To Advanced Stat 23 

Return to BSDM 24 

Backup 24 

Examples 25 

Regression Analysis 55 

General Information 55 

Multiple Linear Regression 58 

Stepwise Regression (Variable Selection Procedures) 60 

Polynomial Regression 64 

Nonlinear Regression 66 

Standard Nonlinear Regressions 71 

Residual Analysis 73 

Examples 75 



IV 



Statistical Graphics 127 

General Information 127 

Common Plotting Characteristics 129 

Time Plot 130 

Histogram 131 

Normal Probability Plot 134 

Weibull Probability Plot 135 

Scattergram 136 

Semi-Log Plot 136 

Log-Log Plot 136 

3D Plot 137 

Andrew's Plot 138 

Examples 139 

General Statistics 157 

General Information 157 

One Sample Tests 158 

Paired Sample Tests 164 

Two Independent Sample Tests 169 

Multiple-Sample (>3 Samples) Tests 175 

Statistical Distributions (see Table 1, next page) 181 

Examples 186 

Analysis of Variance 217 

General Information 217 

Discussion 219 

Data Structures 228 

Factorial Design 242 

Nested or Partially Nested Design 243 

Split Plot Designs 245 

One-Way Classification 246 

Two-Way Unbalanced Design 247 

One-Way Analysis of Covariance 248 

F-Prob 250 

Orthogonal Polynomials 251 

Contrasts 252 

Interaction Plots 254 

Multiple Comparisons 255 

Examples 257 

Principal Components and Factor Analysis 

307General Information 307 

Principal Components 308 

Factor Analysis 309 

Discussion 311 

Methods and Formulae 313 

Examples 318 



Monte Carlo Simulations 355 

General Information 355 

9826/36 Uniform Random Number Generator 359 

Random Number Generators 360 

Beta 361 

Binomial 362 

Chi-Square 363 

Exponential 364 

F 365 

Gamma (Alpha) 366 

Gamma (A,B) 367 

Geometric 368 

Lognormal 369 

Negative Binomial 370 

Standard Normal 371 

Normal 372 

Bivariate Normal 373 

Pareto of the First Kind 374 

Pareto of the Second Kind 375 

Poisson 376 

Random Points on M-dimensional Unit Sphere 377 

Super Uniform 378 

t 379 

Type I Extreme Value 380 

Type II Extreme Value 381 

Uniform 382 

Weibull 383 

Tests for Randomness 384 

Chi-Square 384 

Kolmogorov-Smirnov 386 

Maximum-of-T 387 

Modified Poker 388 

Runs 389 

Serial 390 

Spectral 391 

Elementary Sampling Techniques 393 

Selection Sampling 393 

Shuffling 394 

Appendix 

Changes Necessary For Larger Data Sets 397 

Statistics Library Data Formats 398 

Statistical Tables 407 



VI 



Table 1 
Statistical Distributions 

Table Values and Right-Tail Probabilities 



Continuous 

1. Normal 

2. Two-paremeter gamma 

3. Central F 

4. Beta 

5. Student's T 

6. Weibull 

7. Chi-square 

8. Laplace 

9. Logistic 



Discrete 

1. Binomial 

2. Negative Binomial 

3. Poisson 

4. Hypergeometric 

5. Gamma Function 

6. Beta Function 

7. Single Term Binomial 

8. Single Term Negative Binomial 

9. Single Term Poisson 

10. Single Term Hypergeometric 



Vll 



Commentary 



The Stat Library, which we have developed for Hewlett-Packard, is an integrated package 
developed specifically for the HP desktop computers. We set as our objective in preparing this 
library to develop an integrated system which provides the user with a flexible collection of 
routines for data manipulation, exploration, and analyses. The package uses a common front 
end, which provides for considerable flexibility in data handling. The Basic Statistics and Data 
Manipulation (BSDM) front end has been updated and enhanced for inclusion with this library. 
The programs are interactive in operation using the CRT display to list a "menu" of options at 
appropriate times. The group of special function keys are used only with the BSDM routines to 
connect the user directly with a specific operation. The statistical analyses range from the very 
elementary summary statistics to complicated routines for principal com-ponents and factor 
analysis. 

The figure on the next page is a diagram showing the essential organizational structure of the 
Stat Library. Notice that there are six major segments in the Stat Library which operate on the 
data: Input Routines, Manipulation Routines, Data File Management Routines, Selection 
Routines, Data Exploration Routines, and Statistical Analysis Procedures. 

This library has evolved out of our ten years' experience in developing software for desktop 
computers. We are currently using these routines in our Statistical Laboratory. We hope you 
will find them useful. 

Thomas J. Boardman, Ph.D. 
Professor-In-Charge 
Statistical Laboratory 
Colorado State University 
Fort Collins, CO 80523 



VI11 



HP Stat Library 
Integrated Statistical Routines 





INPUT 
ROUTINES 

\ 












DATA 

EXPLORATION 

ROUTINES 








\ 

DATA 

/ \ 




SELECTION 
ROUTINES 

\ 


s. 












/ 

MANIPULATION 
ROUTINES 




DATA FILE 
MANAGEMENT 




STATISTICAL 

ANALYSIS 
PROCEDURES 




Operation 




Subprogram Package 


(Key Words) 


Description 


Containing Routine 


Input Routines 




BSDM 


Keyboard 


Direct numeric input by the user. 




Mass Storage 


Of data previously stored on one of 
several mass storage devices. 




Graphics Input 


Using the Graphics Tablet 




Other 


User supplied routines 




Manipulation Routines 






Sort 






Sortinc 


data on one or two variables. 





Join 

Rename 

Subfile 

Recode 

Edit 

Transformation 



Data Recovery 



Joining two data sets either by adding 
variables or observations to existing set. 

Change variable label, subfile name, or 
project title. 

Several methods to specify or create 
subfiles (groups within your data set). 

Method to recode variable values into 
another variable. 

To correct, add, or delete observations 
or variables. 

By algebraic routines including user 
supplied function. To assign missing 
values. To create new variables by us- 
ing ranks, subfile codes, sequence num- 
bers, standardized scores, or lagged vari- 
ables. 

A backup data file may be accessed if 
necessary. 



(Continued) 



IX 



Data File: Management Routines 

Store 

Store Subfile(s) 
Store Variables 
Direct 

Purge 

Selection Routines 

By Subfiles 

Exclude Missing Values 
Select 



BSDM 



Save data set on user file. 

Save particular subfile on a user file. 

Save particular variables on a user file. 

Obtaining a catalog or directory of data 
file(s). 

Eliminate selected data files. 

To choose a portion of the data for 
further analyses. 

Always excluded from analyses and 
data exploration routines. 

To choose a portion of the data set for 
further processing on the basis of values 
from one or two variables. The values 
selected are shown on the CRT and the 
data set is reduced down to the selected 
data set size. 



BSDM 



Data Exploration Routines 

Selected Listing 

Scan 

Summary Statistics 

Graphics Displays 

Frequencies 
Cross Tabulation 



Several ways are available to list all or a BSDM 
portion of the data set. 

Same as Select (above) except that BSDM 
data set is not reduced. 

Many basic statistics such as mean, BSDM 
median, standard deviation, etc., on all 
or a portion of the data set. 

Eight common statistical graphics for Stat Graphics 
studying data sets such as normal prob- 
ability plots and semi-log plots. 

Under development for future addition 
to library. 

Under development for future addtion 
to library. 



(Continued) 



Statistical Analysis Procedures 

General Parametric Methods 



Common one, two-independent, and 
two-paired sample inferential proce- 
dures. Also one way analysis of variance. 



General Statistics 



General Nonparametric Method 



Regression Analyses 
Polynomial 

Multiple Linear Regression 
Stepwise 

Nonlinear 



Standard Nonlinear 

Analysis of Variance (AOV) 
One Way 
One Way Covariance 

Two Way Unbalanced 

Factorial 

Split Plot 

Nested 

Principal Components 
and Factor Analysis 

(Others) 



Common one, two-independent, and 
two-paired sample nonparametric in- 
ferential procedures. Also the Kruskal 
Wallis test for 3 or more independent 
samples. 

Selection procedures including the step- 
wise, forward, backward, and manual 
routines. 

From user supplied functions using the 
Marquardt Compromise algorithm. 

Several common nonlinear models are 
available for use on your data set. 



One way AOV procedure. 

One way analysis of covariance proce- 
dure. 

The AOV procedure for two way facto- 
rials which are unbalanced. 

AOV procedure for up to 5 factors with 
balanced data. 

AOV methods for several types of split 
plot designs with up to 4 factors. 

AOV methods for completely or partial- 
ly balanced nested designs. 

Common multivariable dimension re- 
duction procedures. Extensive use of 
graphics. 

In the future. 



General Statistics 



Regression Analysis 



Analysis of Variance 



Principal Components 

and 
Factor Analysis 



Basic Statistics 
and Data Manipulation 



General Information 



Description 

This set of programs allows you to create a statistical data base which can be accessed by 
other Hewlett Packard statistical routines. It alleviates the need to key in data each time a new 
statistical procedure is used. 

The capabilities of this set of programs include data entry and several manipulative data 
operations. A wide variety of summary statistics may be obtained. In addition, the programs 
have many ease-of-use features - the human interface is a major concern in designing the 
programs. Specific capabilities follow. 



Data Entry: 



Data Manipulation: 



Summary Statistics: 



Other Features: 



Keyboard 

Magnetic media (flexible discs) 

Graphics tablet 

Other input devices (paper tape, etc.) 

Edit incorrect/incomplete data sets 

Transform - both algebraic and non-algebraic 

Assign codes to intervals of data 

Sort 

Divide data set into subfiles 

Join two data sets 

Select portions of the data 

Basic statistics (mean, standard deviation, etc.) 

Correlation matrix 

Order statistics (max, min, median, etc.) 

Error detection 

Easy error correction 

Variables can be named 

Data can be stored for future reference 

Data can be listed 

Data can be scanned for specified qualities 

A backup file of the data can be recalled 

Printer unit can be changed 

Missing data values can be assigned 



Typical Program Flow 



Get data into memory from 

keyboard or disc 

(RESTART) 



List the entered data 
(LIST) 



Correct mistakes 
(EDIT) 



Break the data into subfiles 
(SUBFILE) 



Transform the original data 

by normalizing, etc. 

(TRANSFORM) 



List the edited and transformed 
data set 
(LIST) 



Obtain basic statistics such as 

means, standard deviations, etc. 

(STATS) 



Go to an advanced statistical routine 

such as Regression, AOV, etc. 

(ADV. STAT) 



Special Considerations 

Data Matrix Configuration 

The data matrix incorporated in this program should be thought of as a p-by-n array whose 
columns correspond to observations and whose rows correspond to variables as shown 
below. 



OBSERVATIONS 



Oj 



O, 



o, 



o r 



VARIABLES 



V 3 



V. 



Subfiles may be created, in which case the structure becomes only slightly more complex as 
shown below. 



OBSERVATIONS 



SUBFILE 1 SUBFILE 2 

OiO^.O^ o ni + 1 ...o ni + „ 2 



SUBFILES 

^n, + ...n s _, + l-"*-'n 1 + ... + n s 



VARIABLES 



V 2 



V. 



Scratch Data Sets 

There are two data files which are used by the statistical data base. They are "DATA" and 
"BACKUP". DATA is the file which contains the most current form of your data matrix. It is 
updated upon completion of any procedure which modifies the data matrix or any variable 
names. Thus, DATA contains the data that will be used for any statistical calculations. BACK- 
UP on the other hand, is not updated automatically. After the data has been first entered a 
copy of the DATA file is automatically put into BACKUP. From then on BACKUP can only be 
modified manually via the BACKUP PROCEDURE. This procedure will also let you retrieve 
the BACKUP file and copy it to the DATA file. So, if you erroneously alter your data matrix, 
the original data set is still retrievable. 



Data File Configuration 

The scratch file on the program medium, "DATA", and any files created to hold stored data 
and related information are configured as follows. 

The data file is broken into logical records of 1280 bytes each (if you are unfamiliar with logical 
records, refer to your desktop's Programming Techniques Manual.) The first logical record is a 
"header file", which contains information pertinent to the data set which is stored in the 
remaining logical records. The header file contains the following information (variables): 

Limitations 

data set title (T$) 80 characters 

number of observations (No) No*Nv < = 1500 

number of variables (Nv) 50 

variable names (Vn$(*)) 10 characters each 

number of subfiles(Ns) 20 

subfile names(Sn$(*)) 10 characters each 

subfile characterizations (Sc(*)) N/A 

The remaining logical records contain D(*,*), the data matrix. 

For a detailed explanation of the data file, see the appendix. 

Parser 

BSDM is equipped with an elementary parser. This means that wherever an answer could 
require multiple responses the parser will separate your response into its individual parts. For 
example, when asked "What variables are desired?", you may respond in three ways: 

1. ALL: enter ALL if you want the entire set of variables to be used 

2. 1,2,3,...: enter the specific variables you want 

3. 4-7: enter a dash (-) if you want all variables from 4 to 7 

So, a sample response for the question might be: 
1,3,5-8,10,15.21-25 

The response would be interpreted to mean that you requested variables 
1,3,5,6,7,8,10,15,21,22,23,24 and 25. 

Thus, anywhere multiple values may be input, you may enter the responses in this manner. 

In several cases the words "NONE" or "NO" are also possible responses. When they are 
allowed, it is mentioned in the prompt. These words may be used interchangeably. 

Note 

Entering negative numbers is no different than entering positive 
ones. For example, the input: 

-10- -3,1-4 
would mean all numbers between -10 and -3 and all between 1 
and 4. 



Incorrect Responses 

If a response outside the range of plausible responses is input from the keyboard, an appropri- 
ate message is displayed on the CRT. Program execution is resumed by asking the question, or 
in some cases a previous question, again. 

If a plausible response is given, but it is not correct, a couple of possibilities exist. First, if an 
incorrect value has been entered for a data point, it may be corrected using the EDIT program. 
Second, in many cases, responses to several questions are printed on the CRT. Then a 
question such as "Is the above information correct?" is asked. This allows any of the printed 
information to be changed. 

Hardware Requirements 

9826 or 9836 computer with 240k bytes, available user memory — required. 

External printer — required. The CRT may be used as the printer but results will be difficult to 
read and understand. 

External plotter — optional. 

External mass storage — optional. 



Note 

Both the user-defined transformation option and non-linear regres- 
sion require that you specify the form of the functions before you 
begin BSDM. See page 69 for an explanation. r 

Getting Started 

1. If your 9826 or 9836 computer is ROM-based, go to Step 2. Otherwise, if your system is 
RAM-based, or if you do not wish to turn the computer OFF and the complete system is 
ready: 

a. Make sure that Basic is ready and all peripherals are properly connected and turned 
on. (Make sure PI and P2 are set properly if a hardcopy plotter is being used). 

b. Insert the Basic Statistics disc into the internal flexible disc drive. 

c. Type : Scratch A ( EXECUTE ) 

d. Type: Load "AUTOST" >1 ( EXECUTE ) 

e. Go to Step 5. 

2. If the 9872C (or any peripheral) is being used, make sure it is properly connected and 
turned on. Make sure PI and P2 are set properly if a hardcopy plotter is being used. 

3. Insert the Basic Statistics disc into the internal disc drive. 

4. Turn the computer on. 

5. You will be asked a series of questions which should be self-explanatory. If you have any 
questions turn to the Special Considerations section of the manual covering the proce- 
dure in question. You will find some general comments on how that section of the 
program works. 



Start 

Object of Program 

This program allows you to enter a data matrix into memory. The data may be entered from the 
keyboard, or from some other input device such as a graphics tablet, etc. Conversely, the data 
may have been entered previously and stored in the program scratch file ("DATA") or in a 
user-created file on a flexible disc or hard disc. In this case, the function of this program is to 
retrieve the previously stored data and place it into memory so that further operations can be 
performed. After the data is in memory, a listing option is available to obtain a complete or 
partial copy of the data. 

Typical Program Flow 



j Specify printer 1 



Specify data type, 
e.g.. raw data 



Specify data entry mode 



Magnetic or disc 




Keyboard ! 




■ 






Data retrieved, description 
of data set printed 




Enter data set title, # variables, 
# observations, variable names 










■ 






1 Enter data manually! 















Data stored on DATA 
& BACKUP files 



Special Considerations 

Terminology 

The displayed prompts concerning the scratch file ("DATA"), whether the data was stored by 
this program, and whether the data is in the proper configuration are explained here and in the 
Special Considerations section of General Information for BSDM. 

The prompts concerning the data medium and program medium may cause confusion. The 
word "medium" is used since the set of programs making up this software package may be on 
floppy disc. Thus, the "program medium" refers to the disc on which the programs making up 
this package are stored. Conversely, the "data medium" refers to the disc on which the file 
containing the data matrix resides. In some cases, the program medium and the data medium 
are the same. However, this is not determined by the program and hence, the prompts are 
displayed to make sure the correct medium is in the correct device. 

Data on Mass Storage 

If the data is on a mass storage device, it may have been stored in one of four ways. The 
following discussion explains the prompts that apply to each situation. 

1. If the data was entered using this statistics package (and was the last data set used on this 
package), it will be on the disc in the scratch file called "DATA". Thus, an affirmative 
answer to the prompt "Is data stored on the program medium's scratch file (DATA)?" will 
retrieve the data and related information. 

2. The data may have been entered using the Basic Statistics and Data Manipulation 
routines and then stored using the STORE routine of BSDM. After specifying the file 
name and the storage unit in which the data resides, you should answer Yes to the 
prompt "Was data stored by this program?". Then, the data and related information will 
be retrieved. 

3. The data may be stored as: all observations of variable one followed by all observations 
of variable two, etc. This is in the same configuration as data stored by the BSDM 
routines, i.e., variables = rows and observations = columns. To retrieve the data, a Yes 
response to the prompt "Is the data in proper configuration...?" should be given. 

4. The data may be stored as: all variables of observation one followed by all variables of 
observation two, etc. This is the transpose of what is expected by the BSDM routines, i.e., 
observations = rows, variables = columns. To retrieve this type of data a Yes response 
should be given to the prompt "Data stored as contiguous array with observations = 
rows...?". 

Notice that in cases 3 and 4, the data was stored by a program other than a statistics routine. 
Thus, no variable names or other auxiliary information will be stored along with the data. 



8 



As an example, suppose you have run your own program where you have created a file by 
storing data acquired from three sensors as it came in from the devices. A picture of five 
readings (observations) from the sensors would look like this: 





1 


Reading 

2 


3 


4 


5 


Sensor 1 
Sensor 2 
Sensor 3 


7.2 
8.0 

7.8 


7.4 
7.9 
7.5 


7.1 
8.1 
7.5 


7.2 
7.8 
7.6 


7.3 
8.0 
7.9 



If the data were stored in this order: 7.2, 7.4, 7.1, 7.2, 7.3, 8.0,..., 7.5, 7.6, 7.9, then it is in 
what we call the proper configuration, and the situation is that described in note 3 above. 

Conversely, if the data were stored as: 7.2, 8.0, 7.8, 7.4, 7.9, 7.5, ... , 7.3, 8.0, 7.9, then it is 
the transpose of what is expected and the situation is that described in note 4 above. 



Keyboard Entry 

When entering data from the keyboard, an option to enter data one case at a time is offered. 
The following example will serve to explain this feature. Suppose an investigator has collected 
four observations on each of three variables. He has the following data matrix: 

Variable 







1 


2 


3 




1 


10 


2 


5 


Observation 


2 


11 


2 


6 




3 


9 


3 


7 




4 


9 


2 


6 



He elects to enter the data one case at a time. Then, when the prompt "Observation #, all 
variables (separated by commas) = ?" is displayed, he enters 10, 2, 5 and presses CONT, 
etc. This allows for quick entry of the data. 

The other form of keyboard entry will prompt you at each observation for the required vari- 
able. 



Missing Values 

If you have missing values, use an unused number for a temporary code for a missing value. 
Subsequently you can change your values to the program's value of -9999999.99999 by 
using the TRANSFORM operation. 



Graphics Input 

Data may be input by digitizing from a graphics tablet. You may find this form of input very 
useful. The following diagram briefly describes the types of information requested by the 
program. 



Specify printer 






Specify raw data 






Choose graphics input mode of data entry 




■ 


Specify graphics input device 






Input select code & bus code of device 






Input project title 






Input form of input, e.g. (x,y) pairs 




■ 


Specify digitizing mode 






Specify sample size requirements 






Digitize chart limits 






Input numeric values of limits 






Digitize data 



10 



"Other" Input 

Because of the wide variety of formats that could be used when entering data from "other" 
devices, no attempt was made to program in the necessary statements. It will be necessary for 
you to provide the statements before using the program. Refer to the Operating Manual of the 
appropriate device for detailed instructions. In general, though, 

1. Type: LOAD "F ILE1 " 

2. Press: ( EXECUTE ) 

3. Type: EDIT Other, in put ( EXECUTE ) 

4. Change the to a 1 in line 1731: Other_inpi.it: Implemented^) 

5. Press: ( ENTER) 

6. Press: (PAUSE) 



7. Type: EDIT 1 h e r i n ( EXECUTE ) 

8. Type in and enter the appropriate statements for "other" input, referring to the Operat- 
ing Manual for the input device. 

Edit 

Object of Program 

This program is designed to allow you to perform a variety of editing procedures on your data 
set. The editing capabilities include: 

Correct a data value 

Correct an entire observation 

Delete a variable 

Delete an observation 

Add a variable 

Add an observation 

Insert an observation (in ordered data) 

Delete a subfile 

All of these operations may be performed repeatedly. For example, three variables may be 
added in succession. After the data matrix has been edited, you are given the option of listing 
the data. 

Special Considerations 

Order of Corrections 

As stated in the program note printed on the screen, the data is renumbered after deletions or 
insertions are performed. For this reason, if more than one deletion (insertion) is to be per- 
formed, it is recommended that the highest-numbered observation (or variable) be deleted, 
then the next highest-numbered, etc. For example, if observations three and eight are to be 
deleted, then it is recommended to delete observation eight first, then observation three. 
Notice that if observation three were deleted, first, the subsequent renumbering would move 
observation eight to position seven. The recommendation is meant to alleviate confusion 
which may occur due to the renumbering. If you delete several observations at once using the 
answering technique described in the Special Considerations section of BSDM General In- 
formations under "Parser", you do not need to worry about the renumbering problem. Your 
responses will be sorted from highest to lowest automatically. So to delete observations five 
through eight, just enter 5-8 and you will have no problems. 



11 



Subfiles 

Insertions or deletions of observations will affect the content of subfiles which exist at the time 
of editing. For example, if subfile one consists of the first 10 observations while subfile two 
consists of the last 20 and if observation five is deleted, then observation ten (formerly num- 
bered 11) will have jumped from subfile two to subfile one. Thus, it may be necessary to 
change the subfile structure after editing. It is recommended that subfiles be created only after 
all editing has been performed. 

Correcting Data Value(s) 

When correcting a data value, you must specify the variable number and observation number 
of the value to be corrected. Then, the old value is displayed prior to your correction so you 
can be sure you are altering the correct value. 

Correcting Observation(s) 

When correcting an entire observation, you specify the observation to be corrected. The old 
values are then listed on the screen and you may then enter the new values one-at-a-time. 

Adding Observation(s) 

In adding observations you will be asked to enter the number of observations that are to be 
placed at the end of the data matrix. Observations should be entered one-at-a-time with the 
data values separated by commas. 

If an observation is to be inserted, the position of the insertion must be specified by entering 
the number of the existing observation which the insertion will precede. For example, if an 
observation is to be inserted between observations 8 and 9, you must enter 9 when the 
prompt "Insertion to precede observation #?" is displayed. You will then be asked to enter 
the number of observations that are to be inserted at this point. 

Deleting Observation(s) 

You will be asked to enter the numbers corresponding to the observations to be deleted. They 
will be sorted and the observations will be deleted from highest-numbered to lowest- 
numbered to avoid renumbering confusion. 

Deleting Subfile(s) 

This option works the same as deleting observations. All you need to specify is the subfile 
number and all observations within the subfile will be deleted. All observations after the ones 
deleted will be renumbered. 

Deleting Variable(s) 

You will be asked to enter the numbers corresponding to variables to be deleted. They will be 
sorted and the variables will be deleted from highest-numbered to lowest-numbered. 

Exceeding Program Limitations 

If the addition of an observation or of a variable will exceed program limitations, these options 
will not be executed. 

Methods and Formulae 

The data matrix is redimensioned into a row vector to facilitate the shuffling of elements 
necessitated by the editing operations. The vector contains all the observations of variable 
one, followed by the observations of variable two, etc. When an observation is inserted, for 
example, the elements of the data vector are shuffled one-at-a-time to make room for the 
incoming observation. Similarly, when an observation is deleted, the remaining observations 
are "packed" together so that the resultant data vector has no "holes" between observations. 



12 



Transform 

Object of Program 

This procedure is designed to allow you to transform your data. The transformations available 
fall into three categories. Algebraic transformations allow you to perform the standard algebraic 
operations on one or two variables in the data set. There is also the capability for you to define 
your own transformation. The second category of transformations is the assigning of missing 
values. With this section you may assign any value in the data set to correspond to missing data. 
The final section is new variables. Here, you may perform such operations as generating 
uniform random numbers, standardizing variables, lagging variables, creating rank variables, 
sequence variables, and variables corresponding to subfiles. 



In all the sections the transformed results will be placed in a variable you specify, either old or 
newly-created. Hence, transformations on more than two variables may be performed iter- 
atively or via a transformation defined by you. 

Special Considerations 

Missing Values (Algebraic Transformations) 

None of the pre-specified algebraic transformations are applied to missing values. Thus, mis- 
sing values are unaffected by these transformations. However, this is not necessarily the case 
with the user-defined transformation. If you define a transformation and there are missing 
values, you must make provisions to ensure that the transformation is not applied to the 
missing values (unless, of course, this is desired). This may be accomplished as explained 
below. 

User-Defined (Algebraic Transformations) 

Before you start to run the Basic Statistics and Data Manipulation program, you should prepare 
your own transformation function and store it on the data storage medium. Consider the 
following example. Suppose your data set consists of four variables. There are missing values. 
You desire to form variable five as the sum of the exponential of variables one and three. If 
there is a missing value in either of these variables, you wish to assign a missing value to the 
transformed variable. Recall that the data is of the form D(J,I) where J is the variable number 
and I is the observation number. In the transformation routine the variable Z is used to denote 
the variable where the transformed data is to be stored. Thus, to accomplish the above- 
described transformation, follow the instructions below: 

1. Insert a flexible disc into the internal disc drive. 



2. Type: SCRATCH A ( EXECUTE ) 

3. Press: EDIT ( EXECUTE ) 

4. Now you should be able to see line number = 10" on the upper-left corner of the CRT. 
Start to type in your function as a subroutine. Press (ENTER) after each line. For example: 

10! A comment to identify perhaps your file n a m e • 
20 SUB Function <D<*) »Z .1 ) 

(Note: This line must be exactly the same as above.) 
30 IF D(l .1)0-9933939 4 99999 AND 
D(3 .1 ) 0-9999999. 93333 THEN SO 
40 D(Z .1 )=-9999993. 33999 



13 



50 SO TO 80 

BO D(Z tI)=EXP(D( 1 »I ) )+EXP(E(3 tl) ) 

70 ! Note: The value of Z will be asked by the program. You must specify the 

variable numbers for the right hand side of the equation (i.e., 1 and 3) 
80 SUBEND 

(Note: This line must be the last line of the subroutine) 

5. Press: ( CLR SCR ] 

6. Type: STORE "your filename : mass storage identifier" ( EXECUTE ) 

Now you can proceed with data entry through BSDM. 

Declaring Missing Values 

This section allows you to assign missing values to any or all of the variables in the data set. It 
may be used successively so that you can assign different missing values to each variable or 
different sets of variables. The program asks you to enter the variables to which a missing 
value is to be assigned. You are then asked what numbers are to be considered missing values 
for that group of variables. Then, these variables are scanned and all missing values are 
transformed to -9999999.99999, which is the standard missing value code. 

Create Rank Variables 

This operation will take a variable, rank its values in ascending order, and place the resulting 
ranks in the variable specified by you. 

As an example, consider the following variable which has four observations. 

Variable 1 
23 
25 
29 
20 

You could create a second variable which contains the ranks corresponding to the observa- 
tions in the first variable. You would obtain the following: 

Variable 1 Variable 2 
23 2 

25 3 

29 4 

20 1 



14 



Creating Variables by Subfile 

This option may only be used when a subfile structure is present. If used, this option will 
assign the subfile number associated with each observation to the specified variable. 

For a simple example, suppose you have a data set with one variable containing five observa- 
tions. Subfile one consists of the first two observations, while subfile two has the last three 
observations. In this case, you could create a second variable whose observations correspond 
to the subfile numbers associated with the original variable. This variable would look like the 
following. 

Variable 2 
1 
1 
2 
2 
2 

Creating Variables by Sequence Number 

By selecting this option, you can place the observation numbers in a specified variable. For 
example, in a data set with five observations, you could create a second variable which would 
look like the following: 

Variable 2 
1 
2 
3 
4 
5 

Creating Standardized Score Variables 

In this option, a chosen variable is standardized by the following formula: 

New Variable = Specified Variable - Mean of Specified Variable 
Standard Deviation of Specified Variable 

The new variable can be placed in any variable you specify. Notice that standardized variables 
have a mean of zero and a standard deviation of one. 



Creating Lag Variables 

The lag variable operation will take the value of a chosen variable n-lags before and use it as 
the current observation of the lagged variable being created. As an example, consider the 
following data set: 







Var.l 


Var.2 




1 


2 


3 




2 


1 


4 


Obs.# 


3 


4 


6 




4 


1 


2 




5 


2 


4 



15 



We can create variable 3 by lagging variable two by one lag. We can also create variable four 
by lagging variable one by two lags. We would obtain the following: 







Var.l 


Var.2 


Var.3 


Var.4 




1 


2 


3 


MV 


MV 




2 


1 


4 


3 


MV 


Obs.# 


3 


4 


6 


4 


2 




4 


1 


2 


6 


1 




5 


2 


4 


2 


4 



Notice that missing values are placed in the first n observations of an n-lag variable since 
lagged values cannot be assigned. 

Creating Uniform Random Number Variables 

This option allows you to generate uniform random numbers between zero and one and have 
them placed in a variable of your choice. 

As an example of the use of this option, you could select a random sample of the observations 
in your data set to be used in a subsequent analysis. To do this, you could first use the 
uniform random number option to assign a uniform random number to each observation. 
Then, you could use the select procedure (described later in this manual) to chose a portion 
of the data set based on the uniform random numbers. For example, if you selected observa- 
tions that had a corresponding random number value between zero and one-half, you expect 
to have selected about one-half of your data set. 



Recode 

Object of Program 

This program allows you to assign codes to various categories or classes of data. The categor- 
ies are intervals along the real number line and 20 of these may be specified. The recoding is 
done on one variable at a time. The same coding scheme may be used iteratively on succes- 
sive variables. A summary of the coding intervals, codes, and number of observations 
assigned to each code is printed as hard copy. 

Special Considerations 

Coding Schemes 

Four coding schemes are available for the sole purpose of eliminating unnecessary entries 
from the keyboard. If the coding intervals are all of the same length and are contiguous, that 
is, together they form a connected interval, then the interval construction can be accom- 
plished internally knowing only the interval length and lower limit for the first interval. Similar- 
ly, if the intervals are of equal length but noncontiguous, for example, 

[10,20), [25,35), [35,45), [50,60) 

then the lower limit of each interval needs to be specified but the upper limit may be com- 
puted internally. Hence, the coding schemes are meant only to minimize the amount of in- 
formation which needs to be entered from the keyboard. Clearly, the coding intervals could 
all be constructed by requiring you to enter the lower and upper limits for each and every 
interval (which is necessary, and what is done if the intervals are unequal and non- 
contiguous). 



16 



Coding is carried out one observation at at time. If you wish to recode more than one variable 
you must use the procedure successively, once for each variable to be recoded. Listed below 
are the available recoding options. 

1. Contiguous intervals of equal length 

2. Contiguous intervals of unequal length 

3. Non-contiguous intervals of equal length 

4. Non-contiguous intervals of unequal length 

Option 1 will recode a variable into equally spaced intervals that are side by side. The second 
option will recode based on intervals of unequal length that are side by side. Options 3 and 4 
will recode into intervals that need not be side by side. For equally spaced intervals, use option 
3 and for unequally spaced intervals use option 4. 

Brackets 

The brackets used to denote the coding intervals are meant to follow their usual mathematical 
interpretation, that is, the intervals are closed on the left and open on the right. Hence, if you 
want a value to fall into a certain interval, make sure it is strictly less than the upper limit for 
the interval. 

Observations Which Do Not Fall in an Interval 

If an observation does not fall into any of the coding intervals, a table will appear giving you 
three options on how to handle these values. You may either 1) leave them unrecoded, 2) 
assign them a special code, or 3) assign them the missing value code. 



Sort 

Object of Program 

This program allows the data matrix, or individual subfiles of the data matrix, to be sorted 
according to the values of one variable. For example, suppose you have five observations of 
three variables, say height, weight and age and want to arrange the observations in ascending 
order according to age. This is accomplished by sorting the data matrix according to variable 
three. The data may be sorted in ascending or descending order. 

If you want to perform a hierarchical sort, the sort procedure must be used successively. For 
example, suppose you wish to sort a data set on weight and within weight by age. To do this, 
you should first sort on age and then use the sort procedure again and sort on weight. The 
sort procedure also sorts either in ascending or descending order. A sort in ascending order 
will place the observations in order from lowest to highest based on the variable sorted. A 
descending-order sort will put the observations in order from highest to lowest. 



17 



Special Considerations 

Subfile Structure Options 

If subfiles are ignored, the entire data set will be sorted and, in the process, the composition of 
the subfiles is subject to change. The option of sorting certain subfiles may be used to sort a 
single subfile or a set of successive subfiles according to one variable. The option of sorting all 
subfiles may be used to sort each and every subfile. The options of sorting certain subfiles and 
sorting all subfiles treat each subfile as if it were a separate data set. Thus, the sort is done 
with respect to one subfile at a time. 

What Happens 

It is important to note that entire observations are moved when the sort is carried out. Thus, 
referring to the example given in the Object of Program section above, a person's height and 
weight remain with the person's age as shown below. 

Original Data Set 









Variable 








Height 


Weight 


Age 




1 


72 


170 


21 




2 


70 


165 


25 


Observation 


3 


69 


150 


20 




4 


70 


165 


25 




5 


73 


160 


19 



Data Set Sorted by Age 









Variable 








Height 


Weight 


Age 




1 


73 


160 


19 




2 


69 


150 


20 


Observation 


3 


72 


170 


21 




4 


70 


165 


25 




5 


70 


165 


25 



18 



Subfiles 

Object of Program 

This program allows you to specify subfiles or logical groupings of the observations. This may 
be accomplished by entering the number of observations in each subfile or by entering the 
observation number of the first observation in each subfile. A third option is to create subfiles 
for each level of a specified variable. Names for the subfiles are entered in all cases. A fourth 
option allows you to destroy the existing subfile structure. 

Special Considerations 

Use of Subfiles 

Subfiles may be created in order to specify logical groupings of observations. A subfile struc- 
ture allows you to consider each subfile as a separate data set or to lump all the subfiles 
together and analyze the overall data set. For example, suppose you want to determine the 
output generated each day by each of three shifts. You would like to analyze the data separ- 
ately for each of the three shifts as well as for the work force as a whole. You could form three 
separate data sets and do the individual analyses, then later join the three sets together for the 
overall analysis. However, since the same variables were measured for each of the shifts, the 
situation is well handled by specifying a subfile for each shift. The subfile structure options 
make it possible to do the analysis by subfile as well as for the overall data set. 



Change Names 



Object of Program 

This program allows you to rename the data set, to rename variables and/or to rename sub- 
files. These names are then stored, along with the data, on the program medium's scratch file 
("DATA"). You may change a single variable or subfile name, or you may change a set of 
names. 



Store Data 

Object of Program 

This program allows you to store the entire data matrix and related information in a file so 
that it may be retrieved at a later date for further analysis. Alternatively, a subset of the data 
matrix may be stored by specifying which variables and/or subfiles are to be saved. 



19 



Special Considerations 

Use of Program 

The store feature will be useful in two different situations. First, if an investigator has a data 
set which he may want to analyze further at a later date, he may store it and retrieve it later 
via the E&asic Statistics and Data Manipulation Start routine. Secondly, if several people have 
access to the data input programs, it becomes mandatory that each be able to store his data 
set in a unique place. Note that if only one person uses the routine on one data set it is 
unnecessary to use the store feature since the data and related information are kept in 
"DATA" - the scratch file on the program medium. 

Protecting Existing Data 

The existence of a file is checked in the program in an attempt to avoid the accidental loss of 
existing data. Thus, when a file is specified to receive the data, an attempt is made to ensure 
that you are not accidentally storing the new data in a file which you did not know existed. 



List 

Object of Program 

This program allows you to obtain a listing of the data matrix. The listing will appear on the 
device that has been specified for hard-copy in the Start routine or in the Output Unit routine. 
You can list all the data, or a specified subset of the data. You may also specify how you want 
the data listed, i.e., by observation, by variable, etc. 



Join 

Object of Program 

This progam allows you to join or combine two data sets into a single unit. One data set must 
be in memory and the other data set must have been previously stored by the Basic Statistics 
and Data Manipulation program. Two options are available. First, observations may be added 
together (if both sets have the same number of variables). Second, variables may be added 
together (if both sets have the same number of observations). A check is made in the program 
to make sure the two sets can be joined. Also, summary information on both data sets is 
printed before the joining operation is performed. Thus, the joining can be aborted if the 
resultant set will not be as expected. 



20 



Special Considerations 

Adding Observations 

Suppose data on six variables was gathered in each of the 52 weeks in 1975, analyzed, and 
stored on an auxiliary data disc. Suppose the same variables were measured in 1976, analy- 
zed, and stored. If you are interested in lumping the two sets of data together for an overall 
analysis, you may use the Add Observations option of the joining routine. One set of data 
must be retrieved via the Start routine. Then, after entering the Join routine, the second set 
may be retrieved and the joining carried out. Notice that the variables must be in the same 
order in the two data sets. 

Adding Variables 

Suppose you measured five variables on each of 50 subjects in an experiment. These were 
analyzed and stored on disc. Later, you realize that three more variables are of interest. You 
measure these variables on the subjects in the same order as before and analyze them. All 
eight variables measured on each subject could be combined into a single data set via the 
joining routine. 

Subfiles 

If variables are added, the subfile structure assigned to the resultant data set is the subfile 
structure of data set #1, that is, the data set that is in machine memory prior to the joining 
operation. If observations are added, the following procedures are employed: 1) If no subfiles 
exist in either data set, the resultant data set has no subfiles. 2) If data set #1 has no subfiles, 
but data set #2 does, then a subfile named "SET #1" is created which consists of data set #1 
and the subfiles of data set #2 remain unchanged. 3) If data set #1 contains subfiles, but data 
set #2 does not, then a subfile named "SET #2" is created which consists of data set #2 and 
the subfiles of data set #1 remain unchanged. 4) If both data sets contain subfiles, all of the 
subfiles of data set #1 are retained and as many subfiles of data set #2 are retained as 
possible - the upper limit of total subfiles for the resultant set being determined by the prog- 
ram limitations (see Special Considerations of Basic Statistics and Data Manipulation). 



Printer Is 

Object of Program 

This program allows you to specify the device on which the hard-copy output will be printed, or 
conversely, to specify that no hard-copy is desired, i.e., that output be directed to the CRT. 



Special Considerations 

The hard copy option can be changed in two ways: 



1. Select "PRINTER" key when you are asked to "SELECT ANY KEY". 

2. This option can only be used when the program is not expecting a n answer. For example, 
when Notes are displayed on the CRT and you are asked to press ( CONTINUE ) when ready. 
The printer may be changed as follows: 



21 



For Non-HP-IB Printer: 



1. Type: H c = (the select code of the desired printer) ( EXECUTE ) 

2. Type: H c b u s = 993 ( EXECUTE ) 

For HP-IB Printer: 



1. Type: He = (the select code of the desired printer) ( EXECUTE ) 



2. Type: H c b u s = (the bus address of the HP-IB device) ( EXECUTE ) 



Select and Scan 

Object of Program 

This program allows you to look at a portion of your data set that satisfies a conditional 
statement. If you are scanning the data set, your output will include the observation numbers 
satisfying the scanning criterion and their distribution throughout the subfile structure. The 
data set which you are scanning will remain unaltered. When using the select option, your 
output will be the same as scanning, but the data set will be reduced to just those observations 
satisfying the selection criterion. Remember, the BACKUP file (explained in Special Consid- 
erations of Basic Statistics and Data Manipulation) will contain the original data set. The selec- 
tion and scanning procedure may be performed over all subfiles or over a user-specified 
subset of the data. 



Specieil Considerations 

There are four different scanning or selection criteria offered in this routine. Explanations of 
each conditional statement follow. 

One Variable 

This option will allow you to "edit" your data set based on specified values for one chosen 
variable. For example, you may scan (or select from) your data set based on variable number 
two and have the routine report the observations where variable two has any of the following 
values: 1, 2.6, 4-8. 

Variable A OR Variable B 

This option will allow you to "edit" your data set based on specified values of two chosen 
variables. An OR operation links the two variables. For example, if two of your variables are 
temperature and humidity, you may want to select (or scan) all observations that have a 
temperature of 70-80 degrees, OR have a humidity level of 50-80. 

Variable A AND Variable B 

This option performs much like the OR option except is uses an AND operator. For example, 
you may want to select (or scan) all observations that have a temperature of 72 degrees AND 
a humidity level of 50-80. 



22 



Variable A = Variable B 

In this case the observations that would be selected (or scanned) are the observations where 
Variable A has the same value as Variable B. For example, you might want to know which 
observations have equal temperature and humidity level. 



Basic Statistics 

Object of Program 

This program computes a variety of summary statistics for data which was entered via the Start 
routine of Basic Statistics and Data Manipulation. The statistics may be computed by subfile or 
for the entire data set (ignoring subfiles). Basic statistics which are computed include: number 
of observations, number of missing values, sum, mean, variance, standard deviation, coeffi- 
cient of skewness, coefficient of kurtosis, coefficient of variation, standard error of the mean, 
and a confidence interval on the mean. An option is available to compute a correlation matrix 
for data sets having more than one variable. Order statistics computed include: the maximum, 
the minimum, range, and midrange. Additional order statistics which may be computed in- 
clude: the median, 25th percentile, 75th percentile, Tukey's middlemeans, and user-specified 
percentiles. These statistics are divided into three groups. You may specify any or all of the 
groups for output. 

Special Considerations 

Parser on Statistics Options 

Three options for statistics will be offered. They are 1) the common summary statistics, 2) the 
correlation matrix, and 3) the order statistics such maximum minimum, median, etc. You may 
respond "ALL" to the prompt asking you for your choice of options. Or, you may choose a 
portion of the options by responding as documented in the General Information section of 
Basic Statistics and Data Manipulation e.g., 1-2. 

Data Type 

If the data input type is not "RAW DATA", the Basic Statistics may not be computed. For 
example, Basic Statistics cannot be computed if the covariance matrix was entered as data. 

Hard-Copy Output 

If a hard copy of the statistics is not being made, the program halts occasionally so that you may 
study the results on the CRT. In this case, simply press CONTINUE to continue program 
execution. 

Additional Order Statistics 

If the option to obtain additional order statistics (Tukey's middlemeans and percentiles) is 
exercised, the data matrix is sorted and the observations of each variable are arranged in 
ascending order. At the end of the program the original data matrix is re-loaded into memory. 
Thus, if the program is aborted, that is, if the program is stopped before the reloading can 
occur, the data matrix will be in the sorted state. So, if the portion of the program used to 
calculate additional order statistics is accessed, abortion of the program is discouraged. 



23 





1 


2 


3 


4 


5 


1 


5 


M 


3 


4 


5 


-E 2 


6 


7 


M 


6 


4 


3 


1 


3 


2 


1 


1 



Methods and Formulae 

Variance: The best unbiased estimator is calculated by these programs, i.e., the denominator 
in the formula is N-l, where N is the number of observations used in the calculation. 

Correlations: Suppose you have the following data matrix: 

OBSERVATION 



VARIABLE 



Here, an M denotes a missing value. When computing the correlation between variables 1 
and 2, we discard observations 2 and 3 since variable 1 is missing a data value for observation 
2 and variable 2 is missing the data value for observation 3. However, when computing the 
correlation between variables 1 and 3, we need only discard observation 2. Similarly, the 
correlation between variables 2 and 3 is computed by discarding observation 3. Hence, the 
correlations may be based on different numbers of observations. An observation is thrown out 
if a data value from that observation is missing from one of the two variables for which the 
correlation is being computed. 

Tukey's Middlemeans 

Midmean: The midmean is the sum of all observations between (and including, if applicable) 
the 25th and 75th percentiles divided by the number of observations between those two 
percentiles. That is, it is the mean of all observations between the 25th and 75th percentiles. 

Trimean: The trimean is a weighted average of the median and the 25th and 75th percentiles: 
(1/4) (25th percentile + 2(median) + 75th percentile). 

Midspread: The midspread is the difference between the 75th and 25th percentiles: 

75th percentile - 25th percentile. 



Go To Advanced Stat 

Objective 

This procedure loads a file which prompts you to remove the BSDM program medium and 
insert the desired advanced statistics program medium into the mass storage device. You press 
CONTINUE after you have made this change. The new routines are then prepared to carry on 
further analyses on the data set in memory. 



24 



Return To BSDM 

Objective 

This procedure operates in the reverse of "Go To Advanced Stat" and should be used when 
you wish to return to the BSDM routines from an advanced statistics routine. 



Backup 

Objective 

This routine allows you to transfer the original data which is stored in the file called "BACK- 
UP" to the program scratch file called "DATA". You might find this useful in a case where the 
data currently in the "DATA" file is not the data you wish to be analyzing. This could occur, 
for example, if you inadvertantly stored a transformed variable in place of one of your original 
variables. Note that no operations, including editing, are performed on the data stored on the 
"BACKUP" file. 

This routine also allows you to transfer the data set in the opposite direction. That is, you may 
transfer the data stored in "DATA" to the "BACKUP" file. You might choose to do this after 
you have edited the original data set but before you perform any other operations. Then, the 
"BACKUP" file would contain the corrected original data without any further manipulations 
or modifications. 



25 



Examples 

Example 1 

This is a hypothetical set of data from a non-existent factory. The purpose of this example is to 
show the use, in part, of the LIST, EDIT TRANSFORM, SORT, SUBFILE, and STATS routines. 



BASIC STATISTICS AND DATA MANIPULATION 



[Answer all yes/no questions with Y / N 1 

Are you SoinS to user defined transformation or do No n -linear regression ? (Y/N) 

N 

Are you usinq an HP IB Printer? 



YES 

Enter select code > bus address (if 7il press CONT) 



We input these values separated by a 
comma or press CONTINUE if default 
(7,1) is correct 



^ *^ fl^- t^ -^ ^ <^ ^ ^. ^ ^t ^ ^ ^ flt Jft ^ J^ ^ ^ J^ P^ ^ ^(. V. ^ JH J^ J^ J^. j^ J|t ^. V ^ J|t ^ J^ ^. ^k. J^ j^t 7f( Jfs. /ft Y J(t ^fC/)(Jfv?fC/f(3f(i)fC^Cj , |(^jftJf( ?(C ?jC »ff J(C ?fC J|( JfC Jff )(C )|C 5t^ )(t 3fC )f( Y 'K * -V t 1 * 5|C )ft 

* DATA MANIPULATION * 

fl* T* ♦ *r *r t t* t * * ™ * * * T* * ™ * * * * * * ^ ™ ^ ^ * ^ * * * * t ™ ^ * ^ * * ^ ^ * * * ^ ^ ^ * ^ ^ * ™ * ^ t^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^k ^s ^ ^t ^k ^ ^t ^C Jfi ^t jji JfC jpt JfC 

Enter DATA TYPE: 



i 

Mode nu fiber ~ ? 



Raw data 



Project title for this dota set (<= 80 characters) 

HYPOTHETICAL FACTORY DATA 
Number of variables = 



The data will be entered by typing it in on 
? the keyboard. 

Title 



WuMber of observations/variable = ? 

17 

Variable * i nane (<= 10 characters) 

? 



Nv 



No 



TEMP<C) 

Variable # 2 nane (<= 10 characters) 

7 



Label for variable 1 



PRODUCTION 

Variable # 3 nane (<= 10 characters) = 

? 



Label for variable 2 



DAYS 

Variable # 4 nane (<= 10 characters) 

7 



Label for variable 3 



PAYROLL 

Variable # 5 nane <<= 10 characters) = 

? 



Label for variable 4 



WATER USE 

Is above infor nation correct? 

YES 



Label for variable 5 

Approve information on CRT (shown 
below). 



26 



HYPOTHETICAL FACTORY DATA 

Data file none t 

Data type is: Raw data 

Nunber of observations: 17 
Nunber of variables: 5 

Variable nanes: 
i. TEMP(C) 

2. PRODUCTION 

3. DAYS 

4. PAYROLL 

5. WATER USE 

Do you want to enter data one case at a tine, i , e , ,, by observation? 

All variables will be entered separately by 
v f c 

, , , , ■. ■, , . j l n commas. 
Observation # i , all variables (separated by conr-ias) = 

7 

14.9,6396,21,134,3373 

Observation # 2 , all variables (separated by connas) - 

7 

18,4,5736,22,146,3110 

Observation # 3 , all variables (separated by cowr-ias) = 

7 

21 .6,6116,22,158,3180 

Observation * 4 , all variables (separated by cocrnas) = 

? 

25.2,8287,20,171,3293 

Observation * 5 , all variables (separated by con«as) = 

7 

26.3,13313,25,198,3390 

Observation * 6 , all variables (separated by connas) = 

? 

27.2,13108,23,194,4287 

Observation # 7 , all variables (separated by coMfias) = 

? 

22.2,10768,20,180,3852 

Observation # 8 , all variables (separated by commos) = 

? 

17. It 12173, 23, 191. ,3366 

Observation * 9 , all variables (separated by cot-mas) = 

7 

12.5 ,11390,20,195,3532 

Observation # 10 , all variables (separated by cownas) = 

7 

6.9,12707,20 ,192,3614 

Observation * 11 , all variables (separated by comas) = 

? 

6.4,15022,22,200 ,3896 

Observation # 12 , all variables (separated by connas) = 

7 

13.3,13114,19,211 ,3437 

Observation # 13 , all variables (separated by conwas) = 

? 

18.2, 1 2257 , 22 , 203 , 3324 

Observation # 14 , all variables (separated by carinas) = 

7 

22.8,13118,22,197,3214 

Observation * 15 , all variables (separated by connas) - 

? 

26. 1 ,13100,21,196,4345 

Observation # 16 , all variables (separated by cotwas! = 



27 



26.3,16716,21,205,4936 

Observation * 17 , all variables (separated by commas) = 

? 

4,2,14056,22,205,3624 

PROGRAM NOW STORING DATA ON SCRATCH DATA FILE AND BACKUP FILE- 



SELECT ANY KEY 



Option number = ? 



LIST 



Select Special Function Key-LIST 
List all the data 



Enter method for listing data: 

3 In tabular form 

HYPOTHETICAL FACTORY DATA 
Data type is; Raw data 





Variable * 1 


Variable # 2 


Var iab' 


l.e * 3 


Variable * 4 


Variable \ 5 




(TEMP(C) ) 


(PRODUCTION) 


(DAYS 


) 


(PAYROLL. ) 


(WATER 


USE ) 


OBS* 
















1 


14,90000 


6396.00000 


21 


, 00000 


134, 000 00 


3373 


, 00000 


p 


18,40000 


5736.0 0000 


22 


, ooonc 


146, 000 


3110 


, 00000 


3 


21 ,60000 


6116, 00000 


22 


, 00000 


158,00000 


3180 


, 00000 


4 


25.20000 


8287, 000 


20 


,00000 


1 7 1 , 


3293 , 





5 


26,30000 


13313,00000 


25 


,00000 


198, 00 000 


3390 


,00000 


6 


27.200 00 


13108,0000 


23 


,00000 


194, 0000 


4287. 





7 


22,20000 


10768,00000 


20 , 


,00000 


180 .00000 


3852 


, 00000 


8 


17. 10000 


12173. 00000 


23, 


00000 


191.00000 


3366, 


00000 


9 


12,50000 


11390.00000 


20, 


,00000 


195,00000 


3532 


, 00000 


10 


6,90000 


12707.00000 


20 , 


00000 


.1.92, 00000 


3614, 





11 


6,40000 


15022,00000 


22, 


00000 


200 , 00000 


3896 


, 00000 


12 


13.30000 


13114.00000 


19, 


00000 


211 , 00 000 


3437, 





13 


18.20000 


12257,00000 


22. 


00000 


203, 00000 


3324 


, 00000 


14 


22.80000 


13118,00000 


22, 


00000 


197,00000 


3214. 





15 


26.10000 


13100,00000 


21 , 


00000 


196, 00000 


4345 


, 00000 


16 


26,30000 


16716.00000 


21 , 


00000 


205, 00000 


4936, 


00000 


17 


4.20000 


14056,00000 


22, 


.00000 


205, 00000 


3624. 


00000 



Option number = 



SELECT ANY KEY 



Select option desired 



1 



EDIT ROUTINES 



Observation nunber (enter 'NONE' when done) 



11 



Variable number = ? 



Old value = 15022 — Correct value = 
? 

15024 

OBS VAR OLD NEW 

# # VALUE VALUE 

11 2 15022,00000 15024.00000 

Observation number (enter 'NONE' when done) = 



Exit List routine 



Select Special Function Key-EDIT 



Choose to correct a data value. 



At observation #11 



For variable 2 



Should be 15024 



28 



NONE 

Select option desired 



Which observations are to be deleted ? 

i!) 

Observation * i deleted, 

16 observations rectain. 
Select option desire d : 



Delete an observation 



Are o b s ■!>. r v a t i o n s ordered, i e 



sb ou Id add i t i ons 



NO 

How Many observations are to be added' 



Add an observation 

i. n s e r t e d "/ 

Add at the end 



1 

Enter observation * 17 (variables separated by connas! 

■> 

4 .2, 12707, 20. ,192, 3614 

Observation # 17 Variable * 1 = 4.2 

Observation # 17 Variable * 2 = 12707 

Observation * 17 Variable * 3 = 2 

Observation * 17 Variable * 4 = 192 

Observation # 17 Variable * 5 = 3614 

Total nuwber of observations now = IV 
Select option desired : 



New observation #17 



PROGRAM NOU UPDATING SCRATCH DATA PILE 
SELECT ANY KEY 



LIST 



Option nunber = 1 

1 

Enter Method for listing data: 

3 



Exit Edit routines 

Select Special Function Key-LIST 
List all the data 
In tabular form 



HYPOTHETICAL FACTORY DATA 



Data type is: Raw data 





Variable * 1 


Variable * 2 


Variable # 3 


Var tab * 


te * 4 


Variable * 5 




(TEMP(C) ) 


(PRODUCTION) 


(DAYS 


) 


(PAYROI. 


,.L ) 


(WATER 


USE ) 


OBS* 


















i 


14.90000 


6396 , 00000 


21 , 


, 00000 


134 


,00000 


3373, 


00000 


2 


18,40000 


5736.000 00 


22, 


,00000 


146 


00000 


3110, 


00000 


3 


21 .60000 


6116.00000 


22, 


, 00000 


158 


,00000 


318 0. 


00000 


4 


25,20000 


8287. 00000 


20, 


00000 


171 , 





3293 , 





5 


26,30000 


13313.00000 


25, 


00000 


198 


,00000 


3390 


00000 


6 


27.20000 


13108,00000 


23, 


00000 


194. 


00000 


4287 . 





7 


22.20000 


10768,00000 


20 , 


,00000 


180 


,00000 


3852, 


00000 


8 


17, 10000 


12173.00000 


23, 


00000 


191. 





3366 , 


00000 


9 


12.50000 


11390.00000 


20, 


.00000 


195 


, 00000 


3532 


00000 


10 


6,40000 


15024. 00000 


22, 


00000 


20 0, 


00000 


3896. 


00000 


11 


13.30000 


13114,00000 


19 


00000 


211 


.00000 


3437 


00000 


12 


18,20000 


12257. 00000 


c. f... . 


,00000 


20 3, 


00000 


3324 . 


00000 


13 


22,80000 


13118,00000 


22, 


, 00000 


197 


, 00000 


3214 


. 00000 


14 


26,100 00 


13100 ,00000 


21 , 


,00000 


196. 


,00000 


4345, 





15 


26.30000 


16716, 00000 


21, 


,00000 


205 


,00000 


4936 


. 00000 


16 


4.20000 


14056. ooono 


22 


, 00000 


20 5. 





3624. 


00000 


17 


4,20000 


12707,00000 


20, 


,00000 


192 


,00000 


3614 


, 00000 



29 



Option nunber = ? 



SELECT ANY KEY 



Enter option number desired 



Name of data f.ile - ? 

HYPO: INTERNAL 

Is data medium placed in device INTERNAL 

? 

YES 

PROGRAM NOW STORING DATA ON HYPO • INTERNAL 

Is program medium replaced in device? 

YES 

Enter option nunbsr desired : 



SELECT ANY KEY 



Exit List routine 



Select Special Function Key labeled-STORE 



Store all the data 



On this file on our floppy 



Exit Store routine 



Select option desired : 



Transformation niiciber : = "> 



TRANSFORMATION ROUTINES 

Select Special Function Key labeled-TRANSFORM 
Algebraic transformations 



Variable nticiber corresponding to X = ? 



a*(Xtb) + c 



Parameter a :=: ? 

,2642 
Parameter b = ? 

i 

Paraweter c = '> 



To convert liters to gallons 
X 6 = .2642X 5 



Store transformed data in Variable * ( < = 6 ) 
? 



Variable name (<= 10 characters) = ? 

GALLONS 

Is above information correct? 



X 6 now called GALLONS. 



YES 

press 'CONTINUE' when ready 

The following transformation was performed; a*(X A b)+c 
where a = .2642 
b = i 
c = 

X is Variable # S 
Transformed data is stored in Variable * 6 (GALLONS) 



Select option desired : 



PROGRAM NOW UPDATING SCRATCH DATA FILE 

SELECT ANY KEY 



Exit transformation routine 



30 



SORT ROUTINES 



ENTER OPTION NUMBER DESIRED 



NuMber of the Variable on which to sort 



Select Special Function Key labeled-SORT 
Sort in ascending order 



3 On variable 3 (Days in month) 

Data se t i 

HYPOTHETICAL FACTORY DATA 
has been arranged in ascending order according to Variable * 3 



ENTER OPTION NUMBER DESIRED 







PROGRAM NOW UPDATING SCRATCH DATA FILE 
SELECT ANY KEY 

Option lumber -•• f 

1. 

E n t e r M t> t h o d f o r 1. i <■■, t i n a cl a t a : 

3 



LIST 



Exit sort routine 

Select Special Function Key labeled-LIST 
List all the data 

In tabular form 



HYPOTHETICAL FACTORY DATA 



Data type is: Raw data 





Variable # 1 


Variable ♦ 2 


Var i ab 3 


l.e * 3 


Var iabj 


Le # 4 


Variable # 5 




< TEMP < C > > 


(PRODUCTION) 


(DAYS 


) 


(P AYR 01. 


J... ) 


(WATER 


USE ) 


DBS* 


















i 


13.30 000 


13114.0 00 00 


19 





211 





3437 


.000 


2 


22.20 


10768.00000 


20 


.00000 


180 


. 


3852 


. 


3 


25.20 


8287 .00000 


20 


.000 


171 


.000 


3293 


. 


4 


4.20000 


12707.00000 


20 


. 


192 


. 


3614 


. 


S 


12.50000 


11390 .00000 


20 





3.95 





3532 


. 


6 


26.3000 


16716.00000 


21 





205 


. 


4936 


.00000 


7 


26. iOOOO 


1310 0.000 


21 





196. 





4345 


.00000 


8 


14.90000 


6396 .00000 


21 





134 


. 


3373 


.00000 


9 


6.40000 


15024.00000 


22 





20 0. 





3896 


. 


10 


21 .60000 


6116.00 000 


22. 





158 


.000 


3180 


.000 


ii 


18.20000 


12257.0 00 00 


22. 





203. 





3324 


.00000 


12 


22.800 


13118.000 00 


22 





197 


. 


3214 


.00000 


13 


10. 40000 


5736 .00000 







146. 





3110 


. 


14 


4.20 00 


14056. 000 00 


22. 





205 





3624 





15 


27.20 00 


13108.00000 


23. 





194 





4287 





16 


17. 10000 


12173. 00000 


23 





191 . 





3366 


. 


17 


26.30 000 


13313.00000 


25 





198. 





3390 








Variable * 6 




( GALLONS ) 


DBS* 




1 


908.05540 


(L 


1017.69840 


3 


870 .01060 


4 


954.81880 


5 


933. 15440 


6 


1304.09120 



31 



7 


1147.94900 


B 


891 .14660 


9 


1029.32320 


10 


840.15600 


11 


878.20080 


12 


849.13880 


13 


821.66200 


14 


957.46 080 


15 


1132.62540 


16 


889.29720 


17 


895.6380 


Dp t i on 



SELECT 


nu«bsr =■ f 


ANY KEY 



SUBFILE 



Option nuriber 



Exit list routine 
Select Special Function Key labeled-SUBFILES 



Nu nber of subfiles ( <=20 



Ncine of Subfile # 1 '. < "- 10 characters ) 
? 



Select method of subfile specifications 
which ask you to enter the first observation 
in each subfile. 



FY '76 

Ncme of Sub ft 

? 



I ( = \ (l (- h rj r O r t (> ,'" - ) 



Subfile I 2 •. nurtber of first observation 
? 



1. 3 

Is the above information correct ? 

YES 

8 u b f i 1 e none: be g i n n i. n u o b s e r u a t i o n n u ri b e r of o b s e r v a t i o n s 

1, FY '76 i 12 

2, FY '77 13 5 



Summary 



Option notiber = f 



PROGRAM NOW STORING DATA 

SELECT ANY KEY 



Exit subfile routine 



BASIC STATISTICS ROUTINES 



U) h a t s t a t i s t .i. c o r. » t i o n s a r e d e s i r 



Select Special Function Key labeled-STATS 



1 



Mean, Ci, Variance, Standard Deviation, 
Skewness, Kurtosis 



VARIABLES 
? 

ALL Compute statistics for all variables 

Confidence coefficient for confidence interval on the Mean<e,g, 90,95,992) :::: '> 

95 

Option nu nber = ? 



What subfiles are desired "> 
i 



Compute statistics for selected subfiles, 
For FY76 



32 



* **« A*********** *********** K******************** ********************* ********* 

* SUMMARY STATISTICS * 

* ON DATA SET: * 

* HYPOTHETICAL FACTORY DATA * 



Subfile: FY'76 



BASIC STATISTICS 



VARIABLE 


* 


OF 


# 


OF 














NAME 


OBS . 


MI 


SS 


SUM 




MEAN 




VARIANCE 


STD . DEV . 


TEMP(C) 




12 







2.1.3 


.7000 


17. 


8083 


56.9572 


7.5470 


PRODUCTION 




12 







138993 


.000 


11582. 


750 


10478676.7500 


3237 . 0784 


DAYS 




12 







250 


.0000 


20 


. 8333 


i .0606 


1 .0299 


PAYROLL. 




12 







2242 


.0000 


186. 


8333 


504 5152 


22.4614 


WATER USE 




12 







43996 


. 


3666 


3333 


274270 .7879 


523.7087 


GALLONS 




12 







11623 


7432 


968. 


6453 


19144.5508 


138.3638 



VARIABLE 


COEFF 


•ICIENT 


STD. ERROR 


NAME 


OF 


VARIATION 


OF MEAN 


TEMP CO 






42.37903 


2. 17863 


PRODUCTION 






27.94741 


934.46405 


DAYS 






4 . 94332 


. 29729 


PAYROLL 






12.02217 


6.48405 


WATER USE 






14.28426 


151. 18168 


GALLONS 






14.28426 


39.94220 



95 7. CONFIDENCE INTERVAL 
LOWER LIMIT UPPER LIMIT 



13 


.01195 


22 


.60471 


9525 


.47409 


13640 


.02591 


20 


.17882 


21 


. 48784 


172 


. 55832 


201 


.10834 


3333 


. 49825 


3999 


.16841 


880 


.71024 


1056 


.58 030 



VARIABLE 



SKEWNESS 



KURTOSIS 



TEMP CO 

PRODUCTION 

DAYS 

PAYROLL 

WATER USE 

GALLONS 



-.53473 
-.42217 
--. 18352 
•1.22848 
1 . 34739 
1 . 34739 



- . 96332 

-.66250 

•1.18041 

.55306 

. 89749 

. 89749 



What statistic: options arc <:!< 

1 

VARIABLES=? 

A I. L 



Mean, Ci, Variance, Standard Deviation, 
Skewness, Kurtosis 



Compute statistics for all variables 



C o n f i d ence r.: o e. f f i <:: j. e n t f o r <:: o n f i cl e n c t> .i. n t e r v a I a n 1 h e m e a n ( e . a 

95 

Op 1 i on nuwber = ? 



<?0 : 95 ,9?) 



? 



What subfiles are desired 



Compute statistics for selected subfiles. 
For subfile FY77 



33 



a*************************************** w^ 

* SUMMARY STATISTICS * 

* (3N DATA SET: * 

* HYPOTHETICAL FACTORY DATA * 



Subfile: FY '77 



BASIC STATISTICS 



VARIABLE 


# 


OF 


# 


01" 














NAME 


OK 


IS. 


MI 


:ss 


SUM 




MEAN 




VARIANCE 


STD . DEV . 


TEMP<C) 




1 1 







93 


.2000 


18. 


640 


85.7230 


9 . 2587 


PRODUCTION 




S 







58386 


,000 


11677 


20 


11481348.70 


3388.4139 


DAYS 




s 







115 





23. 


.0000 


1 .5000 


1.2247 


PAYROLL 




s 







934 


. 


186. 


8000 


547.7000 


23.4030 


WATER USE 




s 







17777 


.0000 


3555 


4000 


20 0388.8000 


447.6481 


GALLONS 




s 







4696 


, 6834 


939 


3367 


13987.4669 


118.2686 



VARIABLE 


COEFFICIENT 


NAME 


OF 


VARIATION 


TEMP CO 




49.67099 


PRODUCTION 




29. 01735 


DAYS 




5.32498 


PAYROLL 




12.52837 


WATER USE 




12.59065 


GALLONS 




12.59065 



ITD. 



IRROR 



OF MEAN 



95 7. CONFIDENCE INTERVAL 



LOWER LIMIT 



UPPER LIMIT 



4 


. 14060 


7 


.14334 


30 


. 13666 


15 


.34476 


7469 


.74622 


15884 


. 65378 




. 54772 


21. 


.47921 


24 


.52079 


10 


.46614 


157 


.740 09 


215 


.85991 





.19431 


2999 


.54742 


4111 


. 25258 


52 


.89134 


792 


.480 43 


1086 


.19293 



VARIABLE 

TEMP(C> 

PRODUCTION 

DAYS 

PAYROLL 

WATER USE 

GALLONS 



NESS 


KURTOSIS 


- .68247 


-.77608 


1 . 35662 


.05662 


.91287 


-.50000 


1 .30917 


.02054 


.91055 


- . 44827 


.91055 


-- . 44827 



What statistic opt. tons ums cles.irad '•> 



VARIABLES" 
? 

ALL 

Option n u fiber ::: v 



Correlation matrix 



Compute statistics for all variables 



What s i) b f i 1 e ;?: a r e. cl t> sir- e cl f 
i ,2 



Compute statistics on selected subfiles. 



34 



,1s u. W/ vju s^ J/ \i/ ^ \ii* si/ "J/ ■*!/ si/ si/ sV \1/ si/ sJ/ si/sl/sii'si/sj/^sL'st ^ ^ sk si/ si/ si/ *f/ yi/ 4* ^ si/ si/ si/ s^ W ^ si/ ^ ^1/ i" >fr \V "^ si/ si/ si/ J/ ^ ^ ^ >t ^ ^ 4 >t ^ ^\t ^ ^ W ^t \V 'J/ 4>k^4/ st' sV sV \l/ "A* \iy 
?fi /fi Jfi sp. /f. /J-. Sf- 'p-^'p'r- TN'iV'p.^.i'n^'p-'r^^T^'r. ^ ^s /|s fl\ iy. ^ /p. i^^^^Jls^^^^^^^^^^^^.x^^^^^^^^^^^^^^^^^^^^^^^^^^^^*r , 'Ts'i>'r' 

* SUMMARY STATISTICS * 

* ON DATA SET ■■ * 

* HYPOTHETICAL FACTORY DATA * 

\i/ \t/ sL- si/ si/ U/ si/ \i/ si/ ^ si/ si/ si/ ^/ si/ \L- J/ \1/ \1/ ^ \1/ *!/ vt' si/ sVsi/sl/si/st/sl/sl/^/%Lr \L> ^/ si/ \i/ \l/ si/ si/ si/ s^ slf *si/ W ^/ ^ sL" \^ s^ *!/ \^ si/ ^J/ sir ^ s^ si/ s^ s^ \t st st \i/ st \V "J/ si/ "s^ s^ si/ si/ si/ ^ si/ sir -si/ si/ \^ \U 



Subfile: FY "76 



CORRELATION MATRIX 



TEMP < C ) 
PRODUCTION 
DAYS 
PAYROLL 
WATER USE 



PRODUCTION 
-. ill. 3482 



DAYS 
1627763 
081945 



PAYROLL WATER USE 



1007200 
.8872541 
.1113502 



2511888 
6589095 
368011 
3820119 



GALLONS 
.2511888 
.6589095 

0368011 
.3820119 
.0000000 



> u b f i 1 e 



•Y'77 



CORRELATION MATRIX 



TEMP(C) 
PRODUCTION 
DAYS 
PAYROLL 
WATER USE 



PRODUCTION 
■0709995 



DAYS 
.6614042 
,4116924 



PAYROLL WATER USE 



.1292917 
.9974909 
. 3924963 



2656162 
5754985 
209757 
5259584 



GALLONS 
.2656162 
.5754985 
.0209757 
.5259584 
.0000000 



What statistic options are desired ? 



VARIABLES 

ALL 

Option nunber -= '> 

i~ 

What subfiles are desired ? 

1 ,2 



Median, Mode, Percentiles, Min, Max, 
Range. 

Compute statistics for all variables 
Compute statistics for selected subfiles. 
Both subfiles 



35 



ft******************************************* 

* SUMMARY STATISTICS * 

* ON DATA SET: * 

* HYPOTHETICAL FACTORY DATA * 



Subfile: FY '76 



ORDER STATISTICS 



VARIABLE 

TEMP(C) 

PRODUCTION 

DAYS 

PAYROLL 

WATER USE 

GALLONS 



MAXIMUM 

26.30 

16716.00000 

22. 0000 

211.000 00 

4936.00000 

1304. 09120 



MINIMUM 



4. 

6116. 

19 

134. 

3180. 

840. 



20000 
00000 


00000 
1560 



22 

10600 

3 

77 

1756 

463 



RANGE 
.10000 
00000 


00000 
93520 



MIDRANGE 


15 


.25000 


11416. 


.00000 


20 


.50000 


172 


.50000 


4058 


.00000 


1072 


. 12360 



TUKEY* 



HINGES 



VARIABLE 


MEDIAN 


TEMP<C> 


19.90000 


PRODUCTION 


12482. 00 00 


DAYS 


21 . 00000 


PAYROLL 


195.50000 


WATER USE 


3484.50000 


GALLONS 


920.60490 


VARIABLE 






MIDMEAN 


TEMP ( C > 


18.83333 


PRODUCTION 


12222.66667 


DAYS 


20.83333 


PAYROLL 


193.33333 


WATER USE 


3522.000 00 


GALLONS 


930.51240 


Other percent!]. 


e<r,<Y/N>? 


NO 





-th X-ile 
12.90000 



9527 
20 

175 
3308 

874 



50 00 

50 00 
50000 
10570 



TUKEY 'S MIDDLEMEANS 



11901 

21 

192 

3S37 

934 



TRIMEAN 
19.17500 
8750 

00000 
8750 
70658 



75-th 

24 

13116 

22 

201 

3874 

1023 



Z-ile 
00000 
00000 

50 000 

51080 



MI DSP READ 



11 

$588 

2 

26 

565 

149 



10 
50000 
00000 

.500 00 
40510 



36 



Subfile: FY '77 



ORDER STATISTICS 



VARIABLE 


MAXIMUM 


MINIMUM 




RANGE 


MIDRANGE 


TEMP(C) 


27.20 00 


4.20000 


23 


.00000 


15 


.700 00 


PRODUCTION 


14056.00000 


5736.00 00 


8320 


.00000 


9896 


.00000 


DAYS 


25 .00000 


22.00000 


3. 


.00000 


23 


.50 00 


PAYROLL 


205.00000 


146.00000 


59 


00000 


175 


,500 00 


WATER USE 


4287. 000 


3110 .00000 


1177. 





3698 


.50 00 


GALLONS 


1132.62540 


821.6620 


310 


96340 


977 


.14370 






TUKEY 


"S HINGES 






VARIABLE 


MEDIAN 


25-th Z-ile 




75-th 


Z-ile 




TEMP(C> 


18.40000 


17.10000 




18 


,40000 




PRODUCTION 


13108.00000 


12173.00000 




13108 


.00000 




DAYS 


23.00000 


22.00000 




23 







PAYROLL 


194. 00000 


191.00000 




194 


.00000 




WATER USE 


3390 .00 000 


3366.00000 




3390 







GALLONS 


895.6380 


889.29720 




895 


.6380 








TUKEY'S MIDDLEMEANS 








VARIABLE 
















MIDMEAN 


TRIMEAN 




MIDSPREAD 




TEMP(C) 


20 .60000 


18.07500 




1 


30 000 




PRODUCTION 


12864.66667 


12874.250 




935. 







DAYS 


22 . 66667 


22.750 




1 . 


000 




PAYROLL 


194.33333 


193.250 




3. 







WATER USE 


3460 .00000 


3384.000 00 




24. 







GALLONS 


914. 13200 


894.05280 




6. 


34080 




t h e r p e rcen 1 


i '1 f ■■■■■ f 












NO 















W 1 1 a i '-■ 1 a t i s r i r o p t i. o n *' : a r 9 <:\ t> <■■ i r <• cl ? 



SELECT ANY KEY 



Exit basic statistics routine 



Note: All Basic Statistics for these subfiles 
could have been obtained more effi- 
ciently than we demonstrated in this 
example by responding "ALL" to the 
above question. 



37 



Example 2 

The data set is from the MINITAB STUDENT HANDBOOK authored by T. Ryan, and B. Joiner 
and published by the Duxbury Press (1976). The data appeared on page 279. The operation 
performed on two sets SAMPLE A and SAMPLE B demonstrate the following 
operations: JOIN, LIST, RECODE, SUBFILE (by variable), STORE, SELECT, and STATS. 



BASIC STATISTICS AND DATA MANIPULATION 

[Answer all y e s / n o questions with Y / N ] 

Are you 3oinsf to use user defined transformation or non-linear regression ? (Y/N) 
NO 

Are you using an HPIB Printer? 

YES 

Enter select codet bus address (if 7)1 press CONT) ? 

yy ^ ^ ^ ^ ^ "A' *!/ W ^ W 4 ^ 4 W ^ ^ W 4 *if 4 4 W 4 4 W ^ 4 4' W ^" ^^^^ ^ i^ ^ ^ "A* ^ \lf ^ ^ ^ ^Lf W ^ '^ ^ .__^ .__f .__f ^ 'Jf .A' '__f" .J_f ^ ^ ^ ^ .it *__f '_l_" .ii' .Jf ".If -_k" "i -Jf ,_k" '-if' 4 .__f .t .t "Jf 4 it 

* DATA MANIPLH._ATT.ON * 

^ ^ ^ ^ ^ ^ _^ Jp j^ j^ ^ .^ ^ ^ .^ ^ ' ^ 9 ^ * ^ ^ ^ * * * ^ ^ ^ ^ ^ ^ ^ ^ * ^ * * * ^ * * ^ * ^ * ^ ^ ^ * t* * ^ * * ^ ^ ^ ™ * * * * ^ ^ * * * * * * * * ^* * ^ '^ 'r* ™ 



Enter DATA TYPE: 



i 

Mode nunber ~ f 



Raw data 



Data is from mass storage 



Is data stored on the program's scratch file (DATA)? 
NO 



Data file nacie = f 



GRADEB: INTERNAL 

Was data stored by the BS&DM system ? 

YES 

Is data medium placed in deuice INTERNAL 
? 



The data was stored under the name 
GRADEB in a different place, so the pro- 
gram must retrieve it. 



YES 

Is program medium placed in correct deuice ? 

YES 

PROGRAM NOW STORING DATA ON SCRATCH DATA FILE AND BACKUP FILE 



SAMPLE B 



Data file nawe : GRADEB : INTERNAL 

Data type is: Raw data 

Nunber of observations: SO 
Nunber of variables: 3 



This data is the second set of 50 student 
grades (GPA) and scores on the ACT tests 
(Verb and Math). The data taken from the 
Minitab Student Handbook on page 279. 



38 



Var .tab 1« nawes : 

1. VERB 

2 . MATH 

3. GPA 

Subfiles: HONE 

SELECT ANY KEY 

Op t ion nufiber = ',' 

1 

E n ter Method f () r 1 i s t i n <:t d a t a : 

3 



Select Special Function Key labeled-LIST 
List all the data. 

In tabular form. 



Data type 



Raw d a t a 



SAMPLE B 





Variable # 1 


Variable * 2 


Var iab le * 3 




(VERB 


) 


(MATH 


) 


(GPA ) 


OBS# 












i 


500 , 


00000 


661 


,00000 


2,30000 


2 


460. 


,00000 


692, 


00000 


1,4000 


3 


717, 


,00000 


672 


,00000 


2,80000 


4 


592, 


,00000 


441. , 


00000 


2,40000 


5 


752, 


00000 


729 


, 00000 


3,40000 


6 


695 


,00000 


68.1 


.00000 


2,50 00 


7 


610 


, 00000 


777 


,00000 


3.60000 


8 


620 


, 00000 


638, 


00000 


2,60000 


9 


682, 


, 00000 


701 


,00000 


3.60000 


10 


524 


, 00000 


700, 


, 00000 


2.90000 


ii 


552, 


,00000 


692 


,00000 


2.60000 


12 


703 


, 00000 


710 


,00000 


3,80000 


13 


584. 


, 00000 


738 


,00000 


3. 00000 


14 


550, 


, 00000 


638 


,00000 


2.50000 


15 


659 


,00000 


672 


.00000 


3,50000 


16 


585, 


, 00000 


605, 


00000 


2.00000 


17 


578. 


00000 


614 


,00000 


3, 00000 


18 


533, 


00000 


630, 


00000 


2,00000 


1? 


532, 


00000 


586 


,00000 


i ,80000 


20 


708, 


,00000 


701, 


00000 


2.30 00 


21 


537, 





681 


,00000 


2.10000 


22 


635, 


, 00000 


647, 


00000 


3.00000 


23 


591 , 


, 00000 


614 


, 00000 


3,30000 


24 


552 


, 00000 


669, 


00000 


3.00000 


25 


557 


00000 


674 


,00000 


3.20000 


26 


599 


, 00000 


664, 


, 00000 


2,300 00 


27 


540 


00000 


658 


,00000 


3,30000 


28 


752, 


,00000 


737, 


00000 


3.30000 


29 


726, 


00000 


800 


, 00000 


3.90000 


30 


630. 


, 00000 


668 . 


00000 


2.10000 


31 


558, 


00000 


567 


, 00000 


2,60000 


32 


646, 


00000 


771. , 


00000 


2,40000 


33 


643, 


00000 


719 


, 00000 


3,30000 


34 


606, 


00000 


755 , 


00000 


3,10000 


35 


682, 


00000 


652, 


,00000 


3.60000 


36 


565, 


00000 


672, 


00000 


2.9000 


37 


578, 


00000 


629 


00000 


2.40000 


38 


488, 


00000 


611 , 


00000 


1 .80000 


39 


361 , 


00000 


602 


, 00000 


2.40000 


40 


560, 


00000 


639, 


00000 


2.90000 


41 


630 , 


00000 


647 


,00000 


3.50000 



42 


666 


, 00000 


705, 


00000 


3 


.40000 


43 


719, 


,00000 


668 


,00000 


i... 


.30000 


44 


669 


,00000 


70 i , 


00000 


p 


,90000 


45 


57 i. , 


00000 


647 


,00000 


i 


.800 


46 


520, 


,00000 


583. 


00000 


p 


.80000 


47 


57i , 


,00000 


593 


,00000 


p 


,30000 


48 


539, 





601., 


00000 


2 


,50000 


49 


580 , 





630 


, 00000 


f... 


.40000 


50 


629, 


,00000 


695. 


00000 


':> 


,90000 



39 



Option nu fiber ■= '> 



SELECT ANY KEY JOIN ROUTINE 



Option nu nber = 1 
2 

Do you wish to continue with the JOIN procedure ? 
Title for combined data set (<= 80 characters) = ? 



TOTAL ACT SCORE/GPA COHPARISON DATA 
File naMe of data set #2 = ? 

GRADEA: INTERNAL 

Is data set #2 MediuM placed in device INTERNAL 

? 



YES 

Press ' CONTINUE ' 
Press ' CONTINUE ' 
Is p ro 9 r am med i um 

YES 



when ready to continue 
when ready to continue 

placed in device ? 



Exit List routine. 

Select Special Function Key labeled-JOIN 
Choose to add observations. 

To continue you must have 

1. Data Set #1 currently in memory. 

2. Data Set #2 previously stored by this 

program. 

3. Total observations times varibles < 
1500. 

4. Each data set must contain the same 
number of variables arranged in the 
same order. 

This data set (the first set A in the Minitab 
manual) was previously stored. 



TOTAL ACT SCORE/GPA COMPARISON DATA 



Nuttber of variables: 3 
NuMber of observations! 1.00 

Variable naties: 
i . VERB 

2 . MATH 

3. GPA 
Subfiles: NONE 

PROGRAM NOW UPDATING SCRATCH DATA FILE 
Option nu fiber = V 



SELECT ANY KEY 



LIST ROUTINE 



Option nunber = 1 

i 

Enter Method for listing data: 

3 



The two data sets are combined. That is 
the second 50 observations are 'attached' 
to the bottom of the original 50 observa- 
tions. 



Exit Join routine 



Select Special Function Key labeled-LIST 



List all the data 



In tabular form 



40 



TOTAL ACT SCQRF/GPA COMPARISON DATA 



Data type .is: Raw data 





Var iab 1 


.e * i 


Var iab 1 


.e * 2 


Var x able * 3 




(VERB 


) 


(MATH 


) 


(GPA > 


OBS* 












1 


500 . 


00000 


661 , 


ooooo 


2,30000 


2 


460 , 


00000 


692, 


ooooo 


1 ,40 00 


3 


717. 


00000 


672, 


ooooo 


2,80000 


4 


592, 


00000 


441 , 


ooooo 


2,40000 


5 


752, 


00000 


729, 


ooooo 


3,40000 


6 


695, 


00000 


681, 


ooooo 


2,500 00 


7 


610 . 


00000 


777, 


ooooo 


3,600 


8 


620 , 


00000 


638. 


ooooo 


2,600 00 


9 


682, 


00000 


701 , 


ooooo 


3,60000 


10 


524, 


00000 


700, 


ooooo 


2,90 00 


il 


552, 


00000 


692, 


ooooo 


2,600 


12 


703, 


00000 


710, 


ooooo 


3,80000 


13 


584, 


00000 


738, 


ooooo 


3, 00000 


14 


550 . 


00000 


638 , 


ooooo 


2,50000 


IS 


659. 


00000 


672, 


ooooo 


3,50000 


16 


585, 


00000 


605, 


ooooo 


2,00000 


17 


578. 


00000 


614 


ooooo 


3,00000 


18 


533, 


00000 


630 , 


ooooo 


2.00000 


19 


532, 


00000 


586, 


ooooo 


1 .80000 


20 


708, 


00000 


701 , 


ooooo 


2,30 00 


21 


537, 


00000 


681 . 


ooooo 


2,10000 


22 


635, 


00000 


647, 


ooooo 


3,00000 


23 


591 , 


00000 


614 


, ooooo 


3.30000 


24 


552, 


00000 


669, 


ooooo 


3.00000 


25 


557, 


00000 


674 


, ooooo 


3.20000 


26 


599, 


00000 


664, 


ooooo 


2,30000 


27 


540 , 


00000 


658 


, ooooo 


3.30000 


28 


752. 


, 00000 


737, 


ooooo 


3,30000 


29 


726, 


,00000 


800 


.00000 


3,90000 


30 


630, 


00000 


668, 


ooooo 


2,10000 


31 


558, 


00000 


567 


, ooooo 


2,60000 


32 


646, 


00000 


771 , 


ooooo 


2,40000 


33 


643, 


00000 


719 


,00000 


3,30000 


34 


606. 


,00000 


755 


ooooo 


3,10000 


33 


682, 


00000 


652 


,00000 


3,60000 


36 


565 


,00000 


672 , 


ooooo 


2,90 00 


37 


578, 


, 00000 


629 


,00000 


2,40000 


38 


488, 


, 00000 


6.1.1. , 


ooooo 


1 .800 00 


39 


361. 


, 00000 


602 


,00000 


2,40000 


40 


560 


, 00000 


639 


, ooooo 


2,9000 


41 


630 


, 00000 


647 


,00000 


3.50000 


42 


666 


, 00000 


705 


, ooooo 


3.4000 


43 


719 


,00000 


668 


,00000 


2,30000 


44 


669 


,00000 


701 


, ooooo 


2,9000 


45 


571 


, 00000 


647 


, ooooo 


1 ,80000 


46 


520 


, ooooo 


583 


,00000 


2 , 8 


47 


571 


, ooooo 


593 


, ooooo 


2,300 


48 


539 


, ooooo 


601 


ooooo 


2,50 


4V 


580 


, ooooo 


63 


.00000 


2.40000 


50 


629 


, ooooo 


695 


, ooooo 


2 , 90000 


51 


623 


,00000 


5 9 


,00000 


2,60000 


52 


454 


, ooooo 


471 


, ooooo 


2,30 


53 


643 


,00000 


700 


,00000 


2,4000 


54 


585 


,00000 


71.9 


, ooooo 


3 .000 


5 5 


719 


, ooooo 


71.0 


ooooo 


3 .10 


56 


693 


, ooooo 


643 


,00000 


2 . 9 


57 


571 


,00000 


665 


ooooo 


3 ,10 



41 



58 


646, 


00000 


71.9, 





3. 


3 


59 


613, 


00000 


693 


,00000 


p 


, 3 


6 


655, 


00000 


701 , 


00000 


3, 


3 


61 


662 , 





614. 





2 


6 


62 


585, 


00000 


557, 





3 


3 


63 


58 , 


00000 


6 1 1 . 





o 





64 


648, 


00000 


701 , 





3, 





65 


4 05, 





61 i . 





1 . 


90000 


66 


506, 


00000 


68.1. , 


00000 


o 


7 


6? 


669, 





653 . 


00000 


2 





68 


558 , 





50 , 





3 , 


3 


69 


577 , 





635 , 





7> 


00000 


7 


487 , 





584, 


00000 


p 


,30 00 


71 


682, 


00000 


629 


,00000 


3 


,30000 


72 


565, 


00000 


624 , 


00000 


2 


,80 000 


73 


552, 


00000 


665 


,00000 


1 


,70000 


74 


567 , 





724 


00000 


2 


,400 00 


75 


745, 


00000 


746 


,00000 


3 


,40000 


76 


610, 


00000 


653 , 





2 


,80000 


77 


493, 


00000 


605 


00000 


o 


,40000 


78 


571, 


00000 


566, 


00000 


1 


,90000 


79 


682, 


00000 


724 


, 00000 


':> 


,50000 


80 


600, 





677, 


00000 


p 


,30000 


81 


740 , 





729 


, 00000 


3 


,40000 


82 


593 


,00000 


611 


,00000 


p 


,80000 


83 


488, 





683 


,00000 


1 


,90000 


84 


526 


, 00000 


777, 


00000 


3 


,00000 


85 


630 


,00000 


605 


,00000 


3 


,70000 


86 


586 


, 00000 


653 


, 00000 


p 


,3000 


87 


610 


, 00000 


674 


.00000 


2 


,90000 


88 


695 


,00000 


634, 


, 00000 


3 


.30000 


89 


539 


,00000 


601 


,00000 


2 


,10000 


90 


490 


,00000 


701, 


,00000 


1 


,20 000 


91 


509, 





547 


, 00000 


3 


.30000 


92 


667 


,00000 


753 , 


00000 


p 


,00000 


93 


597, 


00000 


652 


,00000 


3 


,10 


94 


662, 


00000 


664, 


00000 


2 


,60000 


95 


566, 


00000 


664 


,00000 


2 


,40000 


96 


597, 


00000 


602, 


00000 


p 


,40000 


97 


604, 


00000 


557 


,00000 


p 


,30000 


98 


519, 





529, 


00000 


3 


, 00000 


99 


643, 


00000 


715 


,00000 


o 


,90000 


100 


606, 


00000 


593, 


00000 


3 


,40000 



Option nu fiber = : ? 





SELECT ANY KEY 



Option number -- "> 



RECODE ROUTINE 



Store recoded data in Mar table * (<= 4 ) 
? 



Exit List routine 

Select Special Function Key labeled-RECODE 

Recoding using contiguous unequal inter- 
vals is chosen. 



Variable nane <<= 10 characters) -- ? 



Recoded data stored in variable 4. 



RANKS 

Nu fiber of the variable to be recoded = f 



Variable name or label. 



Recode based on variable 3 (GPA). 



Nuwber of recoding intervals to be specified <<=20) = ? 



42 



4 Four intervals 

Lower 1 iM.it of first interval = '> 

i , See table below for summary of recoded 

Upper Unit of interval * 1 - specifications. 

■> 

?.. 

For- data falling in interval i. = [ 1. , 2 !> , code 

? 

i 

!.) p p e r lifi 1. 1 o f 1 1 "i t e r v a 1 I 2 

? 



for data falling in interval £ - [2,3 ), codi 



I.J p p e r liM.it of interval # 3 = 
? 

3 . 5 

I- o r d a t a f a 1 1 i n a i n i n t e r v a 1 3 = [ 3 3 . 5 ) . <:: o d i 

? 

3 

U p per 1 i m 1 1 of interval # 4 - 



4 

For data falling in interval 4 ~ [ 3,5 , 4 ) .. code =■ 

? 

4 

Is above inforfiation correct? 

YEB 

Variable * 3 is recoded into 4 categories, and the recoded 
values are stored in Variable t 4 , where: 



CATEGORY BOUNDS * OHS 

LOWER UPPER CODED CODE 

1,000 2,000 9 1.000 

2,000 3,000 54 2,000 

3.000 3,50 29 3,000 

3 ,500 4,000 8 4,000 

Option nunber =- ? 

Exit Recode routine. 

PROGRAM NOW UPDATING SCRATCH DATA FILE 
SELECT ANY KEY 

LIST ROUTINE 



Summary: Note that upper limit is not 
closed but open. That is a value of 3.5 
would be recoded as a 4. 



Option nunber 



Select Special Function Key labeled-LIST 



1 List all the data. 

Enter Method for listing data: 

3 In tabular form. 



TOTAL ACT SCORE/GPA COMPARISON DATA 



Data type is: Raw data 



43 





Variable * 1 


Variable ♦ 2 


Variable ♦ 3 


Variable # 4 




(VERB ) 


(MATH ) 


(GPA ) 


(RANKS ) 


OBS# 










1 


500 ,00000 


661.00000 


2.30000 


2.00000 


2 


460. 00000 


692, 00 00 


1 .400 00 


1,00000 


3 


717.00000 


672,00000 


2.80000 


2.00000 


4 


592.00000 


441 .00000 


2,40000 


2,00000 


5 


752.00000 


729.000 00 


3,40000 


3, 00000 


6 


695.00000 


681 , 00000 


2.50000 


2,00000 


7 


610.00000 


777.00000 


3,60000 


4,00000 


8 


620 .00000 


638,00000 


2.60000 


2.00000 


9 


682. 00000 


701 , 000 00 


3.60000 


4.00000 


10 


524, 00000 


700, 00000 


2.90000 


2.00000 


li 


552. 00000 


692.00000 


2.60000 


2,00000 


12 


703,00000 


710.00000 


3.80000 


4,00000 


13 


584, 00000 


738,00000 


3,00000 


3,00000 


14 


550. 00000 


638, 000 


2,50000 


2 ,00000 


15 


659, 00000 


672,00000 


3,50000 


4.00000 


16 


585, 00000 


605. 00000 


2, 00 00 


2,00000 


17 


578. 000 00 


614.00000 


3, 00000 


3,00000 


18 


533.00000 


630 , 00000 


2,00000 


2.00000 


19 


532,000 00 


586.00000 


1 ,80000 


1 , 00000 


20 


708, 00000 


701.00000 


2,300 00 


2.00000 


21 


537, 00000 


681 .0 0000 


2,10000 


2.000 


22 


635,00000 


647. 00 00 


3. 00000 


3,0000 


23 


591 , 00000 


614,00000 


3,30000 


3,00000 


24 


552.00000 


669, 00000 


3, 00000 


3,00000 


25 


557.00000 


674.00000 


3.20000 


3,00000 


26 


599.00000 


664. 00000 


2,300 00 


2.00000 


27 


540,00000 


658,00000 


3,30000 


3.00000 


28 


752.00000 


737.0 000 


3.30000 


3,00000 


29 


726.00000 


800 .00000 


3,90000 


4,00000 


30 


630, 00000 


668, 00 000 


2.10000 


2,00000 


31 


558.00000 


567,00000 


2.60000 


2,00000 


32 


646,00000 


771 ,00000 


2,40000 


2.00000 


33 


643.00000 


719,00000 


3,30000 


3, 00000 


34 


606,00000 


755,00000 


3.10000 


3.00000 


35 


682.00 00 


652,00000 


3.60000 


4,00000 


36 


565, 00000 


672,00 000 


2.90 00 


2, 00000 


37 


578.00000 


629,00000 


2.40000 


2.00000 


38 


488. 00000 


611.00000 


1 .80000 


1,0000 


39 


361 .00000 


6 02,00000 


2.40000 


2,00000 


40 


560,00000 


639,00000 


2,90000 


2,0000 


41 


630 . 00000 


647,00000 


3,50000 


4, 00000 


42 


666.00000 


705.00000 


3,40000 


3.00000 


43 


719.00000 


668,00000 


2,30000 


2.00000 


44 


669.00000 


7 01,00000 


2,90000 


2.00000 


45 


571 .00000 


647.00000 


1,80000 


1,00000 


46 


520.00000 


583.00000 


2,80000 


2.00000 


47 


571.00000 


593.00000 


2.30000 


2,00000 


48 


539. 00000 


601, 000 00 


2.50000 


2,00000 


49 


580.00000 


630.00000 


2.40000 


2.00000 


50 


629.00000 


695,00000 


2.90000 


2.00000 


51 


623,00000 


509,00000 


2,60000 


2.00000 


52 


454, 00000 


471 , 00000 


2,30000 


2,00000 


53 


643.00000 


700,00000 


2.40000 


2,00000 


54 


585.00000 


719,00000 


3.00000 


3, 00000 


55 


719,00000 


710 ,00000 


3,10000 


3, 00000 


56 


693.00000 


643,00000 


2.90000 


2, 00000 


57 


571 .00000 


665,00000 


3. 10000 


3.00000 


58 


646.00000 


719.00000 


3.30000 


3,00000 



44 



59 


613.00000 


693, 


00000 


2,30000 


o 


00000 


60 


655.00000 


701 , 


00000 


3.30000 


3. 


00000 


61 


662,00000 


614. 


00000 


2.60000 


2, 


00000 


62 


585.00000 


557. 


00000 


3.30000 


3, 


00000 


63 


580,00000 


611 . 


00000 


2.00000 


° 


00000 


64 


648.00000 


701 . 


00000 


3,00000 


3. 


00000 


65 


405.00000 


611 . 


00000 


1,90000 


i , 


00000 


66 


506.00000 


681 . 


00000 


2.70000 


2. 


00000 


67 


669, 00000 


653, 


00000 


2,00000 


P 


00000 


68 


558.00000 


500, 


00000 


3.30000 


3 , 





69 


577, 00000 


635, 


00000 


2,00000 


? 


00000 


70 


487. 00000 


584, 


00000 


2.30000 


P 


00000 


71 


682,0 0000 


629, 


00000 


3.30000 


3, 


00000 


72 


565, 00000 


624, 


00000 


2.80000 


? 





73 


552. 0000 


665, 


00000 


1 ,70000 


i , 





74 


567,00000 


724, 


00000 


2,40000 


P 





75 


745,00000 


746 , 





3,40000 


3. 


00000 


76 


610.00000 


653 


00000 


2.80000 


P 





77 


493.00000 


605 


00000 


2,40000 


P 


, 00000 


78 


571.00000 


566, 


00000 


1 ,90000 


.1 , 





79 


682. 00000 


724 


00000 


2,50000 


2, 


, 00000 


80 


600 , 00000 


677 


00000 


2,30000 


P 


00000 


81 


740 , 00000 


7?9 


,00000 


3,40000 


3. 


, 00000 


82 


593, 00000 


611 


, 00000 


2,80 00 


P 





83 


488.00000 


683 


. 00000 


1,90000 


.1 


, 00000 


84 


526, 00000 


777 


,00000 


3. 00000 


3 


,00000 


85 


630, 00000 


605 


,00000 


3,70000 


4 


, 00000 


86 


586, 00000 


653 


, 00000 


2.30000 


■;:> 


, 


87 


610 . 00000 


674 


,00000 


2,90000 


o 


. 00000 


88 


695 ,00000 


634 


,00000 


3,30000 


3 


.00000 


89 


539, 0000 


601 


.00000 


2,10000 


p 


,00000 


90 


490 . 00000 


701. 


, 00000 


1 ,20 00 


1. 


,00000 


91 


509,00000 


547 


. noooo 


3,300 


3 


,00000 


92 


667. 00000 


753 


,00000 


2,0000 


T> 


,000 


93 


597, 00000 


652 


,00000 


3,10 


3 


,00000 


94 


662.000 00 


664 


, 00000 


2,60000 


P 


.00000 


95 


566,00000 


664 


, 00000 


2,40000 


p 


, 00000 


96 


597, 00000 


602 


, 00000 


2,40000 


P 


.00000 


97 


604, 00000 


557 


,00000 


2 ,30000 


n 


.00000 


98 


519, 00000 


529 


, 00000 


3,00000 


3 


,00000 


99 


643,00000 


715 


.00000 


2,90000 


2 


,00000 


100 


606.00000 


593 


, 00000 


3,40000 


3 


, 00000 



Option nuMber = 





SELECT ANY KEY 



Option nunber == '> 



SUBFILE ROUTINES 



Exit List routine 



Select Special Function Key labeled-SUBFILES 



Choose to create subfile by values of a 

3 variable. 
U h i c: h variable s h o u I. d be u s e d t o <:: r e a t e t h e s, u b file s ? 

Enter variable no. to be used in creating 

4 subfiles. 
Criterion value = 1 Enter nane for subfile 1 (<~10 characters) 
? 

POOR 

Criterion value = 2 Enter narie for subfile 2 (<-"i0 characters) 

? 

AVERAGE 

Criterion value - 3 Enter narie for subfile 3 (<-10 characters) 

? 

GOOD 

Criterion value =■■ 4 Enter none for subfile 4 (<-10 characters) 



45 



EXCELLENT 

Is the above information correct ? 

YES 

Subfile: niM* : beginning observation — nunber of observations 



i . POOR 

2. AVERAGE 

3 . GOOD 

4. EXCELLENT 

Option number = ? 



PROGRAM NOW STORING DATA 
SELECT ANY KEY 



Option nuMber = V 

i 

Enter Method for listing data 

3 



1 
5 
64 
93 



LIST ROUTINE 



9 

54 

8 



Exit Subfile routine 

Select Special Function Key labeled-LIST 
List all the data 
In tabular form 



TOTAL ACT PCORE/GPA COMPARISON DATA 



Data 


type is: Raw data 


i 




Data is again listed b 
ranged on the basis < 




Variable # 1 


Variable # 2 


Variable * 3 


Variable * 4 




(VERB ) 


(MATH ) 


(GPA ) 


(RANKS > 


OBS# 










i 


460 .00000 


692. 0000 


1.40000 


1,00000 


o 


532.000 00 


586.00000 


1 ,80000 


1 ,000 


3 


488, 00000 


611,00000 


1.80000 


1.00000 


4 


571 .00000 


647, 00000 


1 ,800 00 


.1. ,000 


5 


405.00000 


611.00000 


1 ,900 


1,00000 


6 


552, 00000 


665.0 000 


1 ,70000 


1,0000 


7 


571 .0 0000 


566,000 


1 .90 00 


1,00000 


8 


488. 00000 


683.00000 


1 ,90000 


1. , 


9 


490 .0 00 00 


701 ,0000 


1,20000 


1.0000 


10 


500. 00000 


661 ,000 


2,30 00 


2.00000 


11 


717,00000 


672,000 00 


2,80000 


2,00000 


12 


592.00000 


441 , 00000 


2.40000 


2.00000 


13 


695.00000 


681 ,0 00 00 


2.50000 


2.00000 


14 


620.00000 


638,00000 


2,60000 


2 .00000 


IS 


524,0 00 00 


700.0000 


2,90000 


2.00000 


16 


552.00000 


692. 00000 


2.60000 


2.000 


17 


550 , 00000 


638,00000 


2,50000 


2,00000 


18 


585.000 00 


605, 00000 


2.00000 


2,00000 


19 


533.00000 


630 .00000 


2, 00000 


2,00000 


20 


708, 00000 


701 ,00000 


2.30000 


2,00000 


21 


537. 0000 


681 .00000 


2.10000 


2,00000 


22 


599. 00 00 


664, 0000 


2.30000 


2.00000 


23 


630. 00000 


668,00000 


2. 10000 


2,00000 


24 


558.0 00 00 


567,00000 


2,60000 


2,00000 


25 


646, 00000 


771 .000 00 


2,40000 


2,00000 


26 


565, 00000 


672,00000 


2,90000 


2,00000 


27 


578, 00000 


629,00000 


2.40000 


2,00000 


28 


361 , 00000 


602. 00000 


2.40000 


2,00000 


29 


560 .00000 


639,00000 


2.90000 


2,00000 


30 


719. 00000 


668. 00000 


2.30000 


2. 00000 


31 


669.00000 


701 . 00000 


2.90000 


2. 00000 


32 


520.00000 


583.00000 


2.80000 


2. 00000 


33 


571.00000 


593.00000 


2.30000 


2.00000 


34 


539. 00000 


601 ,00000 


2.50000 


2.00000 



46 



35 


580 , 





630 . 


00000 


2,40000 


p 


00000 


36 


629. 


00000 


695. 





2,90000 







37 


623. 


00000 


509, 


00000 


2,60000 


p 


,00000 


38 


454, 


00000 


471 , 


00000 


2,30000 


p 





39 


643. 





70 , 


00000 


2,40000 


'■> 


,00000 


4 


693, 


00000 


643, 


00000 


2,90 00 


p 


.00000 


41 


613. 





693, 


00000 


2,30000 


o 


,00000 


42 


662, 


00000 


614, 


00000 


2,60000 




,00000 


43 


580 , 


00000 


6 1 i . 


00000 


2. 00000 


O 


00000 


44 


506. 


00000 


681 , 


00000 


2,7000 


p 





45 


669. 





653, 


00000 


2, 00000 


p 


00 00 


46 


577. 


00000 


635, 


00000 


2, 00000 


p t 





47 


487. 


00000 


584, 


00000 


2,30000 


p | 





48 


565. 


00000 


624 . 


00000 


2,80 00 


p 





4? 


567. 


00000 


724, 


00000 


2,40000 


p 


00000 


5 


610, 





653, 


00000 


2,80000 


p 





51 


493, 


00000 


605, 


00000 


2,40000 


2 


00000 


52 


682, 


00000 


724, 





2,50000 


p 


.00000 


53 


600 , 


00000 


677, 





2,300 


2 , 


, 00000 


54 


593 


00000 


611 , 





2.80000 


2. 


, 00000 


55 


586, 


00000 


653, 


00000 


2.30000 




,00000 


56 


610, 


00000 


674, 


00000 


2,90000 


2 


,00000 


57 


539, 


00000 


601 , 


00000 


2,10000 


2, 


,00000 


58 


667, 


00000 


753, 


00000 


2,00000 







59 


662, 


00000 


664, 


,00000 


2,60000 


2 


, 00000 


6 


566, 





664, 


00000 


2,40 00 


p 





61 


597, 





602, 


00000 


2,40000 




,00000 


62 


604, 


00000 


557, 


00000 


2,30000 


2 





63 


643 , 


00000 


715, 


00000 


2,90000 


2 





64 


752 , 


00000 


729 , 





3,40000 


3, 





65 


584, 





738, 





3.00000 


3, 


,00000 


66 


578, 


00000 


614, 


00000 


3,00000 


3 


.00000 


67 


635, 


00000 


647 . 


,00000 


3.00000 


3 


,00000 


68 


591 


, 00000 


614, 


,00000 


3.30 00 


3 


.00000 


69 


552 





669 


,00000 


3.00000 


3 


, 00000 


7 


557 


, 00000 


674, 


,00000 


3,200 00 


3 


00000 


71 


540, 


00000 


658, 


, 00000 


3.30000 


3 


,00000 


72. 


752 


, 00000 


737 


00000 


3,30000 


3 


, 00000 


73 


643 





719 


,00000 


3,30000 


3 


,00000 


74 


606 


, 00000 


755 


, 00000 


3,10000 


3 


, 00000 


75 


666 


00000 


705 


,00000 


3,40000 


3 


,00000 


76 


585 


, 00000 


719, 





3 , 00000 


3 


,00000 


77 


719 





7 1 , 


,00000 


3,10000 


3 


,00000 


78 


57.1 


, 00000 


665, 


00000 


3,10 


3 


,00000 


7 9 


646, 





719 


,00000 


3,30000 


3 


, 00000 


80 


655 


, 00000 


701 , 


00000 


3,30 00 


3 


.00000 


8 1 


585, 





557, 


,00000 


3,30000 


J , 


.00000 


82 


648 


, 00000 


701 , 





3.00000 


3 


,00000 


83 


558 


,00000 


50 


, 00000 


3.30000 


'*, 





84 


682 


,00000 


629 


,00000 


3.30000 


3 


, 


85 


745 


,00000 


746 


,00000 


3 ,40000 


3 





86 


740 


,00000 


729 


,00000 


3,400 


3 


,000 


87 


526 


.00000 


777 


,00000 


3 ,00000 


3 


.00000 


88 


695 


. 00000 


634 


,00000 


3,30000 


3 


.00000 


89 


509 


,00000 


547 


,00000 


3,30000 


3 


. 00000 


90 


597 


. 00000 


652 


,00000 


3,10000 


y f 


. 00000 


91 


519 


00000 


529 


,00000 


3, 00000 


3 


,00000 


92 


606 


, 00000 


593 


, 00000 


3,40000 


3 


,00000 


93 


610 


. 00000 


777 


.00000 


3.60000 


4 


,00000 


94 


682 


,00000 


701 


,00000 


3.60000 


4 


.00000 


95 


703 


, 00000 


710 


.00000 


3.80000 


4 


, 00000 


96 


659 


,00000 


672 


. 00000 


3,50000 


4 


,00000 


97 


726 


, 00000 


800 


.00000 


3,90000 


4 


, 00000 


98 


682 


. 00000 


652 


, 00000 


3,60000 


4 


, 00000 


99 


630 


. 00000 


647 


,00000 


3.50000 


4 


,00000 


100 


630 


. 00000 


605 


.00000 


3.70000 


4 


, 00000 



47 



STORE ROUTINE 



Option nunber =- ? 



SELECT ANY KEY 

Enter option nunber desired : 

1 

Norte of data file = ? 

TGRADE: INTERNAL 

Is data medium placed in device ? 

7 



YES 

PROGRAM NOW STORING DATA ON TGRADE : INTERNAL 



Exit List routine 

Select Special Function Key labeled-STORE 

Store the complete set of data. 

On this file. 



* * * * * The data and related information are stored in TGRADE : INTERNAL # * # * * 



Is program medium placed in device ? 



YES 

Enter option nu fiber desired 



SELECTION ROUTINES 





SELECT ANY KEY 



Choose option desired : 



Choose option desired 



SELECTION BASED ON ONE VARIABLE 
Which variable should be used "> 



Criterion variable = i (VERB) 

What values can the criterion variable take ? 

550-80 

Allowable values : 550-800 

Which subfiles do you want to be selected f 



Exit Store routine. 
Choose Special Function Key labeled-SELECT 
Select choosen instead of Scan. 



Choose to Select on basis of value of just 
one variable. 



Variable 1 = Verb 



Select those cases for which Verb is be- 
tween 550 and 800. 



ALL 

SUBFILES TO BE SELECTED ■ ALL 



For both subfiles. 



BSERVi 


ftTIONS 


SAT I SI 


-YING 


SELECT 


ION CR 


ITERIC 


IN = 








3 


5 


9 


10 


12 


14 


IS 


16 


17 


19 




20 


22 


23 


24 


25 


26 


28 


29 


30 


31 




32 
45 
58 


33 
47 
59 


35 
49 
61 


36 
50 
62 


37 

51 
63 


38 
53 
64 


40 

54 
65 


42 
55 
67 


43 
56 
68 


44 
57 
69 


These observations 
meet the criteria. 


70 


7 5. 


72 


73 


74 


75 


76 


78 


79 


8 i) 




81 


82 


84 


85 


86 


87 


88 


89 


90 


91 




93 


94 


95 


96 


97 


98 


99 


10 









48 



SUBFILE 

POOR 
AVERAGE 
GOOD 
EXCELLENT 



BEFORE SELECTION AFTER SELECTION 
NIJM OF OBS NUM OF OBS 



9 
54 
29 

8 



3 
42 
25 

8 



PROGRAM NOW UPDATING SCRATCH DATA FILE 
Choose option desired : 





SELECT ANY KEY 



STATS ROUTINE 



What statistic, options are desired ? 



VARIABLES' 



ALL 



The Selection routine saves only those 
observations whose verbal score was be- 
tween 550 - 800. The rest of the observa- 
tions are discarded from the program 
memory. 



Exit Select routine. 



Select Special Function Key labeled-STATS 



Mean, CI, Variance, Standard Deviation, 
Skewness, Kurtosis. 



Statistics will be computed for all variables. 



Confidence coefficient for confidence interval on the nean(e,q, 90,95, 331) - : 



95 

Option nu fiber = ? 



Uliat subfiles '.ire desired ? 
1-4 



With a 95% coefficient. 

Complete statistics for specified subfiles. 
All subfiles 



* SUMMARY STATISTICS * 

* ON DATA SET: * 

* TOTAL ACT SCORE/GPA COMPARISON DATA * 

Subfile: POOR 



BASIC STATISTICS 



VARIABLE 


# OF # OF 
















NAME 


OBS. MISS 


SUM 




MEAN 




VARIANCE 


STD 


. DEV. 


VERB 


3 


1694 


.0000 


564 


6667 


120.3333 




10.9697 


MATH 


3 


1878 


0000 


626 


0000 


2781.0000 




52 . 7352 


GPA 


3 


5 


.4000 


1 


8000 


.0100 




.1000 


RANKS 


3 


3. 


0000 


i. 


0000 


0.0000 




0.0000 



VARIABLE 


COEFFICIENT 


STD 


, ERROR 


95 X CONFIDE 


NCE INTERVAL 


NAME 


OF 


VARIATION 


OF 


MEAN 


LOWER LIMIT 


UPPER LIMIT 


VERB 




1.94268 




6.33333 


537.60540 


591.72793 


MATH 




8.42415 




30 .44667 


495.90649 


756.09351 


GPA 




5.55556 




.05774 


1.55331 


2.04669 


RANKS 




0.00000 




0.00000 


1.00000 


1.00000 



49 



VARIABLE 

VERB 
MATH 
GPA 
RANKS 



SKEWNESS 



KURTOSIS 



-.70711 
-.61556 
0.00000 



-1.50000 
-1.50000 
-1.50000 



Subfile: AVERAGE 



BASIC STATISTICS 



VARIABLE 


* 


OF 


* 


OF 
















NAME 


OBS. 


MISS 


SUM 




MEAN 




VARIANCE 


STD 


.DEV. 


VERB 




42 







25935 


0000 


617. 


5000 


2382.4024 




48.8099 


MATH 




42 







27318 


0000 


650. 


.4286 


3694.4460 




60.7820 


GPA 




42 







104. 


3000 


2 


.4833 


. 0814 




.2853 


RANKS 




42 







84 


0000 


2. 


0000 


0.0000 




0.0000 



VARIABLE 


COEFFICIENT 


STD. 


ERROR 


95 X CONFIDE 


NCE INTERVAL 


NAME 


OF 


VARIATION 


OF 


MEAN 


LOWER LIMIT 


UPPER LIMIT 


VERB 




7.90443 




7.53152 


602.28627 


632.71373 


MATH 




9.34491 




9 . 37886 


631.48322 


669.37393 


GPA 




11.49047 




.04403 


2.39439 


2.57227 


RANKS 




0.00000 




0.00000 


2.00000 


2.00000 



VARIABLE 



SKEUNESS 



KURTOSIS 



VERB 
MATH 
GPA 
RANKS 



.54518 
-1.03447 

-.03388 



-.82101 
2.32038 
-.90383 



Subfile: GOOD 



BASIC STATISTICS 



VARIABLE 


# OF * OF 
















NAME 


OBS. MISS 


SUM 




MEAN 




VARIANCE 


STD 


.DEV. 


VERB 


25 


15948. 


0000 


637 


9200 


4324.1600 




65.7583 


MATH 


25 


16856. 


0000 


674. 


2400 


4096.6067 




64.0047 


GPA 


25 


80 


.3000 


3 


.2120 


.0236 




.1536 


RANKS 


25 


75. 


.0000 


3 


0000 


0.0000 




0.0000 



50 



VARIABLE 


CO 


EFFICIENT 


STD 


. ERROR 


95 % CONFIDENCE 


INTERVAL 


NAME 


CO- 


VARIATION 


OF 


MEAN 


LOWER LIMIT 


UPPER LIMIT 


VERB 




10.30824 




13.15167 


610.76982 




665.07018 


MATH 




9.49287 




12.80095 


647.81385 




700.66615 


GPA 




4 . 78278 




.03072 


3.14857 




3.27543 


RANKS 




0. 00000 




0.00000 


3.00000 




3.00000 



VARIABLE 

VERB 
MATH 
GPA 
RANKS 



SKEWNESS 



.48079 
-.96523 
-.27487 



KURTOSIS 



-1.04529 

.42114 

-1.47768 



Subfile: EXCELLENT 



BASIC STATISTICS 



VARIABLE 


* 


OF 


# 


OF 












NAME 


OBS. 


MISS 


SUM 




MEAN 


VARIANCE 


STD.DEV. 


VERB 




8 







5322 


.0000 


665.2500 


1607.6429 


40.0954 


MATH 




8 







5564 


0000 


695.5000 


4398.5714 


66.3217 


GPA 




8 







29 


.2000 


3.6500 


.0200 


. 1414 


RANKS 




8 







32. 


0000 


4.0000 


0.0000 


0.0000 



VARIABLE 


COEFFICIENT 


STD. ERROR 


95 X CONFIDENCE INTERVAL 


NAME 


OF VARIATION 


OF MEAN 


LOWER LIMIT 


UPPER LIMIT 


VERB 


6.02712 


14.17587 


631.72037 


698.77963 


MATH 


9.53583 


23 . 44827 


640.03874 


750.96126 


GPA 


3.87456 


.05000 


3.53174 


3.76826 


RANKS 


0.00000 


0.00000 


4.00000 


4.00000 



VARIABLE 

VERB 
MATH 
GPA 
RANKS 



SKEWNESS 



.07320 
. 38485 
.64794 



KURTOSIS 



-1.21757 
-.97545 
-.77551 



What statistic options are desired V 

2 

VARIABLES'* 

7 

ALL 

Option nunber ;=; ? 

2 

What subfiles are desired ? 

1-4 



Correlation matrix 

Statistics completed for all variables 
Compute statistics for specified subfiles 
All subfiles 



51 



* SUMMARY STATISTICS * 

* ON DATA SET= * 

* TOTAL ACT SCORE/GPA COMPARISON DATA * 

IK************************************************ 

Subfile: POOR 

CORRELATION MATRIX 

MATH GPA RANKS 

VERB -.6404640 .8660254 

MATH -.9386522 

GPA 

Subfile: AVERAGE 

CORRELATION MATRIX 

MATH GPA RANKS 

VERB .3530502 .0440427 

MATH .0482350 

GPA 

Subfile: GOOD 

CORRELATION MATRIX 

MATH GPA RANKS 

VERB .4981619 .5173239 

MATH -.0706494 

GPA 

Subfile: EXCELLENT 

CORRELATION MATRIX 

MATH GPA RANKS 
VERB .3654701 .6651140 

MATH .4934875 

GPA 

What statistic options are desired ? Median mode, percentiles, Min., Max., 

3 Range 

VARIABLES= 

? Statistics computed for all variables 

ALL 

Option noMber == ? Compute Statistics for specified subfiles 

What subfiles are desired ? All subfiles 

1-4 



52 



************************************************** 

* SUMMARY STATISTICS * 

* ON DATA SET: * 

* TOTAL ACT SCORE/GPA COMPARISON DATA * 
******************************************************************************** 

Subfile: POOR 



ORDER STATISTICS 



VARIABLE 

VERB 

MATH 

GPA 

RANKS 



MAXIMUM 

571.00000 

665.00000 

1.90000 

1.00000 



MINIMUM 

552.00000 

566.00000 

1.70000 

1.00000 



RANGE 

19.00000 

99.00000 

.20000 

0.00000 



MIDRANGE 

561.50000 

615.50000 

1.80000 

1 .00000 









TUKEY'S 


HINGES 




VARIABLE 

VERB 

MATH 

GPA 

RANKS 




MEDIAN 

571.00000 

647.00000 

1.80000 

1.00000 


25-th %-ile 

552.00000 

566.00000 

1.70000 

1.00000 


75-th %-ile 

571.00000 

647.00000 

1.80000 

1.00000 










TUKEY'S MIDDLEMEANS 






VARIABLE 

VERB 
MATH 
GPA 
RANKS 




MIDMEAN 

564.66667 

626.00000 

1.80000 

1.00000 


TRIMEAN 

566.25000 

626.75000 

1.77500 

1 .00000 


MIDSPREAD 

19.00000 

81.00000 

.10000 

0.00000 




Other percentiles 

NO 


(Y/N>? 










Subfile: 


AVERAGE 











ORDER STATISTICS 



VARIABLE 


MAXIMUM 


MINIMUM 


RANGE 




MIDRANGE 


VERB 


719.00000 


550.00000 


169.00000 




634.50000 


MATH 


771.00000 


441.00000 


330.00000 




606.00000 


GPA 


2.90000 


2.00000 


.90000 




2.45000 


RANKS 


2.00000 


2.00000 


0.00000 




2.00000 






TUKEY 


'S HINGES 






VARIABLE 


MEDIAN 


25-th %-ile 


75-th 


%-ile 




VERB 


607.00000 


578.00000 


646 


.00000 




MATH 


658.50000 


624.00000 


681 


.00000 




GPA 


2.40000 


2.30000 


2 


.60000 




RANKS 


2.00000 


2.00000 


2 


.00000 





53 



TUKEY'S MIDDLEMEANS 



VARIABLE 






MIDMEAN 


TRIMEAN 


MIDSPREAD 


VERB 610.13636 


609.50000 


68.00000 


MATH 655.95455 


655.50000 


57.00000 


GPA 2.46818 


2.42500 


.30000 


RANKS 2.00000 


2.00000 


0.00000 


Other percentiles(Y/N)? 






NO 








Subfile: GOOD 



ORDER STATISTICS 



VARIABLE 




MAXIMUM 


MINIMUM 




RANGE 


MIDRANGE 


VERB 




752.00 00 


552.00000 


2 


00.00000 


652.00000 


MATH 




755.00000 


500.00000 


2' 


55.00000 


627.50000 


GPA 




3.40000 


3.00000 




.40000 


3.20000 


RANKS 




3.00000 


3.00000 




0.00000 


3.00000 








TUKEY 


'S 


HINGES 




VARIABLE 




MEDIAN 


25-th Z~ile 




75-th X-ile 




VERB 




635.00000 


585.00000 




666.00000 




MATH 




701.00000 


634.000 00 




719.00000 




GPA 




3.30000 


3.10000 




3.30000 




RANKS 




3.00000 


3.00000 




3.00000 










TUKEY'S MIDDLEMEANS 






VARIABLE 




MIDMEAN 


TRIMEAN 




MIDSPREAD 




VERB 




626.53846 


630.25000 




81.00000 




MATH 




685.76923 


688.75000 




85.00000 




GPA 




3.23077 


3.25000 




.20000 




RANKS 




3.00000 


3.00000 




0.00000 




Other percentiles 


<Y/N>? 










NO 
















Subfile: 


EXCELLENT 











ORDER STATISTICS 



VARIABLE 


MAXIMUM 


MINIMUM 


RANGE 


MIDRANGE 


VERB 


726.00000 


610.00000 


116.00000 


668.00000 


MATH 


800.00000 


605.00000 


195.00000 


702.50000 


GPA 


3.90000 


3.50000 


.40000 


3.70000 


RANKS 


4.00000 


4.00000 


0.00000 


4.00000 



54 



TUKEY'S HINGES 



VARIABLE 

VERB 

MATH 

GPA 

RANKS 



MEDIAN 

670.50000 

686.50000 

3.60000 

4.00000 



25-th X-ile 

630.00000 

649.50000 

3.55000 

4.00000 



75-th %-ile 

692.50000 

743.50000 

3.75000 

4.00000 



TUKEY'S MIDDLEMEANS 
VARIABLE 

MIDMEAN TRIMEAN MIDSPREAD 

VERB 663.25000 665.87500 62.50000 

MATH 683.75000 691.50000 94.00000 

GPA 3.62500 3.62500 .20000 

RANKS 4.00000 4.00000 0.00000 

Other percent iles(Y/N>? 
NO 

What statistic: options are desired ? Exit Basic Statistics routine 



SELECT ANY KEY 



55 



Regression Analysis 



General Information 

Description 

The Regression Analysis software provides you with five routines to perform various types of 
linear and non-linear regressions. The regression routines include: 

• Multiple Linear Regression 

• Polynomial Regression 

• Variable Selection Procedures (Stepwise algorithm, etc.) 

• Non-linear Regression 

• Standard Non-linear Regression Models 

In addition, a residual analysis module is included which will be helpful in judging the quality 
of the chosen regression model. Brief desciptions of each regression routine follow. 

The multiple linear regression routine performs a least-squares regression on a set of predeter- 
mined variables. 

The variable selection procedures perform least-square regressions iteratively on sets of vari- 
ables which are determined by one of four selection procedures - stepwise, forward selection, 
backward elimination, or manual. These selection procedures are helpful in determining 
which of the independent variables are "important" in predicting the behavior of the depen- 
dent variable. 

The polynomial regression routine is a special case of the multiple linear regression procedure 
where the independent variables are actually powers of a single variable. In other words, the 
form of the regression model is: 

Y = BO + B1*(X) + B2*(X|2) + ... + Bp*(X|p), 

where Y is the dependent variable, X is the independent variable, and Bl, ..., Bp are the 
regression coefficients. A routine is also provided so you can plot the X-Y data along with the 
regression curve. 

The non-linear regression routine allows you to determine the coefficients of virtually any 
model you wish to specify. It is more difficult to use than the multiple linear regression routines; 
however, its use is mandatory when the model is non-linear in the regression coefficients. An 
example of this is the model: 

Y = Bl(Exp)B2*Xl + B3*X2), 

where Exp is the exponential function. A plotting routine is provided so you can plot any 
variable versus the dependent variable. If the model has only one independent variable, the 
regression curve can also be plotted. 



56 



The routines referred to as "standard" non-linear regressions determine the regression coeffi- 
cients for the following four types of common non-linear regression models: 

• Y = A*XTB + C 

• Y = A*Exp(BX) + C 

• Y = A*Exp(BX) + OExp(DX) + E 

• Y = A*Sin(BX) + OCos(DX) + E 

Also provided is a routine to plot the data along with the computed regression curve. 

All of the regression programs provide an analysis of variance table, correlations, and the 
regression coefficients, as well as their standard errors. 

The residual analysis routine provides a list of the residuals as well as a plot of the standar- 
dized residuals versus observation number or any variable. 

Typical Program Flow 



Enter data via BSDM 


■ 




Select Advanced Statistics option 


. 




Choose type of regression routine 






Specify model 






Obtain regression output 






Obtain table of residuals 




■ 


Plot residuals 



57 



Special Considerations 

Terminology 

By an independent variable we mean a variable that can be set to a desired value (for exam- 
ple, input temperature or catalyst feed rate in a chemical reaction), or values that can be 
observed but not controlled (for example, the outdoor humidity). 

As a result of changes in one or more independent variables, the dependent variable will be 
affected. For example, the purity of a chemical product may be affected by temperature and 
the catalyst feed rate. 

In a simple linear regression: Y = BO + B1*X, Y is the dependent variable, and X is the 
independent variable, while BO and Bl are the regression coefficients. 

Data Structure 

Data is input via the Basic Statistics and Data Manipulation routines. You need to tell the 
regression routine the number of the BSDM variable which you want to be your dependent 
variable. In general, you tell the routine how many independent variables are in your regression 
model. Then, you specify the BSDM variable numbers which you want to be your independent 
variables. For example, suppose you input 10 variables in the BSDM procedure. You might 
specify that variable #4 is your dependent variable and that you want to have five independent 
variables. You then might specify the independent variables as BSDM variables #2, #3, #5, 
#7, and #9. 

If you specify subfiles with the BSDM procedure, you may perform regressions on individual 
subfiles. 



Note 

Non-Linear Regression 

You will have to create a file which contains the function and 
partial derivatives before you get into the program. The steps in- 
volved are shown on page 69. 



58 



Multiple Linear Regression 



Object of Program 

This routine is designed to calculate a least-squares multiple linear regression on a predeter- 
mined set of variables. The general form -of the regression model is: 

Y = BO + B1X1 + B2X2 + ... + BpXp + Error 

where Y is the dependent variable, XI, X2, ..., Xp are the independent variables and BO, Bl, 
..., Bp are the regression coefficients. 

Several basic statistics, as well as the correlation matrix, are output. An analysis of variance 
table is printed. The regression coefficients and their standard errors are output and confi- 
dence intervals are constructed about them. Output along with each regression coefficient is 
an associated t-value. This statistic is used to test if the regression coefficient is significantly 
different from zero, i.e., if the term is useful in the model. In addition, the regression equation 
may be used for predictions and a residual analysis may be performed. 

Typical Program Flow 



Input data via BSDM 



Edit, transform, and 
list data.Obtain basic statistics 



Select Advanced Statistics option 



Select MLR routine 



Specify subfile and 
variables to analyze 



Calculate correlation matrix, 

R-squared, 
and standard error of estimate 



Obtain AOV table 



Obtain confidence intervals 
on parameter estimates 



Obtain residual analysis 



59 



Special Considerations 

Method of Computing Sums of Squares and Cross Products Matrix 

If a data value is missing for one or more variables, the entire observation is deleted, i.e., not 
used in computing the sums of squares and cross products matrix (or correlations). Consider 
the following matrix where missing values are denoted by an M. 

Variable 







1 


2 


3 




1 


M 


3 


2 




2 


1 


3 


4 


Observation 


3 


2 


2 


3 




4 


M 


4 


M 




5 


1 


3 


3 



Observation 1 is deleted since the data value is missing for variable 1 and observation 4 is 
deleted since the data value is missing for variables 1 and 3. Hence, only obervations 2, 3, 
and 5 will be used to compute the sums of squares and cross products matrix, as well as the 
correlations. 

Constant Term 

In the output of the regression coefficients, the term labeled "Constant" refers to the intercept 
or initial value when all the independent variables are zero. This constant term corresponds to 
the BO term in the general form of the model shown in the Object of Program section. 

Transforming Variables 

After you input your data via Basic Statistics and Data Manipulation, you can use the trans- 
formation routine to create new variables. The transformation routine has several predefined 
functions which will allow you to create transgenerated regression variables. Refer to the Basic 
Statistics and Data Manipulation section for further details on transforming variables. 

Additional Sum of Squares in AOV Table 

In the analysis of variance table, you will see that the degrees of freedom and the sum of 
squares of regression are dividied into several parts, each with one degree of freedom. For 
example, suppose a regression problem has three independent variables, say XI, X2, and X3. 
You will notice that these three variables are listed below the "regression" term in the AOV 
table, and that each has one degree of freedom. See the sample problem on page 25. 

The meaning for the XI line is as follows. We first consider only XI in the regression model 
and from the sum of squares we can tell how much of the variation of the dependent variable 
is explained by introducing XI into the model. The meaning for the X2 line is as follows. 
Given that XI is in the model, if we introduce X2 into the model we can see how much 
additional variation is explained by X2. Then, in the X3 line, we suppose XI and X2 are 
already in the model. The sum of squares shows how much additional variation is explained 
by adding X3 to the regression model. The total degrees of freedom of the independent 
variables are equal to the regression degrees of freedom. The sum of squares of the indepen- 
dent variables will also add up to the sum of squares for regression. 



60 



Methods and Formulae 

The Cholesky square-root method is used to factor the sum of squares and cross products 
matrix. It is felt that this method produces less round off error than other inversion techniques. 
This method, as well as all other methods and formulae used may be found in F.A. Graybill's 
Theory and Application of the Linear Model, Chapters 7 and 10. 



Stepwise Regression 
(Variable Selection Procedures) 

Object of Program 

This program allows a regression model to be built iteratively using one of four variable selec- 
tion procedures. The procedures- are stepwise, forward, backward, and manual. A correlation 
matrix is calculated and output. An analysis of variance table, as well as partial correlations, F 
values for deletion and inclusion, and the regression coefficients are output at each step of the 
regression. In addition, a residual analysis may be performed. 

The four selection procedures operate as follows: 

Stepwise 

You specify an F-to-enter and an F-to-delete, and the program begins with no variables in the 
regression model. If any of the variables have an F value larger than the F-to-enter, then the 
variable with the largest F value is entered into the model. This process is repeated with the 
remaining variables. At this point, the F values of the variables in the model are compared 
with the F-to-delete. If a variable has a smaller F value than the F-to-delete, it is removed 
from the model. This process of adding and deleting variables continues until all the variables 
in the model have F values larger than the F-to-delete and all the variables not in the model 
have F values smaller than the F-to-enter, or until the tolerance value becomes too small. A 
small tolerance value signals that the matrix has become unstable. 

Forward Selection 

You input an F-to-enter. The program operates in the same manner as the stepwise selection 
procedure, except that variables are not deleted. The process continues until all variables not 
in the model have F values smaller than the F-to-enter, or until the tolerance value becomes 
too small. 

Backward Elimination 

You input an F-to-delete and the program begins with all the variables in the model. If any 
variable has an F value smaller than the F-to-delete, then that variable with the smallest F 
value is deleted from the model. This process continues until all the variables in the model 
have F values larger than the F-to-delete or until the tolerance value becomes too small. 



61 



Manual Selection 

As the name implies, variables are added or deleted manually until you are satisfied with the 
model. 

Typical Program Flow 



Input data via BSDM 




■ 


Select Advanced Statistics option 






Select Stepwise routine 






Specify variables to be in regression 




■ 


Choose selection method 


. 


Specify control parameters 
such as F to enter 






Variable selection is performed 


■ 




Residual analysis 



Special Considerations 

F Values Insufficient for Further Computation 

If one of the stepwise, forward, or backward procedures is used in the selection of variables, 
the program will proceed automatically by entering and/or removing variables from the model 
until the F values are not exceeded or until the tolerance value is not met. At this point the 
program reverts to the manual mode. So, for example, this allows you to enter a variable 
whose F value is just slightly less than the specified F-to-enter. 



62 



Methods of Computing Correlations 

Two methods of computing correlations are available. The first method will use an observa- 
tion only if data values are present for each variable. The second method uses all possible 
data values to compute each correlation. If no missing values are present, method two should 
be used to speed computation. 

A simple example will show the difference between the two methods. Suppose we have the 
following data set: 

Variable 





1 


2 


3 


1 


2 


3 


M 


2 


3 


2 


4 


3 


1 


3 


5 


4 


M 


1 


4 



Observations 



If method one is used to compute the correlations, only observations 2 and 3 will be used. 
Observation 1 will be deleted entirely since the data value is missing for variable 3. Similarly, 
observation 4 will be deleted entirely since the data value is missing for variable 1. 

Conversely, suppose method two is chosen. The correlation between variables 1 and 2 will be 
computed using the data values of observations 1, 2, and 3. The correlation between vari- 
ables 1 and 3 will use the data values associated with observations 2 and 3. Similarly, the 
correlation between variables 2 and 3 will use the data values associated with observations 2, 
3, and 4. Hence, data values from a given observation are used if the data points are present 
for the two variables under consideration. 

The observations used to compute AOV table are the same as those used to get the correla- 
tions. 

F-to-enter, F-to-delete 

A variable must have an F value which is greater than the value of F-to-enter for entry into the 
regression model via the stepwise or forward selection procedures. A typical value is 4. A 
variable may be deleted from the regression via the stepwise or backward selection proce- 
dures only if its F value is less than the value of F-to-delete. When using the stepwise proce- 
dure, you must have F-to-enter > = F-to-delete. The F-to-enter should be selected from 
tabled values for your desired significance level with 1 and n-v degrees of freedom, where n is 
the number of observations and v is the number of variables in the regression. Since you 
don't know how many variables will be in the regression a priori, you might guess the number 
of variables which will end up in the regression for your initial analysis. 



63 



Tolerance Value 

You will be asked to enter a tolerance value. Your input must be between and 1. The 
tolerance value is a scaled function of the determinant of the X'X matrix, and is a measure of 
the stability of the correlation matrix. If a variable not in the equation is linearly dependent on 
one of more of the variables already in the model, then the correlation matrix will have a 
determinant of zero. So, if the computed tolerance value gets too small, this might suggest a 
singular matrix. A suggested value for the tolerance is .01. 

Reading the Output 

In the algorithm, one variable will be entered or deleted per step. The variables currently 
included in the regression model are printed on the left side of the table. The variables which 
are not currently included in the model are printed on the right side of the table. 

Partial Correlation 

The partial correlations of the variables not currently in the regression equation are output. 
After a variable, say XI, has been entered into the regression model, the program calculates 
the partial correlation of the other independent variables with the dependent variable, given 
that XI is in the regression model. 

Adding One Variable to the Model 

If any of the variables has an F value larger than the F-to-enter, then the variable with the 
largest F will be entered into the model provided that its tolerance value is greater than the 
user specified tolerance value. 

Deleting One Variable from the Model 

If any variable currently in the regression equation has an F value smaller than F-to-delete, 
then the one with the smallest F value will be deleted from the model at that step. 

Manual Selection 

After you have completed a portion of the program, you will see the prompt "Input 'K', delete 
' — K' ?". At this point the program is operating in a manual mode. That is, you may add a 
variable to the regression equation by entering its number, or delete a variable from the 
equation by entering its number preceeded by a minus sign. 

Methods and Formulae 

All methods and formulae used in this routine may be found in Statistical Methods for Digital 
Computers by K. Enslein, et.al. 



64 



Polynomial Regression 

Object of Program 

This program is designed to fit a polynomial regression model of the form: 

Y = BO + B1(X) + B2(X|2) + B3(X|3) + ... + Bp(X f p) 

where p < = 10. The regression coefficients, BO, Bl, ..., Bp are computed by the method of 
least squares. 

The degree of the regression, p, is chosen by you with the aid of a preliminary analysis of 
variance table and, if desired, an X-Y scatter plot. The preliminary analysis of variance table 
shows the additional sum of squares explained by models of successive degrees as well as the 
associated F values and R-squared values. 

After the degree of the regression is selected, an analysis of variance table for the model is 
printed and confidence intervals are constructed about the coefficients. In addition, a residual 
analysis may be performed. 

Typical Program Flow 



Input data via BSDM 



Select Advanced Statistics option 



Select Polynomial Regression 



Specify variables 
and subfile for analysis 



Plot the X-Y pairs 



Input maximum degree 
of regression to consider 



Decide degree of regression 
based on preliminary AOV table 



Obtain final AOV, 

parameter estimates and 

confidence intervals 



Plot regression line 



Perform residual analysis 



65 



Special Considerations 

Degree of Model 

The maximum degree of the model has been set (somewhat arbitrarily) at 10. Models of 
degree ten involve arithmetic operations using the X variable raised to the 20th power, where 
X is the independent variable. Hence, substantial round-off errors may occur with models of 
high degree. In general, a model of degree p will involve X values raised to the 2*p power. It 
is therefore suggested that you use extreme caution in choosing models of high degree. 

Method of Computing Sums of Squares and Cross Products Matrix 

If a data value is missing for one of the two variables, the entire observation is deleted, i.e., 
not used in the computation of the sums of squares and cross products matrix. See Special 
Considerations of the Multiple Linear Regression section for an example. 

Preliminary AOV Table 

After plotting the X-Y data pairs, you will be asked to specify the maximum degree of the 
regression. A preliminary AOV table will be displayed which will show the additional sum of 
squares and R-squared for the linear, quadratic, cubic, ... regression models. This table can be 
used as an aid in determining the appropriate degree for your polynomial model. 

Plotting Considerations 

When plotting the data and regression, every tic mark on the axes will be labeled. So, you 
should specify no more than 10 tic marks to obtain an uncluttered plot. One tic mark will 
coincide with the point where the X-axis crosses the Y-axis. Another tic mark will coincide 
with the point where the Y-axis crosses the X-axis. 

Plotting the data is highly recommended since a plot may suggest the degree of the polyno- 
mial model. 

Methods and Formulae 

The Cholesky square-root method is used to factor the sum of squares and cross products 
matrix. It is felt that this inversion method produces less round-off error than other proce- 
dures. This method, as well as all other methods and formulae may be found if F.A. Graybill's 
Theory and Application of the Linear Model. 



66 



Nonlinear Regression 

Object of Program 

Given a model 

Y = f(X 1 ,X 2 ...,Xm;P 1 p 2 ,...,p P ) + € 

where the model f contains m independent variables X; and p parameters (3j and given n 
observations 

(Yi,Xi 1) Xi 2 ,...,Xi m ) ; i = l,2,...,n 

this program computes the least square estimates fjj; that is, the program adjusts the (3j to 
minimize 

n 
i = l 



Q = £{Yi-f(Xi 1 ,Xi 2 ,...,Ximi 1 ,p 2 ,...ip)} 2 



You supply the functional form of f. For example, one possible form would be 

Y = p 1 exp(p 2 X : + p 3 X 2 ) + p 4 

The program also provides X-Y scatter plots (the non-linear regression curve can be added to 
the plot if the model contains only one independent variable). After each iteration the follow- 
ing information is output: the iteration number, estimated parameter values, and sum of 
squared residuals (Q). Confidence intervals (regions) on the parameters are also constructed. 
In addition, a residual analysis may be performed. 

Before beginning the program, you will need to create a file which contains the function and 
partial derivatives. The necessary steps are shown in the Special Considerations section. 



Typical Program Flow 



67 



Input data via BSDM 



Select advanced statistics 



Insert program medium 



Choose Nonlinear Regression 



Specify variables and subfiles 



Plot X versus Y 



Load subroutine with 
function and partial derivatives. 



1 


Enter initial values 
for every parameter 






Estimation of parameters 






Plot regression line 




■ 


Confidence invervals 
on parameters 






Residual analysis 



68 



Special Considerations 

Limitations 

The maximum number of parameters in the model is 20. Also, the number of observations 
times the number of parameters must be less than or equal to 5000. 

Convergence Criteria 

From a user viewpoint there are three modes of program termination during the iterative 
stage of estimation of the parameter. The first mode is the satisfactory completion of the 
convergence criteria; that is, the iteration is terminated whenever 

Ul! < delta for all j 

0.001+ |0j | 

where delta is a small number that you input, and Sj is the change in pj resulting from the last 
iteration. This is the normal termination which should occur when a proper function has been 
specified for f, the derivatives are specified correctly, and the initial estimates for the para- 
meters are reasonable. 

A second mode of termination can occur when the program determines that the process is not 
converging in a satisfactory manner. (For the procedure used in determining whether the 
process is converging properly, see Reference 5.) If the program does terminate the iterative 
process, you are able to respecify the convergence coefficient (Delta), the function and/or 
derivatives, and the initial parameter estimates. 

The third method of termination of the iterative process is for you to "force off" the computa- 
tional process by pressing the "No" key. 

Quick Plot 

A quick plot is essentially a default plot with plotting parameters: 

1 . X-min = actual X-min, X-max = actual X-max. 

2. Y-min = actual Y-min, Y-max = actual Y-max. 

3. Y-axis crosses X-axis at X-min. 

4. X-axis crosses Y-axis at Y-min. 

5. Distance between X-tics - (Xmax-Xmin)/5. 

6. Distance between Y-tics = (Ymax-Ymin)/5. 

7. Number of decimals for labeling X-axis and Y-axis = 2. 

You may wish to have the quick plot drawn in order to "see" what the relationship 
between Y and the X you have chosen looks like. 

The actual limits of the confidence intervals are very data dependent. Caution should be 
exercised in using these limits if many iterations were required to determine the regression 
coefficients. 



69 



Before you Run Non-linear Regression 

To run non-linear regression, you must first create a file which contains the function and partial 
derivatives you wish to use. You can create as many of the files as you wish. The procedure to 
create these files is as follows: 

• Insert your floppy in the built-in disc drive 

• Type SCRATCH A; press EXECUTE 

• Press EDIT key; press EXECUTE 

You should now see the line number ten on the screen. 

• Now type in each line of the file, pressing ENTER after every line that has been entered. 
The file should resemble the one below. 



Note 

Remember that partial derivatives should be taken with respect to 
PC). 

10 SUB Function<P<*> .X<*> »F) 

20 F=P( 1 )+P(2)*X( 1 ) -P<3> 

30 SUBEND 

40 SUB Partial (P(*) >)<(*) ,Der<*) ) 

50 Der(l)=l 

B0 Der(2)=X( 1 ) -P<3> 

70 Der(3)=P(2)*L0G(X( 1 ) )*X( 1 ) " P ( 3 ) 

B0 SUBEND 

• The two SUB statements in your file must be exactly the same as in the example. 

• When you have finished typing the two subroutines, press the CLR SCR KEY. Type STORE 
"name of file". You may name your file whatever you like as long as the name is not greater 
than ten characters long and has nothing but letters and numbers in it. 

• You may now begin running the Statistics Library by typing LOAD "AUTOST",l with the 
BASIC Statistics and Data Manipulation disc in the internal disc drive. 



70 



Methods and Formulae 

The Marquardt's procedure (see Reference 5) is used to obtain the estimated parameters in 
each iteration. Define 

Z = (Zij)= | af(X 1 j,X 2 j,...,Xmjj 1 ,...p P ) 1 r Qf(Xjj )1 

dpi api 

then each iteration can be written as 

o(k + l) = A(k) + g(k) 

where 8(k) is the solution of the set of linear equations 

(A + Xl)5=Z'(Y-f(X,p)) = g 

where A = Z'Z and g are evaluated at p(k) (both A and g are normalized in the program), and 
where X is an adjustable parameter which is used to control the iteration. The motivation of 
Marquardt's method is to choose X so as to follow the Gauss-Newton method to as large an 
extent as possible, while retaining a bias towards the steepest descent direction to prevent 
divergence. 

The square root method is used to solve the system of linear equations in each iteration and 
toobtainC = (Cij) = A _1 . 

For the confidence intervals (regions) on parameters, the 1 - a one-at-a time confidence inter- 
val on pj is 

PJ-tta/ain-pJtSe^jjli^^Pi^pj + tfa/am-pXSe^jj) 1 ^ 

and the approximate 1 -a simultaneous confidence intervals on pj's are 

pj-(pF(a:p,n-p)Se 2 Qj) 1 /2^pj«pj + ((pF(a:p,n-p)Se 2 Cjj)V2 

where p is the number of parameters in the model, n is the number of observations (exclude 
the missing values), t(a/ 2 :n-p) is the a/2 upper point of the T-distribution with n-p degrees of 
freedom. F(a:p,n-p) is the a upper point of the F-distribution with p and n-p degrees of 
freedom, and Se is the standard error of the residuals. 

References 

1. Draper, N., and Smith, H., (1980) Applied Regression Analysis, 2nd Edition, John Wiley 
and Sons, Inc., New York. 

2. Fletcher, R. (1971) "A Modified Marquardt Subroutine for Nonlinear Least Squares", 
United Kingdon Atomic Energy Authority Research Group Report. 

3. Graybill, F. (1976) Theory and Application of the Linear Model, Wadsworth Publishing 
Co., Inc., California. 

4. Kopitzke, R., and (Boardman, T.J., Editor). Unpublished Notes for 9830A Statistical 
Distribution Pac. Hewlett-Packard, September 1976. Part No. 09830-70854. 

5. Marquardt, D. (1963). "An Algorithm for Least Squares Estimation of Nonlinear Para- 
meters". J. Soc. Indust. and Appl. Math., 11. No. 2. 



71 



Standard Nonlinear Regressions 

Object of Program 

This program determines the regression coefficients for the following four types of standard 
non-linear regression models: 

1. Y = A(XTB) + C 

2. Y = A*Exp(BX) + C 

3. Y = A*Exp(BX) + OExp(DX) + E 

4. Y = A*Sin(BX) + OCos(DX) + E 

where the intercept term, C or E above, is optional. The intercept is determined by using an 
approximate minimum Y value in the observed data as the initial value. 

Typical Program Flow 



Input data via BSDM 






Select Advanced Statistics option 






Choose Standard Non-linear 
Regression routine 






Specify variables and 
subfile for the analysis 






Choose model and, 
if desired, intercept 






Plot the data 




■ 


Use initial parameter values 
provided or supply your own 






Non-linear regression performed 
to estimate parameters 


' 




Plot regression curve 


■ 




Obtain confidence intervals 


■ 




Perform residual analysis 



72 



Special Considerations 

Initial Parameter Estimates 

In models 1), 2), and 3), initial estimates for parameters are obtained by linearizing the model. 
This is accomplished by taking the logarithm of both sides of the equation for model 1, and by 
taking the logarithm of Y in models 2 and 3. In model 3, C is taken as .1*A and D = .5*B. In 
model 4: 

A = (Ymax - E) * Sin(a) * Cos(B * Xmax) 

B = 360 / (length in units of X of a typical cycle) 

C = (Ymax - E) * Cos(a) * Sin(B*Xmax) 

D = B 

E = sample mean of y 

where a = 90 - B * XI, for data in degrees, and XI is the X value at Ymax. 
For angular units in radians, the estimates of B and C will change accordingly. 

Convergence Criteria 

There are three ways by which the program may terminate its iterative procedure of estimat- 
ing the model parameters. 

a. The iteration is terminated when 

| Aj | / (.001 + | J3j | < Delta for all regression coefficients, |3j, 

where Delta is a small number that you input, and Aj is the change in |3j resulting from 
the last iteration. This is the normal termination which should occur when the proper 
model has been selected for a given data set and the initial estimates are chosen prop- 
erly. 

b. When the program determines that the process is not converging in a satisfactory man- 
ner, it will terminate. For the procedure used in determining whether the procedure is 
converging properly, see reference 5 in the Non-linear Regression section. If the prog- 
ram does terminate the iterative process, you can re-specify the convergence coefficient 
(Delta), and/or the initial estimates of the parameters and try the regression again. 

c. You may force the iterative procedure to terminate by pressing the "Stop" key. 

Angular Units for Model 4 

When model 4, the trigonometric model, is chosen, you need to specify two additional items for 
the program. You must declare whether your X values are in degrees or radians. In addition, 
during the routine which supplies the initial estimates for the parameters, you need to specify 
the length of a typical cycle of data. 



73 



Residual Analysis 

Object of Program 

This program allows you to analyze the residuals from a regression problem in order to check 
the adequacy of the regression model. It may be used upon completion of any of the regres- 
sion routines. The residuals may be printed and/or plotted. 

The residual printout includes the observed values, predicted values, residuals, and standar- 
dized residuals. A final column shows which residuals are significantly large. 

The residual plot allows you to plot the standardized residuals versus observation number or 
versus any of the variables in the model. 

Residuals may be generated for subfiles which were not used in the determining the regres- 
sion equation. This may be useful as a method of confirming the adequacy of the derived 
model. 

Typical Program Flow 



Request a residual analysis 
upon completion of regression 



Printout residuals 



Plot residuals 



Special Considerations 

Range of Standardized Residuals 

The standardized residuals are plotted in a range from -5 to 5. If any standardized residuals 
are outside this range they will not be plotted, but a note showing the number of residuals off 
scale will be added to the plot. 

Significance of Residuals 

The last column in the residual table output shows which residuals are significantly large. In 
this column, two asterisks are printed for standardized residuals between two and three stan- 
dard deviations away from zero. Similarly, three asterisks are printed for standardized re- 
siduals between three and four standard deviations away from zero, and four asterisks are 
printed for standardized residuals four or more standard deviations away from zero. 

Distance Between X Tic Marks When Plotting 

The first tic mark will coincide with the minimum X value. Every tic mark will be labeled. 
Hence, an uncluttered plot would contain no more than 10 tic marks. 



74 



Methods and Formulae 

Suppose you wish to fit a regression model of the form: 

Y = BO + B1X1 + B2X2 

where BO, Bl, and B2 are the regression coefficients. We will call the nth predicted value for 
Y, y(n), the nth residual r(n), and the Jth observation of the Ith variable, D(I,J). We would 
then calculate the following: 

1. Predicted Y: y(n) = bO + bl*D(Xl,n) + b2*D(X2,n), where bO, bl, and b2 are the 
predicted regression coefficients. 

2. Residual: r(n) = D(Y,n) - y(n) 

3. Standard error of residuals: Ser = (residual mean square) f .5, where the residual 
mean square is calculated in the regression routine. 

4. Standardized residual: SR(n) = r(n)/Ser 

The residuals for a nonlinear regression are derived in a similar manner except that the non- 
linear regression model is used to predict Y. 



75 



Example 1: Multiple Linear Regression 

The data below will illustrate Multiple Linear Regression. The data consists of three variables, 
XI, X2 and the independent variable Y: 

Are you Join* to use user defined transformation 

or do Non-linear regression? (Y/N) 
NO 

A r a <i o u u z .i. n g a n II P I B Pr i. nt c r "/ 

YES 

Enter select code i bus address (if 7»1 press CONT)? 

¥ * * * * * * M **###*';(:#* * * ****** * * * * X: * * * * * * * * * * * * * * * * * * * >i: * * t # * * *: 1 1 1 * * * * * * % * % * 4: * * t :<■: * * H: * 
* D A T A M A N I P U L A T 1 N ' «: 

********************************************** 



Enter DATA TYPE: 

1 

Mode nu fiber ■-■ ? 



Raw data 



I.:: data stored on the [KMigraw's scratch file (DATA) 
YES 



Stored on mass storage 



Previously stored on scratch data file. 



EXAMPLE OF MULTIPLE LINEAR REGRESSION 



Data file na«e ; DATA 

D a t a t v p e ,i. s ; Raw d a r a 

Nurtber of observ a 1 i o n s : 

NiiMber of variables; 



Variable nawes; 



.1. 


A <J J. '.'■ 1 1 U 

Xi 


t- 


X2 


3 


Y 


4 


XI A 2 


S 


X2"2 


6 


Xi*X2 


'3 u b f j 


1 1 c fe : 



NONE 



Note: X4, X5, and X6 are derived from X1 
and X2 by transformations. 



SELECT ANY KEY 



Opt j. on nu fiber 



Select special function key labeled-LIST 
List all the data 



Eiv 



ti '? t h o d f o r 1 j. s 1; i i "i ci d a \ < 



76 



In tabular form 



MULTIPLE LINEAR REGRESSION EXAMPLE 



Data type isi Raw data 





Variable * i 


Variable * 2 


Variable ♦ 3 


Variable # 4 


Var 


■iable # 5 




(Xi 


> 


<X2 


) 


<Y 


> 


(Xi A 2 > 


<X2 A 2 ) 


OBS4 




















i 




7.80000 




4.00000 




0,00000 


60 ,84000 




16.0 0000 


2 




7.80000 




8.00000 




,03100 


60,84000 




64, 0000 


3 




7.80000 




12,00000 




,47500 


60.84000 




144,00000 


4 




39.00000 




4. 00000 




. 01600 


1521 ,00 00 




16, 00000 


S 




39,00000 




8.00000 


8, 


000000E-03 


1521 .00000 




64,00000 


6 




39. 00000 




12.00000 




,19000 


1521,00000 




144, 00000 


7 




78,00000 




4,00000 




,00000 


6084,00000 




16,0 0000 


8 




78.00000 




8,00000 




.03900 


6084,00000 




64 . 


9 




78, 00000 




12.00000 




.00000 


6064, 000 00 




144, 00000 





Variable # 6 




<Xi*X2 ) 


DBS* 




1 


31.20000 


2 


62.40000 


3 


93,60000 


4 


156. 00000 


5 


312,00000 


6 


468, 00000 


7 


312.00000 


8 


624. 00000 


9 


936.0 0000 



For this data set only X1 , X2 and Y need by 
typed in. When this is done, select the tran- 
formation key on the template. To get X1 f 2, 
choose option 1 allowing a=1, b = 2, and 
c = 0. This creates a new variable X f 2. The 
same is done to obtain X2 f 2. To obtain 
X1*X2, choose option 10 allowing a=1, 
b = 1, and c=1. Once you have all these 
variables, store them by using the Store key 
on the template. 



Option nurtber = 7 



SELECT ANY KEY 



Exit from the List routine. 



What statistic: options are desired f 



Select Special Function Key labeled-STATS 



Select just the mean, ci, variance, standard 
deviation, skewness, and kurtosis of all the 
data variables. 



1 

VARIABLES = 

■? 

ALL 

Confidence coefficient for confidence interval on the meanfe.S. 90 »95 t99X ) = 



95 95% ci for means requested. 

I******************************************************************************** 

* SUMMARY STATISTICS * 

* ON DATA SET; * 

* MULTIPLE LINEAR REGRESSION EXAMPLE * 
*********************************************** 



77 



BASIC STATISTICS 



VARIABLE ♦ OF # OF 



NAME 


OBS. 


MISS 


SUM 




MEAN 


VARIANCE 


STD , DEV , 


XI 


9 





374 


,40000 


41 ,60000 


927.810 00 


3 , 45997 


X2 


9 





72 


,00000 


8.00000 


12,00000 


3,46410 


Y 


9 







,7590 


,08433 


,02506 


, 15832 


Xi A 2 


9 





PP997 


,52000 


2555,28000 


7403936,57637 


2721 , 01756 


X2 A 2 


9 





672, 


00000 


74,66667 


3136,00000 


56,00000 


Xi*X2 


9 





2995 


,20000 


332,80000 


90043,20000 


300 ,07199 



VARIABLE 


COEFFICIENT 


STD 


, ERROR 


95 % CONFIDENCE 


INTERVAL. 


NAME 


OF VARIATION 


OF 


MEAN 


LOWER LIMIT 


UPPER LIMIT 


::<i 


73.22109 




10.15332 


18.18009 




65, 01991 


X2 


43.30127 




1 , 15470 


5.33654 




10 ,66346 


Y 


187.72946 




.05277 


-, 03739 




.20606 


Xi A 2 


106,48608 




907,00585 


463,15784 




4647.40216 


X2 A 2 


75.00000 




18,66667 


31 ,60967 




117,72366 


Xi*X2 


90, 16586 




100.02400 


102.08217 




563,51783 



VARIABLE 


SKEWNESS 


KURTOSIS 




XI 


, 13506 




-1 


,50000 


X2 


0, 00000 




-1 


,50000 


Y 


1.93769 




2 


.29099 


Xi A 2 


.53922 




-1 , 


.50000 


X2 A 2 


,29480 




-1. 


,50000 


Xi*X2 


,88424 






, 26334 



What statistic options are desired ? 

2 Request the correlation matrix of all the data 

VARIABLES = variables. 

? 

ALL 

* SUMMARY STATISTICS * 

* ON DATA SET: * 

* MULTIPLE LINEAR REGRESSION EXAMPLE * 

CORRELATION MATRIX 



XI 

X2 

Y 

Xi A 2 

X2 A 2 



X2 
0, 0000000 



,4209438 
.5916875 



XI A 2 

.9747877 

0.0000000 

-.3905355 



X2 A 2 
, 0000000 
.9897433 
,625096.1. 
.0000000 



Xi*X2 
.8120711 
.4802402 
,2314209 
,7915969 
,4753145 



What statistic options are desired ? 



VARIABLES 
? 



Gives median, mode, percentiles, min, max, 
and range of all the data. 



ALL 



78 



* SUMMARY STATISTICS * 

* ON DATA SET : * 

* MULTIPLE LINEAR REGRESSION EXAMPLE * 

ORDER STATISTICS 



VARIABLE 




MAXIMUM 


MINIMUM 




RANGE MIDRANGE 


Xi 




78.00000 


7,80000 


70, 


,200 00 42,90000 


X2 




12.00000 


4, 00000 


8 


,00000 8,00000 


Y 




.47500 


0,00000 




,4750 .23750 


Xi"2 




6084.00000 


60.84000 


6023 


,16000 3072,42000 


X2*2 




i44. 00000 


16, 00000 


128 


,00000 80,00 00 


Xi*X2 




936.00000 


31 .20000 

TUKEY 


904 

>s h: 


,80000 483.60000 








INGES 


VARIABLE 




MEDIAN 


25-th %-j.le 




75-th %-ile 


Xi 




39.0 00 00 


7.80000 




39.00000 


X2 




8,00000 


4, 00000 




8. 00000 


Y 




.01600 


,00000 




,03100 


Xi A 2 




1521. 00000 


60.84000 




1521 . 00000 


X2*2 




64.00000 


16,00000 




64.00000 


Xi*X2 




312.00000 


93.60 000 




312 , 00000 








TUKEY' S MIDDLEMEANS 




VARIABLE 




MIDMEAN 


TRIMEAN 




MIDSPREAD 


Xi 




40.56000 


31 ,20000 




31 .20000 


X2 




8,00000 


7.00000 




4.00000 


Y 




,01880 


, 01S7S 




. 03100 


Xi A 2 




2141.56800 


1155.96000 




1460 , 16000 


X2 A 2 




70,40000 


52.00000 




48, 00000 


Xi*X2 




268.320 


257,40000 




218,40000 

















— 


Other per 


cen tile 


s? 








NO 













What statistic options are desired ? Note: All three sets of statistics could have 

selected original by answering ALL to option 

SELECT ANY KEY question. 

Exit Basic Statistics routine. 

Select special function key labeled-ADV STATS 
Remove BSDM medium. 
Option number = ? Insert regression medium. 

5 Multiple linear regression. 

Nuwber of the dependent variable = f 

3 Y = variable'Y" 

Which of the refraining variables should be included in the regression '> 

ALL X,, X 2 , Xf2, X2|2, X1 and X2 

Is above infor nation correct? 

YES Displayed on CRT 



79 



MULTIPLE LINEAR REGRESSION ON DATA SET: 

MULTIPLE LINEAR REGRESSION EXAMPLE 

— where 





Independen t 


var lab le (s ) 


= <l)Xi 
<2)X2 
<4>X1 A 2 
<5)X2*2 
(6)Xi*X2 


STANDARD 


COEFF, OF 


VARIABLE 


N 


MEAN 


VARIANCE 


DEVIATION 


VARIATION 


XI 


9 


41,60000 


927,81000 


30 .45997 


73,22109 


X2 


9 


8,00000 


12,00000 


3.46410 


43,30127 


Xl*2 


9 


2555.28000 


7403936,57637 


2721 .01756 


106.48608 


X2 A 2 


9 


74.66667 


3136. 00000 


56. 00000 


75,00000 


Xi*X2 


9 


332.80000 


90043.20000 


300.07199 


90 . 16586 


Y 


9 


.08433 


. 02506 


. 15832 


187,72946 



CORRELATION MATRIX 



XI 

X2 

Xl*2 

X2 A 2 

X1*X2 



X2 


Xi*2 


X2 A 2 


X.i.*X2 


Y 


0,0000000 


.9747877 


.0000000 


,8120711 


-.4209438 




0,0000000 


,9897433 


,4802402 


.5916875 






0,0000000 


.7915969 
.4753145 


-.3905355 

,6250961 

-.2314209 



ANALYSIS OF VARIANCE TABLE 



SOURCE 


DF 


TOTAL 


8 


REGRESSION 


5 


XI 


1 


X2 


1 


X1 A 2 


1 


X2 A 2 


1 


Xi*X2 


1 


RESIDUAL 


3 



SUM OF SQUARES 



.20052 
,17769 
.03553 
, 07020 
,00158 
.01531 
,05507 
.02283 



MEAN SQUARE 



03554 
03553 
07020 
0158 
01531 
05507 
,00761 



F-VALUE 



67 
67 
23 

21 
01 
24 



R-SQUARED = .88615 
STANDARD ERROR OF ESTIMATE 



.0872327012721 



From the AOV table we see that the addition- 
al sum of square for each variable produces a 
'reasonable' F except X 4 and X 5 . 



VARIABLE 



x CONSTANT' 

XI 

X2 

Xi A 2 

X2 A 2 

X1*X2 



REGRESSION COEFFICIENTS 
STD. FORMAT E-FORMAT 



0218 - 


218154219795E- 


-02 


0247 


246964177292E- 


-02 


02576 - 


257643442623E- 


-01 


002 


23 132929 11 S8E- 


■04 


547 


5468750 00 0E- 


•02 


083 •••• 


833990121900E- 


■03 



STANDARD ERROR 
REG. COE 



CIENT 


T-VALUt" 


25209 


- . 1 


0517 


.48 


6364 


-.40 


00 05 


.46 


0386 


1 . 42 


031 


-2 . 69 



80 



Confidence coefficient (e.ci,, 90,95,99) 
95 



COEFFICIENT 
-. 00218 
.00247 
-.02576 
.00002 
.00547 
-.00083 



Note: All but the last T values are very small. 
Not a very good model. 



95 V. CONFIDENCE INTERVAL 



^CONSTANT' 

XI 

X2 

Xi A 2 

X2 A 2 

Xi*X2 

Residual anal v sis and/ or prediction '•' 

YES 

P r i n t out re s i d u a 1. s ? 

YES 



LOWER LIMIT 
-.72581 
-.01237 
-.20845 
-.00012 
-.00560 
-.0 01.72 



UPPER LIMIT 
.72145 
.01731 
. 15692 
, 00017 
. 01653 
.00006 



TABLE OF RESIDUALS 



STANDARDIZED 



DBS* 

1 

3 
4 



8 
9 



OBSERVED Y 

, 00000 

. 03100 

.47500 

, 01600 

, 00800 

.19000 

.00000 

. 03900 

,00000 



PREDICTED Y 

-, 2309 

, 11033 

,41876 

-. 0.1.634 

, 01300 

,21734 

, 05543 

-. 04533 

. 02890 



RESIDUAL 

, 02309 

-, 07933 

.05624 

. 03234 

- . 050 

-. 02734 

- . 5543 

, 08433 

- . 2890 



RE' 


SI DUAL 




.26468 


- 


, 90944 




. 64476 




, 370 73 


... 


05732 


... 


.31.342 


- 


,63541 




. 96676 




.33135 



SIGN IF 



Dor-bin -Watson Statistic: 2,8245975174 



For test for autocorrelation of residuals. 



R a s i dual plots-? 

YES 

Would you like to Plot on CRT ? 

NO 

Plotter identifier strins (press CONT if % HPGL') 

l ! 1 o t t e r sele c t c o d e , B u «. # ■■" (defaults are 7 1 5 ) ' 

I? e s i. d u a 1 plot o p t ion no, ■= '•' 

1 

For p 1 o 1 1 i n a , X - m i n : = "> 

.1. 

F « r p 1 o 1 1 i n <:i , X-mqx = ? 

9 

Distance between X- ticks ;:;: "■' 

1 

-I of dec i na Is for- label 1 inci X-ax.i. <:■: (< = /> ■- 1 



Residual Plots 

Press CONTINUE 
Press CONTINUE 
Plot residuals vs time sequence. 



Number of pen color to be used ? 

1 

I i3 a b o v e i n f o r m a t i. o n <:: o r r e c: t ',' 

YES 



81 



EXRMPLE OF MULTIPLE LINERR REGRESSION 



-J 
CC 

3 

n 

M 

en 
u 
a 

a 
u 

N 

M 

Q 

oc 

<E 

a 
z 

(E 



5 

4. 

3. 

2. 

1 . 

0/1 
-1. 
-2. 
-3. 
-4. 
-5. 



x 



8 


8 


8 


8 


8 


8 


8 


8 


8 


8 


8 


8 


8 


8 


8 


8 


8 


8 



(VI 



CO 



m 



to 



CD 



O) 



SEQUENCE ♦' 



Residual plots 



Op t .i. on n u ciber 



Exit from residual plots. 
Return to BSDM. 



82 



Example 2: Stepwise Regression 

The data shown below is the same as used in Multiple Linear Regression. Following the data 
are the results from the stepwise and backward selection procedures. 

Are you SoinS to use user defined transformation 

or do Non-linear regression? (Y/N) 

ND 

Are you using an HP IB Printer'? 

YES 

Printer select code, bus address = '? 

Enter select codei bus address (if 7 <) press CQNT)? 

* DATA MANIPULATION * 

#**#*#*#*#***#***:********#***##*^ 

Enter DATA TYPE!: 

i Raw data 

Mode nuMber = '•' 

2 Stored on mass storage 

Is data stored on the proaraw's scratch file (DATA)? 

YES Previously stored 

Same as MLR example. 



EXAMPLE OF STEPWISE LINEAR REGRESSION 

Data file naiie ; DATA 

Data type is: Raw data 

NuMber of observations: 9 
NiiMber of variables; h 



Variable nones: 
i . X i 
2. X2 
3 . Y 
A. Xi A 2 

s. x;.?*2 

6, Xi*X2 
Subfiles: NONE 



SELECT ANY KEY 
Op t i on nu fiber :: 



Select special function key labeled-LIST 



1 List all the data. 

E 1 1 i e r- n c tho <:l for 1 i s tin a d n t n : 

'3 In tabular form. 



EXAMPLE OF STEPWISE LINEAR REGRESSION 



Data type is: Raw data 



83 



Variable # 
(XI 



Var iab le 
(X2 



Mar iabli 
<Y 



f 3 



Mar i ab le 
(XI \? 



Var 
<X2 



able * 5 
' J ) 



OBS* 

1 
2 
3 
4 



7,80000 
7.80000 
7.80000 
39. 00000 
39. 00000 
39,00000 
78.00000 
78.00000 
78,00000 



4,00000 
8,00000 

12,00000 
4. 00000 
8.00000 

.12,00000 
4,00000 
8. 00000 

12,00000 



0,0000 
, 03100 

.475 

, 5.60 
80Q0000E-03 

, 19000 
, 00000 

, 0390 
0,00000 



6 

6 

60 

1521 



840 
.8400 
,84000 





1521 ,0 00 00 
1521 ,00 000 
6084. 00 00 
6084. 000 
6084,0000 



i 6,0000 
64 , 

144,0 00 
16, 000 
64, 0000 

.1 44,00 
16,00000 
64 .000 

144, 00 00 





Mar. table # 6 




<Xi*X2 ) 


OBS* 




1 


31,20000 


2 


62,40000 


3 


93.60000 


4 


156. 00000 


5 


312.00000 


6 


468, 00000 


7 


312.00000 


8 


624. 00000 


9 


936.00000 



This is the same data set that was used for 
multiple linear regression. Refer to that ex- 
ample for instructions on how to form X1 | 2, 
X2|2, X1*X2. 



Option nuMber = 



SELECT ANY KEY 



Option nu fiber :::: ? 

2 

Procedure nuwber = : ? 

1 

To lerance val ue < i . e . .01 , 1 ) 

. .1. 

I- - v a J. ut? for i. n c 1 o s i o n = ? 

F - value for d e 1 e t ,i. on = '■' 



lis above inf or Mat i. on correct? 

YES 

N u m b e r o f d e p e n cl e n t >> a r • i a b 1 e 



Which remaining variables desired in regression? 

ALL 

Is above information correct? 

YES 



Exit the List routine. 

Select special function key labeled-ADV STATS 

Remove BSDM disc. 

Insert Regression Medium. 

Stepwise regression 

Choose the stepwise algorithm. 



Input tolerance value. 

F-to enter A F-value with 1 and n-k de- 
grees of freedom where k = expected 
number of coefficients in mode, 
f-to delete 

Note: We used F enter = F delete a common 
practice. Also, for n = 9 we probably should 
have used a much larger F. We definitely do 
not recommend small sample sizes except as 
examples. 
Variable 3 = Y 

With all others used as X,. 

Information on CRT 



84 



5|C ^ 5^ )fC 5fC 5|C 5(C 5(C )tf 5fC IfC jfC )fC ^k 3^ ^|C JfC ^C 5fC 5|C 5(C 5(C )|t )ft 3(C 3f^ J^ *t^ Jft )fC Jfs ?|( )(C ?(C JfC ?K 'T^ ?r- t^ 't^ ^ * ^ * $<f'^^^*^^)(t)f^^^*^$*)K*^)is^*^*^^-'rJ^^'T 1 -'f' "ft .^ )fi )f. 

STEPWISE REGRESSION on DATA SET: 

EXAMPLE OF STEPWISE LINEAR REGRESSION 
******************************************************************************** 



Dependent variable: <3)Y 
I n d e p e n d a n T v a r i a b 1 e ( s ) : 



Tolerance = , i 
F-- value for inclusion 
F -- v a 1 o e for deletion - 
Method numbe r = ? 



( 1 ) X i 

>.?.)X2 
<4>X1 A 2 
<5>X2 A 2 
(6)Xi*X2 



The stepwise algorithm can enter or delete 
variables at a step. This example does not 
show any variables which are deleted. 



CORRELATION MATRIX 



XI 

X2 

XI A 2 

X2 A 2 

X.1*X2 

Y 



Kl 
1.0000000 



. 0000000 
i .0000000 



XI A 2 

, 9747877 

. n 

1,0000 



yp-. :> 



n , I) 

1 ,0000000 



Xi*X2 
. !3 1 2 '? 1. 1 
. 4802402 

7 9 1 t- q /, Q 

, 4753145 

J ,0000 



Y 

42094' J 3 

, 591 6875 

390 C ;35S 

6250 <*M 

■ , 231420 c > 

, 



************************************************************** r***M 
STEP NUMBER 



F TO PART F TO REGRESS I ON COEFFICIENTS STD 

♦ •--VARIABLE ENTER CORR TOL DELETE STD, FORMAT E-FORMAT ERRO» 

1.X1 1,51 .421 1,000 

2.X2 3.77 ,592 1.000 

4,Xi A 2 1,26 .3?i 1.0 00 

5, X2"2 4 , 49 ,62S i .000 

6,Xi*X2 .40 .231 1.000 



Var. 5 has largest F-value and correlation, so 
it is the variable to enter the model. 



*#*****************#************************************************************ 
STEP NUMBER 1 
VARIAF<I.E>X2 A 2> ADDED 
R-SOUARED = .39075 



A n a I v i : : is-, of '- ' <x r i. a n <. 



TabL 



SOURCE 
TOTAL 

REGRESSION 
RESIDUAL. 



DF 



SUM OF SQUARES 

?. c w> 
n7R3-> 



mfaw sr iar; 



UAL 



STANDARD ERROR 



. 132107402855 







F TO 


P ART- 




F TO 




RF.GRES 


*■■■• 


-VARIABLE 


ENTER 


CO RR 


TOL 


DELETE 


ST! 


), FORMAT 


1 


, XI 


2 ,46 


, 539 


1,000 








2 


. X2 


. 37 


, ?42 


, 020 








4 


, X 1 " ? 


2.00 


.50 


1,000 








c, 


. X2 A 2 








a, , 49 




.0 01 7'" 


6 


, X!*X2 


B . 7 '■'■'. 


.77 


, 774 









ON COEFFICIENTS 

:- FORMAT 



IT! 7 7. ■'•>[. "0 ■■' 



STD 
"PRO! 5 



85 



Constant - -- • 047619047S19 Var 6 has the largest F-value and correlation, 

so it is the variable to enter the model, 
ft***************************************** A'************:***'*** ***'***:***********'** 

STEP NUMBER 2 
MARIABLE'Xi*X2> ADDED 
R -SQUARED - ,75163 

Analysis of Mariano* Table 

SOURCE DF SUM OF- SQUARES MEAN SQUARE F VALUE 

TOTAL 8 ,20 052 

REGRESSION 2 ,15072 .07536 ",08 

RESIDUAL 6 .04980 .00830 

STANDARD ERROR = .0911067112552 

REGRESSION COEFFICIENTS 3TD 
STD, FORMAT E- FORMAT ERROR 





F TO 


PART 




F TO 


♦ — VARIABLE 


ENTER 


CORR 


TOL 


DELETE 


1 .XI 


4.7J. 


.696 


. 148 




2.X2 


,45 


,286 


, 020 




4, XI '2 


4. S3 


. 689 


,19 




S.X2 A 2 








5 6.86 


6X.i*X2 








8 . 72 


C o n stan t = 


.0037S2291577G 







SOURCE 


DF 


SUM OF 


SQUARES 


TOTAL 


8 




. 20 052 


REGRESSION 


3 




,1.7486 


RESIDUAL 


s 




, 2565 



00268 . 268474330203E-02 0007 
,00036 •■■■■ 360245767615E-03 ,0001 

Var 1 has the largest F-value and correlation, 
so it is the variable to enter the model. 

STEP NUMBER 3 

MAR I ABLE 'XI' ADDED 

R-SQUARED - ,87206 

Analysis of Variance Table 

MEAN SQUARE F-VALUE 

,0 5829 11 36 

. 00513 

STANDARD ERROR - .0716294324428 

REGRESSION COEFFICIENTS ST!) 

STD, FORMAT E- FORMAT ERROR 

,00469 .468749152939E-02 0022 

.00396 .395611766121E-02 ,0003 

-,00086 - .85942300316-03 .0002 

Constant = -.120040391928 None of the remaining variables have an F- 

value greater than F-To-Enter and none of 
the variables in the model have an F-value 
less than F-To-Delete, so the model is com- 
plete with X1 , X2 t 2, and X1 *X2. 

Tolerance value too sfiall and /or F -uol u '=>s i n ?:■ o f f i. c i. e n t To p " o c e e t:l 

Input 'K', delete '-K', or, enter to end regression , 

No other terms added or removed. 

P r o c: « dure n u n b e r = ? 



2 Choose the forward (stepwise) algorithm. 

Tolerance value (i.e, ,01. .001) = f 

. 1 Tolerance 

F-value for Indus i on •= ? 

4 F-To-Enter (perhaps too small) 

Is above inforwation correct? 





F TO 


PART 


F TO 


1= — VARIABLE 


ENTER 


CORP 


TOL DELETE 


1 .XI 






4.71 


2 . X2 


.20 


.220 


, 020 


4 , X 1 * 2 


.26 


.248 


,050 


5.X2 A 2 






:-'5 , 76 


6,X1*X2 






1 1 . 89 



86 



YES 

N u m b <■? r o f d e p e n d e n t war .i. a b 1 e '-' 

3 

Which of the renal nina variables should b 

ALL 

I s above inforn a t ion cor r e c t ? 

YES 



Note: No F to remove in FORWARD. 

Y = X 3 

eel in the r e ar w <: ; <? j. o n '■' 
All others potential. 



*************************** #*r****M 
FORWARD REGRESSION on DATA SET: 

EXAMPLE OF STEPWISE LINEAR REGRESSION 

*.L. *Aj *■» .t, J, ^, ^ Of ^ ilr ^f ^ Jy ilf iLf ■*: *(/ iLi ^ ^ 'i/ ^ •i' -A. -Jj -if ij,- ■,!/ ■Af ii -^ Jf iJ.' J/ ^ >Aj \L- \1; xL \1» \U •X' -J/ -Jj -J^ J.' \l/ \L> 'At \lf \L- vlf J/ ^.' ^ ^ ^ ^ ^ ^ ^ J j -^ ^ ^ 'd' \1' ^ ^ ^ ^ * \1' -A' ^* St- ^lf ■jb 'A - 



Dependent variable: (3>Y 
Independent var iable (s) : 



Tolerance = .01 

F - v a ] i) e for i nclusion 

Method nuMber = ? 



< 1 > X 1 
(2^X2 
( 4 > X 1 " ?. 
<5>X2 A 2 
<6>Xi*X2 



The forward procedure will only add vari- 
ables to the model and will stop when no 
variable has an F to enter larger than 4 (or 
whatever value you specify). 



CORRELATION MATRIX 



XI 

X2 

Ki A 2 

S' p A '? 

X1*X2 

Y 



1,0000000 



X2 
0000000 




XI A 2 

9747877 

0.0000 o 

1,00 



0.0000 

0.0000000 
i 



X1*X2 
.812071 i. 
.480 240 2 
t 7915969 
. 475X1 <*5 
J .00 



Y 

4?f) ! '4 :>S 

5916075 

6250<-'61 
2'H4?0<? 
5 ,0000 



*****************#*****#************************'*'+:*•**'*************************** 
STEP NUMBER 



F TO PART F TO 

♦—VARIABLE ENTER CORR TOL DELETE 

I .XI 1 ,51 .421 1 . 000 

2.X2 3,77 .592 .1 ,00 

4, XI -2 1.26 ,391 1.000 

5 , X.9'2 4 . 49 , 625 1 ,00 

6 ,X1*X2 .40 ,231 1 , 000 



R E G P E S S I P N C H E F F I C I i;: " N T S 
BTD . r OR MAT E FORMA" 



STD 
ERROR 



The results for this portion of the example will 
be the same as the stepwise algorithm 
above. 



************************************** ********Niif:***** ****** *********'l-:**''|: ****■**"** 
STEP NUMBER 1. 
VARIABI.E'X2 A 2' ADDED 
R -SQUARED = ,39 75 



Analysis of Variant 



Table 



SOURCE 


DF 


SUM OF 


SOU ARES 


TOTAL 


8 




,20 052 


REGRESSION 


1 




,0 7835 


RESIDUAL 


7 




1^217 



MEAN SQUARE 

, O'?035 

, 01745 



F VALUE 
4 . 49 



87 



STANDARD ERROR 



. 132107402855 



# — VARIABLE 
1 ,Xi 
2.X2 
4,X1 A 2 
S,X2"2 
6.Xi*X2 

Constant = - 



F TO 


PART 




F TO 


ENTER 


CORR 


TOL 


DELETE 


2,46 


.539 i 


,000 




.37 


.242 


.020 




2.00 


,50 i. 


,000 


4.49 


8.72 


,770 


.774 




047619047619 







R EGRESS I ON COEFF I C I ENTS 
STD, FORMAT E-FORMAT 



, 0177 



176721938776E-02 



STD 
ERROR 



OR 



STEP NUMBER 2 
VARIABI...E'X1*X2' ADDED 
R -SQUARED = .75163 

Analysis of Variance Table 



SOURCE 


DF 


SUM OF 


SQUARES 


TOTAL 


8 




,20052 


REGRESSION 


2 




,15072 


RESIDUAL 


6 




, 498 



MEAN SQUARE 

,07536 
, 0830 



F- VALUE 
9 , 8 



STANDARD ERROR 



.0911067112552 





F TO 


PART 




F TO 


♦—VARIABLE 


ENTER 


CORR 


TOL. 


DELETE 


i ,X1 


4.71 


.696 


,148 




2.X2 


. 45 


.286 


.020 




4.Xi A 2 


4.53 


,689 


.190 




S.X2 A 2 








16.86 


6.X1*X2 








8,72 


Constant = 


.0037622915776 







REGRESSION COEFFICIENTS 
STD, FORMAT E-FORMAT 



0268 
-. 036 



.268474330198E-02 
.360245767B05E-03 



STD 

;:rror 



07 
001 



)fc )fc )fc )K A A )fc A A A )k !K !4c )fc )fc )fc )fc '.ic )fc )k ]4 A^ 
STEP NUMBER 3 
VARIABLE 'XI' ADDED 
R-SQUARED = ,87206 

Analysis of Variance Tab It 1 ; 



SOURCE 


DF 


SUM OF 


SQUARES 


TOTAL 


8 




.20052 


REGRESSION 


3 




. 17486 


RESIDUAL 


5 




.0 2565 



MEAN SQUARE 

, 5829 
,00513 



F - VALUE 
.1 1 , 3 6 



STANDARD ERROR = . 071B29432442B 







F TO 


PART 


F TO 


#-- 


-VARIABLE 


ENTER 


CORR 


TO! PP'I FT!" 


1 


, XI 






4.71 


p 


, X2 


.20 


.220 


, 020 


4 


. X i A 2 


,26 


, 248 


. 050 


5, 


, X2 A 2 






25,76 


6 


,X1*X2 






11 ,89 



R F G R F R H T O N r O i':" F F I C I F. H T S 
;TJ> , Ffir?MAT r- rflPHAT 

. 00469 .4687491.S3034F.~02 



0396 
00 086 



.395611766121E-02 
■.859423200316E-03 



, 022 



S 
2 



Constant 



, 120040391928 



Tolerance value too snail and/or F-values 



The results are the same as in stepwise re- 
gression. 

i n s u f f i c i e n t t o p r o c e e d . 



88 



Input 'K>, delete '-¥,', or, enter to end regression 



P i" o c e dur e n u m b e r : = ? 



T 1 e r an ce value ( i , e , , 1 . . i ) = '> 

.05. 

F -value for deletion = ? 

A 

l>; above .information correct? 

yes 

Number of dependent variable = ? 

3 

Which remaining variables desired in regression 

ALL 

Is above information correct? 

YES 



Backward (stepwise) algorithm. 



Only a F-To-Delete is required. 

(Perhaps it should be bigger than 4 with 

n = 9.) 



*■**#*********#***********************************************'******************* 
BACKWARD REGRESSION on DATA SET; 

EXAMPLE OF STEPWISE LINEAR REGRESSION 



Dependent variable: (3)Y 
Independent v a r i a b 1 e ( s ) ! ( 1. > X 1 

<2>X2 



i 4 ) X .1 * ?. 
<5)X2 A 2 

(6>xi*x;. 3 



The backwards algorithm sets all the terms in 
the model and then deletes one at a time until 
no F to remove is less than the F we specify 
(Fdelete = 4). 



Tolerance = .01 

F - v a 1 u e for d e 1 e t i o n 

Method number = ? 



CORRELATION MATRIX 



X 




x 


1 A 2 


x 


"J A O 


x 


1**2 


Y 





XI. 
i ,00 !) 0,00 

1 .ooooo 



)() 9".:'4";'R-?7 R (1 H Ij >- 

1 , 0000') on qrp'741^ 

i o o o o o o n i) n n n 

1 .0000 





'■-"i *V2 




V 


<:' i 


:>(1 ■"'.; ■, 


- ,1 ;? n o 4 ? 


■■.3 


42 


n :■' .-I o ; : ' 


... '•:', <;■ n r: 3 r 


vr 


4 ' '' 


'■,"'< i 4 ''•'•: 


,:. '<il<5.' 


\ 





!l !: <" ! 


1 , '<'. 


Q 



***************************************** ************#************************'** 

STEP NUMBER 

R -SQUARED = ,88615 

A n a 1 v ": I. s f U a r i. a n c e T a b 1 e 



SOURCE 
TOTAL 

REGRESSION 
RESIDUAL 



PF 


SUM OF 


SOU ARES 


8 




, 20052 


5 




, 17769 


3 




, 2283 



MEAN SQUARE 

. 03SS4 
. 761 



F--UALUE 
4 , 67 



STANDARD ERROR = .0872327012721 



89 





F TO 


PART 


F TO 


♦—VARIABLE 


ENTER 


CORR 


TOL. DEI FTE 


I.XI 






.23 


2.X2 






.16 


4.Xi"2 






.21 


5.X2"2 






2.01 


6.Xi*X2 






7.24 



REGRESSION COEFFICIENTS STD 

STD. FORMAT E-FORMAT ERROR 

.00247 .246964177292E-02 .0052 

-.02576 --.257643442623E-01 .0636 

.00002 . 23i32929ii58E~04 .0001 

.00547 .5468750 00 0E-02 .0039 

-.00083 --.833990121900E-03 .0003 



Constant = -.0021815421979a Removes the variable with the smallest F to 

delete(x 2 ) 
*************************************************** 
STEP NUMBER 1 
VARIABLE>X2' DELETED 
R-SQUARED = ,87993 

Analysis of Variance Table 



SOURCE 


DF 


SUM OF 


SQUARES 


TOTAL 


8 




.20 052 


REGRESSION 


4 




. 17644 


RESIDUAL 


4 




02408 



MEAN SQUARE 

, 4411 
, 602 



F- VALUE 
7.33 



STANDARD ERROR 



,0775917889132 



♦—VARIABLE 
i.Xi 
2.X2 
4.Xi A 2 
5.X2*2 
6.Xi*X2 



F TO PART- 
ENTER CORR 



16 



228 



F TO 
TOL DELETE 
.34 
020 

.26 
2.1. .96 
10. 13 



REGRESSION COEFFICIENTS STD 

STD. FORMAT E-FORMAT ERROR 

.00267 .267310640025E-02 .0046 

.00002 . 2313292911S8E-04 0.0000 

.00396 . 39S611766121E-02 .0008 

-.0 086 -. 85942320 0316E-03 .00 03 



Constant 



0953530816668 



Removes X 4 = X1 \2 next. 



******* *****#*****************##*#*******#******#************************* ****** 

STEP NUMBER 2 

VARIABI. E'X1 A 2> DELETED 

R -SQUARED = .87206 



A n a 1 y s J. s of V a r j, a n <:: e T a l:> 1 1 



SOURCE 


DF 


SUM 


/ ::> .i. ::> v. 1 i v "jl t 

OF SQUARES 


TOTAL 


8 




, 20 052 


REGRESSION 


3 




, .1.7436 


RESIDUAL. 


5 




. 2565 


STANDARD ERROR 


= .0716294324428 






F TO PART 




F TO 


# VARIABLE 


ENTER CORR 


TOL 


DELETE S" 


I.Xi 






4.71 


2.X2 


.20 .220 


.020 




4.X.t A 2 


. 26 . 248 


.050 




S.X2 A 2 






25 . 76 


6Xi*X2 






11.89 



MEAN SQUARE F-- VALUE 

, 5829 1 1 : 36 

. 051.3 



REGRESSION COEFFICIENTS STD 
STD. FORMAT E-FORMAT ERROR 

.00469 . 468749152939E-02 .0 022 



00396 .3956ii766121E-02 .0008 
00086 -.8S9423200316E-03 .0002 



C o n s t a n t 



120 040391928 



Results are the same as in stepwise regres- 
sion. 

Tolerance value too snail and/or F-ualues insufficient to proceed. 

But this may not be the case 

Input >k", delete '-K', or, enter to end regression , , , , for some data sets. 



Procedure nuwber = ? 

Exit Stepwise Regression. 

Residual analysis and/or prediction? 

NO 

Option nuwber = ? 

7 Return to BSDM . 



90 



Example 3: Polynomial Regression 

Bus Passenger Service Time 

The time required to service boarding passengers at a bus stop was measured together with 
the actual number of passengers boarding. The service time was recorded from the moment 
that the bus stopped and the door opened until the last passenger boarded the bus. The 
objective is to determine a model for predicting passengers service time, given knowledge of 
the number boarding at a particular stop. Let Variable 1 = number boarding and Variable 
2- passenger service time. The following data was gathered during the month of May 1968 at 
twelve downtown locations in Louisville, Kentucky. 



Are you So in 9 to use user defined transformation 
or do No n -linear regression ? (Y/N) 

NO 

Are you usinq an HP IB Printer 1 ? 

YES 

Enter select cadet bus address (if 7)1 press CONT)' 



* DATA MANIPULATION * 

********************************************************#******#**************** 

Enter- DATA TYPE: 

1 Raw data 
Mode nuMber = :: ? 

2 Mass storage 
Is data stored on the progran's scratch file ''DATA ''7 

YES Previously stored on Data File' 



BUS PASSENGER SERVICE TIME (EXAMPLE OF POLYNOMIAL REGRESSION) 

Data file nane : DATA 

Data type is: Raw data 

NuMber of observations: 31 
Nunber of variables: 2 

Variable nanes: 

i . NUMBER X1 = number of passengers boarding a bus. 

2, TIME X2 = Y = passenger service time in seconds. 

Subfiles; NONE 



SELECT ANY KEY 



Select special function key labeled-LIST 



91 



Op t i. on nurtber ~ ? 



i 



Enter Method for list. in a data; 
3 



List all the data. 
In tabular form. 



BUS PASSENGER SERVICE TIME (EXAMPLE OF POLYNOMIAL REGRESSION) 
Data type is: Raw data 





Variable # 1 


Variable # 2 




(NUMBER ) 


(TIME ) 


OBS# 






i 


1 .00000 


1 .40000 


2 


1,00000 


2.80000 


3 


1 .00000 


3.00000 


4 


i .00000 


1 .80000 


5 


1 ,00000 


2,00000 


6 


2,00000 


4,70000 


•-? 


2.00000 


8,00000 


8 


P.. 00000 


3, 00000 


9 


2.00000 


2,50000 


10 


3.00000 


5,2000 


15. 


3.00000 


6,20000 


12 


3, 00000 


9,40000 


13 


4, 00000 


11 ,70000 


14 


5. 00000 


7,50000 


IS 


5, 00000 


11 ,90000 


16 


6.00000 


13.60000 


17 


6.00000 


12,40000 


18 


6. 00000 


1.1 .60 000 


19 


7.00000 


14,70000 


20 


7. 0-0000 


13.50000 


21 


8,00000 


12.00000 


22 


8,00000 


14. 10000 


23 


8,00000 


26.00000 


24 


9,00000 


19. 00000 


25 


10. 00000 


21,20000 


26 


11 . 00000 


22,90 00 


27 


11,00000 


22.60000 


28 


13.00000 


25,20000 


29 


.17.00000 


33.50000 


30 


19. 00000 


33,7000 


31 


25, 00000 


54,20000 



Option nunber = ? 



SELECT ANY KEY 



Exit List routine. 



Select special function key labeled-STATS 
What statistic options are desired ? 
1 

VARIABLES = 
> 

ALL 
Confidence coefficient for confidence interval on the ciean ( e . ci . 30 .95 t99'Jf,) = ? 



Gives the mean, ci, variance, standard, de- 
viation, skewness, and kurtosis of all the 
data. 



9 5 



95%C.l.on means will be developed. 



92 



****************************************************** 

* SUMMARY STATISTICS * 

* ON DATA SET : * 

* BUS PASSENGER SERVICE TIME (EXAMPLE OF POLYNOMIAL REGRESSION) * 
******************************************************************************** 



BASIC STATISTICS 



VARIABLE 
NAME 
NUMBER 
TIME 



* OF * OF 

OBS, MISS 

31 

3i 



SUM 

207. 00000 

431 .30000 



MEAN 

6.67742 
i. 3 . 9 1 29 



VARIANCE 

33.22581 
139,39983 



STD . DEV . 

S.7641S! 
11 .80677 



VARIABLE 
NAME 
NUMBER 
TIME 



COEFFICIENT 
OF VARIATION 
86.32351 
84.86202 



STD. ERROR 
OF MEAN 

1 , 03528 
2.12056 



95 % CONFIDENCE INTERVAL 
LOWER LIMIT UPPER LIMIT 

4.56260 8.79223 
9,58113 18.24468 



VARIABLE 



SKEWNESS 



KURTOSIS 



NUMBER 
TIME 



1 ,43125 
1 , 48977 



1 -9079 
2 , 55645 



What statistic options are desired "> 

2 Gives the correlation matrix of all the data. 

VARIABLES = 

•) 

ALL 

******************************************************************************** 

* SUMMARY STATISTICS * 

* ON DATA SET : * 

* BUS PASSENGER SERVICE TIME (EXAMPLE OF POLYNOMIAL REGRESSION) * 
***********************'********************************************************* 

CORRELATION MATRIX 



NUMBER 



TIME 
,9743533 



Highly correlated in a linear fashion. 



What statistic option*; are desired 7 
VARIABLES = 



Gives median, mode, percentiles, min, max, 
and range of all the data. 



ALL 
**************#**************************************************'» ; ************* : * 

* SUMMARY STATISTICS * 

* ON DATA SET ; 

* BUS PASSENGER SERVICE TIME (EXAMPLE Of POLYNOM I A 1 . REGRESSION) * 
******************************************************************************** 



i: 



ORDER STATISTIC 



V API ABLE 
NUMBER 
T I MI- 



MAX I MUM 
:>5, 00 00 
54,20000 



MINIMUM 
1 ,00000 
1 ,40000 



"'ANGE 
'4 0(H) 
52 , 600 00 



tf T DRANG E 
j ~i 



93 



Tn KEY'S HINGES 

VARIABLE MEDIAN 25-th X-.i.le 75--th X--ile 

NUMBER 6,00000 2.00000 8.00000 

TIME il. 90000 4,70000 19,00000 

TUKEY'S MIDDLEMEANS 

VARIABLE M II) M E A N T R I M E A N M I I) S P R E A D 

NUMBER 5 , 4 .1. i 76 5 , 5 6 , 

T I ME i i , S7 59 1 i , 875 1. 4 . 3 

Other p e r c e n t i 1 e s ? 
NO 

What statistic: options art? desired '> 

Exit Basic Statistics. 

SELECT ANY KEY Select special function key labeled-ADV STATS 

Remove BSDM disc. 

Insert regression medium. 
Option nu fiber - ? 

3 Polynomial regression selected. 

NuMber of the dependent variable - '> 
? 

NuMber of the independent variable -■• ? 
i 

POLYNOMIAL REGRESSION ON DATA SET: 

BUB PASSENGER SERVICE TIME '.EXAMPLE OF POLYNOMIAL REGRESSION) 

--where: Dependent variable = (25TIME 

Independent variable = (1.) NUMBER 

Is a plot of the regression d e s i r e d ? 

YES 

Plot on CRT? 

NO Plot on an external plotter 

Plotter identifier strinS (press CONT if *HPGL') ? 

Plotter select code, Bus * = (defaults are 7t5) ? 

X-nin - f 



X--MQX -- ? 

23 

Y-win = ? 



Y-hax = ^ „, . 

60 Plotting limits specified. 

Y-axis crosses X-axis at X = ? 



X-axis crosses Y-axis at Y -■■ '! 



Distance between X-ticks == v 

5 

Distance between Y- ticks - ? 

5 

* of decimals for labelling X-axis (<-~7'; = ? 


♦ of decinals for labelling Y-axis = ? 
I) 

Number of pen color to be used ? 

1 

Is above information correct? 

YES 

Beep will sound when plot is done* then press CONTINUE 



94 



BUS PRSSENGER SERVICE TIME 





B0 














y 

/ 




55 














y + 

y 




50 














y y 
y ^y 
y ^y 




45 














y y*y 
y y^ 




40 














y yS 

y yS 

y yS 


u 


35 
30 














y yS 

y y^ ^ •" 


H 
















' y^ " " 


h- 


25 










+ 
y 


y 






20 






S 
s 


y 


y 




*•* 




15 




*-- 






.+ 


s* 


y^ 




10 


- . — ■ 


+ 


+ yS 


y 


y 








5 


" *, 


"% 


+ y 
y 












6 






y i 








1 1 1 1 




3 




in 








Q in Q in 


















— • ~* OJ OJ 


















NUMBER 



95 



Max i nun degree of reqressi on < < =10 > 
i 



VARIABLE N MEAN 

NUMBER 31 6,67742 

TIME 3.1. 13. 91.290 



We specified maximum degree at 1 although 
we could have chosen a value slightly higher 
than desired level. 



VARIANCE 



33, 



381 



139 ,39983 



STANDARD 

DEVIATION 

5.76418 

ti , BO 677 



COEFF . OF 

VARIATION 

86 .33351 

84.862 02 



CORRELATION 



,97435 



Degree of reqression = ? 

1 

SELECTED DEGREE OF REGRESSION = 1 

R -SQUARED = ,94936 

STANDARD ERROR OF ESTIMATE = 2.70221890497 



Specify the actual degree of interest. 



ANALYSIS OF VARIANCE TABLE 



SOURCE 



DF 



SUM OF SQUARES 



MEAN SQUARE 



F- VALUE 



TOTAL 
REGRESSION 

X A 1 
RESIDUAL 



30 

1 

1 

29 



4181 ,99484 

397 , 23722 

3970 , 23722 

2.11 ,75762 



3970 ,23722 

3970 , 23722 

7 .30199 



343,72 
543,72 



REGRESSION COEFFICIENTS 



E -FORMAT 
86330 09690 0E+0 



VARIABLE STD. FORMAT 

* CONSTANT' .53633 

X A 1 1.99577 .199576699031E+01 

Confidence coef.fic.Lent (e.g., 90,95,99) =• ? 
95 



STANDARD ERROR 

REG, COEFFICIENT T-VALUf 

. 74979 . 78 

. 08559 23 , 32 

y = .586+ 2.00X about two seconds per pas- 
senger to board a bus. 



4 CONSTANT' 
X A i 



COEFFICIENT 

, 58633 

1 , 99577 



95 % CONFIDENCE INTERVAL 

LOWER LIMIT UPPER LIMIT 

-.94752 2,12018 

1 82063 2 17086 



Plot regression curve on present Jraph ? 

YES 

Plot confidence interval of regression line also ? 

YES 

Confidence coefficient (e.S.t 90 t 95. 99)= ? 

95 

Same pen color ? 

YES 

ChanSe decree of regression ? 

NO 

Residual analysis and/ or prediction ? 

YES 

P r j. n t o u t r e s .i d u a 1 <•; '' 

YES 



May not need an intercept term. 



96 



TABLE OF RF SI DUALS 











STANDARDIZED 


DBS* 


OBSERVED Y 


PREDICTED Y 


RESIDUAL. 


RESIDUAL 


1 


1 ,40000 


2,58210 


-1 , 18210 


- , 43745 


2 


2.B0000 


2,58210 


, 21790 


, 08064 


3 


3, 00000 


2,58210 


.4179 


, 15465 


4 


1.80000 


2,58210 


-.782.10 


- , 28943 


5 


2.00000 


2,58210 


•58210 


■-.21541 


6 


4.70000 


4 , 57786 


. 1221.4 


, 04520 


7 


8.00000 


4,57786 


3,42214 


1 . 26642 


8 


3, 00000 


4 , 57786 


■1 , 57786 


■■■■ .58391 


9 


2.50000 


4.57786 


--2, 07786 


■- , 76895 


10 


5.200 00 


6 . 57363 


-1 , 37363 


-.50 833 


11. 


6,20000 


6 , 57363 


- , 37363 


-.13827 


13 


9.40000 


6 . 57363 


2 , 82637 


1 . 04594 


13 


5.1 ,70000 


8,56940 


3,13060 


1 . 15853 


14 


7,50000 


10.56517 


•3. 065.1.7 


-i , 1343.1. 


IS 


11.90000 


10 ,56517 


1 ,33483 


. 49398 


.1.6 


13,60000 


12,56093 


1, 03907 


.38452 


17 


12,400 00 


12,560 93 


~, 16093 


- , 05956 


IB 


11 ,600 


12.560 93 


--, 96 93 


-.35561 


19 


1.4,70000 


14,55670 


, 14330 


,05303 


20 


13,50000 


14,55670 


■1. 05670 


■•- ,39105 


21 


12,00000 


16.55247 


-4,55247 


-1 ,68471 


22 


14,10000 


16.55247 


■2 , 45247 


- , 9 757 


23 


26,000 


16.55247 


9 , 44753 


3,49621 


24 


19.00000 


18,54823 


,45177 


, 16718 


25 


21 ,20000 


20.54400 


.656 


. 24276 


26 


22.900 


22,53977 


.360 23 


, .1.3331 


27 


22.60 00 


22 ,53977 


, 60 23 


.0 2229 


28 


25.2000 


26,531.30 


■1 ,33130 


- , 49267 


29 


33.50 00 


34,51437 


-1 , 01437 


•- . 37538 


30 


33.7000 


38,50590 


-4,80590 


-1 ,77850 


31 


54,20000 


50 ,48050 


3.71950 


1 .37646 



sicn:i 



*** 



Durbin -Watson Statistic: 2. 09200089648 



Residual plots'? 

YES 

Plot on CRT? 

NO 

Plotter identifier strinS (press CONT if *HPGL' ? 
Plotter select code. Bus * - (defaults are 7i5) 



Note that one observation (#23) seems to 
have a very large standardized residual. 



Residual plots 

An external plotter is used. 



R e s i ct i.) u 1 p 1 o t o p t i o n no. =• '> 

1 

For plotting, X-cu. n = ? 



For olottinci, X-Max ::; '> 

35 

D j. s t '.) n c e b e t we e n X •- 1 i. c: k s = '< 

5 

* of decimals for labelling X--axi<i (.<-?) 



Number of pen color to be used ? 

1 

I '-. a b (i '.) e i n f o r m a t i o n c: o r r e. c 1 1 

YES 



Plot residuals vs time sequence. 



97 



PRSSENGER SERVICE TIME (EXRMPLE OF POLYNOMIRL REG.) 



(E 

n 

£0 
UJ 
OH 

Q 
UJ 
N 

M 

a 
a. 
or 
a 

z 

(E 

I- 

cn 



4 

3 

2L 

1 


-1 
-2 
-3. 
-4. 
-5 



— *-= H* H; 

x x x 

X 



-X-+- 



x?x 



I ** X 



X X 



in 



G> 



m 



(U 



in 

(VJ 






m 
en 



v SEQUENCE *' 



Residual plots ? 

YES 

Plotter identifier strins" (press CDNT if V HPGL') 

Plotter select oodei bus * (defaults are 7t5) ? 

Residual plot option no. = ? 

2 

For plotting, X-Min = ? 



For plotting, X-nax * ? 

55 

Distance between X-ticks = ? 

5 

* of deciwals for labelling X-axis (<=7) = ? 



Number of pen color to be used ? 

1 

Is above infornation correct? 

YES 



Plot residuals vs predicted Y values. 



98 



PRSSENGER SERVICE TIME (EXRMPLE OF POLYNOMIAL REG.) 



(E 

3 
Q 

CO 

UJ 

a. 

a 
ui 

N 

a 
a 
cc 
a 

z 

(E 

cn 



4 

3L 

2 

1 




-1 
-2 
-3 
-4 

-5 



X X 



es 



m 



<s 



in 


6) 


in 


eg 


in 


eg 


in 


G3 


in 


«— 


(U 


OJ 


m 


m 


*- 


v 


m 


m 



x PREDICTED Y' 



Residual plots 

NO 

Op t j. on n under 



Return to BSDM 



99 



Example 4: Nonlinear Regression 

Twenty-five samples of human urine were obtained to determine if a nonlinear model could 
be developed relating Y = blood concentration of urine (micrograms/1000 cc) to X = time in 
hours. 

The data were entered from the keyboard. 

A "three-exponential" model was tried: 

Yhat = B0*exp( - B1*X) + B2*exp( - B3*X) + B4*exp( - B5*X) 
and 0.00001 was used as the convergence coefficient. 

Notes: 

1. The initial estimates were chosen after some experimentation although the only effect 
that they have is in the speed of convergence. 

2. Every iteration was printed. It is not necessary to have this done. 

3. The residuals for the smallest time are larger than for T or X near 60 or above. Of 
course, the largest Y's are associated with the smallest X. 

Are you soins to use user defined transformation 
or do Non-linear regression ? (Y/N) 

N0 We have already prepared the file with the function and derivative. 

A r e ■.' o u u s in g a n H P I B Pr i n ! :± r 7 

YES 

Enter select code, bus address (if 7.1 press C0NT) ? 

* DATA MANIPULATION i 

Enter DATA TYPE- 

- 1 Raw data data type required 

Mode nu fiber = '' 

2 From mass storage 

1 si d a •(• a s t o r e d o n t h « p r o q r a m > s <= <:: r a t c h f j. 1 e >' X) A T A ) ? 

y i— v-i- 

l " :l Data stored in program's storage medium 

from previous run. 



EXAMPLE i-UR INF/BLOOD CONCENTRATION 

Data file nana: DATA 

D a t a type i si Raw d a t a 

NuMber nf observci t j. on-s : 2S 
Number of variables! 2? 



100 



Variable nanes: 
1 TIME(HR) 
2, BLD.CONT 

Subfile si NONE 



SELECT ANY KEY 



Op i i. on n u fiber 



i a r m « t h o «:l f o r 1 1 s t i n a *:! g f ■ 



Select special function key labeled-LIST 

List all the data. 
In tabular form. 



EXAMPLE i -URINE/BLOOD CONCENTRATION 
Data i.vds is; Raw data 





Marioble 1 1 


Variable # 2 




<t:cme(hr) ) 


< BLD.CONT > 


OBSt 






1 


4 , 250 


1165 ,700 


p 


7 , 5 


851 .00000 


3 


10,80000 


523, 000 00 


4 


12. 000 


365. 00000 


5 


16.00 


294 ,00000 


6 


23.80000 


170 ,00000 


■7 


27.8000 


6 0,00000 


e 


35,300 00 


81,00000 


9 


38.30000 


20 ,0 00 00 


10 


45.30000 


45 ,00000 


11 


51.3000 


27 . 00000 


.1 2 


54.20 00 


37 ,00000 


1 3 


5? , 80000 


31,00000 


14 


64.25000 


26 ,00000 


IS 


69.50 00 


36, 000 


16 


78.200 00 


i 8 . o n n 


17 


9 0.20000 


10,000 


18 


100 , 00 00 


8 , 2 


i? 


10 5.00000 


13 .40000 


2 


108. 00000 


17 . 40 00 


21 


114.00000 


8 ,00000 


/". !'... 


120 , 00000 


4 


23 


130 , 00000 


6 .70000 


24 


142, 00000 


6 , 70 00 


o r:.; 


154. 0000 


5,80000 



Option nu fiber ::: ~> 



SELECT ANY KEY 



Exit the List routine. 



101 



Option nuMber = ? 



' v > u « b e r of t h e. d e o e n d a n t u a r i. a b 1 e 



Select special function key labeled-ADV STAT 
Remove BSDM medium. 
Insert the regression medium. 

Select non-linear regression. 
Specify blood content as Y. 



How f»any independent variaftles will be in tiie mhJcI" 

One independent variable. 



1 

Tiidspeiideri t variable nuwtief-s ( s eg a r at e <i by <:<!<"!«'• 



Specify time in hours as X. 



T '.: q h A ',' <v: I H f r ivi q t j, n r o ■'■ ;■'■ •:■■-' t ^ 

YES 

************************************** * * * * * * * * * '.< ■■■ * * % £ * * * '--(: •!■: * * "*: * & * * * * * :■<■: * * * * * * * * * * * * 

NON-LINEAR REGRESSION ON DATA SET ■ 

URINE /BLOOD CONCENTRATION 'EXAMPLE i OF NON-LINE A!? REGRESSION) 
** ******************************* to********************************************** 

w h e r e : I) e p e n d e n t v a r ,i. a b 1 e - < 2 ) E L D , C N T 

Independent var i.ab le. < s > -■■ <" .1. )TIME ( HP > 

# of oar-mieter";: in the Model < <~2 > '* 

6 

I s a p 1 o t o f t h e n o n - 1 i n e. a r r e a r e s s .i. o n d e s i r e d 

YES Request plot 

Plot on CRT 

NO But not on CRT. 

Plotter identifier strinS (press CONT if ' HPGL'J ? 
Plotter select <:: o d e , B u s ♦ --(defaults are 7(5) ? 

On plotter with select code = 7 and bus 
Is a auick plot desired '> code = 5. 

NO No quick plot. We will specify our limits. 

X-Min -■ ? 



4 

X ■"• « a x : ~ ? 

1.60 
Y-Min ■• ? 

"3 

T-mx = '.' 

ii.70 

Y - a x i s c r o s s e s X - a x i s a t X 

4 

X-axj. s crosses Y-axis at Y 

3 

Distance between X- ticks - 

1.6 

I) i s t a n c e b e t w sen Y - 1 i c ; = '■' 



Xmin = 4 
Xmax = 160 
Ymin = 3 
Ymax = 1170 



Xtic inverval = 1 6 
Ytic interval = 1 20 
With no decimal points for labelling. 



102 



1.?0 

# of decimals for- labellina X-gx i. ■=:''< : "' 7 ' 1 



* of decimal'- for label]. tno Y--a / i c ','■ 



Number of pen color to be used ? 

1 

S :> above information cor re •:.* '.' 



Beep will sound when plot donei then press CONTINUE 

File nane where subroutines are stored ? 

FONDER: INTERNAL 

Is function rtediuM placed in device? 

YES 

Is prograM Medium placed in device? 

YES 

E n t e r co n v e r n e n c '2 c o e f f i c i e n t ( e . q . 0,005 .> . !) i. > 

. i) i Convergence criteria on changes in all coeffi- 

Inj.tic.il est i. Mate for para water t 1. cients. Note .00001 is pretty restrictive. 

? Initial estimates input at this point. 

1202,336 

1 1 > i t J. a 1 e s t i m a t e f o r p a r a m e 1 e r * 2 

. i083 

Initial est i Mate for parameter # 1 

? 

40 . 33f->? 

Initial estimate for paramete r I- 4 

o 

. tOR"? 

I n i t i a I e s t i m a t e for p a r a m e t e r t S 
V 

31. . 461.9 

Initial estimate for par am: t er I 6 

1 

. 6'7i/> 

Is the a bow; infornatinn correct? 

YKS 
IK********************************************************* 



103 



De 1 1 a (Con vergence cr iter i.a > = 



.1. 



THE INITIAL VALUES OF PARAMETERS ARE 

PARAMETER 1 = 1202,336 

PARAMETER 2 = ,1083 

PARAMETER 3 = 4 0.336" 

PARAMETER 4 - , 10 83 

PARAMETER 5 ~- 31 46.1 ^ 

PARAMETER 6 = . 06716 



Would vou like to print out every iteration on hard copy opt.!, on printer 

Not a good idea if many iterations are ex- 
J hS pected. 

Laics, May be lenqthv. A beep will, sound when done, Prevs ,C; ' > key to APnpT' 
ITERATION ESTIMATED PARAMETER MA! UFR ?;', S RF'VrDUAi (: i 



Calculations way be quite tiwe c onsoninq , A beep will sound when completed. 

, 10830 

42560 , 6966977 
16513 

19113,9722052 
. 13849 

17230. 2902S8<. 
12102 

17131, 1484877 
, 13013 

17001 ,3543193 
. 13435 

16990, 4904512 
,13606 

16989, 3523974 
13672 

169B9, 1984944 
. 13698 

16989. 1746607 
13708 

16989, 1708856 
, 13713 

16989, 1702861 
. 13714 

16989, 1701 889 
, 1 3 7 1 5 

16989, 170172!) 
.13715 

1698'?, 1701669 

Note: Estimated values for six coefficients followed by sum of squared residuals 

*******K*****>r************«^ 

THE ESTIMATED PARAMETER VALUES AFTER 13 ITERATIONS ARE : 

PARAMETER i= 1398.5719009 ( 1 . 3985719009E+03) 
PARAMETER 2= .1371535 ( i . 371S34796SE--0 i ) 

PARAMETER 3= 604.3657684 < 6 . 0436576836E+02) 
PARAMETER 4= .1371525 < 1 . 3715246328E--01 ) 

PARAMETER S~ 75.0763988 < 7 . S076398794E+Q1 ) 

PARAMETER 6= .0170560 < 1 . 7055987670E •02) 

THE INITIAL VALUE OF SUM OF SQUARED RESIDUALS = 42560,6966977 
AFTER 13 ITERATIONS THE SUM OF SQUARED RESIDUALS- 16989,1701669 
APPROXIMATE STANDARD ERROR FROM SQUARED RESIDUALS'* 29,9026228091 
Plot regression curve on present GRAPH ? 

^^ Plot curve to see how good the fit is. 

Same pen color? 

YES 






1202,33600 
31,46190 


. 10830 
,00672 


400 


33670 


1 


1379, 0339 
76. 16355 


, 12722 
,02198 


r;77 


00409 


o 


1392,99446 
71,83127 


. 13353 
,01538 


600 


25867 


3 


1395, 63956 
76 . 36979 


. 14371 
, 01725 


603 


73447 


4 


1397,91748 
76.09567 


. 14022 
.01722 


603 


92050 


q 


1398,50753 
75,57048 


. 13844 
,01714 


6 4 


30809 


6 


1398,59945 
75.28321 


, 13768 
01709 


6 04 


39161 


7 


1398,59229 
75.15983 


, 13736 
,01707 


604 


38569 


8 


1398.58144 
75 , 1 0969 


, .13724 
, 01706 


604 


37522 


9 


1398,57589 
75, 08959 


,13719 
,01706 


604 


36975 


10 


1398,57350 
75.08157 


. 13717 
.01706 


604 


36737 


11 


1398,57252 
75,07838 


, 13716 
.01706 


604. 


36639 


12 


1398,57212 
75.07711 


,13716 
.01706 


6 4 , 


3659^ 


1 3 


1398.57196 
75,07660 


. 13715 
,01706 


6 04 . 


36583 


DONE ! ! ! 


i 


Note: Estimated v 


alufis fnr 


six rnpffi 



104 



BLOOD CONCENTRATION 




TIME(HR) 



Like to chanSe initial estimates and/or function 



HO 

<", r t> fonfidsn c >■; i.nte r •> a i ■■■■ o a o a r a »<■> r > : 



We are satisfied. 



C o i"i +' i. ci <■-'. n r: e <:: 11 <.■: f f i i:: i. « n t " <> "" ■ o n f i. •:.! ' ; 



Request confidence intervals. 



105 



If********************** ***:-** * * * :+; sit * * * :+: * * :*■ X< "* * * i« * H: * >!•; 4: * * ^ +: * * * * ^ •{■: *; ^ * *- * * ;K * * :^: * :4.. * :k 4: *: * * * * 
APPROXIMATE 95 7. CONFIDENCE INTERVALS ON PARAMETERS 



PARAMETER 



ONE-AT-A TIME C.I 



SIMULTANEOUS C.I . 



LOWER LIMIT 

790.3196 

.0762 

-3.8858 

-.0039 

5 -33.5338 

6 -.0073 
**** *************************** ******************#**************** 
Residual analysis and /or ort>d ic t i o> 



i 
2 
3 
4 



UPPER 1 


-IMIT 


2006 


.8242 




.1981 


1212 


.6173 




. 2782 


183 


.6866 




.0414 



LOWER LIMIT 


UPPER LIMIT 


244.5233 


2552.6205 


.0215 


.2528 


-549.6815 


1758.4130 


-.1304 


. 4047 


-130.9917 


281.1445 


-.0292 


. 0633 


********************* ********* 



YES 

P r i n t out r e s i d u a 1 s ? 

YES 



Study size and form of residuals. 







TABLE OF 


RESIDUALS 


STANDARDISED 


OBS* 


OBSERVED Y 


PREDICTED Y 


RESIDUAL 


RESIDUAL 8IRN.1>"' 


i 


1165,70000 


1188. 01983 


-22. 31933 


- ,74642 


2 


851 ,000 00 


782, 09103 


68,9 897 


2,30445 ** 


3 


523,00000 


517 ,81851 


5, 18149 


. 17328 


4 


365.00000 


447,44910 


-82,44910 


■■-? , 75725 ** 


5 


294,00000 


280 , 31284 


1.3.68716 


, 45772 These two have 


6 


170. 00000 


126,59139 


43, 40861 


1. , 45167 fairly large residuals. 


7 


60 , 00000 


9 ,96313 


-30 ,96313 


-1 . 03547 


8 


81 , 000 


56.93086 


24. 06914 


.80492 


9 


20 ,00000 


49 , 54572 


•-39 ^4572 


- , 98806 


10 


45. 00000 


38,68203 


6. 31.797 


,21128 


11 


27.00000 


33, 05932 


-6,05932 


-.20264 


1.2 


37, 0000 


3 ,97073 


6. 2927 


,20163 


13 


31 ,00000 


27 . 62273 


3 . 37727 


, 11294 


1.4 


26. 00000 


25, 39305 


,6 0695 


, 02030 


15 


36,00000 


23, 09053 


12 . 9 0947 


.43172 


16 


.1.8.00000 


19.82514 


■•1,82514 


••- , 0610 4 


17 


10,00000 


16,12840 


-6, 1284 


•- ,2 0495 


18 


8.2000 


13,64085 


-5.44085 


-. 18195 


1? 


13.40000 


12.52486 


.87514 


, 02927 


2 


17.40000 


1 1 , 89978 


5.50022 


, 18394 


21 


8. 00000 


10 ,74190 


-2 , 74190 


.... (19169 


22 


4, 00000 


9 , 69684 


-5.69684 


•- , 19051 


23 


6.700 00 


8, 17622 


-1 , 47622 


-. 4937 


24 


6,7000 


6,66290 


. 371 


. 00124 


25 


5,80000 


5 , 42969 


.370 31 


01238 


D u r b i n ■ 


-Ufa t son Statistic: 


2.57626883803 







R e s i. d u a 1 p 1 o t s; 



YES 

Plot on CRT'? 

NO 

Plotter identifier strins (CDNT if *HPGL'> ? 

Plotter select code, Bus ♦ --■ (defaults are 7 



Residual plots yes 
On external plotter 



5) 



106 



P e s i cl u a 'I p lot opt .i. o n n o . 



for p 1 o t i: j. n q , X - r»i .i. n 



For plotting, X-mqx = f 



I) i s t a n '.:: e bet w e en X - 1 .i. c k s : = '' 



* of dec irtals for label lino X-axis (<=?> 

P 

Number of pen color to be used ? 

1 

I <3 a b o v e i n f o r n a t i on c o r r e c t ? 

YES 



Plot residuals vs time/sequence number. 



BLOOD CONCENTRATION (EXAMPLE 1 OF NON-LINERR REG. ) 



3 


2 


n 




M 




(0 


1 


Id 


a 




a 
u 


PI 


N 




M 




O 


-1 


a 


<r 




a 




z 
a: 


-2 



cn 



-3 
-4 
-5 



s 



x x 

_t 



X I 



X X 



» ' X v A H « 



in 



in 



(9 

(XI 



in 

(VI 



Residual Plots ? 
NO 

Op t ion nu fiber ~ ? 



% SEQUENCE *' 



Exit residual routine. 



Return to BSDM 



107 



Example 5: Nonlinear Regression 

An experiment was conducted to determine the relationship between Y = elevation (in centi- 
meters) and X = distance from the summit of a hill. 

Thirty-four observations were entered from a mass storage device. 

After viewing the X-Y scatter plot, it appeared that it would be necessary to piece the model 
together. Hence, the following model suggested itself: 

Yhat = polynomial model of degree 2 if X=s65. 
= simple linear model if 65<X«sl25. 

= polynomial model of degree 2 if X>125. 

i.e., the model can be written as 

Yhat= A0 + Al*X + A2*Xt2 if X^65. 
= B0 + B1*X if65<X==125. 

= C0 + Cl*X + C2*Xf2 if X>125. 

or for the program's purpose: 

F= (P(l) + P(2)*X(l) + P(3)*X(l)T2)*(X(l)*s65) + 
(P(4) + P(5)*X(1))*((X(1)>65)AND(X(1)=£125)) + 
(P(6) + P(7)*X(l) + P(8)*X(l)t2)*(X(l)>125) 

Therefore, we have eight unknown parameters in the model to be estimated. 0.00001 was 
used as the convergence coefficient. 

The initial values were obtained by interpolating values on the scatter plot. The chosen values 
are: 

Initial Values: 

A0 = 1000 B0 =1200 CO =1826 
Al=-1.0 Bl=-5.8 CI- -16.0 
A2=-.2 C2-.046 

After five iterations, the estimated coefficients give a Sum of Squares residual of about 295 
and a very good fit as we can observe from the plot of the data and the estimated equation. 
Also, the residual analysis seems to suggest that the fit is quite good. 

Are you Joins to use user defined transformation 
or do Non-linear regression ? (Y/N) 

N0 Other printer selected. 

Are you using an HP IB Printer? 

YES 

Enter select code i bus address (if 7 » 1 press C0NT) ? 



108 



***************************************************** 

# DATA MANIPULATION * 

**************************************************************** >,: **** ),c ********** 



Enter DATA TYPE: 

i Raw data (data type required) 

Mode ntinber = ? 

2 From mass storage 

lis data stored on the prog raw's scratch file (DATA)' 1 

NQ Data stored on a different medium so it must 

Data file na«e - '> be retrieved. 

LANDSCAPE: INTERNAL 

Was data stored by the BS&DM system ? 

YES 

Is data tied inn placed in device INTERNAL 

? 

YES 

Is pro a ran n e d i. u m p 1 a c e d i n d e v i. c e ? 

YES 

PROGRAM NOW STORING DATA ON SCRATCH DATA FILE AND BACKUP FILE 



LANDSCAPE SEGMENTS DELINEATION 

Data f i. le nawe : L.ND120 ■ E8 , 1 

Data type is: Raw data 

NuMber of observations; "54 
Nuciber of variables; 2 



Var iab le nti«es : 
i , DISTANCE 
2. ELEUATIDN 

Subfile none beginning obser vn \ j, on ---n u«bt;r of obser mi ': i on ■■■.-. 

i TOP 1 ^5 

2, BOTTOM i 6 1 ■'" 



SELECT ANY KEY 



Option nu fiber '<' 

1 

rntcr Method for listing data 



Select special function key labeled-LIST 

List all the data 
In tabular form 



109 



. ANDSCAPE SEGMENTS DELINEATION 



Data type 



Raw data 





Variable * 1 


Variable * 2 




(DISTANCE ) 


(ELEVATION > 


DBS* 






i 


,00000 


10 0.00000 


2 


5,00000 


992.40 


3 


10.00000 


985.40000 


4 


1 5 , 


973,30 00 


5 


20 , 00000 


963. 100 00 


6 


25, 000 


952,90 000 


7 


30 .00000 


939,60000 


8 


35.0 00 


929,40000 


9 


4 0,00000 


912.90000 


10 


45.00000 


894,50000 


ii 


50, 00000 


881,80000 


12 


55, 00000 


864. 00 00 


13 


60,00000 


832,90 00 


14 


65,00000 


808.80 00 


IS 


70 ,00000 


779 ,00000 


16 


75.00 000 


757,40000 


.17 


80 .00000 


727.60000 


18 


85.00000 


691 .40000 


19 


90 . 00000 


664.100 00 


20 


95,00000 


633 , 


21 


10 0,00000 


605.70000 


22 


105, 00000 


577, 10 000 


23 


110,00000 


549,800 


24 


115, 00000 


518, 00000 


25 


120.00000 


495, 10 000 


26 


125.00000 


468 ,4000 


27 


130 ,00000 


446,200 00 


28 


135,00000 


421 ,40000 


29 


140 , 00000 


4 3,00000 


30 


145, 00000 


390 .9000 


31 


ISO, 00000 


369.30000 


32 


155. 00000 


356,6000 


33 


160 , 00000 


347,70 00 


34 


165. 00000 


340 ,10000 



Ontion number - '' 



SELECT ANY KEY 



Exit List routine. 

Remove Basic Statistics 

Go to Regression program medium. 



Dot ion number -- V 

4' 

S ti b f i 1 e Kent e r to i n r e s u b f i. 1 e s > ~- ? 



Non-linear regression 





Number of the dependent variable. ~ '> 



How Many independent variables will be in the Model? 

1 

Independent variable numbers (separated h\> commas) = 

? 

1 



110 



1 s a bo'je I n f or«ation c o r r e <:: t "> 

YES 

NON-LINEAR REGRESSION ON DATA SET' 

LANDSCAPE SEGMENTS DELINEATION 

^ ^ * T "T* ^ M* ^ ^ * * * * * ^ * ^ ^ ^ ^ ^ ^ ^ *^ ^ ^ ^ n* 'I* ^ ^ ff* *- ^t/f*/p^fi^\Jy\^\flK/^^/f\^^i/f\ «*f\- <^ /jS /p .^S *p fl\ Jf* /ft ^ JH jpj^Jf^i^ ifv Jf\ %\ ^. /j\ /fi ^ Jf* .% Jfx Jf ^k /p. Jft /f\ ft /^ ^ 

--•-where: Dependant variable =•■ ( 2 >ELEUATION 

Independent uar lab le (s > ~- ( 1. ) DISTANCE 

: l of paracteters in the nodel<<=20) ? 

8 

Is a slot of the n on -linear regression desired 

YES 

Plot on CRT 

NO 

Plotter identifier strinS (CDNT if 'HPGL'J? 

Plotter select code,E<us* --(defaults are 7i5) ? 



I s a a u i c k p 1 o t d esire d 



Plot on EXTERNAL plotter 



NO No quick plot. We specify our limits. 

X-nin = '.' 



X-cmx = ? 

165 
Y--«in ■:-■ ■> 

34 
Y-wax -= ? 

i. n 

Y-ax U crosses X-axis at X = ? 



Kaxt s crosses Y-axis at Y = "i 

X40 

D i. <-,• '< a n c: e b e t w e e n X - t i r k s - ? 

1) i. s t a n c. e b e t w sen Y - 1 i c -• ? 

.1. 
tt (j f cl e c: i n a 1 s f » r 1 a belling X - a x i s ( < ~ 7 ) - '> 

{) 
# of decimals for labelling Y-axis ■■= "> 



Number of pen color to be used ? 

1 

Is above information correct f 

YES Plot shown below overlayed curve. 



Ill 



Beep will sound when pl-ot donet then press CONTINUE 
File name where subroutines are stored ? 

LANDER: INTERNAL 

Is data Medium placed in device INTERNAL 

? 

YES 

Is prosiram medium placed in device ? 

YES 

Enter convergence coefficient <e.g. . 005, . 00i> 

,000 i Supply initial estimates. 

Initial est incite for parameter # i 

? 

10 

Initial estiwate for poraneier * 2 

■> 

-.1 

Initial estiwate for parameter * 3 

? 

Initial estinote for paraweter * 4 
V 

i.20 

Initial estinate for parameter * 5 

-S . 8 

I n i t i a 1 e s t i m a t e f o r par a n e t e r # h 

? 

1.826 

Initial estiwate for paraweter * 7 

? 

-.1.6 

Initial estiwate for parameter # B 

? 

. 46 

1 s i. h « a b o v h i. n f o r m a t i o n t: o r r k c t V 

YES 

Del ta( Hon Mergence criteria)-- .000.1 

THE INITIAL VALUES OF PARAMETERS ARE : 
PARAMETER 1 = 1000 
PARAMETER 2 = -1 
PARAMETER 3 =-.;.■' 
PARAMETER 4 = 120 
PARAMETER 5 =-5,8 
PARAMETER 6 = 1826 
PARAMETER 7 =-.1.6 
PARAMETER 8 = .046 

Would you liKe to see every iteration ? 
YES 



112 



Culcs. May be lenqthv. A beep will sound when clont 
ITERATION ESTIMATED PARAMETER VALUES 



Pr 



k e y 



to ABORT! 
;. RESIDUAL' 



t ine. 
-.1. 



Calculations May be quite 

1000,00000 
-5.80000 1826 

1 994.39986 
-5,7688:5 1796 

2 997,48082 
-5,76284 1798 

3 997.51105 
-5,76280 1807 

4 997.51107 
-5.76280 1823 

5 997.51107 
-5.76280 1826 

DONE MM 

4 "& *A/ ^t ^' 4 W 4 4 W 4f W ^ 4 ^ ^ 4 ^ 4 4" 4f 4 *it ^ *^ ^ ^" ^ "^ ^t 4' 'ilf "4f' 4 b!/ "A" "^ ' V ^k" ^t 

THE ESTIMATED PARAMETER VALUES AFTER 5 I 



c o n s u«ina 


00000 

-.69611 
78585 
-1 , 0858 
11334 
-1 .01126 
15342 
-1 .01126 
356 08 
-I . 01126 
81849 



A b e e p w i. 1 1 s o u n cl w h 
1200,0 

. 0460 
11 84, 6 

.0 446 
1.184. 

, 0447 
1184. 

,0451 
1184, 

, 458 
.1.184,0 

. 0460 



First eight values per line 



PARAMETER 
PARAMETER 
PARAMETER 
PARAMETER 
PARAMETER 
PARAMETER 
PARAMETER 
PARAMETER 



3 = 
4- 

5 = 

6 = 

8 = 



997 

- 1 , 

1184, 
-5 
1B2G, 
-16. 



5110714 
0112610 
28 0357 
0893939 
7627972 
8938B29 
6126168 
0460476 



THE INITIAL. VALUE OF SUM OF S 
AFTER 5 ITERATIONS THE SUM OF 
APPROXIMATE STANDARD ERROR FR 
regression curve on present * r a p h ? 



- 20 00 
-16. 00000 

•■■■ . 03291. 
-16.20243 

- , 02807 
■16,22046 

- . 2804 
-16,34364 

-, 02804 
-16,56441 

- , 2804 
-16 ,6115'' 
are the estimated coefficients. Last 

TERATIONS ARE : 
( 9 ,9751107143E+02) 
<-i. 0112609889E+00) 
( -2,803S7i4287E~02> 

< 1, 1840893939E+03) 
(-5 , 7627972028E+00) 

< 1 . 826S938829E+03) 
<-\ , 66 126 16824 E* 01.) 
( 4,60476ii520E-02) 

********************** 
QUARED RESIDUALS =■-• 169 
SQUARED RESIDUALS^ 29 
OM SQUARED RESIDUALS- 



en com pie ted 



1693553.53 

9313 



6 
9329 

894 

4 

8939 

8 

8939 

4 295,65 

is sum of squared residuals. 



339.21 
295,84 
295 , 74 
295 ■ 65 



****************** 

3553 . 53 

5 ,649036151 

3. 372.10865409 



Plot 

YES 

Same 

YES 



Plot curve or graph. 



pen color 



113 



LANDSCAPE SEGMENTS DELINEATION 





940 








640 






z 
o 

M 

(E 
> 
U 

_l 

u 


740 
640 








540 








440 
340 


i i 


1 1 _** 



CO 

co 



to 

CO 



0) 

en 



tu 
m 



m 

CO 



DISTANCE 



Like to change initial estimates and/or function ? 



NO 

Ar« confidence intervals on par a Met'; 

NO 



a s s i r- 1: c! 



114 



Residual analysis and/or prediction? 



YES 

P r .i. n t out res i d u a 1 <■; 1 

YES 



TABLE OF RFSIDUALS 

STANDARDIZED 



DBS* 


OBSERVED Y 


PREDICTED Y 


RESIDUAL 


RESIDUAL 


i 


10 0,00000 


997.51107 


2 , 48893 


7380 9 


p 


992,40000 


991 ,75387 


, 646.1.3 


, 1916.1. 


3 


985,40000 


984 , 59489 


.80511 


. 23876 


4 


973,300 00 


976, 3412 


-2.73412 


- . 810 80 


5 


963, 10000 


966, 07157 


-2 , 97157 


-■ , 83122 


6 


952,900 


954,7 0723 


-1,80723 


- . 53593 


'? 


939,6000 


941 ,94110 


-2,34110 


-.69425 


8 


929,40 000 


927,773.1.9 


1,62681 


,48243 


9 


912,90000 


912,20349 


.69651 


20655 


1.0 


894,50000 


895,23201 


-.73201. 


- ,2170 8 


1.1 


881 ,80000 


876 ,85874 


4 , 94126 


1 , 46533 


12 


864, 0000 


857, 08368 


6, 91.632 


2 , n 5 1 n 4 


13 


832,90000 


835 , 90684 


-3 , 0684 


-.89168 


14 


808.800 00 


813,3282! 


-4,52821 


-1 . 34284 


15 


779, 000 


780 . 69359 


-1 . 69359 


■ 5 223 


1.6 


757,40000 


751 ,87960 


5.52040 


1 , 63708 


1 7 


727.600 00 


723 . 06562 


4 , 53438 


1 ,34467 


1 8 


691 ,40000 


694,25163 


-2, 85163 


■••• , 34565 


1 9 


664, 10 000 


665 .43765 


-I ,33765 


•■•■ , 39668 


20 


633, 0000 


636 . 62366 


-3 , 62366 


-1 , 07460 


21 


605,70000 


607,80967 


-2, 10967 


- 62562 


22 


577, 1.000 


578 , 99':': 69 


••■! . 89569 


- • 5621 '-' 


'.'■> " :i , 


549,800 00 


550 , 18170 


■- , 38170 


- 11319 


24 


518,0000 


521 ,36772 


...T '^77^ 


.... Q9870 


25 


495 ,1000 


492 55373 


2 54627 


7551!) 


26 


468.40000 


463 , 73974 


4 , 66026 


5 , '*< 2"0 


27 


446 ,200 


445 , 45333 


74167 


21994 


28 


421,40000 


4':>3 , 4 83.3 


-2 0833 


.... roq 5 7 


29 


403 , 00000 


403 ,66071 


-.66071 


-■ 19^94 


3 


390 ,90000 


386,21548 


4 , 68452 


1 , 38920 


31 


369 . 300 00 


371 , 07262 


■-1 . 77262 


- , 52567 


32 


356,600 


333.23214 


-1 , 63214 


- . 484 1 


33 


347 700 


347 . 694 5 


005^5 


no 1 77 


34 


34 1.0000 


■j ~i p 4 c; £> "z -t 


. 641 67 


19 2 - 



SIGN IE 



#* 



Durb in -Watson Statistic: ■ 1 ,51.322482175 Test statistic for autocorrelation of residuals 

Special tables are necessary. 

R e s i d u a 1 plots'? 

YES Residual plots 

Plot on CRT? 

NO Plot on external plotter 

Plotter identifier strinS (CONT if 'HPGL') ? 

P ] o t t « r •= e 1 e c t code, Bus \ ~ (defaults are 7)5) 

R (> s i d i.) a 1 p i o t o p t i o n n o , 

1 Plot residuals vs time square 

F o r p lotting, X - m i n = '•' 



For plotting, X-mux - ? 



115 



35 

Distance between X- ticks = ? 



# of decinale for labelling X-axis (<-»?) = '> 



Number of pen color to be used ? 
1 
Is above information correct' 

YES 



LRNDSCRPE SEGMENTS DELINEATION 



3 d 

a 

l-t 

CO , 
u l 
at 

y 

N 

M 

O 1 

a. ~1 

(E 

a 
a: -2 

H 
(0 



-3. 

-4. 
-5 



XX 



X X 



in 



x - xX X XX 
x x „ 



in 






in 






in 



"SEQUENCE ♦ ' 



Residual plots ? 

YES 

Would you liKe to plot on CRT ? 

NO 

Plotter identifier strins (CONT if *HPGL') ? 

Plotter select codei bus * (defaults are 7 » 5 ) ? 



116 



R a <•> i d u a 1 plot option no 



For plotting, X-Min r;: ? 



30 

For plotting, X-eiax - ? 

1. 

Di. stance between X -ticks -- ? 



iOO 

# of decimals for labelling X-axis (<=?) 



Number of pen color to be used ? 

1 

!1 <:i a b owe i n f o r m a t i o n c o r r e <:; t ? 



Plot residuals vs predicted Y 



LRNDSCflPE SEGMENTS DELINERTION 



4. 



5 2 
a 



1 







u 

DC 

a 
u 

N 



a: 
a 

a: -2 

H 

cn 



-3 



-4 



-M 1 

X 



6) 

eg 
m 



x X 



SOS 
S S S 

<«■ in U) 



X 



xX 



s 


S 


s 


s 


s 


s 


s 


s 


IV- 


OD 


<n 


s 



Residual plots 



N PREDICTED Y' 



ND 

0i» I i. on no fiber 



Exit residual routine. 



Return to BSDM 



117 



Example 6: Standard Nonlinear Regression 

In this example, standard nonlinear models are fit to the data from Example 4. 

Are you SoinS to use user defined transformation 
or do Non-linear regression ? (Y/N) 

NO 

Are you using an HPIB Printer? 

YES 

Enter select oodet bus address (if 7tl press CONT) ? 

X**************************************************^ 

* DATA MANIPULATION * 



Enter DATA TYPE: 



Mode nu fiber ~- ? 



Raw data (data type required) 
From mass storage 



Is data stored on the prograw's scratch -Tile (DATA)? 
YES 



Previously stored on program's scratch 
file called DATA. 



URINE/BLOOD CONCENTRATION 



Data file nane: DATA 

Data type is: Raw data 

Nu fiber of observations; 

Nufiber of variables: 



"3 



Variable nawes: 
i, TIME(HR) 
2. BIL.D.CONT 

Subfiles: NONE 



Same data set which we used for nonlinear 
regression. 



SELECT ANY KEY 

Option nufiber = f 

1 

Enter Method for listing data: 

3 



Select special function key labeled-LIST 



List all the data 



In tabular form 



118 



URINE/BLOOD CONCENTRATION 



Data 


type is: Raw data 






Variable # 1. 


Variable # 2 




(TIME(HR) ) 


(BLD.CONT ) 


OBS* 






1 


4,25000 


1165,70000 


2 


7,50000 


851 .00000 


3 


10 .80000 


523.00000 


4 


12.00000 


365,00000 


5 


16.00000 


294.00000 


6 


23.80000 


170 .00000 


7 


27.80000 


60.00000 


8 


35.30000 


81 .00000 


9 


38,30000 


20,00000 


iO 


45,30000 


45,00000 


ii 


51 .30000 


27,00000 


12 


54.20000 


37 . 


13 


59.80000 


31.00000 


14 


64.25000 


26. 00000 


IS 


69.50000 


36.00000 


16 


78,20000 


18,00000 


17 


90.20000 


10.00000 


18 


100, 00000 


8,20000 


19 


105, 00000 


13.40000 


20 


108.00000 


17,40000 


21 


114, 00000 


8,00000 


22 


120.00000 


4,00000 


23 


130,00000 


6,70000 


24 


142. 00000 


6,70000 


25 


154,00000 


5,80000 



Option nuMber = ? 



SELECT ANY KEY 



Exit List routine. 



Option nunber = ; ? 



Nu fiber of the regression nodel 



Should fitted fiodel include intercept terfi 1 

NO 

Nunber of the dependent variable --- 1 



Nunber of the independent variable 

1 

]!<•> above inf or fiat ion correct? 

YES 



Select special function key labeled-ADV STAT 
Remove BSDM medium. 
Insert regression medium. 



Select standard non-linear regression 
modes. 

Mixed exponential of form: 

Y = A«Exp(B«X) + OExp(D*X) 

Note: In the non-linear regression exam- 
ple we specified 3 exponential terms. 

Y = blood count 



X = time in hours 



Displayed on CRT. It is correct. 



119 





X-MdX 


:= 


160 
Y-m in 


= ? 




Y-MQX 


= ? 



REGRESSION MODELING ON DATA SET: 

URINE/BLOOD CONCENTRATION 
********************************************** 

— where: Dependent variable = <2)BLD,CONT 

Independent variable = (l)TIME(HR) 

THE STANDARD NON-LINEAR REGRESSION MODEL SELECTED = Y=A*EXP (B*X>+C*EXP <D*X> 

Is a plot of the regression desired? 

YES Like to see a plot 

Plot on CRT 

N0 But not on CRT. 

Plotter identifier strinS (CONT if V HPGL') ? 

Plotter select code,Bus# -(defaults are 7.5) ? On an external plotter at 7,5 

X-Min - ? 



Specify plotting limits. 



1200 

Y-axis crosses X-axis at X = ? 



X-axis crosses Y-axis at Y = ? 



Distance between X-ticks = ? 

16 

Distance between Y-tic = 

130 

* of deciMals for labelling X-ax is< < =7) - ? 



♦ of deciMals for labelling Y-axis - ? 



Number of pen color to be used ? 

1 

Is above inforMation correct ? 

YES 

Is plotter ready ? 

YES 

Are the values of the initial estinates proper? As shown on CRT and printed out below. 

YES 



120 



************************************************ 

DeltafConuergence criteria) ■= .05 

THE INITIAL VALUES OF PARAMETERS ARE : 
PARAMETER i = 334.489319026 
PARAMETER 2 =-3 , 26684362156E-02 
PARAMETER 3 = 33.4489319026 
PARAMETER 4 =-1 , 63342i8i078E-02 

Calcs. May be lengthy, A beep will sound when done, Press 'NO' key to INTERRUPT 

I 

CALCULATIONS STARTED ON 0/0 AT 0:0 



ITERATION 






A 





334 ,48932 


1 


767,19521 




1593,74645 


3 


1854.77884 


4 


1974,36951 


r~ 


2008,32686 


6 


2003.21510 



ESTIMATED PARAMETER VALUES 
B C 



S.S. RESIDUALS 



3267 


33 


, 44893 


09251 


2 1 1 


,31532 


18542 


335 


,76599 


13293 


214 


, 4524 


14275 


12 


,85302 


13849 


78 


,90472 


1369? 


73 


,85445 



D 

01633 
03542 
3753 
3737 
02902 
1868 
01677 



1137486, 
330889, 
82374, 
39473 , 
17809, 
17060, 
16989, 



3294 
6618 
840 
4982 
6312 
8039 
6350 



DONE' ! ! ! 

***********************************************************************'********* 

THE ESTIMATED PARAMETER VALUES AFTER 6 ITERATIONS ARE ; 

PARAMETER 1- 2002,9416350 ( 2 . 029416350E + 03) 
PARAMETER 2= -,1371748 ( -1 . 3717475809E-0 1) 
PARAMETER 3= 75,2098521 ( 7 , 52098521 03E + 01 ) 
PARAMETER 4= -.0170887 ( -1 , 7088737873E-02 ) 

******************************************************************************** 
THE INITIAL VALUE OF SUM OF SQUARED RESIDUALS = 1137486.32942 
AFTER 6 ITERATIONS THE SUM OF SQUARED RESIDUALS- 16989.1765347 
APPROXIMATE STANDARD ERROR FROM SQUARED RESIDUALS- 28.4430730832 

******************************************************************************** 

Should regression line be plotted on seme graph ? Note: These results in terms of the sum of 

squared residuals are very close to the non- 
linear regression example with two more 

Same pen color ? parameters. 

YES 



YES 



121 



URINE/BLOOD CONCENTRATION 





1200 r 




; 




1060 - 




960 - 




840 - 


H 

Z 

o 

• 


720 - 
600 - 


a 
m 





480 
360 
240 
120 





TIMECHR) 



New initial estimates arid/or convergence criteria ? 

NO 

Are confidence intervals on parameters desired ? 



Satisfied with results. 



YES Why not get confidence intervals 

Confidence coefficient for confidence interval on par acie ters ( e , a . 90,95,99) = 



95 X CONFIDENCE INTERVALS ON PARAMETERS 



PARAMETER 


ONE-AT- 


•A 


TIME 


C.I , 


SIMULTANEOUS 


C.I , 


i 
2 

3 


LOWER LIMIT 

1818,4543 

-.1593 

-42,6712 






UPPER LIMIT 

2187,4289 

-.1150 

193.0909 


LOWER LIMIT 

1703.9352 

-.1731 

-115,8451 


UPPER LIMIT 

2301,9481 

■■-. 1013 

266.2648 



4 -.0433 .0092 -.0596 .0255 



122 



Residual analysis and/or prediction ? 



YES 



Print out residuals? 
YES 



Residual analysis 







TABLE OF 


RESIDUALS 


Residual Sy.X 
STANDARDIZED 




OBS# 


OBSERVED Y 


PREDICTED Y 


RESIDUAL. 


RESIDUAL 


SIGNIF 


i 


1165.70000 


1188.03381 


-22,33381 


-.78521 




2 


851 .00000 


782. 07774 


68,92226 


2.42317 


** 


3 


523.00000 


517.80218 


5,19782 


, 18274 




4 


365.00000 


447,43453 


-82.43453 


-2 . 89823 


** 


5 


294. 00000 


280.30783 


13,69217 


.48139 


Note: 


6 


170. 00000 


126.60208 


43,39792 


1 . 52578 


Two large 


•7 
/ 


60 . 00000 


90.97711 


-30 .97711 


-1.08909 


residuals. 


8 


81 ,00000 


56.94431 


24, 05569 


, 84575 




9 


20 .00000 


49.55743 


-29 , 55743 


-1 ,03918 




10 


45.00000 


38.68823 


6.31177 


.22191 




ii 


27.0 0000 


33.06036 


-6.06036 


-.21307 




12 


37. 0000 


30.96936 


6. 03064 


,21202 




i.3 


31 . 00000 


27.61708 


3.38292 


.11894 




14 


26. 00000 


25,38439 


.61561. 


. 02164 




15 


36.0 00 00 


23,07884 


12,92116 


,45428 




16 


18, 00000 


19,80955 


-1 ,80955 


-.06362 




17 


10, 00000 


16, 10942 


-6, 10942 


-.21479 




18 


8,20000 


13.62043 


-5,42 043 


-, 1905V 




19 


13.40000 


12,50406 


, 89594 


, 03150 




20 


17.40000 


11 ,87886 


5,52114 


, 19411 




21 


8.00000 


10 ,72091 


-2.72091 


■-. 9566 




22 


4, 00000 


9,6760 


-5.6760 


», .19956 




23 


6,70000 


8, 15597 


-1 .45597 


--. 05119 




24 


6.70000 


6,64379 


. 05621 


. 0198 




25 


5,80000 


5,41199 


,38801 


, ni364 





Durbin-Watson Statistic: 2.57642711573 



Residual plots? 

YES 

Plot on CRT? 

NO 

Plotter identifier strins (press CONT if V HPGL') ? 
Plotter select code, Bus * = (defaults are 7>5) 

Residual plot option no. - ? 

1 

For plotting, X-nin = ? 



for plotting, X-ciax = ? 



Specify limits for residual plot verses sequ- 
ence #. 



Distance between X-ticks = 



* of decimals for labelling X-axis <<=7) = ? 



Number of pen color to be used 
1 



Is above information correct? 



YES 



URINE/BLOOD CONCENTRATION 



123 



3. 



3 


2 


a 




M 




0) 


1 


u 


a: 




a 





N 




M 




a 

a: 


-1 


(C 




a 




z 


-2 


^- 




<n 





-3 
-4 
-5 



<S 



-H g X I » X I v X-X— « 

x * x x x x * 



X x 



in 



69 



in 



(VJ 



in 



Residual plots ? 

NO 

Option number = ? 



v SEQUENCE *' 



Exit residual routine. 



Number of the regression Model -• ? 



Should fitted model include intercept tern ? 

YES 

Number Df the dependent variable = ? 



Standard non-linear regression models. 
This time with intercept term. 

Mixed exponentials 



This time with an "intercept" term. 
Y = A*EXP(B*X) + C*EXP(D*X) + E 



Number of the independent variable = ? 



Is above information correct? 
YES 



124 



«W *X* W ^ J** ^ >£■ ^ ^ »V W ^ ^ \t ^ W ■A' 4 4f ^ ^ J/ ^ ^ si 1 ^ ^ "-if *&/ 4r & >^^\^\if\^^ii/^\i/ -j/ ^/ ^jj- u." ^ ^ *Ji" ^ ^V ^ '4/ '4/ *4f '4t °4/ *4/ "Jf" W W ^V *4f '4/ *4f *ds "J/ ■Ji' ^X r "A - ^ ^ "A - *& v Jf ^ st sir , if ^A* ^ 

REGRESSION MODELING ON DATA SET: 

URINE/BLOOD CONCENTRATION 
******************************************************************************** 

--where: Dependent variable =■ (2)BLD.C0NT 

Independent variable = (i)TIME(HR) 

THE STANDARD NON-LINEAR REGRESSION MODEL SELECTED =■ Y=A*EXP ( B*X )+C*EXP ( D*X ) + E 



IN RADIANS 
Is a plot of the regression desire d '• 

NO 

Are the values of the initial est incites proper? 



No plot this time. 



YES 
**************************************************** 

Delta (Convergence criteria)- OS 

THE INITIAL VALUES OF PARAMETERS ARE : 
PARAMETER i ~ 429.35223400? 
PARAMETER 2 =-.0428279809939 
PARAMETER 3 = 42.9352234007 
PARAMETER 4 =-.021413990497 
PARAMETER 5 = 3.92 

Would you liKe a hard copy of e u e r y iteration ? 

YES 

Calcs, way be lengthy. A beep will sound when done. Press 'NO' key to INTERRUPT 
! 
CALCULATIONS STARTED ON 0/0 AT 0:0 



ITERATION 



ESTIMATED PARAMETER VALUES 



S.S. RESIDUALS 



429.35223 
3,92000 
885, 18106 
36.47985 

1286.23842 
20,86657 

1243,85321 
18,44537 



-.04283 



9863 



-.12132 



10655 



42 , 93522 



211 , 11731 



604,50758 



824,55159 



02141 



909776. 40446 



,0 9760 



278063.40271 



,17961 



50454.00143 



.18667 



18850.77414 



DONE MM 

******************************************************************************** 

THE ESTIMATED PARAMETER VALUES AFTER 3 ITERATIONS ARE : 

PARAMETER 1- 1218.8442529 ( 1 . 218B442529E + 03 ) 
PARAMETER 2= - . 1063836 ( - 1 . 0B38358B05E-01 ) 
PARAMETER 3= 860.7805920 B , 6078059213E+02) 
PARAMETER 4~ -.1848468 (-1 . 8484679940E-0 1 ) 
PARAMETER 5= 17,7594408 ( 1 , 7759440852E+01 ) 

J|\ >f* ^£ J^ ^^ Jf» J(C JfC J|C ^C ^t J|( JfC Jf( J|C Jft Jf( j^l JfC Jft JfC J(t jp 3fC Jft Jft J|C jft jp J|C jfs JfC JfC JfC jjs J|C JfE J(( JfC 3fC 3JC Jfk Jf£ J(C J^t 3ft 3ft Jft J^t JfS Jf( Jft JJ? Jft jp J|( Jft Jft Jf( JfC fll J(n Jf\ ^^ ^^ ^t ^t J|C J^ ^- ^» -^ -T 1 t^ 'T^ '■^ -^ -t^ -^ -t^ 

THE INITIAL VALUE OF SUM OF SQUARED RESIDUALS = 909776.404501 Not as good 
AFTER 3 ITERATIONS THE SUM OF SQUARED RESIDUALS* 18803.5777771 as before. 
APPROXIMATE STANDARD ERROR FROM SQUARED RESIDUALS= 30,6623366503 
******************************************************************************** 

New initial estimates and /or converse nee criteria ? 



NO 

Are confidence intervals on paraMeters desired v 



125 



YES 

Confidence coefficient for confidence interval on par ane t e.r s (e . g , 90,95,99) = 

95 

95 7, CONFIDENCE INTERVALS ON PARAMETERS 

SIMULTANEOUS C.I, 



ARAMETER 


ONE-AT-A 


TIME 


CI, 




LOWER LIMIT 




UPPER LIMIT 


i 


631,4045 




1806,2840 


2 


-. 1322 




-.0806 


3 


228, 0449 




1493,5163 


4 


-.31S5 




-.0542 


5 


1,8285 




33,6904 



LOWER LIMIT 

182. 0381 

-.1519 

-255.9710 

-.4154 

-10,3579 



UPPER LIMIT 

2255,6504 

-.0609 

1977,5322 

,0457 

45,8768 



^/ *&/ W ^ W ^ ^ ^ W W W ^t W ^f ^ W W >t *& ^ ^ ^ 'At ^ 4r ^ 4 ^r ^ ^ W ^ ^ ^ ^ ^ "^ ^ W ^ *4f ^ ^^^^ ^ ^ -j/ j* -^ <Af ^ -j/ ^ ^ ^- ^ ^ j/ o/ ^ ^l- ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ O/ ^ 

^ ^K J^ ^ ^ ^ ^ ^ ^t ^ ^ ^ ^ ^ ^ J^ ^ J^ ^ ^ ^ ^ ^ ^ ^ ^L ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ J^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ *f* ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

Residual analysis and/or prediction ? 

YES 

Print out residuals? 

YES 

TABLE OF RESIDUALS 











STANDARDIZED 




iSt 


OBSERVED Y 


PREDICTED Y 


RESIDUAL 


RESIDUAL 


SIGN IF 


1 


1165,70000 


1185.65897 


-19.95897 


- .65093 




2 


851 , 00000 


781,76841 


69.23159 


2,25787 


** 


3 


523.00000 


521.01910 


1 .98090 


. 06460 




4 


365. 00000 


451.45738 


-86.45738 


-2,81966 


** 


5 


294.00000 


284.66099 


9.33901 


.30458 




6 


170.00000 


125,23916 


44,76084 


1,45980 




7 


60 .00000 


86,12756 


-26,12756 


-.85211 




8 


81.00000 


47,53330 


33.46670 


1. 09146 




9 


20. 00000 


39.20569 


-19.20569 


■- . 62636 




10 


45, 00000 


27.79845 


17,20155 


,56100 




11 


27, 00000 


23,02251 


3,97749 


, 12972 




12 


37, 00000 


21 .61557 


15,38443 


.50174 




13 


31 , 00000 


19.87723 


11 , 12277 


.36275 




14 


26, 000 


19.07606 


6.92394 


,22581 




15 


36, 00000 


18.51147 


17,48853 


,570 36 




16 


18, 00000 


18.05704 


-.05704 


-. 0186 




17 


10 .00000 


17.84239 


-7 . 84239 


-.25577 




18 


8,20000 


17.78867 


-9,58867 


-.31272 




19 


13,40000 


17.77661 


-4.37661 


■-, 14274 




20 


17,40000 


17.77192 


- .37192 


-. 012.1.3 




21 


8,00000 


i 7, 76603 


-9.766 3 


-.31850 




22 


4, 00000 


17,76292 


-13,76292 


» . 44885 




23 


6.70000 


17.76064 


-11. 06064 


-.36072 




24 


6.70000 


17.75978 


-11. 05978 


-.36070 




25 


5.80000 


17.75953 


-11 .95953 


-.390 04 





Durbin Watson Statistic: 2 . 36053763341 



Residual plots? 

NO 

Option riuciber = ? 

7 



Return to BSDM 



126 



127 



Statistical Graphics 



General Information 

Object of Program 

This group of nine programs has been developed to allow you to quickly get a graphical 
representation of your data with a minimum number of questions on the CRT screen or an 
HP-IB Peripheral Plotter. 

Because of the length of the programs, two discs are used to hold the Statistical Graphics 
Routines. 

The entry to every program requires that you specify only the variables to be used and how 
subfiles are to be treated if they exist. From here on, all plotting parameters are determined by 
the program, and a plot may be constructed immediately by selecting the plot option from the 
plotting characteristics menu. 

Once the data has been specified, you have the option of changing nearly all the plotting 
parameters in order to construct a more personalized plot. This is done by selecting the option 
from the plotting characteristics menu. 

Any time new variables are defined by selecting the "RESTART" option from the menu, all 
previous parameters that had been defined are reset to a default value for this particular data 
set. In order to save the plotting characteristics you have specified, select the store option 
available in the menu. This stores the plotting characteristics out on another file of you choice. 
Then, after you select the restart option, you can retrieve these characteristics by selecting the 
load option. 

Special Considerations 

1. Every time you select a graph type, the CRT is declared as the standard plotting device. 
This unit may be changed by selecting the "Select Plotter" option from the plotting 
characteristics menu. 

2. Every program begins its execution by reloading the data contained in the "DATA" file. 
In the case of the NORMAL PROBABILITY PLOT and WEIBULL PROBABILITY PLOT, 
the file "DATA" is reloaded every time the "RESTART" option is selected. 



128 



The "RESTART" option always initializes the plotting parameters to default values as 
follows: 

• The axes labels default to the name of the variable being plotted. 

• The graph title contains the first 33 characters of the data set title. If a subfile is 
declared, the graph title preceded by the 10-character name given to the subfile. 

• The plotting symbol is a plus sign. 

• Pen numbers are set to 1. 

• The axis parameter is wide enough to contain the data set and has 10 equally spaced 
tic marks with every second tic mark being labeled. 

• In the special case of the log axis, only complete cycles are plotted on a log scale 
might be scaled so that it fits in an entire cycle. 

• The graphics device used for plotting is reset to CRT. 



Note 

After selecting the "OVERLAY" option, new data may be plotted on 
the previously constructed graph. But, the default values will be in 
effect for pen number and symbol. These may be changed by select- 
ing the "SELECT PLOTTER" and "SELECT PEN NUMBER" op- 
tions from the Plotting Characteristics menu. 

4. Whenever the program identifies an incorrect response, the question is asked again, until 
the correct response is given. 

5. Most plotting symbols are centered on top of the point they are designating. For some 
special characters, like the period and comma, the symbols are plotted in a lower posi- 
tion. 

6. The graphics programs only allow up to six decimal places for labeling the axis tic marks. 
For data that would need more, it is suggested that it be transformed. 

7. Each program handles missing values in a different way. See the individual programs for 
details. 

8. When asking for labeling information, an error 18 will occur if the label is too long. To 
recover, shorten the label and re-enter it. 

9. Do no press the "RUN" or "SHIFT-PAUSE" (RESET) keys unless it is necessary. The 
"RUN" key erases all variables, and RESET may erase memory. 

10. To prevent a graph segment from being plotted, assign a pen number of - 1 for the CRT 
or for an external plotter to that segment using the select pen numbers option. 



Note 

Statistical Graphics may be entered from any of the other Statistics 
packages by selecting the Advanced Statistics option. Once in the 
Statistical Graphics package, select the type of plot you wish to do 
from the menu provided. 



129 



Common Plotting Characteristics 

The following options are available for all nine of the plots, so their description and operation 
are explained in this section. There are slight deviations in the way some of these options 
work for the different plots. These differences are explained in the sections that describe each 
plot. It is recommended that you read through the section for the particular plot you wish to 
do before using the program. Not all plotting characteristics can be changed in each program. 

RESTART 

When this option is selected, all plotting parameters for the data set are reset to the default 
values which the program has determined for the data set. At this time a new variable to be 
plotted may be selected. 

PLOT 

This option plots the variable(s) being considered. The plot will be done on the CRT if no 
other device has been specified. If you have not specified any plotting parameters, the ones 
determined by the program are used, otherwise, the plotting characteristics you specified are 
used. You may choose whether or not to connect the points on most of the graphs, and 
whether or not to put grid lines on the graph. 

X-AXIS 

This option allows you to designate the scale for the x-axis. You determine the minimum x 
value, the maximum y value, the distance between the tic marks on the axis, and how many 
places after the decimal point you want printed. Since complete cycles on the x-axis are 
required by the semi-log, log-log, normal and Weibull plots, this option may not be used in 
those routines. 

Y-AXIS 

This option allows you to designate the scale for the y-axis. You determine the minimum y 
value, the maximum y value, the distance between the tic marks on the axis, and how many 
places after the decimal point you want printed. Since full cycles are required on the y-axis by 
the log-log and Weibull plots, this option may not be used in those routines. 

LABELS 

This option allows you to change the labels of your graph. You have an opportunity to 
change the x-axis label, the y-axis label, and the title of the graph. 

SYMBOLS 

This option allows you to change the symbol used to designate the points on the graph. If you 
do not want any symbol use a blank which is designated by " ". 

Dump Graphics On CRT 

This option prints the most recent CRT graph on the printer. This option may be used only if 
your printer has graphics capabilities (e.g. 2671G, 2631G). 



130 



SELECT PLOTTER 

This option allows you to select the plotting device on which you wish to have the plot drawn. 
You may have the plot done on the CRT or an external plotter. You will need to input the select 
and bus codes. You will also need to input a plotter identification string. 

SELECT PEN COLOR 

This option allows you to select the pen number you wish to use for plotting your graph. The 
pen number used may be changed for axes and numeric labels, grid lines, labels and points. 

OVERLAY 

This option, when available, allows you to add another plot of the same type with new vari- 
ables on the previously constructed plot. The plotting limits will remain as you have specified. 

STORE 

This option allows you to store the plotting characteristics that you have specified so that they 
may be retrieved at a later time. To do this you need to specify a file name and where you 
wish to store the information. 

LOAD 

This option allows you to retrieve the plotting characteristics that were stored previously for 
this type of plot. You need to specify the name of the file and where it was stored. The 
program will then list the stored plotting characteristics. 

RETURN 

This option returns the program to the main STATISTICAL GRAPHICS MENU. 



Time Plot 

Object of Program 

This program plots any variable in increasing units of time or sequence number. This plot is 
useful in determining the effect that time/sequence may have on a variable. The program 
allows the initial time to begin the plotting and the time period between points to be set by 
selecting the "X AXIS" option. If the plot option is selected first, the program defaults to a 
starting time of 1 and time increment period of 1. 

Special Considerations 

1. Missing values are not plotted. The value at this time period is left blank. 

2. When doing an overlay of the data, the initial time and time increments are 1 unless 
changed by selecting the x-axis option. Once the values have been changed, they retain 
the new values until they are changed again. 



131 



Special Plotting Characteristics 

X-AXIS 

This option allows you to determine the scale for the time axis. You need to specify the 
minimum and maximum time values, and the distance between tic marks. In addition, you 
need to specify the initial time for beginning series, the point in time that the plotting begins, 
and time increments between points, how much time passes between each plotted point. 

OVERLAY 

This option allows you to plot another variable over an already contructed graph. 

References 

1. EXPLORATORY DATA ANALYSIS, John W. Tukey; 1977; Addison Wesley. 

2. "A Review of Some Smoothing and Forecasting Techniques", T. J. Boardman and 
M.C. Bryson, Journal of Quality Technology, Volume 10, Number 1, January, 1978. 



Histogram 

Object of Program 

This program creates a histogram with up to forty cells. For every data set, the sample mean, 
the sample variance, the number of cases used to calculate them, and the cell statistics will be 
printed. 

Different histograms may be created by specifying the number of cells to be used, and the cell 
locations, or by specifying the number of cells, the location of the first cell, and the cell width. 
These specifications may be given by selecting the "CELL LIMIT" option from the Plotting 
Characteristics menu. 

A normal curve overlay and the corresponding Chi-squared goodness-of-fit statistic may be 
obtained by selecting the "NORMAL CURVE OVERLAY" option from the Plotting Charac- 
teristics menu. 

Special Considerations 

1. Missing values are not considered in any calculation, and are not considered in con- 
structing any cell. 

2. A maximum number of forty cells may be obtained. 

3. At least four cells are needed to perform a chi-squared goodness-of-fit test. 

Specicil Plotting Characteristics 

CELL LIMITS 

This option allows you to specify the cell size for the histogram. There are two ways of doing 
this: 

1. Enter the number of cells (greater than 1 but not more than 40) and Enter the minimum 
cell value and the maximum cell value that should be used. 

2. Enter the number of cells (greater than 1 but less than 40), and Enter the mimumum 
cell value and the width of the cell. 



132 



The program will then give you a list of the number of cells, their minimum and maximum 
bounds, and the number of observations in each cell. 

NORMAL CURVE OVERLAY 

This option does a chi-square goodness-of-fit test of the data. In order to do this at least four 
cells must be specified; if four cells have not been specified, an error will be printed. The 
descriptive statistics for each cell will be printed. The contributions to the chi-squared statistics 
are added together to get the final value. The cells on the tails are collapsed together until an 
expected frequency of at least three and less than seven is found, and then the contribution is 
calculated. If, after collapsing the end cells to get high enough frequencies, the number of 
terms in the contribution of the chi-squared value go below four then another error will be 
printed. 

Once this is done the normal curve for the desired plot is plotted over the histogram. 

Methods and Formulae 

X, = ith observation of the selected variable that is not a missing value 
N = number of valid observations 



X= 2* x/n 



n / n \ : 

-Xv-(2*) 

i = l N i = l 7 



Variance = JL, X, z - V^ Z, X, / /N 

N-l 

Normal Curve overlay = 

100*(Cell width)*(EXP((X-X) 2 /(2*Variance)) 



* 2 ,r 



V 2iT*Variance 

# cells 

(Observed frequency in cell i - Expected frequency in cell i) 
(Expected frequency in cell i) 



df = (# of cells) -3; because 1 degree of freedom is lost for number of cells, 1 for the 
estimated mean, and one for the estimated variance. 

The expected frequency of cell i = area under the normal curve overlay which would fall in 
cell i is calculated by determining the left side of the cell i(A), and the right side of the cell i(B) 
and finding 



133 



( 



B-Xbar \ / A-Xbar \ 

4> \ standard deviation ) — 4> I standard deviation ) 

Then use the following approximation for the area between A and B in a standard normal. 

4>(X) = 1-Z(X) (bit + bz^ + ba^ + b^ + bgt^ + EfX) where | E(X) | <7.5*10~ 8 

t = (l + .231649X)~ 1 

bi=. 31938153 

b 2 == -.35656378 

b 3 = 1.781477939 

b 4 = 1.821255978 

b 5 = 1.330274429 

for X>0 

andl-$(|X|)forXs=0 

Z(X) = exp(-x 2 /2)/V2ir 

To calculate the right tailed probability value associated with the Chi Square value we use 

P(X 2 v>calculated value) = 

i-{[-|^p(-x 2 '2)]/r((v + 2 )/ 2 ))}.c 

c-i + i i* 



r=i (v + 2) (v + 4)...(v + 2R) 

where X 2 is the calculated value 

v is the degree(s) of freedom 

7 (.) is the standard gamma function y{.5) = .88626925 

The sum is calculated until the percentage of change between two consecutive sums is less 

than .000001 or R = 40. 

The number of cells being used defaults to the value given by the closest integer of 
the function: 

[1 + (3.31og 10 (Number of valid observations))] 

References 

1. An Introduction to Statistical Methods and Data Analysis, Lyman Ott; 1977; Wads- 
worth. 

2. Statistics for Modern Business Decisions, Second Edition, Lawrence Lapen; 1978; Har- 
court, Brace, Jovanovich. 

3. Statistical Analysis for Decision Making, Second Edition, Morris Hamburg; 1977; Har- 
court, Brace, Jovanovich. 

4. Fundamental Statistics for Business and Economics, Fourth Edition; Neter, Wasserman, 
and Whitmore; 1973; Allyn and Bacon. 

5. Handbook of Mathematical Functions, Abramowitz, Stegun; Fifth Printing; 1965 Dover 
Publications. 



134 



Normal Probability Plot 

Object of Program 

This program creates normal probability paper, orders the data, and then plots the data on 
the paper. This plot may be used to indicate if the data set may have come from a normal 
distribution. If a straight line can be made to fit the plotted points, then the data may come 
from a normal distribution. 

Special Considerations 

1. Missing values are eliminated from the data, which effectively makes the data set one 
smaller for each missing value. 

2. When plotting more than a hundred points, it is suggested that the period be used as 
the plotting symbol. This allows for a more even line. Note that the period is plotted 
lower than the actual value of the point. 

3. A maximum of 999 points may be plotted on the graph with the empirical distribution 
used by the program. 

Special Plotting Characteristics 

LABELS 

This option allows you to change the labels for the y-axis and the title, but not the x-axis. 

OVERLAY 

This option allows you to plot the normal probability of another variable over the already 
existing graph. 

Methods and Formulae 

Empirical Distribution Function (EDF) 

X, is the i sorted value in the data set. i can go from 1 to N. N is the number of non-missing 
values in the data set. EDF(Xi) = i/(N + 1 ) 

Cumulative distribution function (CDF) for plotting and scaling the X axis is done by deter- 
mining the EDF(Xi) and then determining X p . 



x„=t- 



c + c 1 t+c 2 t 



2 



P l+d,t + d 2 t 2 + d 3 t 3 



where t= Vlog e (l/(EDF(X)) 2 

c = 2.515517 
cj = .802853 
c 2 = . 010328 
d 1 = 1.432788 
d 2 = . 189269 
d 3 = . 001308 



135 



References 

1. Probability Plots for Decision Making, James R. King; 1971; Industrial Press. 

2. "Weibull Probability Papers", Wayne Nelson and Vernon C. Thompson, Journal of 
Quality Technology; Volumn 3, Number 2, April 1971. 

Weibull Probability Plot 

Object of Program 

This program creates Weibull probability paper, orders and then plots the data. The number of 
cycles used to plot the data is determined by the data. 

If the plotted data appears to lie on a straight line, the data may come from a Weibull distribu- 
tion. No attempt is made in the program or on the paper to estimate the parameters of the 
Weibull distribution. 

Special Considerations 

1. Missing values are eliminated from the data, which effectively makes the data set 1 
smaller for each missing value. 

2. When more than a hundred points are plotted, it is suggested that the period be used as 
the plotting symbol. This allows for a more even, narrower line. Note that the period is 
plotted lower than the actual value of the point. 

3. A maximum of 999 points may be plotted on the graph with the empirical distribution 
used by the program. 

4. All data used by this program must be positive. The data is checked and a message is 
printed if any zero or negative data is found. 

Methods and Formulae 

Empirical Distribution Function (EDF) 

Xi is the ith sorted value in the data set. i can go from 1 to N where N is the number of 
non-missing values in the data set. 

EDF(Xi) = i/(N + l) 
Percent P : ailure 



l09 -( l03 '(l3I^))) 



136 



Scattergram 

Object of Program 

This program plots points on a graph according to the two variables you specify. The plot is 
useful in determining if there is any relationship between two variables. 

Special Considerations 

For any point where either the X or Y coordinate is missing, the point is not plotted. 



Semi-Log Plot 

Object of Program 

This program plots points on a graph where each X value is plotted on a log scale, and each Y 
value is plotted on a normal scale. The number of cycles used on the X axis is determined by the 
program. 

This plot is useful in determining if any relationship between an untransformed Y variable and a 
log-transformed X variable exists. 

Special Considerations 

1. For any point where either the X or Y coordinate is missing, the point is not plotted. 

2. All data used for the X variable must be greater than zero. 



Log-Log Plot 

Object of Program 

This program plots points on a graph where both the X and Y axes take on log values. The 
number of cycles used by both axes are determined by the program. 

The plot of the points is useful in determining if any relationship exists between log- 
transformed X and Y variables. 

Special Considerations 

1. For any point where either the X or Y coordinate is missing, the point is not plotted. 

2. All data specified for this program must be positive. 

References 

1. Exploratory Data Analysis, John W. Tukey; 1977; Addison Wesley. 

2. The Statistical Analysis of Experimental Data, John Mandel; Interscience. 



137 



3D Plot 

Object of Program 

This program constructs and draws points in a simulated three-dimensional graph. The axes 
may be rotated and tilted to see relationships between the data better. An effective XY scatter- 
plot may be obtained by tilting the axes 90 degrees. The program looks best when rotation 
and tilt are between 20 and 70 degrees. At more extreme angles, labeling problems may 
occur. You may correct some of these problems by adjusting the axis so that the number of tic 
marks labeled are fewer, and so that axes labels are shorter. 

Special Considerations 

1. For any point where either the X, Y or Z value is missing, the point is not plotted. 

2. For long axes titles and various rotation and tilt combinations, the axes titles may over- 
lap, or not be entirely plotted. 

Special Plotting Characteristics 
PLOT 

This option plots the three variables that were specified. You need to input the angle (in 
degrees) of rotation about the z-axis between zero and ninety degrees, and the angle, be- 
tween zero and ninety of elevation, which is the angle between the line drawn from the origin 
of the axes and the XY plane. 

Z-AXIS 

This option allows you to designate the scale for the z-axis. It works the same as the options 
for the X and Y-axis. 

Methods and Formulae 

Mapping from the third dimension to the two dimensions of the plotting device uses the 
following method. 

Given any point (X,Y,Z) we map to the point (A,B) by letting 

A = (X-Xmin) (CQS(Rotation)) + (Y-Ymin) ( _ SIN(Rotation)) 

(Xmax-Xmin) (Ymax-Ymin) 

and 

B= (X-Xmin)[(COS 2 (Rotation) - l)(TAN(Tilt/2))] 
(Xmax-Xmin) 

+ (Y-Ymin)[(SIN 2 (Rotation) -1) (TAN(Tilt/2))] 



(Ymax-Ymin) 

+ (Z-Zmin) (COS(Tilt)) 
(Zmax-Zmin) 

where Xmin, Xmax, Ymin, Ymax, Zmin, and Zmax are the minimum and maximum values of 
the axes. Rotation and Tilt are the angles specified for the tilt and rotation of the axes. 



138 



Andrew's Plot 

Object of Program 

This plot takes multidimensional data and plots it on a two-dimensional plotting device in a 
meaningful way. It does this by mapping the vector X= (X 1? X 2 , X 3 ,..., X k ) into a function of 
the form Fx(t) = X 1 \ / 2 + X 2 sin(t) + X 3 cos(t) + X 4 sin(2t) +X 5 cos(2t) +... where t is between 
±it. For further information, see Reference 1. 

Special Considerations 

1. Up to twenty variables may be used for plotting. 

2. Each observation causes one line to be plotted. 

3. The order of the variables determines the outcome of the plot. 

4. Neither axis may be labeled. 

5. A rough guess is made by the program as to extremes of the functions being plotted, 
and may be modified by pressing the "YAXIS" special function key. 

6. The duration of the plot increases with the number of variables being used. 

7. Any time a missing value is encountered for any variable used, the entire observation is 
deleted. For labeling the lines, the observation number that would have been used to 
label the line is incremented and used for the next observation. 

8. Each line being plotted is broken up into 100 straight-line increments. 

Special Plotting Characteristics 

PLOT 

This option creates the Andrew's Plot. You may choose whether or not you wish to have the 
first twenty observations labeled. Because this plot constructs one curve for every observa- 
tion, it takes quite awhile to complete the plot for large data sets. 

X-AXIS 

In changing the parameters for the x-axis, the minimum value of x must be between ( — PI) 
and ( + PI). The maximum value must be between the minimum value and ( + PI). 

Labels 

The only label that may be changed is the title. 

References 

1. D. F. Andrews, "Plots of High-Dimensional Data", Biometrics, 28, pp. 125-136, March, 
1972. 



139 



Examples 

STATISTICAL GRAPHICS EXAMPLES 



************************************************ 

* DATA MANIPULATION * 

************************)K*********************************** ) x*)| ( !(c)(c 3 (()|t ! t(**)((*)K)K***!((*)|< 

Enter DATA TYPE (Press CONTINUE for RAW RATA): 



Mode nuMber = ? 

it 

Is data stored on program's scratch file (DATA)V 
YES 



Raw data to be Input 
From mass storage 



EGG FUTURE CONTRACTS 



Data file nawe: DATA 

Data type is: Raw data 

NuMber of observations: 
NuMber of variables: 



83 
5 



Variable nacies: 
i . ALBUMEN 

2. FROZ. ALBU 

3. FROZ. EGGS 

4 . SHELLEGGS 

5. EGG. FUTURE 

Subfile nawe 
i. SUBFILE i 

2. SUBFILE 2 

3. SUBFILE 3 

4. SUBFILE 4 

SELECT ANY KEY 



Five variables and names or labels 



beginning observation nunber of observations 

i 30 

31 12 

43 24 

67 17 



Option nu fiber = ? 

1 

Enter Method for listing data; 

3 



Press special function key labeled-LIST 
All data listed 



EGG FUTURE CONTRACTS 



Data type is- Raw data 





Variable # 1 


Variable # 2 


Variable * 3 


Variable # 4 


Variable * 5 




(ALBUMEN ) 


(FROZ. ALBU) 


(FROZ. EGGS) 


(SHELLEGGS ) 


(EGG. FUTURE) 


DBS* 












1 


1 .67000 


21.20000 


2103. 00000 


.20000 


43.580 


2 


1.80000 


19.60000 


2025.00000 


.20000 


47.90 


3 


i .990 


24.80000 


2834.0 00 


.30000 


47.40 000 


4 


1.92000 


36.60000 


4697.00000 


.5000 


45. 100 00 


5 


1.92000 


49.80000 


6842.00000 


1.20 000 


43. 000 


6 


2.12000 


54.40000 


7793.00000 


2. 10000 


42.850 


7 


2.34000 


53.60000 


7920.0 0000 


2.30000 


42. 15000 


8 


2.38000 


46.60000 


6979. 000 


2.20 00 


40.850 


9 


2.260 


37.30000 


5740.00000 


1.70000 


41.75000 



140 



10 


2. 


08000 


30.30000 


4627.00 000 


1 . 1 


43 100 00 


il 


2 


06000 


23.30000 


3392.0 00 


.80000 


43. 00000 


12 


2 


02000 


17.40000 


2429.00000 


.30000 


46.90000 


13 


1 . 


96000 


10.70000 


1912.00000 


.10 


46.450 


14 


1. 


81000 


9.50000 


1681 .00000 


.30000 


45. 15000 


15 


1. 


83000 


15.50000 


2179.00000 


.30000 


44.70000 


16 


1 . 


61000 


25.10000 


3425.0 00 


.30000 


44.50000 


17 


1 . 


53000 


38.80000 


5294. 0000 


.60000 


45.40000 


18 


1 . 


55000 


50.30000 


6464.0 00 


1 .20 000 


42.80000 


19 


1 . 


42000 


51.80000 


6431.00000 


1 .50000 


41 . 00000 


£0 


1 . 


36000 


49.6000 


5955.000 00 


1 30 000 


37.00000 


21 


1. 


25000 


45.30000 


5186.00000 


1.00000 


37 .00000 


22 


1. 


.230 


39.80000 


4478.0 0000 


.70 00 


39.50000 


23 


1 


19000 


33.80000 


3734.00000 


.60000 


39.75000 


24 


1 


.18000 


27.90000 


2930.00000 


.50000 


40.60000 


25 


1 . 


.15000 


26.40000 


2599.00000 


.30 000 


39.9000 


26 


1 


.16000 


23.90000 


2527.00000 


.30000 


40.20 000 


27 


1 


.20000 


24.60000 


3304.00000 


.500 


37.55000 


28 


1 


.280 


33.10000 


4388.00000 


.90000 


36.60000 


29 


1 


.450 


42.80000 


5907.00000 


i. 20000 


36.500 00 


30 


i 


.550 


53.10000 


6836.00000 


1 .70 00 


34.05000 


31 


i 


.3300 


56.50000 


6769.00000 


1 .80000 


35.70 


32 


1 


.20000 


52.50000 


6074.00000 


i:SOO0O 


35.00000 


33 


1 


.17000 


46.50000 


5148.00000 


1.200 00 


34.58000 


34 


1 


.22000 


39.50000 


4101.00000 


.90000 


41.25000 


35 


1 


.16000 


32.50000 


3174.00000 


.60000 


43.30000 


36 


1 


.05000 


25.80000 


2329.00000 


.30000 


43.10000 


37 


1 


.03000 


24.20000 


1921.00000 


.20000 


41.65000 


38 


i 


.00000 


23.00000 


1749.00000 


.20000 


41.70000 


39 


1 


.06000 


21.10000 


1535.00000 


.10000 


42.50000 


40 


1 


.07000 


25.30000 


2176.00000 


.10000 


43.10000 


41 


1 


.10000 


35.20000 


3437.00000 


.30000 


41.05000 


42 


1 


.09000 


45.40000 


4448.00000 


.70000 


39.95000 


43 




.96000 


47.50000 


4459.00000 


.90000 


40.15000 


44 




.91000 


44.60000 


4103.00000 


.70000 


37.650 


45 




.87000 


39.7000 


3423.00000 


.50000 


41.75000 


46 




.80000 


32.30 00 


2711.00000 


.30000 


37.80000 


47 




.80000 


26.70000 


2112.00000 


.20000 


36.80000 


48 




.840 


22.20000 


1631.00000 


.10000 


36.00000 


49 




.88000 


19.20000 


1249.00000 


.10000 


36.50000 


50 




.84000 


18.20000 


1209.00000 


.10000 


36.70000 


Si 




.83000 


19.70000 


1500.00000 


.10000 


35.70000 


52 




.83000 


26.50000 


2687.00000 


.10000 


32.70000 


53 




.81000 


33.20000 


4024.00000 


.50000 


31.50000 


54 




.81000 


39.90000 


4831.00000 


1.00000 


32.40 000 


55 




.81000 


38.60000 


4739.00000 


i .10000 


31.250 


56 




.81000 


36.30000 


4513.00000 


.90000 


28.30000 


57 




.70000 


33.20000 


3966.00000 


.70000 


29.00000 


58 




.74000 


28.70 00 


3489.0 00 00 


.60000 


35.35000 


59 




.84000 


24.40000 


2732.00000 


.50000 


34.95000 


60 




.75000 


21.30000 


2180.00000 


.30000 


36.60000 


61 




.73000 


22.70000 


2210.00000 


.20000 


35.80000 


62 




.67000 


22.80000 


2322.00000 


.30000 


34.10000 


63 




.68000 


24.60000 


2243.00000 


.30000 


36.00000 


64 




.850 


26.70000 


2580.0 000 


.20000 


37.850 


65 




.85000 


38.300 


3836.00000 


.30000 


38.60000 


66 




.88000 


48.50000 


5086.00000 


.80000 


35.70000 


67 




.88000 


51.00000 


5241.00000 


1 .10000 


34.950 


68 




.81000 


48.10000 


4748.0 0000 


1.00000 


34.65000 


69 




.750 


42.90 00 


4022.00000 


.70 000 


35.45000 


70 




.69000 


35.10000 


3149.00000 


.50000 


38.50000 


71 




.68000 


28.100 00 


2307.0000 


.30000 


37.00000 


72 




.71000 


22.40000 


1700.00000 


. 10000 


36.35000 


73 




.75000 


19.10000 


1456.00000 


. 10000 


38.15000 


74 




.76000 


16.90000 


1282.00000 


.10000 


38.70000 


75 




.85000 


17.40000 


1417.00000 


.20000 


36.35000 


76 




.84000 


20.00000 


1772.00000 


.20000 


37.00000 



77 


.95000 


25.80000 


2578.00000 


10 


37.150 


78 


.98000 


28.90000 


3215.00000 


20 000 


37.75000 


79 


.98000 


29.10000 


3165.00000 


4000 


38.30000 


80 


1 .05000 


27.30000 


3025.00000 


30000 


38.450 


81 


i. 00000 


22.60000 


2746.0 00 


30000 


36.350 


82 


.90000 


19.80000 


2311.000 00 


20000 


35.00 00 


83 


.920 


15.60000 


1853.00000 


10000 


33.70000 



141 



Option nunber = ? 



SELECT ANY KEY 



Enter nuciber of desired function: 

1 

Y axis variable nunber? 

2 

Enter subfile to be used (0 if subfiles ignored) 



Enter nunber of desired function: 

8 

Enter option nunber of the graphics device? 

2 

Plotter identifier string (press CONT if 'HPGL')? 

Enter select code, bus address (default is 7,5) ? 

Is the above infornation correct? 

YES 

Enter nunber of desired function: 

1 

Are the points to be connected? 

YES 

Are grid lines to be plotted? 

NO 

Beep will sound when plot is done then press CONT. 

To interrupt plotting press STOP key. 



Exit listing options 

Press special function key labeled-ADV STATS 

Remove BSDM media 

Insert Statistical Graphics 1 A media 

Time Plot 



Select plotter option 
Choose external plotter 
Press CONTINUE 
Press CONTINUE 

Plot 



142 



EGG FUTURE CONTRRCTS 




TIME 



Enter noMber of desired function : 

4 Change y-axis 

Y plotting MiniMUM? 


Y plotting MaxiwuM? 
60 

Y tic ? 
10 

Label every Kth tic nark? 

1 

Nuciber of decimal places to label the Y axis? 



Enter nunber of desired function: 

5 Change labels 
Enter the Tine axis title <33 characters or less) 

TIME BY INCREMENTS OF i 

Enter the Y axis title <33 characters or less) 

FROZEN ALBUMEN 

Enter the Graph Title (33 characters or less) 

FUTURE EGG CONTRACTS 

Enter nunber of desired function: 

i Plot 

Are the points to be connected? 

YES 

Are grid lines to be plotted? 

NO 



143 



Beep will sound when plot is done then press CONT. 
To interrupt plotting press STOP key. 

Enter number of desired function: 

10 

Y axis variable number? 

5 

Enter subfile to be used <0 if subfiles ignored) 



Enter number of desired function: 

6 

Put double quotes around the blank. 

A 

Enter number of desired function: 

i 

Are the points to be connected? 

YES 

E'eep will sound when plot is done then press CONT. 

To interrupt plotting press STOP key. 



Overlay plot 



Change plotting character 



Plot 



FUTURE EGG CONTRRCTS 




TIME BY INCREMENTS OF 1 



Enter number of desired function: 

11 

Enter file name to store plot characteristics ? 

CHARS .INTERNAL 



Store plotting characteristics 



144 



Is data nediun placed in device INTERNAL 

? 

YES 

Is PROGRAM MEDIUM replaced in device 

? 

■> 

YES 

Enter nunber of desired function; 

13 

Enter nuMber of desired function: 



Return to main graphics menu. 



Select histogram example 



HISTOGRAM 



Variable nunber for creating histogram? 



Variable 2 will be used 

Enter subfile to be used < I) if subfiles ignored) 



Nunber of valid cases = 83 

The Mean is calculated to be= 31.9313253012 

The variance is calculated to be- 140.299006759 









OBSERVED 


CELL 


MINIMUM 


MAXIMUM 


FREQUENCY 


1 


9.500 


16.214 


4 


2 


16.214 


22.929 


IS 


3 


22.929 


29.643 


22 


4 


29.643 


36 . 357 


10 


5 


36.357 


43.071 


11 


6 


43.071 


49.786 


9 


7 


49.786 


56.S00 


9 



Enter nunber of desired function = 

8 Select plotter 

Enter option nunber of the graphics device? 

Plotter identifier string (press CONT if 'HPGL')? 

Enter select code, bus address (default is 7,5) 

Is the above infornation correct? 

YES 

Enter nunber of desired function: 

1 Plot 

Are horizontal grid lines to be plotted? 

NO 

BEEP will sound when plot done then Press CONT. 

To interupt plotting, press STOP key. 



Enter nuMber of desired function 
10 



Overlay normal curve 









OBSERVED 


EXPECTED 


CONTRIBUTION 


LL 


MINIMUM 


MAXIMUM 


FREQUENCY 


FREQUENCY 


CHI 


-SQUARE 


i 


-Infinity 


16.214 


4 


7 . 658 




1 . 748 


2 


16.214 


22.929 


18 


10.901 




4 623 


3 


22.929 


29.643 


22 


16.583 




1 . 770 


4 


29.643 


36.357 


10 


18.448 




3 . 869 


5 


36.357 


43.071 


11 


15.011 




1.072 


6 


43.071 


49.786 


9 


8.932 




.001 


7 


49.786 


Infinity 


9 


5.466 




2 . 284 



Press CONT to plot the noma! curve overlay 
BEEP will sound when plot done then PRESS CONT. 



145 



EGG FUTURE CONTRRCTS 



30 



>- 
U 

z 

Ld 

m 
o 

Ld 

o: 

L_ 

Ld 
> 



CE 

_i 
Ld 



Ld 
U 
LY 
Ld 
D_ 




FROZ. RLBU 



Enter nunber of desired function: 
13 

Enter nunber of desired function: 

3 

Variable nunber? 

2 

Enter subfile to be used (0 if subfiles ignored) 



SORTING THE DATA 

Enter nunber of desired function: 

3 



Return to main graphics menu 



Select normal probability plot 



Change y-axis 



146 



Y plotting niniwuM? 
5 

Y plotting Maxinun? 
60 

Y tic ? 
5 

Label every Kth tic Mark? 

i 

NoMber of decimal places for labeling the Y axis? 



Enter nuwber of desired function: 

4 

Enter the Y axis title (33 characters or less) 

FROZEN ALBUMEN 

Enter the Graph Title <33 characters or less) 

EGG FUTURE CONTRACTS 

Enter nunber of desired function: 

7 

Enter option nunber of the graphics device? 

2 

Plotter identifier string (press CONT if 'HPGL')? 

Enter select code, bus address (default is 7,5) 

Is the above in for nation correct? 

YES 

Enter nuMber of desired function: 

5 

Put double quotes around the blank? 

* 

Enter nuMber of desired function: 

i 

Are grid lines to be plotted? 

NO 

Eieep will sound when the plot done then press CONT 

To interrupt plotting, press STOP key. 



Specify y lower limit 
Specify y upper limit 

Label every tic mark 
Change labels and titles 

Select plotter 



Change plotting symbol 



Plot 



147 



EGG FUTURE CONTRACTS 



z 
u 
z 

Zl 

m 

_i 
cr 

z 
u 

N 
O 
QL 



5 5 



45 



35 



15 



»«*«* 



**" 



1* 



<r 



** 




J I I L 



— rvi in ~ tvi 



in s 



PERCENT UNDER 

NORMRL PROBRBILIY PLOT 



j 1 



Q 


S 


ts 


s 


Q 


O 


a 


Q 


in 


CO 


en 


in 


CO CO 


OJ 


ro 


xi- 


in 


CD 


rv 


GO 


co 


en 


en 


01 


CO 

en 


CO CO 
CO CO 



Enter niiMber of desired function : 
12 

Enter number of desired function: 

5 

X axis variable nunber? 

i 

Y axis variable nuftber? 

5 

Enter subfile to be used <0 if subfiles ignored) 



Enter nunber of desired function: 

8 

Enter option nuwber of the graphics device? 

2 

Plotter identifier string (press CONT if 'HPGL')? 

Enter select code, bus address (default is 7,5)? 

Is the above infor nation correct? 

YES 

Enter nunber of desired function; 

i 

Are the points to be connected? 

NO 



Return to main graphics menu 



Select scattergram 



Select plotter option 



Plot 



148 



Are grid lines to be plotted ? 

NO 

Beep will sound when plot done then press CONT 

To interrupt plotting press 'STOP' key. 



EGG FUTURE CONTRHCTS 



49 



45 



Ld 

CK 40 

Z) 

Z) 

u 



3 1 



27 



+ + 



CO 



S 



OJ 



in 

OJ 



RLBUMEN 



Enter nunber of desired function 
A 

Y plotting MininuM? 
30 

Y plotting MaxiMUM? 
50 

Y tic? 



Change y-axis for another scattergram 



Label every Kth tic Mark? 

i 

Nunber of decifial places for labeling the Y axis? 



Enter nunber of desired function: 

3 

X plotting miniMUM? 

.6 

X plotting waxinuM? 

2.4 

X tic? 



Change x-axis 



Label every Kth tic Mark? 

i 

NuMber of decimal places for labeling the X axis? 



149 



Enter nuciber of desired function: 

6 

Put double quotes around the blank? 

i 

Enter nunber of desired function: 

S 

Enter the X axis title (33 characters or less) 

ALBUMEN 

Enter the Y axis title (33 characters or less) 

EGG FUTURE 

Enter the Graph Title (33 characters or less) 

FIRST EGG FUTURE CONTRACTS 

Enter number of desired function: 

i 

Are the points to be connected? 

NO 

Are grid lines to be plotted ? 

NO 

Beep will sound when plot done then press CONT . 
To interrupt plotting press 'STOP' key. 



Change plotting symbol 



Change labels 



Plot 



50 r 



45 - 



Ld 

z> 
h- 

£ 40 

(J 
(J 

Ld 



35 



30 



FIRST EGG FUTURE CONTRACTS 



ii 



1 1 



1 i i 1 



1 « : > 
I'll 

. I 11 



1 1 

1 1 



C3 S 

CD CO 



(S 



SI IS (S3 

OJ tJ- CO 



1 1 

1 






l u 



s 






s 



OJ 



w 



ai 



RLBUMEN 



150 



Enter number of desired function : 

13 

Enter nunber of desired function : 



Return to main graphics menu 

Select another ADV STAT pac 
Remove Statistical Graphics 1A 
Insert Statistical Graphics 1 B 



Enter nunber of desired function: 

3 

X axis variable nunber? 

i 

Y axis variable nunber? 

3 

L axis variable nunber? 

S 

Enter subfile to be used < (I if subfiles ignored) 

3 

Enter nunber of desired function: 

9 

Enter option nunber of the graphics device 

o 

Plotter identifier string (press CONT if 'HPGL' ? 

Enter select code, bus address (defaults are 7,5)? 

13 THE ABOVE INFORMATION CORRECT? 

YES 

Enter nunber of desired function: 

i 

Enter angle of rotation in degrees t 0< Angle<=90 1 

30 

Enter angle of elevation in degrees I 0< =Angle<=90 ] 

30 

Beep will sound when plot done then PRESS CONT. 

To interrupt plotting press 'STOP' key. 



Select 3-D plot 



Plot only for data in subfile 3. 
Select plotter 



Plot 

Rotate plot for easier viewing 

Raise angle of elevation 



SUBFILE 3 EGG FUTURE CONTRRCTS 



151 



4 1 ■ 


- 










y 38 - 






I < 


■ 


• 


? 


J, 


f 


QL 


















Z) 






! 






i " 

i 


! 


i 


h- 35 i 






i 








i 


i 


i 


Z) 














I* ■ 




I 


L. 






' 








i 
j 




1 

j 


■ 32 ■ 
















[ 


1 


(J 






j j 


t . 












(J 






















Ld 2g . 






















> 6 


^ 


.7 


©~"~ 


~y- 


_[__ 


e 


\~" 






52 



RLBUMEN 



1 o -, 1 1 

.,1920 
Oc 2740 
3 560 
A3 80 G <3 



f 



&&' 



Enter nunber of desired function; 
14 

Enter number of desired function: 

4 

Nuciber of variables to be used? 

S 

Enter variable nunber i 

? 

i 

Enter variable nunber 2 
? 



Return to main graphics menu 



Select Andrews Plot 



Enter variable nunber 3 
? 

3 

Enter variable nunber 4 

? 

4 

Enter variable nunber S 

? 

S 

Is the above information correct? 

YES 

Enter subfile to be used (0 if subfiles ignored) 

Enter nunber of desired function: 

7 

Enter option nunber of the graphics device? 

2 

Plotter identifier string (press CONT if 'HPGL')? 

Enter select code, bus address (default is 7,5)? 

Is the above infomation correct? 
YES 



Plot only data in subfile 2 
Select plotter 



152 



Enter nunber of desired function: 

i 

Are up to the first twenty lines to be labelled? 

YES 

Beep will sound when plot done then PRESS CONT . 

To interupt the plot press the STOP key 



Plot 



SUBFILE 2 EGG FUTURE CONTRACTS 



5000 



3000 



1000 



1000 - 



-3000 



-5000 




(S3 



D_ 
\n 

I 



Q 



D_ 

in 



Q_ 
(S 



RNDREHS PLOT 



Enter nunber of desired function 
12 



Return to main graphics menu 



Enter nunber of desired function: 

i 

X axis (LOG AXIS) variable nunber? 



Select semi-log plot 



Y axis variable nunber? 
4 

Enter subfile to be used (0 if subfiles ignored) 



Enter nunber of desired function: 

3 Change y-axis 

Y plotting nininun? 


Y plotting naxinun? 
£.4 

Y tic? 
.4 

Label every Kth tic nark? 

i 

Nunber of decinal places for labeling the Y axis? 

i 

Enter nunber of desired function; 

4 Change labels 



153 



Enter the X axis title (33 characters or less) 

FROZEN ALBUMEN 

Enter the Y axis title (33 characters or less) 

SHELL EGGS 

Enter the Graph Title (33 characters or less) 

SEMI-LOG PLOT EGG FUTURE DATA 

Enter number of desired function: 

7 

Enter option nunber of the graphics device? 

2 

Plotter identifier string (press CONT if 'HPGL')? 

Enter select code, bus address (default is 7,5)? 

Is the above information correct? 

YES 

Enter nufiber of desired function: 

i 

Are grid lines to be plotted? 

NO 

Beep will sound when plot is done then press GONT 

To interrupt plotting, press 'STOP' key 



Select plotter 



Plot 



2.4 ,- 



SEMI-LOG PLOT EGG FUTURE DRTR 



2.0 



en 

U 

_l 
_J 

LJ 

Ul 



1 .6 



1 .2 



.8 



4 •■ 



0.0 



+ + 



+ 
+ + + 



+ + + 
+ + + + 



+ ++ + 
+ + +++ + 

+ 

+ + -H- l «ll II + + + 

+ *+#++ + 
+ + +■» ++ -ttt 



J I I I l_ 



_1_ 



I 



I 



I I I 



J I 



c\j m ^ m id n cooiE) ( 

— ( 

FROZEN RLBUMEN 



s 



S S C2 Q G) (3 CJ 

<*• in cd r- gd mo 



Enter number of desired function 
XZ 



Return to main graphics menu 



154 



Enter nunber of desired function: 

2 

X axis variable number? 



Select log-log plot 



V axis variable nu fiber? 

A 

Enter subfile to be used (0 if subfiles ignored)? 



Enter nuMber of desired function: 

t, Select plotter 

Enter option number of the graphics device? 

o 

Plotter identifier string (press CQNT if 'HPGL')? 

Enter select code, bus address (default is 7,?>>? 

Is the above information correct? 

YES 

Enter number of desired function: 

i Plot 

Are grid lines to be plotted? 

NO 

Beep will sound when plot done then press CONT . 

To interrupt plotting, press 'STOP' key. 



in 

ID 
(J 
U 



u 

X 

in 



3 - 



EGG FUTURE CONTRACTS 



+ + ++ 



+ 
+ + + 
+ + + 
+ ++ 
+ + + + 
+ 
+ + +* 
+ ++ + 
+ +++ + 
+ 

+ + + 



h+ ♦ + + + 



J I I 1_1_ 



OJ 



m 



in cd r- co ens 



I I I H ' I I !«- 



J I I I L_J 



ai 






Q Q S QQQQ 
-*• m ID MDmQ 



FROZ. RLBU 



155 



Enter nunber of desired function: 

* * Return to main graphics menu 

Enter nunber of desired function: 

6 Return to statistical graphics 1 A 



Enter nunber of desired function: 

4 Select Weibull Plot 

Variable nunber? 

2 

Enter subfile to be used (0 if subfiles ignored) 



SORTING THE DATA 

Enter nunber of desired function: 

£> Select plotter 

Enter option nunber of the graphics device? 

2 

Plotter identifier string (press CONT if 'HPGI..')? 

Enter select code, bus address (default is 7,5)? 

Is the above infronation correct') 

YES 

Enter nunber of desired function: 

1 Plot 

Are grid lines to be plotted? 

NO 

Beep will sound when plot done then press CONT. 

To interrupt plotting, press 'STOP' key. 



156 



EGG FUTURE CONTRACTS 





99.9 




99 




95 




90 


h- 


80 


o 


70 




G0 
50 


>- 


40 


H id 


30 


KH (£ 




d3 


20 


H l-H 






10 


CQ K 




O z 


5 


n u 




u 




.J - 


2 


_J 




Z> 


1 


m 


HI 




Lu 


.5 


31 





.2 



. 1 



/ 



./ 



+ 



_J_ 



J I L_I_ 



_1_J 



in to i^ oo en® 



(3 


<S 


Q 


Q 


Q S SSS 


OJ 


n 


■<1- 


in 


lO NOD cno 



FROZ. RLBU 



Enter nunber of desired function: 
ii 

Enter nunber of desired function 
6 



Return to main graphics menu 



Return to Basic Statistics and Data 
Manipulation (BSDM) 



157 



General Statistics 



General Information 

Description 

The General Statistics module includes 5 major parts: 

1. One Sample Tests allow you to run a series of tests and plots on one-variable prob- 
lems. You can test whether the observations are mutually independent, whether the 
mean of the data is significantly different from a hypothesized mean, compare your data 
with normal, exponential, or uniform distributions, and test the randomness of your 
data. 

2. Paired-Sample Tests allow you to compare the means of two samples, test if the 
paired samples are similar, fit the data with a regression equation, test whether the two 
populations have the same median and test the independence of two random variables. 

3. Two-Independent-Sample Tests allow you to test whether the means of two samples 
are equal, whether the medians of two samples are equal, and whether the two popula- 
tions have the same distribution. 

4. Multiple-Sample (s=3 Samples) Tests allow you to test whether the means of several 
populations are equal, and whether there are significant differences between pairs of 
means. 

5. Statistical Distributions allow you to study a series of continuous and discrete statistic- 
al distributions. Both tabled values and right-tailed probabilities are available for the 
continuous distributions. The discrete distributions calculate right-tail probabilities, sing- 
le term probabilities and an approximate value for a specified right-tailed probability. 
This program will also calculate n factorial, the complete gamma function, the complete 
beta function and binomial coefficients. 

Methods and Formulae, References, etc., for each of these five parts are found in each of the 
following sections. 

Special Considerations 

If you specify one type of test (for example, Paired-Sample Tests), you will not be able to 
perform a different type of test (say, Multiple-Sample Tests), without returning to the Start-up 
procedure for the new test. You must access the Start-up procedure to define the segment of 
the data matrix which is to be tested. 



158 



One Sample Tests 

Object of Programs 

This section allows you to run a series of tests and plots on one variable (or one subfile of one 
variable) from the data matrix defined by the Basic Statistics and Data Manipulation program. 
Each test will automatically sort or restore the data to its original form as needed. You can 
perform several kinds of tests on your data: 

Serial Correlation — tests if the observations are mutually independent. 

t-Test — tests if the mean of the data is significantly different from a hypothesized mean 
which you specify. 

Kolmogorov-Smirnov Goodness-of-fit test or Chi-Square Goodness-of-fit — test if your 
data follow a normal, exponential or uniform distribution. 

Runs Test — tests the randomness of your data. 

Shapiro-Wilk Test — tests for normality. 

The above tests will be described in Methods and Formulae. 

Typical Program Flow 



Input data via BSDM 



Select Advanced Statistics option 



Insert program medium 



Select "One sample test" 



Specify variable and subfile 



Select certain test (7 options) 



Execute your choice 



159 



Data Structure 

Since we have only one variable, the data is entered as in the following example, which 
shows a sample of size 12: 

Variable #1 



I 


OBS(I) 


OBSU + 1) 


OBS(I + 2) 


OBS(I + 3) 


OBS(I + 4) 


1 

6 
11 


2 
6 
3 


5 
4 
4 


8 
5 


7 
9 


3 

7 



Alternatively, you may input a data set containing several variables, then specify a single 
variable for the analysis. Several variables may be analyzed in succession. 



Methods and Formulae 

Basic Statistics 

For the calculation of the sample mean, variance, standard deviation, standard error of the 
mean, coefficient of variation, skewness, kurtosis, and confidence intervals on the mean and 
variance, please refer to Snedecor and Cochran's Statistical Methods. 

Kolmogorov-Smirnov Goodness-of-Fit Test 

• Assumptions 

1. The sample is a random sample. 

2. If the hypothesized distribution function G(X), in HO below, is continuous the test is 
exact. Otherwise, the test is conservative. 

• Hypotheses 

Let G(X) be a completely specified, hypothesized distribution function. F(X) is the distribution 
function for the random variable X. 

1. Two-Sided Test 

HO: F(X) = G(X) for all X. 

HI: F(X) * G(X) for at least one value of X. 

2. One-Sided Test 

HO: F(X) 2* G(X) for all X. 

HI: F(X) < G{X) for at least one value of X. 

3. One-Sided Test 

HO: F(X) ^ G(X) for all X. 

HI: F(X) > G(X) for at least one value of X. 



160 



• Test Statistics 

Let S(X) be the empirical distribution function based on the random sample XI, X2, ... , Xn. 

1. Two-Sided Test 

Let the test statistic T be the greatest (denoted by "sup" for supremum) vertical dis- 
tance between S(X) and G(X). 

T = sup|G(X)-S(X)| 

2. One-Sided Test 

Tl = sup [G(X)-S(X)] 

3. One-Sided Test 

T2 = sup[S(X)-G(X)] 

• Decision Rule 

Reject HO at the level of significance a if the appropriate test statistic, T, Tl, or T2 exceeds the 
1 — a quantile W(l - a) from the Table of Quantiles of the Kolmogorov Test Statistic. 

Chi-square Goodness-of-Fit Test 

• Assumptions 

1. The sample is a random sample. 

2. The measurement scale is at least nominal. 

• Hypothesis 

Let F(X) be the true but unknown distribution function and let G(X) be a completely specified, 
hypothesized distribution function. 

HO: F(X) = G(X)forallX. 

HI: F(X) * G(X) for at least one X. 

• Test Statistic 

Suppose the data is divided into c classes, and the number of observations falling in each 
class is denoted by Oj, for j = 1, 2, ... , c. Let Pj be the probability of a random observation 
being in class j under the assumption that G(X) is the distribution function of X. Then define 
Ej as Ej = Pj*n, where n is the sample size. Then, the test statistics is: 

T = S(Oj - Ej) 2 /Ej forj = 1, 2, ... , c. 

• Decision Rule 

The exact distribution of T is difficult to use, so the large sample approximation is used. The 
approximate distribution of T is the Chi-square distribution with (c-1) degrees of freedom. 
Therefore, the critical region of approximate size a corresponds to values of T greater than 
X''(l -a), the (1 -a) quantile of a x 2 random variable with (c-1) degrees of freedom. Reject 
HO if T exceeds x 2 (l —ol); otherwise, accept HO. 

t-Test 

Let XI, ... , Xn be a random sample from a population with mean [l, where M is the sample 
mean and S is the sample standard deviation. 



161 



• Hypotheses 

1. Two-Sided 

HID: (jl = a, the hypothesized value for the population mean. 
HI: |x =* a 

2. One-Sided 
HID: (x = a 
HI: |x < a 

3. One-Sided 
HO: jjl = a 
HI: |x > a 

• Test Statistic 

t = Vn(M-a)/S 

• Decision Rule 

The statistic t has a t-distribution with (n - 1) degrees of freedom. T(l - a, n - 1) is the (1 - a) 
quantile of the t-distribution with (n - 1) degrees of freedom. 

1. Two-Sided: if t «£ T(l - a/2, n - 1), accept HO, otherwise, reject HO. 

2. One-Sided: if t ss T(a/2, n- 1), accept HO, otherwise, reject HO. 

3. One-Sided: if t *s T(l -a/2, n-1) accept HO, otherwise, reject HO. 

In this program the corresponding one- or two-tailed probability ot the computed t-value will 
be printed. 

Runs Test 

Any sequence of like observations bounded by observations of a different type is called a run. 
The number of observations in the run is called the length of the run. 

Suppose a coin is tossed twenty times and the resulting heads (H) or tails (T) are recorded in 
the order in which they occur: 

T HHHHHH T H T H TT HHH T H T H 

Each segment is called a run. The total number of runs in the example is 12. 

The total number of runs may be used as a measure of the randomness of the sequence; too 
many runs may indicate that each observation tends to follow and be followed by an observa- 
tion of the other type, while too few runs might indicate a tendency for like observations to 
follow like observations. In either case the sequence would indicate that the process generat- 
ing the sequence was not random. 

• Hypothesis 

HO: The process which generates the sequence is a random process. 

HI: The random variables in the sequence are either dependent on other random variables 
in the sequence or are distributed differently from one another. 



162 



• Test Statistic 

In this program we use the median as an indicator of two types of observations, i.e., a value 
below the median is one kind, a value above the median is another kind. Count the runs 
below and above the median, say D. Then 

W = N + 1 + Z p ([(N | 2)/(2N-l)] t .5) 

where Z p is the pth quantile of a standard normal random variable. 

• Decision Rule 

Reject HO at the level a if D > W(l - a/2) or D < W(a/2), otherwise accept HO. 

Serial Correlation 

This routine checks for randomness in the sample. 

• Formula 

Serial correlation with lag k: 

|_ Z (X.-X) (X i + k -X) J / |_ 2 X 2 -N.X 2 J 

If the correlation is small, this means the observations are mutually independent. 

Shapiro-Wilk Test 

This routine performs a test for normality for a sample of size 3 to 50, inclusive. 



Note 

A tie means two or more observations have the same value. Ties 
must be given a special treatment when we try to give every single 
observation a rank. 

If the sample size is less than 3 or greater than 50, a message will be printed stating that this 
program will not work and to try a chi-square goodness of fit test for N>50. Then you will 
have a chance to choose the test you want again. 

• Hypothesis 

The data comes from a normal distribution. 

• Test Statistic 

A test statistic W is printed followed by the tabled values of Wa (% POINTS) for alpha = .01, 
.02, .05, .1, and .5. 

• Decision Rule 

The observed test statistic W indicates that the sample did not come from a normal distribu- 
tion at the corresponding alpha level of significance if the value of W is less than the corres- 
ponding percentage point. Hence, small values of W are significant. 



163 



References 

1. Abramowitz, Milton and Stegun, Irene A (1970) Handbook of Mathematical Functions 
with Formulas, Graphs, and Mathematical Tables. U.S. Government Printing Office, 
Washington D.C., p. 949. 

2. Box, G.E.P. and Cox, D.R. (1964). "An Analysis of Transformations". Journal of the 
Royal Statistical Society 26:2, pp. 211-252. 

3. Conover, W.J. (1971). Practical Nonparametric Statistics. John Wiley & Sons, Inc., 
New York, p. 414. 

4. Conte, S.D. (1965). Elementary Numerical Analysis. McGraw-Hill Book Company, 
New York, p. 135. 

5. Dickinson Gibbons, Jean (1971). Nonparametric Statistical Inference. McGraw-Hill 
Book Company, New York, pp. 75-83. 

6. Hahn, G. and Shapiro, S.S., (1967). Statistical Models in Engineering, John Wiley & 
Sons, Inc., New York, pp. 330-332. 

7. Kopitzke, Robert W., Unpublished Notes, 1973. 

8. Mood, Graybill, Boes (1974). Introduction to the Theory of Statistics, 3rd Edition, 
McGraw-Hill Book Company, New York. Chapter 7. 

9. Shapiro, S.S. and Wilk, M.B. (1965). "An Analysis of Variance Test for Normality". 
Biometrika; 52, 3 and 4, p. 591. 

10. Snedecor, George W. and Cochran, William G. (1967). Statistical Methods. The Iowa 
State University Press, Ames, Iowa. 

11. Ullman, Neil R., (1972). Statistics: An Applied Approach, Xerox College Publishing, 
Lexington, Mass. pp. 354-357. 



164 



Paired-Sample Tests 

Description 

This program allows you to perform the following paired-sample tests: 
Paired t-test — compare the means of two samples. 

Cross Correlation — test if the paired samples are similar. 

Family Regression — fit the data with one of several regression equations. 

Sign Test or Wilcoxon Signed Rank Test — test whether two populations have the same 
median. 

Spearman's Rho or Kendall's Tau — test the independence of two random variables. 

Typical Program Flow 



Input data via BSDM 


■ 




Select Advanced Statistics option 






Insert program medium 






Select "paired sample test" 






Specify variables and subfiles 






Select a certain test (9 options) 






Execute your choice 



Data Structure 

For paired-sample tests, two variables or the same subfile of two variables must be used. 



The data are entered as in the following example: 



Obs. # 

1 
2 
3 



Variable #1 

54 
44 
46 



Variable #2 

46 
42 

44 



165 



Methods and Formulae 

Paired t-Test 

This is a one-sample t-test performed on the differences between paired samples. See the 
Methods and Formulae section in the One-Sample Tests chapter for further details. 

Cross Correlator! 

Provides a correlation between paired samples with a lag between them. Large values show 
the paired samples are quite similar, i.e., no significant difference. The cross correlation with 
lag k between the two samples X1,X2,...,XN and Y1,Y2,...YN is: 



[N-k "l/l -1 " 1 N ~~l 

S(X,-X)(Y 1 + k -Y) / X (X-X) 2 X (Y-Y) 2 
i=i -* L i = i i = i -J 



T-5 



Family Regression 

Provides four different regression models. All of the models are solved (except quadratic) by 
"linearizing" the model to the form: 

f(Y) = "b" + "a"g(X) 

and solving by ordinary linear least squares. The AOV table which is printed out for each 
model is in units of the transformed Y's. R 2 , the squared multiple correlation coefficient is 
expressed in units of the transformed Y's. The following models are provided: 

Linear: Y = aX + b 
Quadratic: Y = aX 2 + bX + c 
Exponential: Y = a exp(bX) 
Power: Y = aX t b 

Sign Test 

• Object 

The sign test is designed for testing whether two populations have the same medians. 

• Data 

The data consist of observations on a bivariate random sample (XI, Yl), .... , (Xn, Yn). 
Within each pair, (Xi, Yi), a comparison is made and the pair is a " + " if Xi > Yi, and a "-" if 
Xi < Yi. If Xi = Yi, the pairs are excluded from the analysis. 

• Hypotheses 

1. HO: P(Xi < Yi) = P(Xi > Yi) for all i 

HI: Either P(Xi > Yi) < P (Xi < Yi) for all i or 
P(Xi > Yi) > P(Xi < Yi) for all i 

2. HO: P(Xi > Yi) as P(Xi < Yi) for all i 
HI: P(Xi > Yi) > P(Xi < Yi) for all i 

3. HO = P(Xi > Yi) ^ P(Xi < Yi) for all i 
HI - P(Xi > Yi) < P(Xi < Yi) for all i 



166 



• Test Statistic 

T = total number of pluses ( + ). 

• Decision Rule 

In this program a standardized T value Zt is printed so you can compare it to the cumulative 
distribution for a standardized normal random variable, Z. 

1. Reject HO if 1 - P[-Zt<Z<Zt] <a 
Accept HO if 1 - P[ -Zt < Z < Zt] > a 

2. Reject HO if 1 - P[Z =s Zt] < 1 - a 
Accept HO if 1 - P[Z *£ Zt] > 1 - a 

3. Reject HO if 1 - P[Z «s Zt] > a 
Accept HO if 1 - P[Z «£ Zt] < a 

Wilcoxon Signed Ranks Test 

• Object 

This test is designed to test whether a particular sample came from a population with a speci- 
fied median. It may also be used for paired samples to see if two samples have the same 
median. 

• Data 

The data consist of N observations (X1.Y1), (X2.Y2), ... , (XN,YN). The absolute differences 
|Di| = | Xi - Yi |, for i = 1, ..., N are computed for each pair. Ranks from 1 to N are assigned 
to these N pairs according to the relative size of the absolute differences. Pairs for which Xi = 
Yi are excluded from the analysis. 

• Hypotheses 

1. HO: E(X) = E(Y) 
HI: E(X) > E(Y) 

2. HO: E(X) = E(Y) 
HI: E(X) < E(Y) 

3. HO: E(X) = E(Y) 
HI: E(X) * E(Y) 

• Test Statistic 

Define Ri = if Yi > Xi (Di is negative) 
Ri = the rank assigned to (Xi, Yi) if Xi > Yi 

Then the test statistic T = 2Ri, for i = 1, ..., N. 

• Decision Rule 

Look up the Quantiles, W(*) of the Wilcoxon signed ranks test statistic in the table included in 
this manual. 

1. RejectHOifT>W(l-a) 
Accept HO if T« W(l-a) 



167 



2. Reject HO if T < W (a) 
Accept HO if T s* W (a) 

3. Reject HO if T > W( 1 - a/2) or T < W (a/2) 
Accept HO if W( a/2) <T< W(l-a/2) 

Higher Power Signed Rank 

Ranks the N differences, Xi-Yi, from smallest to greatest. T, the test statistic, is given by the 
sum of the ranks of the positive differences raised to the specified power (2,3,4, or 5). Note 
that if the power specified were 1, this test is the Wilcoxon Signed Rank test, and if the power 
were 0, this test is the Sign test. 

Using higher powers of the ranks can lead to a more powerful test when it is desired to weight 
larger values more heavily. This would be true in highly skewed distributions. 

Spearman's Rho 

• Object 

This routine will test the independence of two random variables. 

• Data 

The data consist of a bivariate random sample of size N, (XI, Yl), ..., (XN, YN). Let R(Xi) be 
the rank of Xi as compared with the other X values, for i = 1,2, ..., N. That is R(Xi) = 1 if Xi 
is the smallest of XI, X2, ..., XN; R(Xi) = 2 if Xi is the second smallest, etc. Similarly, let R(Yi) 
equal 1,2, ..., N depending on the relative magnitude of Yi. 

• Measure of Correlation 

d = 2(R(X,) - R(Y;)) 2 for i = 1,2,. ..,N 
R = l-[6d/N(N|2-l)] 

• Hypothesis Testing 

The Spearman rank correlation coefficient is used as a test statistic to test for independence 
between two random variables. 

1. Two-Tailed Test 

HO: The Xi and Yi are mutually independent. 
HI: Either 

a) there is a tendency for the larger values of X to be paired with the larger values of 
Y, or 

b) there is a tendency for the smaller values of X to be paired with the larger values of 
Y. 

2. One-Tailed Test For Positive Correlation 

HO: The Xi and Yi are mutually independent. 

HI: There is a tendency for the ranks of X and Y to be paired together. 

3. One-Tailed Test For Negative Correlation 
HO: The Xi and Yi are mutually independent. 

HI: There is a tendency for the smaller values of X to be paired with the larger values 
of Y, and vise versa. 



168 



• Decision Rule 

From the table of quantiles of the Spearman test statistic in this manual, we can find the 
quantile value. 

1. Two-tailed test: Reject HO if R exceeds the (1 - a/2) quantile or if R is less than the a/2 
quantile. 

2. One-tailed test for positive correlation: Reject HO if R exceeds the 1 - a quantile. 

3. One-tailed test for negative correlation: Reject HO if R less than a quantile. 

Kendall's Tau 

• Object 

This routine allows you to test the independence of two random variables. 

• Data 

The data consist of a bivariate random sample of size N, (Xi,Yi) for i = 1,2, N. Two observa- 
tions, for example (1.3, 2.2) and (1.6,2.7), are called concordant if both members of one 
observation are larger than the respective members of the other observation. Pc denotes the 
number of concordant pairs of observations. A pair of observations like (1.3,2.2) and (1.6, 
1.1) are called discordant if the two numbers in one observation differ in opposite directions 
(one negative and one positive) from the respective members in the other observation. Let Pd 
denote the number of discordant pairs of observations. If Xi = Xj or Yi = Yj, (i ^ j), the pair 
is disregarded. 

• Measure of Correlation 
T - (Pc-Pd)/[N(N-l)/2] 

• Hypotheses 

Same as in Spearmans's Rho. 

• Decision Rule 

From the table of quantiles of the Kendall rank correlation coefficient in this manual, we can 
find the quantile value. Q. 

1. Two-tailed test: Reject HO if Q exceeds the (1 - a/2) quantile or if Q is less than the a/2 
quantile. 

2. One-tailed test for positive correlation: Reject HO if Q exceeds the 1 - a quantile. 

3. One-tailed test for negative correlation: Reject HO if Q is less than the a quantile. 



169 



Two Independent Sample Tests 

Object of Program 

The following routines are provided: 

Two-sample t-test — tests whether the means of two samples are equal. 

Median test — tests whether the medians of two samples are equal. 

Mann-Whitney, Taha's Squared R, Cramer-von Mises, and Kolmogorov-Smirnov tests 

all test whether the two populations have the same distribution. 

Typical Program Flow 



Input data via BSDM 






Select Advanced Statistics option 






Insert program medium 






Choose two independent tests" 


■ 




Specify variables and subtiles 






Choose the test you desire 
(7 options) 






Execute the test you choose 



Data Structure 

For all of the two-independent-sample tests, data must be entered into one variable in the data 
base created by Basic Statistics and Data Manipulation. Then, the Subfile routine of BSDM 
must be used to create two subfiles. Each subfile corresponds to one sample. For example, 
suppose you have one sample of size six and another sample of size eight. Suppose the data is: 



Sample 1: 2, 3, 4, 2, 3, 6 
Sample 2: 4, 5, 4, 2, 2, 6, 3, 7. 

The data should be entered vis BSDM as one variable with 14 observations. Then, the Subfile 
routine would be used to specify two subfiles, the first with six observations, and the second 
with eight observations. 



170 



Methods and Formulae 

Two-Sample t Test 

• Object 

The two-sample t-test is used to test whether the means of two samples drawn from normal 
populations having the same variance are equal. 

• Data 

Let XI, ..., Xn be a random sample from the first population and Yl, ... , Ym be a random 
sample from the second. Let M(X) and M(Y) be the respective sample means and let S(X) and 
S(Y) be the sample variances. 

• Hypotheses 

Let |x(X) and |x(Y) be the two population means. 

1. Two-Sided Test 
HO: ^(X) - u.(Y) 
HI: \x(X) ± jjl(Y) 

2. One-Sided Test 
HO: m-(X) - m-(Y) 
HI: m-(X) < (x(Y) 

3. One-Sided Test 
HO: ji(X) = |i(Y) 
HI: jjl(X) > |x(Y) 

• Test Statistic 

t = [M(X) - M(Y)] / [( — + --) (2Xi t 2 - nM(X) | 2 + 2Y, T 2 - mM(Y) f 2) / [n + m - 2\ h 
n m 

• Decision Rule 

1. Two-Sided Test 

Reject HO if P[ - 1 < T < t] > 1 - a 

2. One-Sided Tests 

Reject HO if P[T<t] > 1-a 

3. One-Sided Tests 
RejectHOifP[T<t]<a 

Median Test 

• Object 

The median test is designed to determine whether two samples came from populations having 
the same median. 



171 



• Data 

From each of two populations a random sample of size Ni is obtained. Let N — Nl + N2. We 
obtain the sample median of the combined samples which is called the grand median. Let Oli 
be the number of observations in the ith sample that exceed the grand median, and let 02i be 
the number of observations in the ith sample that are less than or equal to the grand median. 
Arrange the frequency counts into a 2-by-2 contingency table as follows: 

Sample 1 2 Totals 

> median 



< median 



Hypothesis 



O n 12 
O21 2 2 



N x N 2 N 



HO: The two populations have the same median. 
HI: The medians of the two populations are different. 

• Test Statistic 

In the first sample count the number of X's greater than the grand median, say O n , and the 
number of X's smaller than the grand median, say 21 , then, let T = On— 2 i- The data 
value which is the same as the grand median is omitted. 

From the contingency table, a x 2 value can be calculated by using: 
X 2 = 2((01i - 02i) 2 /Ni) fori =1,2. 

• Decision Rule 

A standardized z-value is printed, so we can look in the cumulative normal frequency distribu- 
tion table to find the probability corresponding to the standardized z value, Zt, for Z = Vx 2 . 

Accept HO if 1 - P[ - Zt < Z < Zt] > a 
Reject HO if 1 - P[ -Zt < Z < Zt] < a 

If you wish to use the x 2 value calculated from the contingency table, then look in the chi- 
square contingency table and find the W(l -a) value with one degree of freedom where a is 
the significance level. 

Accept HO if calculated x 2 < W ( 1 - a ) 
Reject HO if calculated x 2 >W(l - a) 

If Nl+N2<30, Fisher's exact probability, P, is given. If a/2<P<l-a/2, accept HO; other- 
wise, reject HO. 

Mann-Whitney Test 

• Object 

The Mann-Whitney test is designed to test if two populations are identical. 



172 



• Data 

The data consist of two random samples. Let XI, X2,. ..., XN denote the random sample of 
size N from population one, and let Yl, Y2, ..., YM denote the random sample of size M from 
population two. Assign the ranks 1 through N + M to the combined samples. Let R(Xi) and 
R(Yj) denote the ranks assigned to X and Y respectively, for all i and j. 

• Hypotheses 

Let F{X) and G(X) be the distribution functions, corresponding to populations one and two 
respectively (or of X and Y respectively). 

1. Two-Sided Test 

HO: F(X) = G(X) for all X 

HI: F(X) * G(X) for at least one X 

2. One-Sided Test 
HO: P(X < Y) =s .5 
HI: P(X< Y) > .5 

3. One-Sided Test 
HO: P(X<Y) 5=. 5 
HI: P(X<Y) <.5 

• Test Statistic 

LetT = SR(Xi)fori = 1, ..., N. 

In our output T is standardized to z by using: 

z = (T-(jl)/ct 
where 

ix = N(N + M + l)/2 
and 

a 2 = MN(M + N + l)/12 

• Decision Rule 

Look in the normal probability function table to find the probability corresponding to the 
standardized z, Zt. 

1. Two-Sided Test 

Accept HO if P[-Zt ss Z « Zt] < 1 -a 
Reject HO if P[ -Zt ss Z ^ Zt] > 1 - a 

2. One-Sided Test 

Accept HO if P[Z =s Zt] > a 
Reject HO if P[Z ^ Zt] < a 

3. One-Sided Test 

Accept HO if P[Z =s Zt] < 1 - a 
Reject HO if P[Z s£ Zt] > 1 - a 



173 



Taha's Squared R 

This test is similar to the Mann- Whitney test, because it ranks the pooled sample of X's and 
Y's and defines T by T = 2R(Xj) f 2. Again, the null hypothesis is that the two populations 
have the same distribution. Z is normalized by z = (T - |x)/cr where 

(jl = N(N + M + 1)(2(N + M) + l)/6 
and a is very complicated, but can be found in Mielke. (See References) 

Cramer- Von Mises Test 

• Object 

The Cramer-Von Mises test is designed to test if two populations are identical. 

Data 

The data consist of two independent random samples, XI, ..., XN and Yl, ..., YM, with 
unknown distributions functions F(*) and G(*) respectively. 

• Hypothesis 

HO: F(X) = G(X) for all X 

HI: F(X) * G(X) for at least one X 

• Test Statistic 

Let Fl(Xi) and Gl(Yj) be the empirical cumulative distribution functions. Then 

T = 2[Fl(Xi) - Gl(Yj)] 

where the sum is over consecutive i and j, that is, over the "pooled" cumulative distribution 
function. 

• Decision Rule 

In the program output, T and the .10, .05, and .01 significance levels are printed. Choose 
your desired significance level and: 

Reject HO if T > corresponding critical point 
Accept HO is T < corresponding critical point 

Kolmogorov-Smirnov Test 

• Object 

This test is designed to test whether two populations have the same distribution. 

• Data 

The data consist of two independent random samples XI, ..., XN and Yl, ..., YM. Let F(*) 
and G(*) represent their respective, unknown, distribution functions. 



174 



• Hypotheses 

1. Two-Sided Test 

HO: F(X) = G(X) for all X 

HI: F(X) * G(X) for at least one value of X 

2. One-Sided Test 

HO: F(X) = G(X)forallX 

HI: F(X) > G(X) for at least one value of X 

3. One-Sided Test 

HO: F(X) = G(X) for all X 

HI: F(X) < G(X) for at least one value of X 

• Test Statistic 

Let S1(X) be the empirical distribution function based on the random sample XI, ..., XN, and 
let S2(Y) be the empirical distribution function based on the other random sample Yl, ..., 
YM. 

Define the test statistic, T, as the greatest vertical distance between the two empirical distribu- 
tion functions: 

T = sup|Sl(X) - S2(Y)| 



• Decision Rule 

The output consists of T and the .10, .05, and .01 significance levels. Choose your desired 
significance level. Reject HO if T > corresponding critical point Accept HO otherwise 



175 



Multiple-Sample (^ 3 Samples) Tests 

Description 

The following routines are available: 

One-Way Analysis of Variance — tests whether the means of several populations are equal. 

Multiple Comparisons — test whether there are significant differences between pairs of 
means via Least Significant Differences, Duncan's test, Student-Newman-KeuPs test, Tukey's 
HSD, or Scheffe's test. 

Kruskal-Wallis Test — tests if several populations have identical medians. 

Typical Program Flow 



Input data via BSDM 




■ 


Select Advanced Statistics option 






Insert program medium 






Select "multiple sample tests" 


■ 


■ 


Specify variables and subfiles 






Choose the desired test (4 options) 






Execute the chosen test 



Data Structure 

For ^ 3 Sample tests, three or more different subfiles of the same variable must be used. The 
data are entered as in the following example. Suppose you have three samples: 



Sample 1 
Sample 2 
Sample 3 



2, 5, 8, 7, 6, 4 
3,2,9,11 
7, 3, 5, 8, 6 



176 



You would enter the data via Basic Statistics and Data Manipulation as one variable with 15 
observations like this: 

Variable #1 



I 


OBS(I) 


OBS(I + l) 


OBSU + 2) 


OBS(I + 3) 


OBS(I + 4) 


1 

6 
11 


2 

4 

7 


5 
3 
3 


8 
2 

5 


7 
9 
8 


6 

11 
6 



Then, the Subfile option would be used to specify three subfiles, the first with six observa- 
tions, the second with four observations, and the third with five observations. 

Methods and Formulae 

1. One-way Analysis of Variance is used to test the hypothesis that the means of several 
populations are equal. The assumption is that all the populations are normal and have 
equal variances, although the sample sizes may be unequal. 

Suppose k is the number of populations and n t is the number of observations in the 
sample from the ith population. The total variation of the data is 



SST 



4(§U-&0) 



where X is the overall mean. The variation due to error, or variation within samples is 



-KK^-xa 2 )) 



SSE- ^ \ jLi \ (X,,-X,) 2 
where X ( is the mean of the ith sample. The variation between samples is 



SSB 



K 

= X(n 



itXi-X) 2 ) 



The error mean square is defined as 



MSE = SSE/(N-k), where N = 



X(n.) 

i = i 



and the between samples mean square is defined as MSB = SSB/(k- 1). 



The F-ratio, MSB/MSE, has the F distribution with k - 1 and N-k degrees of freedom. 
The null hypothesis that the population means are equal may be rejected if the F ratio is 
greater than or equal to F«, k- 1, N — k, where « is the significance level of the experi- 
ment. This may be summarized in a table: 



177 



Source of 
Variation 


Degrees of 
Freedom 


Sum of 
Squares 


Mean 
Square 


F 


Between samples 


Kl 


SSB 


MSB= jp 1 * 


MSB 
MSE 


Error 


N-k 


SSE 


MSE=- S -^ E - 

N-k 




Total 


N-l 


SST 







Multiple Comparisons 

Multiple comparisons provide you with several tests to determine whether the the various 
samples have significantly different means. The procedures are used upon completion of an 
analysis of variance. The notation used in these tests is defined below. 

EMS = error mean square used in testing for significance in the analysis of variance 

n = har monic ave rage of observations per mean 

S(M) = VEMS/n 

k = number of groups 

a = degrees of freedom for EMS = n-k 

Mi = mean of the ith sample, i = 1, ..., k 

Oi = ith ordered (from largest to smallest) group mean, i = 1, ..., k 

msd = minimum significant difference 

Group means are sorted and then all possible comparisons are made. Only one table value is 
necessary for Least Significant Differences, Tukey's HSD, or Scheffe's test. On the other 
hand, k - 1 table values are needed for Student-Newman-Keul's test and Duncan's multiple 
range test. 

The minimum significant difference is the smallest difference there can be between two means 
for them to be considered significantly different from one another. In all of the procedures, 
comparisons are made starting with the largest difference between means and progressing to 
the smallest difference. The process should be terminated when there is no significant differ- 
ence found at a given step. 

In all cases the hypothesis is: 

HO: |xi = |jlj , where |xi is the mean of the ith population, i i= j 
HI: |i,i ^ |jlj 



178 



Least Significant Differences (Multiple Comparisons) 

• Test Statistic 

msd = t(a,b)S(M)V2, where t(a,b) is the upper b point of the t-distibution with a de- 
grees of freedom 

• Decision Rule 

Accept HO if Mi - Mj < msd 
Reject HO otherwise 

Duncan's Multiple Range Test (Multiple Comparisons) 

• Test Statistic 

First, the sample means are ordered from largest to smallest: 01, 02, ..., Ok. Define p = 
difference in ranks of the means being compared plus one. For example, if you are comparing 
02 and 05, then p = (5 - 2) + 1 = 4. Then: 

msd = R(a,p,b)S(M), where R(a,p,b) is the upper b point from the new multiple range 
table with a degrees of freedom and distance p. 

• Decision Rule 

Accept HO if Oi - Oj < msd, where i < j 
Reject HO otherwise 

Scheffe's Test (Multiple Comparisons) 

After you have collected the data and tested those contrasts that catch your eye during the 
analysis, you should use Scheffe's Test. 

• Test Statistic 



msd = V(k - l)F(b,k-l,a) S(M), where F(b,k-l,a) is the upper b point of the F 
distributrion with k - 1 and a degrees of freedom. 

• Decision Rule 

Accept HO if Mi - Mj < msd 
Reject HO otherwise 

Tukey's HSD (Multiple Comparisons) 

• Test Statistic 

msd = R(k,a,b)S(M), where R(k,a,b) is the upper b point of the Studentized range table 
with a degrees of freedom and total sample number k. 

• Decision Rule 

Accept HO if Mi - Mj < msd 
Reject HO otherwise 



179 



Student-Newman-Keuls Test (Multiple Comparisons) 

First, the means of the sample are ordered from largest to smallest, 01, 02, ..., Ok. Then p is 
defined the same as in Duncan's Test. 

• Test Statistic 

msd = R(p,a,b)S(M), where R(p,a,b) is the upper b point from the Studentized range 
table with a degrees of freedom and distance p. 

• Decision Rule 

Accept HO if msd > Oi - Oj, i < j 
Reject HO otherwise 

Kruskal-Wallis Test 

• Object 

The Kruskal-Wallis test is designed to test whether k independent samples, k s= 2, have the 
same mean. The test does not assume normality of the k populations. 

• Data 

The data consist of k independent samples, each of size Ni, i = 1, ..., k. Let N = Nl + N2 + 
... + Nk. Rank the combined samples. Then, for each sample compute the sum of the ranks 
of the observations in the sample. Call these sums Ri, for i = 1, ..., k. If more than one 
observation have the same value, assign the average rank to each of the tied observations. 

• Hypothesis 

HO: All of the k populations have equal means 

HI: At least one of the populations has a different mean 

• Test Statistic 

T = [12/N(N + l)][2(R,t2/N,)] -3(N + 1), for i = l,...,k 

• Decision Rule 

The output prints out a chi-square statistic along with the probability that a chi-square random 
variable is greater than the statistic. If the probability printed is smaller than the significance 
level you chose, reject HO. Otherwise, accept HO. 



180 



References 

1. Bancroft, T.A., Topics in Intermediate Statistical Methods, Volume 1. Iowa State Uni- 
versity Press; Ames, Iowa, 1968. 

2. Boardman, T.J., and Moffitt, D.R., "Graphical Monte Carlo Type I Error Rates for Mul- 
tiple Comparisons Procedures", Biometrics, 27: September 1971. 

3. Conover, W.M. (1971), Practical Nonparametric Statistics. John Wiley and Sons, Inc. 
New York. 

4. Conover, W.J. (1974), "Some Reasons For Not Using the Yates Contingency Correc- 
tion on 2x2 Contingency Tables)". JASA, June 1974, 69:374. 

5. Dixon, Wilfred and Massey, Frank, Introduction to Statistical Analysis, McGraw-Hill, 
New York, 1969, pp. 119-123. 

6. Draper, N.R. and Smith, H., Applied Regression Analysis, John Wiley & Sons, New 
York, 1966, pp. 7-20. 

7. Mielke, P.W. (1967), "Note on Some Squared Rank Tests with Existing Ties". Tech- 
nometrics, 9:312. 

8. Mielke, P.W. (1972), "Asymptotic Behavior of Two-Sample Tests Based on Powers of 
Ranks for Detecting Scales and Location Alternatives". 

9. Mosteller, F. and Robert E.K. Rourke (1973), Sturdy Statistics. Addison-Wesley Pub- 
lishing Co., Reading, Mass. 

10. Siegel, S. (1956), Nonparametric Statistics. McGraw-Hill, New York. 

11. Snedecor, George and Cochran, William, Statistical Methods, Iowa State University 
Press, Ames, Iowa; 1971, pp. 91-119. 



181 



Statistical Distributions 

Object of Program 

This program allows you to run a series of continuous and discrete statistical distributions. 
Both tabled values and right-tailed probabilities are available for the continuous distribution. 
The discrete distributions calculate right-tailed probabilities, single term probabilities and an 
approximate value for a specified right-tailed probability. 

Additionally, this program will calculate n factorial, the complete gamma function, the com- 
plete beta function and binomial coefficients. 

Methods and Formulae 

Continuous 

The continuous distributions included in this program are: 

1. Normal (Gaussian) 

2. Two-parameter gamma 

3. Central F 

4. Beta 

5. Student's T 

6. Weibull 

7. Chi-square 

8. Laplace (double exponential, bilateral exponential, extreme distribution, or Poisson's 
first law of error) 

9. Logistic (autocatalytic function, growth curve) 

For the central F, beta, T, chi-square and gamma distributions, the algorithms generally con- 
verge most rapidly for small or large right tail probabilities. For moderate tails, the time in- 
creases as the right tail approaches .5. For the beta distribution, both parameters should be 
greater than 10 3 . If the parameters are smaller than this, the time required for convergence 
is excessive. 

For the chi-square, it is recommended that the degrees of freedom be less than 500. 

For the logistic, Laplace and Weibull it is necessary that the right-tailed probabilities, p, satisfy 
1-10 95 >p>10" 95 

For the incomplete gamma, it is recommended that the ratio A/B be less than 250. 

Some special terms are: 

1. Right -tailed probability. Given that X is a random variable and "a" is an observable 
value of X, then the right-tailed probability associated with "a" is PR(X>a). 

2. Tabled values. Given that X is a random variable and P is a right-tailed probability, 
then the tabled value associated with P is that value "a" such that PR(X>a) = P. 

To specify the distributions, the respective density functions that are evaluated will be shown 
below. Let f(x) be a density, and T(*) be the gamma function. 



182 



1. Normal (standard) 



f(x) = 



=Ue-* 2/2 



oc<x<oo 



:tt 



2. Two parameter gamma, parameters A,B 

x>0 



f(x)= - 1 _ A *x A x *e~ x/B 



T(A)B A 



A>0, B>0 



3. Central F with N degrees of freedom in the numerator and D in the denominator 



f(x) = 



r((N + D)/2)(N/D) 



N/2 



,N/2 - 1 



r(N/2)T(D/2) 



(■•*) 



Nx\ (N + D)/2 



4. Beta with parameters A and B 

f(x)= r(A+B) (1 _ x)B -i x A-i 

r(A)T(B) 



O^x^l 



N and D are positive integers 



A,B>0 



5. Student's t with N degrees of freedom 
r((N + l)/2)* 1 



f(x) = 



V~N^r(N/2) (fTxVNr +i) - 



oc<x<oo N positive integer 



6. Weibull with parameters A.B 
f(x) = BA B x B ^expt-Ax 6 ] 



x>0 



A,B>0 



7. Chi-square with N degrees of freedom 

1 



f(x) = — 



r(N/2) 2 



N/2 



x N/2 - 1 e x/2 



8. Logistic with parameters A,B 

Bxexp(-(A + Bx)) B>0 and -cc< x < : 
[l+exp(-(A + Bx))] 2 



N is a positive integer 
X>0 



183 



9. Laplace with parameters A and B 



f(x) = — exp{ - |x - A|/B} B>0 and - °o<x<oc 
2B 



Discrete 

The discrete distributions included in this program are: 

1. Binomial 

2. Negative Binomial 

3. Poisson 

4. Hypergeometric 

5. Gamma Function 

6. Beta Function 

7. Single Term Binomial 

8. Single Term Negative Binomial 

9. Single Term Poisson 

10. Single Term Hypergeometric 

Other routines of this program are N factorial and Binomial Coefficients. 

Some special terms used are: 

1. Tabled value. Let X be a binomial, hypergeometric or Poisson random variable. Given 
all approriate parameters and p, a desired right-tailed probability, then the tabled value 
is defined to be x such that P(X>x) = p. 

2. Single term probability. Given that X is one of the three distributions and x is the 
counter domain of X, then the single term probability is defined to be P(X = x). 

All tabled values are normal approximations. It should be noted that if a right-tailed probabil- 
ity p is desired, it is an unlikely coincidence that there will exist an element x in the counter 
domain such that P(X>x) = p where x is one of the distributions in (2) above. Thus, after 
getting the normal approximation to the tabled value, values in the counter domain near the 
approximation should be checked to see which value is best for the particular application. 

The distributions are defined as follows: 

1. Hypergeometric 

Let N = number of items in a lot M«sN 

M = sample size K«sN 

X = number of defective items in the sample X^K 

K = number of defective items in the lot X^M 

then P (exactly x defectives are in the sample) is 



184 



P(X = x)= \ x M M-x ) ,x = 0,l,.-,M 



/K\/N-K\ 
= x)= \xj ^M-x^J 

W) 



and 

min(M.K) 

P = P(X5*x)= 2> P(X = i) 



2. Binomial 

Let N = number of trials 

p = probability of success at each trial 
X = number of successes 



(r) p R d 



P(X = R)= \R) p r (i-p) n " r , R = 0,l,...,N,0<p<l 
and 



2 ( i )p'd 

i = R \ / 



N. /m\ 

N-i 



P= P(X=*R) = A \ i Jp'(l-p) 



3. Poisson 

Let m - rate parameter or mean = lambda >0 
X = number of occurrences =0,1,2,... 



2^ 



P = P(X3=N) = e~ m _, 

i=n i! 



4. Negative Binomial 

For a sequence of Bernoulli trials with probability p of success, 
let R = number of failures before the Nth success then 



/N + R-l \ 

;=R)=V R ) P "(i 



P(X = R)=^ R /P N (1-P) R , R = 0,l,2...,0<p<l 

and if A = number of failures before the Nth success then 



185 



M n t i V» 



P(X^A) = .^V /p M d-p)', A = 0,l,2 



5. N! and T(x) and complete beta function. N must be a non-negative integer. 

An asymptotic Stirling's approximation is used to calculate N! and T (x) and complete 
beta function. 

Special Considerations 

Loading the Program Directly 

This program may be entered via Basic Statistics and Data Manipulation, any One Sample 
test, or any Multiple Sample test. You may also load the program directly by following these 
instructions: 

1. Insert the General Statistics program medium. 

2. Enter: LOAD "START_DIST",10, 

3. Press: EXECUTE 

Before you load the program directly, you must specify the mass storage device which contains 
the program medium using the MASS STORAGE IS command. 

Continuity Correction 

For right-tailed probabilities, the exact probabilities are calculated. Thus, there is no need to 
use a continuity correction. There is no restriction that the parameters be integers, so if for 
some reason a continuity correction is desired, one may be used. 

References 

1. Abramowitz, M. and Stegun, I. A., Handbook of Mathematical Functions, National 
Bureau of Standards, 1964. 

2. Abramowitz, M. and Stegun, I. (1964) N.B.S. Handbook Series 55, Government Print- 
ing Office. 

3. Erdelyi, A., editor (1953) Higher Transcendental Functions, Vo. 1, McGraw-Hill, New 
York. 

4. Johnson, N., and Kotz, S. (1970) Continuous Univariate Distributions, Vol. 1 and 2, 
Houghton-Mifflin, New York. 

5. Khovanskii, A.N., (1956) The Applications of Continued Fractions and Their Genera- 
tion to Problems in Approximation Theory, P. Noordhoff, Groningen. 

6. Kopitzke, R., PH.D. Dissertation, 1974. 

7. Kopitzke, Robert W., Unpublished research notes. 

8. Lieberman, G.J. and Own, D.B., Tables of the Hypergeometric Probability Distribution, 
Stanford University Press, 1961. 

9. Wall, H.S., (1948) Analytic Theory of Continued Fractions, D. Van Nostrand, New 
York. 

10. Whitaker, E.T., and Watson, G.N., (1940) Modern Analysis, Cambridge University 
Press. 



186 



Examples 

Examples On One Sample Data Sets 

One Hundred Failure-Time Data 

One hundred observations of the time until failure of an electronic circuit were obtained from 
a life testing experiment. The coded data values are shown below. The serial correlations with 
lag 1 and lag 2 were quite small indicating apparent "independence" of the observations. 
Also, a serial plot of the data shows no particular patterns. The runs test further confirms the 
randomness of the data. 

This type of data is assumed to come from an exponential random variable with mean = 1. 
The histogram of the data indicates that this assumption might be valid. If the data really is 
exponential with mean = 1, then the sample mean and standard deviation also should be 
about 1. From the output we see that x = 1.0856 and s = .9301 which do not differ from 1 
by a great deal. This is confirmed by the one-sample t-test. 

Both the Chi-square goodness of fit test and the Kolmogorov-Smirnov goodness of fit test 
indicate that we cannot reject the hypothesis that the data came from an exponentially distri- 
buted population with mean = 1. The x 2 test yields a test statistic of 9.248 with 8 degrees of 
freedom, which is not significant even at the a = .10 level. The K-S test statistic DN = 
.09907, is not significant at a = ,20 level. However, both tests (x 2 and K-S) indicate that the 
data is not normally distributed. 

Since the sample size for this example was too large to perform a Shapiro Wilk Normality test, 
half of the observations were selected to give you an idea of the output. 

* DATA MANIPULATION * 

Enter DATA TYPE (Press CONTINUE for RAW DATA): 

i Raw data 

Mode nunber = ? 

2 On mass storage 

Is data stored on program's scratch file (DATA)? 

NO 

Data file nane = ? 

TIME: INTERNAL 

Was data stored by the BS&DM systen ? 

YES 

Is data Medium placed in device INTERNAL 

? 

YES 

Is prograei Mediuii placed in correct device ? 

YES 



Data file nawe: TIME = INTERNAL 

Data type is: Raw data 

NuMber of observations: iOQ 
Nunber of variables: i 



Variable nanes.- 
i. Xi 

Subfiles: NONE 



187 



SELECT ANY KEY 

Option nurtber = ? 
i 



Press special function key labeled-LIST 



Data type is: Raw data 







VARIABLE 


* 1 (XI) 






I 


OBS<I) 


OBS(I+i) 


OBSCI+2) 


0BS<I+3) 


0BS<I+4) 


i 


2.00790 


2.45450 


2.55760 


.50250 


1.71430 


6 


1.71430 


2.52480 


.84390 


2.89900 


.32220 


ii 


.18180 


3.38780 


1.71490 


.16020 


.10360 


16 


.53S30 


1.18870 


.01480 


. 03510 


.21580 


21 


.84770 


1.85770 


1.08500 


3.25370 


1.73570 


26 


1.03880 


1.72300 


1.72300 


1.85580 


.89840 


31 


.14220 


.12790 


1.49950 


.11010 


3.37350 


36 


.60190 


1.90800 


.52140 


.29580 


.49730 


41 


1.63010 


.05740 


1 . 08360 


.57650 


2.25210 


46 


2.72780 


.83400 


1.14640 


.02070 


.23900 


Si 


3.84480 


1.29530 


.81290 


.85020 


.97390 


56 


.43280 


.83970 


1.08490 


.95980 


.51170 


61 


.89530 


2.51070 


.32380 


1.06270 


3.21960 


66 


1.20550 


.39400 


. 29730 


1.27110 


.98670 


71 


2.31500 


.48060 


1.34410 


.78670 


2.28790 


76 


.12190 


.54020 


3.11250 


. 17480 


. 06320 


81 


.65310 


.54450 


.01050 


.18050 


.46430 


86 


.55340 


.99490 


.28950 


1.36600 


.15090 


91 


1.51270 


1.53900 


.77450 


.14300 


.44900 


96 


.43340 


.16540 


1.76060 


.40100 


.43230 


Option 


riunber = ? 











SELECT 


ANY KEY 






Exit LIST procedure 





Enter number of desired function 



Select special function key labeled ADV. STAT 

Remove BSDM media 

Insert General Statistics media 



Choose 1 sample tests 
***************************************** 



ONE SAMPLE TESTS 



VARIABLE 



-XI 



♦ ♦W*******************************************************^******^**^*),:^^^^^ 



** 



Enter desired function ■■ 
1 



Choose serial correlation 



SERIAL CORRELATION SAMPLE SIZE IS 100 



CORRELATION LAG = ? 
1 

SERIAL CORRELATION WITH LAG = 1 IS .01605 



Choose lag = 1 

Not very serially correlated 



188 



ENTER ANOTHER LAG? 

YES 

CORRELATION LAG = ? 

2 

SERIAL CORRELATION WITH LAG 



IS 



. 01235 



Try lag = 2 

Not very correlated 



ENTER ANOTHER LAG? 
NO 

Enter desired function: 



2 












Obtain ranks 




RANKED DATA 


















DISTINCT 






DISTINCT 






DISTINCT 


( RANK 


DATA POINT) 


( 


RANK 


DATA POINT) 


< 


RANK 


DATA POINT) 


< 1.00 


.0105) 


< 


2.00 


.0148) 


( 


3.0 


.0207) 


( 4.00 


. 0351) 


< 


5.00 


. 0574) 


( 


6.00 


.0632) 


< 7.00 


.1036) 


( 


8.00 


.1101) 


< 


9.00 


.1219) 


< 10.00 


. 1279) 


< 


11.00 


.1422) 


( 


12.00 


.1430) 


( 13.0 


.1509) 


( 


14.00 


.1602) 


( 


15.00 


. 1654) 


< 16.00 


.1748) 


< 


17.00 


. 1805) 


( 


18.00 


.1818) 


( 19.0 


.2158) 


< 


20.00 


.2390) 


< 


21.00 


.2895) 


( 22.00 


.2958) 


< 


23.0 


.2973) 


( 


24.00 


.3222) 


( 25.00 


.3238) 


( 


26.0 


.3940) 


( 


27.00 


.4010) 


< 28.00 


.4323) 


< 


29.00 


.4328) 


< 


30.00 


.4334) 


( 31.00 


.4490) 


< 


32.00 


.4643) 


( 


33.00 


.4806) 


( 34.00 


.4973) 


< 


35.0 


.5025) 


( 


36.00 


.5117) 


( 37.00 


.5214) 


( 


38.0 


.5353) 


( 


39.0 


.5402) 


( 40.00 


.5445) 


< 


41.00 


.5534) 


( 


42.00 


.5765) 


( 43.00 


.6019) 


< 


44.00 


.6531) 


< 


45.00 


.7745) 


( 46.00 


.7867) 


( 


47.00 


.8129) 


( 


48.00 


.8340) 


( 49.00 


.8397) 


< 


50.00 


.8439) 


( 


51.00 


.8477) 


( 52.00 


.8502) 


( 


53.00 


.8953) 


( 


54.0 


.8984) 


( 55.00 


.9598) 


< 


56.00 


.9739) 


( 


57.00 


.9867) 


( 58.00 


.9949) 


< 


59.00 


1.0388) 


( 


60.00 


1.0627) 


< 61.00 


1.0836) 


< 


62.0 


1.0849) 


( 


63.0 


i .0850) 


( 64.0 


1.1464) 


( 


65.0 


1.1887) 


( 


66.00 


1.2055) 


( 67.00 


1.2711) 


< 


68.00 


1.2953) 


< 


69.00 


1 .3441) 


< 70.00 


1.3660) 


< 


71.00 


1.4995) 


( 


72.00 


1 .5127) 


< 73.0 


1.5390) 


( 


74.00 


1.6301) 


( 


75.50 


1.7143) 


< 77.00 


1.7149) 


( 


78.50 


1.7230) 


( 


80.00 


1.7357) 


( 81.00 


1.7606) 


< 


82.00 


1.8558) 


( 


83.00 


1.8577) 


( 84.00 


1.9080) 


< 


85.0 


2.0079) 


( 


86.00 


2.2521) 


< 87.00 


2.2879) 


< 


88.00 


2.3150) 


( 


89.00 


2.4545) 


( 90.00 


2.5107) 


( 


91.00 


2.5248) 


< 


92.0 


2. 5576) 


< 93.00 


2.7278) 


( 


94.00 


2.8990) 


( 


95.0 


3.1125) 


( 96.00 


3.2196) 


< 


97.00 


3.2537) 


< 


98.00 


3.3735) 


( 99.00 


3.3878) 


< 


100.00 


3.8448) 








Enter desir 


ed function: 














3 












Choose t-test 




ONE-SAMPLE 


t-TEST SAMPLE 


SIZE IS 


100 









1 OR 2 TAIL TEST 
2 

2 TAIL TEST 
HO: MU= 1.085611 OR = 
? 



2 tail test 



189 



1.0000 








HO: MU= 


i 






N= 




100 




MEAN= 




1 


. 08S6 


STD DEV = 






,9301 


STD ERROR OF MEAN= 






.0930 


t = 






,9204 


DF= 




99 





Specify hypothesis mean 



Cannot reject hypothesis 



P< 



.9204 < t < 



.9204) 



.3596 



Enter desired function: 

4 Choose Kolmogorov-Smirnov G.O.F. test 

KOLMOGOROV-SMIRNOV GOODNESS-OF-FIT TEST SAMPLE SIZE IS 100 



Please enter G.O.F. code: 



Testing for EXPONENTIAL goodness of fit. 



MEAN= 1. 085611 OR = 
? 



Choose exponential form of the 
hypothesized distribution. 



MEAN = 1 

N= 100, KOLMOGOROY-SMIRNOV STATISTICS: DN 

SQR<N>*DN 



.09907 
.99 



ANOTHER G.O.F. CODE? 
NO 

Enter desired function: 

5 Choose Chi-square G.O.F. test 

CHI -SQUARE GOODNESS-OF-FIT TEST SAMPLE SIZE IS 100 

Please enter G.O.F. code: 



Testing for EXPONENTIAL goodness of fit. 



Select exponential distribution again 



OFFSET = 


OFFSET = 
# OF CELLS <ctax is 50) = ? 
10 

* OF CELLS = 10 

OPTIMUM CELL WIDTH = .3845 

CELL WIDTH = .3844838448 OR = 

? 

.4 



Minimum value for histogram 



10 intervals or windows 



190 



YOUR CELL WIDTH 



.4000 



CELL # 



3 
4 
5 
6 
7 
8 
9 
10 



LOWER 


LIMIT 





0000 




.4000 




SOCIO 


1 


.2000 


i 


.6000 


2 


.0000 


2 


.4000 


2 


.8000 


3 


.2000 


3 


.6000 



OBSERVED 
* OF OBS. 
26 

20 
19 

8 
ii 

4 

5 

2 

4 

i 



EXPECTED 
* OF OBS. 
30.82 
21.32 
14.75 
10.20 
06 
88 
38 
34 
62 
12 



CHI-SQUARE GQODNESS-OF--FIT FOR EXPONENTIAL DISTRIBUTION 

CHI- SQUARE VALUE = 9.248> DEGREES OF FREEDOM = 8 Not very big. 

ANOTHER GOF CODE? 

NO See Chi-square table in appendix with 

8 degrees of freedom. 

Enter desired function: 

7 Choose runs test 



RUNS TEST 



SAMPLE SIZE IS 100 



Select a significance level by entering 1, 2 or Z- 

3 

TEST FOR TOO FEW RUNS? 

YES 

* OF RUNS IS NOT SIGNIFICANT AT THE .05 
SIGNIFICANCE LEVEL FOR TOO FEW RUNS 

TEST FOR TOO MANY RUNS? 

NO 

Another significance level? 

NO 



Choose =c = .05 

See if data is too non-random 



Enter desired function 
9 



Exit one-sample tests 



Enter nurtber of desired function: 
6 

SELECT ANY KEY 

Option nuMber = ? 

1 

NuMber of subfiles ( <=20 ) = ? 

2 

Name of Subfile * 1 ( <=10 characters ) = 

? 

FIRST HALF 

Subfile * 1 ; number of observations = 

? 

50 

Nacie of Subfile * 2 ( <=10 characters ) = 

? 

SECONDHALF 

Is the above infornation correct? 

YES 

Subfile nafie: beginning observation nunber of observations 

1 FIRST HALF 1 50 

2 SECONDHALF 51 50 



Return to BSDM to split data set in half for 
Shapiro-Wilk test. 

Select special function key labeled-SUBFILES 

Split data set by specifying number of 
observations in each subfile 



191 



Option nufiber = ? 

Exit subfiles procedure 
PROGRAM NOW UPDATING SCRATCH DATA FILE 

SELECT ANY KEY 

Return to General Statistics by pressing 

ADV. STAT key 
Enter nunber of desired function: 

1 Choose one-sample tests 

SUBFILE NUMBER? <0=IGNORE SUBFILES) 
i 

™ * ^v* ^ T ^ * * * ™ * * * ^ ^ ^ ^ ™ * * ^ * ^ ^ ^ * ^ * ^ * ^ * * t ^ t ^ ^ ^p^ ^ * ^^ ^ ^ ^ t * ^ ff ^ ^ * * ^ ^ ^ ^ T T ^ * ^ t ^ ^ ^ t ^ ^ ^ ^ *p ^ ^ ^ ^ ^ ^ 

ONE SAMPLE TESTS 

VARIABLE — -Xi 

SUBFILE --FIRST HALF 

Enter desired function: 

£> Select Shapiro-Wilk test for subfile 1 

SHAPIRO-WILK NORMALITY TEST SAMPLE SIZE IS 50 

W STATISTIC FOR NORMALITY = .904821834706 

% POINTS FOR W <SMALL VALUE SIGNIFICANT) 

.05. .02 .05 .1 .5 
CORRESPONDING U VALUES: .93 .938 .947 .955 .974 

Enter desired function: 
S 

SUBFILE NUMBER? (0=IGNORE SUBFILES) 
2 

ONE SAMPLE TESTS 

VARIABLE — Xi 

SUBFILE — SECONDHALF 

Enter desired function: 

6 Select Shapiro-Wilk test for subfile 2 

SHAPIRO-WILK NORMALITY TEST SAMPLE SIZE IS 50 



W STATISTIC FOR NORMALITY = .831574211967 

X POINTS FOR W (SMALL VALUE SIGNIFICANT) 

.01 
CORRESPONDING W VALUES: .93 

Enter desired function: 
9 

Enter number of desired function: 
6 



.02 


.05 .1 .5 


.938 


.947 .955 .974 




Return to main menu 




Return to BSDM 



192 



SELECT ANY KEY 

Examples On Two Paired Samples Data Sets 

Pig Weight Changes 

176 pigs were paired on the basis of sex, age, and initial weight. They were fed daily one of 
two iron compounds to supplement that which they lacked due to confinement in pens. It was 
desired to determine if there was any difference in pig weight due to the two different com- 
pounds as applied over a one month period. From the paired-t test and the correlation coeffi- 
cient, we see the difference is not significant. 



******************************************************************************** 

* DATA MANIPULATION * 

*************************************************** 

Enter DATA TYPE (Press CONTINUE for RAW DATA): 

i 

Mode nuMber = ? 



Raw data 

On mass storage 



Is data stored on prograei's scratch file (DATA)? 

NO 

Data file nacte = ? 

PIGS: INTERNAL 

Was data stored by the BS4.DM systeM ? 

YES 

Is data nediuM placed in device INTERNAL 

? 

YES 

Is prograM nediun placed in correct device ? 

YES 



PIG WEIGHT CHANGES 

Data file nacie: PIGS: INTERNAL 

Data type is: Raw data 

Nunber of observations: 88 
Nuwber of variables: 2 



Variable naeies: 
i. VARIABLES 
2. VARIABLE#2 

Subfiles: NONE 



Clever names for variables 



SELECT ANY KEY 

Option nuwber = ? 

1 

Enter Method for listing data 

3 



List all the data 



193 



PIG WEIGHT CHANGES 
Data type is: Raw data 





Variable # i 


Variable # 2 




<VARIABLE#i> 


(VARIABLE#2> 


OBS* 






1 


54.00000 


46.00000 


2 


44.00000 


42.00000 


3 


46.00000 


44.00000 


4 


54.00000 


44.00000 


5 


45.00000 


45.00000 


6 


46.00000 


52.0000 


7 


50.00000 


51.00000 


8 


43.0000 


55.00000 


9 


47.00000 


60. 00000 


10 


40.00000 


43.00000 


ii 


40.00000 


20.00000 


12 


46.00000 


48.00000 


13 


52.00000 


54.00000 


14 


50.00000 


55.00000 


15 


54.00000 


62.00000 


16 


49.00000 


41.00000 


17 


30.00000 


48.00000 


18 


50.00000 


45.00000 


19 


48.00000 


46.00000 


20 


38.00000 


31.00000 


21 


27.00000 


35.00000 


22 


50.00000 


59.00000 


23 


107.00000 


135.00000 


24 


77.00000 


90.00000 


25 


91.00000 


98.00000 


26 


88.00000 


98.0 000 


27 


93.00000 


96.00000 


28 


89.00000 


74.0 00 


29 


95.00000 


98.00000 


30 


105.00000 


133.00000 


31 


107.00000 


126.00000 


32 


95.00000 


91.00000 


33 


114.00000 


52.00000 


34 


128.00000 


98.00000 


35 


110.00000 


119.00000 


36 


104.00000 


105.00000 


37 


94.00000 


110. 00000 


38 


87.0000 


81.00000 


39 


66. 00000 


83.00000 


40 


96.00000 


112.00000 


41 


120.00000 


104.00000 


42 


90.00000 


101.00000 


43 


95.00000 


88.00000 


44 


86 . 


86.00000 


45 


158.00000 


221.00000 


46 


125.00000 


176.00000 


47 


149.00000 


150.00000 


48 


175.00000 


176.00000 


49 


196.00000 


209.00000 


50 


121.00000 


118.00000 


51 


181.00000 


180.00000 


52 


201.00000 


238.00000 


53 


175.00000 


196.00000 


54 


147.00000 


138.00000 


55 


209.00000 


133.00000 


56 


194.00000 


159.00000 


57 


203.00000 


209.00000 


58 


179.00000 


205.00000 



194 



59 


170 


.00000 


201 


.00000 


60 


148 


.00000 


149 


.00000 


6i 


138 


.00000 


159 


.00000 


62 


232 


.00000 


230 


.00000 


63 


223 


.00000 


198 


.00000 


64 


151 


.00000 


161 


.00000 


65 


142 


.00000 


147 


.00000 


66 


167 


.00000 


176 


.00000 


67 


210 


.00000 


320 


.00000 


68 


240 


.00000 


267 


.OUOOO 


69 


245 


.00000 


221 


.00000 


70 


263 


.00000 


247 


.00000 


71 


263 


.00000 


293 


.00000 


72 


182. 


00000 


211. 


00000 


73 


261. 


00000 


178. 


00000 


74 


280. 


00000 


320. 


00000 


75 


264. 


00000 


266. 


00000 


76 


187. 


00000 


178. 


00000 


77 


280. 


00000 


199. 


00000 


78 


287. 


00000 


230. 


00000 


79 


230. 


00000 


256. 


00000 


80 


234. 


00000 


272 . 


00000 


81 


238. 


00000 


245 . 


00000 


82 


202. 


00000 


222. 


00000 


83 


202. 





245. 


00000 


84 


317. 


00000 


243. 


00000 


85 


293. 


00000 


264. 


00000 


86 


215. 


00000 


215. 


00000 


87 


171. 


00000 


172. 


00000 


88 


242. 


00000 


233. 


00000 


Option 



SELECT 


nuciber 


= ? 






ANY KE> 


' 







Exit list procedure 

Select special function key labeled-ADV. STAT 
Remove BSDM media 
Enter nuciber of desired function: Insert General Statistics 

3 Choose two paired sample analyses 

VARIABLE NUMBER FOR X =? 

1 

VARIABLE NUMBER FOR Y =? 

************************************************* 

PAIRED SAMPLE TESTS 



VARIABLE FOR X — 
VARIABLE FOR Y — 



VARIABLE*! 
VARIABLE*2 



*******************************************#********************************#*#* 



Enter desired function: 
1 



Choose paired t-test 



PAIRED -t TEST SAMPLE SIZE IS 88 

1 OR 2 TAILED? 

1 

HO : MU<X)-MU<Y> = 





Specify zero difference 



1 TAILED TEST 
HO : MU(X)-MU(Y) = 
Hi : MU<X)~MU<Y> < 

LEVEL OF SIGNIFICANCE 
.05 

T VALUE ~ -.736 

DF == 87 



Specify x = .05 



T< 0.9500, 87 ) = 1.663 

DO NOT REJECT HO AT .05 LEVEL OF SIGNFICANCE 

ANOTHER PAIRED-t TEST ON THIS DATA? 

NO 



Enter desired function; 



CROSS CORRELATION SAMPLE SIZE IS 



88 



Choose cross correlation 



195 



LAG ON X OR Y? 

Y 

LAG ON Y= 

? 

i 

LAG ON Y = i COEFF. = .85126 

ANOTHER CROSS CORRELATION? 
YES 

LAG ON X OR Y? 
Y 

LAG ON Y= 
? 



LAG ON Y 



COEFF 



. 82534 



ANOTHER CROSS CORRELATION? 
YES 

LAG ON X OR Y? 
Y 

LAG ON Y= 
? 
3 

LAG ON Y = 3 COEFF. = .88230 



Lag of 1 on y 



Try lag of 2 



Try lag of 3 



ANOTHER CROSS CORRELATION? 
YES 

LAG ON X OR Y? 
Y 

LAG ON Y= 
? 
22 

LAG ON Y = 22 COEFF. = .89051 

ANOTHER CROSS CORRELATION? 



NO 

Enter desired function 
3 



Try lag of 22 



Choose family regression 



FAMILY REGRESSION / AOV SAMPLE SIZE IS 



88 



196 



REGRESSION CODE =? 



i 



Choose linear regression 
Y=A+BX+E 



AOV OF LINEAR REGRESSION 
Y = A + BX 



OURCE 




ss 




DF 


MS 


F RATIO 


REG 




481475.711 




1 


481475.711 


581 . 18 


RES 




71246.789 




86 


828.451 




TOTAL 


COR 


552722.500 




87 






R 


SQUARED 


5= 


.8711 









YHAT 



( 10.129409002 ) + ( .943467866544 >X 



EVALUATE Y AT X ? 

YES 

AT ALL X(I)'S ? 

YES 



Y EVALUATED AT X 



Table of predicted values and residuals 





X(I) 


YHAT 


Y(l) 


RES(I) 


1 


54.000 


61.0767 


46.00000 


15 07667 


2 


44.000 


51.6420 


42. 00000 


9 64200 


3 


46.000 


53.5289 


44.00000 


9.52893 


4 


54.000 


61.0767 


44.00000 


17.07667 


S 


45.000 


52.5855 


45.00000 


7.58546 


6 


46.000 


53.5289 


52.00000 


1.52893 


7 


50.000 


57.3028 


51.00000 


6.30280 


8 


43.000 


50.6985 


55.00000 


4.30147 


9 


47.000 


54.4724 


60.00000 


5.S2760 


10 


40.000 


47.8681 


43.00000 


4.868.1.2 


11 


40.000 


47.8681 


20.00000 


27.86812 


12 


46.000 


53.5289 


48.00000 


5.52893 


13 


52.000 


59.1897 


54. 00000 


5.18974 


14 


50.000 


57.3028 


55.00000 


2.30280 


15 


54.000 


61.0767 


62.00000 


.92333 


16 


49.000 


56.3593 


41.00000 


15.35933 


17 


30.000 


38.4334 


48.00000 


9 . 56656 


18 


50.000 


57.3028 


45.00000 


12.30280 


19 


48.000 


55.4159 


46.00000 


9.41587 


20 


38. 000 


45.9812 


31.00000 


14.98119 


21 


27.000 


35.6030 


35.00000 


.60304 


22 


50.000 


57.3028 


59.00000 


1.69720 


23 


107.000 


111.0805 


135.00000 


23.91953 


24 


77.000 


82.7764 


90.00000 


7 . 22357 


25 


91.000 


95.9850 


98.00000 


2.01502 


26 


88. 000 


93.1546 


98.00000 


4.84542 


27 


93.000 


97.8719 


96.00000 


1.87192 


28 


89.000 


94.0980 


74.00000 


20.09805 


29 


95.000 


99.7589 


98.00000 


1.75886 


30 


105.000 


109.1935 


133.00000 


23.80647 


31 


107.000 


111.0805 


126.00000 


14.91953 


32 


95.000 


99.7589 


91.00000 


8.75886 


33 


114.000 


117.6847 


52.000 00 


65.68475 


34 


128.000 


130.8933 


98.00000 


32.89330 


35 


110.000 


113.9109 


119.00000 


5.08913 


36 


104.000 


108.2501 


105.00000 


3.25007 


37 


94.000 


98.8154 


110.00000 


11 .18461 


38 


87.000 


92.2111 


81.00000 


11.21111 


39 


66.000 


72.3983 


83.00000 


10.60171 


40 


96.000 


100.7023 


112.00000 


11.29768 



197 



41 


120.000 


123.3456 


104.00000 


19.34555 


42 


90.000 


95.0415 


101.00000 


5.95848 


43 


95.000 


99.7589 


88.00000 


11.75886 


44 


86.000 


91.2676 


86.00000 


5.26765 


45 


158.000 


159.1973 


221.00000 


61.80267 


46 


125.000 


128.0629 


176.00000 


47.93711 


47 


149.000 


150.7061 


150.00000 


.70612 


48 


175.000 


175.2363 


176.00000 


.76371 


49 


196.000 


195.0491 


209.00000 


13.95089 


50 


121.000 


124.2890 


118. 00000 


6.28902 


51 


181.000 


180.8971 


180.00000 


.89709 


52 


201.000 


199.7665 


238.00000 


38.23355 


53 


175.000 


175.2363 


196.00000 


20.76371 


54 


147.000 


148.8192 


138.00000 


10.81919 


55 


209.000 


207.3142 


133.00000 


74.31419 


56 


194.000 


193.1622 


159.00000 


34.16218 


57 


203.000 


201.6534 


209.00000 


7.34661 


58 


179.000 


179.0102 


205.00000 


25.98984 


59 


170.000 


170.5189 


201.00000 


30.48105 


60 


148. 000 


149.7627 


149.00000 


.76265 


61 


138.000 


140.3280 


159. 00000 


18.67203 


62 


232.00 


229.0140 


230.00000 


.98605 


63 


223.000 


220.5227 


198.00000 


22.52274 


64 


151.000 


152.5931 


161.00000 


8.40694 


65 


142.000 


144.1018 


147.00000 


2.89815 


66 


167.000 


167.6885 


176.00000 


8.31146 


67 


210.000 


208.2577 


320.00000 


111.74234 


68 


240.000 


236.5617 


267.00000 


30.43830 


69 


245.000 


241.2790 


221.00000 


20.27904 


70 


263.0 00 


258.2615 


247.00000 


11.26146 


71 


263.00 


258.2615 


293.00000 


34.73854 


72 


182.000 


181.8406 


211.00000 


29.15944 


73 


261.000 


256.3745 


178.00000 


78.37452 


74 


280.000 


274.3004 


320.00000 


45.69959 


75 


264.000 


259.2049 


266.00000 


6.79507 


76 


187.000 


186.5579 


178.00000 


8.55790 


77 


280.000 


274.3004 


199.00000 


75.30041 


78 


287.000 


280.9047 


230.00000 


50.90469 


79 


230.000 


227.1270 


256.00000 


28.87298 


80 


234.000 


230.9009 


272.00000 


41.09911 


81 


238.000 


234.6748 


245.00000 


10.32524 


82 


202.000 


200.7099 


222.00000 


21.29008 


83 


202.000 


200.7099 


245.00000 


44.29008 


84 


317.000 


309.2087 


243.00000 


66.20872 


85 


293.000 


286.5655 


264.00000 


22.56549 


86 


215.000 


212.9750 


215.00000 


2.02500 


87 


171.000 


171.4624 


172.00000 


.53759 


88 


242.000 


238.4486 


233.00000 


5.44863 



REGRESSION CODE =? 







Enter desired function: 
10 

Enter number of desired function: 
6 



Exit family regression 



Exit two-paired sample test. 



Return to BSDM 



198 



Bus Passenger Service Time 

The time required to service passengers boarding at a bus stop was measured together with 
the actual number of passengers boarding. The service time as recorded from the moment 
that the bus stopped and the door opened until the last passenger boarded t' , us. The 
objective is to determine a model for predicting passenger service time, given ! ■*§ ;. i Ige of 
the number boarding at a particular stop. Let X = number boarding and Y *• ; • enger 
service time. The following data was gathered during the month of May, T. *'; twelve 
downtown locations in Louisville, Kentucky. 



* DATA MANIPULATION * 

*********************************************#*************************)! ******** 

Enter DATA TYPE (Press CONTINUE for RAW DATA): 

i 

Mode nu fiber = ? 



Raw data 

From mass storage 



Is data stored on progran's scratch file (DATA)? 

NU 

Data file nane - ? 

BUSTIME: INTERNAL 

Was data stored by the BS&DM systew ? 

YES 

Is data Mediutt placed in device INTERNAL 

? 

YES 

Is proqraM nediuM placed in correct device ? 

YES 



BUS PASSENGER SERVICE TIME 

Data file nane: BUSTIME = INTERNAL 
Data type is: Raw data 



NuMber of observations: 
NuMber of variables: 



31 



Variable names: 
i . NUMBER 
2. TIME 



Subfiles : 



NONE 



SELECT ANY KEY 



Option nuciber = ? 

i 

Enter Method for listing data; 

3 



Choose special function key labeled-LIST 
List all data 



BUS PASSENGER SERVICE TIME 
Data type is: Raw data 



199 





Variable ♦ 1 




< NUMBER ) 


OBS* 




i 


1.00000 


2 


1.00000 


3 


1.00000 


4 


1.00000 


S 


1.00000 


6 


2.00000 


7 


2.00000 


8 


2.00000 


9 


2.00000 


iO 


3.00000 


ii 


3.00000 


12 


3.00000 


13 


4.00000 


14 


S . 


15 


5. 00000 


16 


6.00000 


17 


6.00000 


18 


6.00000 


19 


7.00000 


20 


7.00000 


21 


8.00000 


22 


8.00000 


23 


8.00000 


24 


9.00000 


25 


10.00000 


26 


ii. 00000 


27 


il. 00000 


28 


13.00000 


29 


17.00000 


30 


19.00000 


31 


25.00000 



Variable # 2 

(TIME ) 



5. 
6. 
9. 



1.40000 

2.80000 

3.00000 

1.80000 

2.00000 

4.70000 

8.00000 

3.00000 

2.50000 

.20000 

.20000 

.40000 

11.70000 

7.50000 

11.90000 

13.60000 

12.40000 

11.60000 

14.70000 

13.50000 

12.00000 

14.10000 

26.00000 

19.00000 

21.20000 

22.90000 

22.60000 

25.20000 

33.50000 

33.70000 

54.20000 



Exit list procedure 

Choose special function key labeled-ADV. STAT 

Remove BSDM media 

Insert General Statistics media 

Choose two paired sample test 



Option nunber - ? 



SELECT ANY KEY 



Enter nuciber of desired function: 
3 

VARIABLE NUMBER FOR X =? 

1 

VARIABLE NUMBER FOR Y =7 

2 

PAIRED SAMPLE TESTS 

VARIABLE FOR X — - NUMBER 
VARIABLE FOR Y — TIME 

*************************************************** 



Enter desired Function; 
3 



Choose family regression 



200 



FAMILY REGRESSION / AOV SAMPLE SIZE IS 3i 



REGRESSION CODE =? 



Linear regression 
Y=A+BX+E 



AOV OF LINEAR REGRESSION 
Y = A + BX 



SOURCE 

REG 
RES 
TOTAL COR 



DF 



3970.237 

2ii.758 

4181 .995 



1 

29 
30 



MS 

3970.237 
7.302 



F RATIO 
543 . 72 



R SQUARED = 



.9494 



Not bad! 



YHAT 



( .586330097087 > + < 1.99576699029 >X 



EVALUATE Y AT X ? 

YES 

AT ALL X(I)'S ? 

YES 



Y EVALUATED AT X 



X(I) 



YHAT 



Y(I> 



1 


1 .000 


2.5821 


2 


1.000 


2.5821 


3 


1.000 


2.5821 


4 


1.000 


2.5821 


5 


1.000 


2.5821 


6 


2.000 


4.5779 


7 


2.000 


4.5779 


8 


2.000 


4.5779 


9 


2.000 


4.5779 


10 


3.000 


6.5736 


11 


3.000 


6.5736 


12 


3.000 


6.5736 


13 


4.000 


8.5694 


14 


S.000 


10.5652 


15 


5.000 


10.5652 


16 


6.000 


12.5609 


17 


6.000 


12.5609 


18 


6.000 


12.5609 


19 


7.000 


14.5567 


20 


7.000 


14.5567 


21 


8.000 


16.5525 


22 


8.000 


16.5525 


23 


8.000 


16.5525 


24 


9.000 


18.5482 


25 


10.000 


20.5440 


26 


11.000 


22.5398 


27 


11.000 


22.5398 


28 


13.000 


26.5313 


29 


17.000 


34.5144 


30 


19.000 


38.5059 


31 


25.000 


50.4805 



) 


RES(I) 




1 .40000 


1 . 


18210 


2.80000 




21790 


3.00000 




41790 


1.80000 




78210 


2.00000 




58210 


4.70000 




12214 


8.00000 


3. 


42214 


3.00000 


1. 


57786 


2.50000 


2. 


07786 


5.20000 


1. 


37363 


6.20000 




.37363 


9.40000 


2 


.82637 


11.70000 


3 


.13060 


7.50000 


3 


06517 


11.90000 


1 


.33483 


13.60000 


1 


.03907 


12.40000 




. 16093 


11.60000 




.96093 


14.70000 




.14330 


13.50000 


1 


.05670 


12.00000 


4 


.55247 


14.10000 


2 


.45247 


26.00000 


9 


.44753 


19.00000 




.45177 


21.20000 




.65600 


22.90000 




.36023 


22.60000 




.06023 


25.20000 


1 


.33130 


33.50000 


1 


.01437 


33.70000 


4 


.80590 


54.20000 


3 


.71950 



REGRESSION CODE 







Exit family regression 



201 



Enter desired function: 
10 

Enter number of desired function 
6 



Exit two paired sample tests 
Return to BSDM 



Example #3 

This example is included for your convenience as a sample problem so that you may check 
your operation of the routines involved. 



* DATA MANIPULATION * 

Enter DATA TYPE (Press CONTINUE for RAW DATA): 

i 

Mode nuMber = ? 



Is data stored on program's scratch file <DATA)? 

NO 

Data file name = ? 

TUONP: INTERNAL 

Was data stored by the BS&DM system ? 

YES 

Is data medium placed in device INTERNAL 

? 

YES 

Is program medium placed in correct device ? 

YES 



Raw data 

On mass storage 



TWO SAMPLE NONPARAMETRIC STATISTICS 



Data file name: TWONP = INTERNAL 

Data type is: Raw data 

Number of observations: 12 
Number of variables; 2 



Variable names: 
i. X(I) 
2. Y(I) 

Subfiles: NONE 



SELECT ANY KEY 

Option number = ? 

i 

Enter method for listing data: 

3 



Select special function key labeled-LIST 
List all data 



202 



TWO SAMPLE NONPARAMETRIC STATISTICS 
Data type is: Raw data 





Variable * 1 
(X(I) ) 


Variable # 2 
(Y<I) > 


OBS* 
i 

2 
3 
4 
5 


86. 00000 
71.00000 
77.00000 
68.00000 
91.00000 


88.00000 
77.00000 
76.00000 
64.0 000 
96.00000 


6 

7 

8 

9 

10 

11 

12 


72.00000 
77.0 000 
91.00000 
70.00000 
71.00000 
88.00000 
87.00000 


72.00000 

65.0 00 00 
90.00000 
65.00000 
80.0000 
81.00000 
72.00000 


Option 



SELECT 


nuciber = ? 




ANY KEY 





Exit list procedure 

Select special function key labeled-ADV. STAT. 

Remove BSDM media 

„ . . . Insert General Statistics media 

Enter nurcber of desired function = 
3 Select two paired sample test 

VARIABLE NUMBER FOR X =? 

1 

VARIABLE NUMBER FOR Y =? 

2 

***************************************** * ************************************** 

PAIRED SAMPLE TESTS 



VARIABLE FOR X — X<I) 
VARIABLE FOR Y — Y(I) 

******************************************************************************** 



Enter desired function: 
4 



Select sign test 



SIGN TEST 



SAMPLE SIZE IS 



NUMBER OF POSITIVE DIFFERENCES = 7 

(THE 1 POINTS WHERE X(I)=Y<I) ARE EXCLUDED FROM THE TEST) 

NUMBER OF OBSERVATIONS USED = 11 



YIELDS AN APPROX. STD. NOR. DEV . = 



.90453 



No real differences 



Enter desired function: 

5 



Select Wilcoxon Signed Rank test 



WILCOXON SIGNED RANK 



SAMPLE SIZE IS 



12 



203 



SUM OF POSITIVE RANKS = 41.5 

(USING RANKS OF X(I)-Y(I> AND EXCLUDING THE i 
POINTS WHERE X(I)=Y(I>) 
NUMBER OF OBSERVATIONS USED = 11 

YIELDS APPROXIMATE STANDARD NORMAL DEVIATES 

i) WITHOUT CORRECTION FOR CONTINUITY : 

A) NOT COMPENSATING FOR TIED DIFFERENCES . .75574 

B) CONDITIONAL ON THE EXISTING TIED DIFFERENCES : .75649 
2) WITH CORRECTION FOR CONTINUITY = 

A) NOT COMPENSATING FOR TIED DIFFERENCES : .71129 

B) CONDITIONAL ON THE EXISTING TIED DIFFERENCES : .71199 

Confirms no differences 



Enter desired function = 
6 



Select Taha's higher power signed rank test 



HIGHER POWERED SIGNED RANKS 



SAMPLE SIZE IS 



POWER OF THE RANK (MUST BE 2, 3, 4, OR 5) 
2 

POWER OF THE RANK IS 2 



SUM OF POSITIVE RANKS SQUARED = 335.75 

(USING RANKS OF X(I)-Y(I> AND EXCLUDING THE 1 
POINTS WHERE X(I)=Y(I>> 
NUMBER OF OBSERVATIONS USED =11 

YIELDS AN APPROX. STD . NOR. DEV . OF 8284 
CONDITIONAL ON THE EXISTING TIES AND 
WITHOUT A CORRECTION FOR CONTINUITY 



Enter desired function: 

7 



Again no difference 



Select Spearman Rank Correlation 



SPEARMAN'S RHO 



SAMPLE SIZE IS 



12 



SUM OF SQUARED RANK DIFFERENCES = 75 
RHO = .73776 



Seems to indicate that X & Y are related 



Enter desired function* 
8 



Select Kendall's Tau test 



KENDALL'S TAU SAMPLE SIZE IS 12 



NUMBER OF CONCORDANT PAIRS = 49 
NUMBER OF DISCORDANT PAIRS = 12 

TAU = .56061 



Also indicates X & Y are related 



204 



Enter desired function: 
10 

Enter nuMber of desired function 
6 



Exit two paired sample tests 



Return to BSDM 



Examples on Two Independent Samples 

Example 1 

The following is an example of a two-sample t-test. 



******************************************************************************** 

* DATA MANIPULATION * 

******************************************* 

Enter DATA TYPE (Press CONTINUE for RAW DATA): 

i 

Mode nurtber = ? 



Is data stored on prograM's scratch file (DATA)? 

NO 

Data file nacie = ? 

ANEXMP2: INTERNAL 

Was data stored by the BS&DM systeM ? 

YES 

Is data MediuM placed in device INTERNAL 

? 

YES 

Is progran nediun placed in correct device ? 

YES 



Raw data 

On mass storage 



ANOTHER EXPAMLE 

Data file nafie- ANEXMP2: INTERNAL 

Data type is: Raw data 

NuMber of observations: 13 
NoMber of variables: i 



Variable nattes: 
1. MEANS 

Subfile nane beginning observation nuwber of observations 
i. FIRST PART i 6 

2. SEC. PART 7 7 



SELECT ANY KEY 

Option nunber = ? 
i 



Select special function key labeled-LIST 
List data 



ANOTHER EXPAMLE 
Data type is: Raw data 



205 



I 


OBS(I) 


i 


2.00000 


6 


4.00000 


ii 


6.00000 



Option nunber = ? 



SELECT ANY KEY 



Enter number of desired function: 



VARIABLE t i (MEANS) 
OBS<I+i) 0BS(I+2> OBSCI+3) 
3.00000 4.00000 2.00000 
5.00000 4.00000 2.00000 
3.00000 7.00000 



0BS(I+4) 
3.00000 
2.00000 



Exit list procedure 

Select special function key labeled-ADV. STAT 

Remove BSDM media 

Insert General Statistics media 

Select two independent sample test 



VARIABLE NUMBER =? 

i 

******************************************************)m*********************** 

TWO INDEPENDENT SAMPLE TESTS 

VARIABLE — MEANS 

SUBFILE NUMBER FOR THE 'X' DATA? 

i 

X SUBFILE — FIRST PART 

SUBFILE NUMBER FOR THE 'Y' DATA? 

2 

Y SUBFILE — SEC. PART 

************************************************** 

Enter desired function: 

* Select two sample t-test 

TWO SAMPLE t TEST 



SAMPLE 1 

N == 6 

MEAN = 

VARIANCE » 

COEFF. OF VARIANCE 

STD. DEV. = 

SAMPLE 2 

N == 7 

MEAN = 

VARIANCE = 

COEFF. OF VARIANCE 

STD. DEV. = 



3.000000 

.800000 

'9.814240 

.894427 



4.142857 

3.809524 

47.112417 

1.951800 



t = 1.3147 WITH DF= 11 
PROB <t > 1.3147) =.10769 

Enter desired function: 
8 

Enter number of desired function: 
6 



Exit two sample tests 



Return to BSDM 



206 



Example 2 

A cloud seeding experiment was performed using 16 nonseeded and 10 nonseeded days. The 
amount of rainfall, in inches, was recorded for the seeded (X) and nonseeded (Y) cases. 

Three tests to see if the median rainfall was identical were performed, none of which indicates 
that the two medians differ significantly. 

Taha's squared rank test was performed, since it was assumed that greater precipitation 
amounts are more important, and should therefore be weighted more heavily in this type of 
experiment. 



************************************************ 

* DATA MANIPULATION * 

******************************************************************************** 

Enter DATA TYPE (Press CONTINUE for RAW DATA): 

i Raw data 

Mode number = ? 



On mass storage 



Is data stored on program's scratch file <DATA)? 

NO 

Data file nacte = ? 

CLOUD: INTERNAL 

Was data stored by the BS&DM system ? 

YES 

Is data medium placed in device INTERNAL 

? 

YES 

Is program medium placed in correct device ? 

YES 



CLOUD 

Data file name: CLOUD ■■ INTERNAL 

Data type is: Raw data 

Number of observations: 26 
Number of variables: i 

Variable names: 
i . DAYS 

Subfile nane beginning observation number of observations 

1. SEEDED i 10 

2. NONSEEDED il 16 

SELECT ANY KEY 

Select special function key labeled-LIST 

Option number = ? 

i List all data 

CLOUD 



Data type is= Raw data 



207 



VARIABLE # i (DAYS) 



I 




OBS(I) 


OBS(I+l> 


0BS(I+2) 


0BS<I+3> 


0BS<I+4> 


i 




.05000 


.72000 


.69000 


.09000 


.04000 


6 




.62000 


.37000 


.23000 


1.18000 


.26000 


ii 




.18000 


.88000 


.12000 


.74000 


.43000 


16 




.10000 


.65000 


.06000 


.09000 


.41000 


21 




.12000 


.41000 


.05000 


.03000 


.320 


26 




.05000 










Option 


nu nber 


= ? 













SELECT ANY KEY 



Enter nuober of desired function 



Select special function key labeled-ADV. STAT 

Remove BSDM media 
Insert General Statistics 

Select 2 independent sample test 



VARIABLE NUMBER =? 
1 

TWO INDEPENDENT SAMPLE TESTS 

VARIABLE — DAYS 

SUBFILE NUMBER FOR THE 'X' DATA? 

1 

X SUBFILE — SEEDED 

SUBFILE NUMBER FOR THE 'Y' DATA? 



Y SUBFILE — 



NONSEEDED 



Select median test 



Enter desired function 

•? 



MEDIAN TESTS 



DO YOU WANT THE COMBINED RANKS PRINTED? 
YES 

COMBINED RANKS 
I FOR X(I) FOR Y(I) 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 



4.0000 
23.0000 
22.0000 
.5000 
.0000 
20.0000 
16.0000 
13.0000 
26.0000 
14.0000 



7. 

2. 



12.0000 
25.0000 
10.5000 
24.0000 
19.0000 
9. 0000 
21.0000 



0000 

.5000 

17.5000 

10.5000 

17.5000 

4.0000 

1.0000 



6. 
7. 



Both data sets are combined and then ranked 
from smallest to largest. Tied ranks are 
assigned to identical data values. 



15.0000 
4.0000 



208 



I) TEST STATISTIC, T = 2 Useful for large samples. Since the values 

YIELDS A STD. NOR. DEV . OF .2894 are small do not reject hypothesis of no 

CONDITIONAL ON THE 5 EXISTING TIES differences between X and Y. 



II) CONTINGENCY TABLE ANALYSIS 



TOTAL 



* * * 

* OF OBS. > * 6 * 7 * 13 
GRAND MEDIAN * * * 

* * * 

* OF OBS. <= * 4 * 9 * 13 
GRAND MEDIAN * * * 



TOTAL 10 16 ; 

1) YIELDS AN APPROXIMATE CHI -SQUARE VALUE WITH 1 DF OF 

A) USING YATES' CORRECTION FOR CONTINUITY = 

.16250 

B) WITHOUT CORRECTION FOR CONTINUITY : 

.650 



2) FISHER'S EXACT PROBABILITY OF THE 
EXISTING CELL FREQUENCIES OR WORSE 

.34408 



All three values for the two by two table con- 
clude no difference between X' and Y's for 
middle value. 



Enter desired function 
3 



Select Mann-Whitney test 



MANN-WHITNEY TEST 



DO YOU WANT THE COMBINED RANKS PRINTED? 
NO 

SUM OF THE RANKS OF X ■- 147.5 

YIELDS AN APPROX. STD. NOR. DEV . OF : 
CONDITIONAL ON THE 5 EXISTING TIES 



Designed to see if X's differ from Y's. 



,6583 Conclude, they do not. 
For large sample sizes. 



Enter desired function- 
4 



Select Taha's squared rank 



TAHA'S SQUARED RANK 



DO YOU WANT THE COMBINED RANKS PRINTED? 
NO 

SUM OF X RANKS SQUARED = 2786.25 

YIELDS AN APPROX. STD. NOR. DEV . OF = 

CONDITIONAL ON THE 5 EXISTING TIES 



Useful to see if X's differ from Y's in spread of 
data sets. 

,7605 Conclude they do not. 



Enter desired function: 
8 



Exit from two independent sample tests 



209 



Enter nunber of desired function: 

6 Return to BSDM 



Example 3 

An investigator is interested in whether there is a significant difference in the time required to 
pace himself for one mile between a near sea level location and a high altitude location. 

Forty five low altitude observations (Y) and forty high altitude observations (X) were col- 
lected. It was decided to test whether the two populations from which the investigator sam- 
pled have the same distribution. 

Both the Cramer-Von Mises and Kolmogorov-Smirnov tests were performed, neither of which 
indicates that there is a significant difference between low altitude and high altitude pacing. 



* DATA MANIPULATION * 

Enter DATA TYPE (Press CONTINUE for RAW DATA) : 

i Raw data 

Mode nuMber = ? 



On mass storage 



Is data stored on program's scratch file (DATA)? 

NO 

Data file name = ? 

ALTITUDE: INTERNAL 

Was data stored by the BS&DM systew ? 

YES 

Is data Medium placed in device INTERNAL 

? 

YES 

Is program medium placed in correct device ? 

YES 



ALTITUDE 

Data file name: ALTITUDE: INTERNAL 

Data type is: Raw data 

Number of observations: 85 
Nunber of variables: i 

Variable names: 
i . ALTITUDE 

Subfile name beginning observation number of observations 
i. HIGH 1 40 

2. LOW 4i 45 

SELECT ANY KEY 

Select special function key labeled-LIST 
Option nunber = ? 
1 List all data 



210 



ALTITUDE 
Data type is: Raw data 



I 


OBS<I> 


i 


405.00000 


6 


394.00000 


ii 


394.00000 


16 


367.00000 


21 


361.00000 


26 


387.00000 


31 


351.00000 


36 


348.00000 


41 


361.00000 


46 


392.00000 


51 


379.00000 


56 


376.00000 


61 


373.00000 


66 


360.00000 


71 


438.00000 


76 


386.00000 


81 


3S7. 00000 


Option 



SELECT 


nuriber = ? 


ANY KEY 





VARIAE 


iLE * i 


L (ALTITUDE) 




OBS(I+i) 


0BS(I+2) 


0BS(I+3) 


387 


.00000 


400 


.00000 


392 


00000 


366 


.00000 


389 


.00000 


356, 


00000 


379 


.00000 


359 


,00000 


357 


00000 


380 


.00000 


395 


.00000 


442 


00000 


361 


.00000 


360, 


,00000 


353 


00000 


352 


00000 


385 


00000 


349 


00000 


367 


00000 


364 


.00000 


363, 


00000 


360 


00000 


353 


00000 


355. 


00000 


362 


00000 


359 


00000 


382, 


00000 


371 


00000 


398 


,00000 


400. 


00000 


370. 


00000 


365. 


00000 


362. 


00000 


371. 


00000 


369 


00000 


375. 


00000 


360 


00000 


374 


00000 


412. 





364. 


00000 


377 


00000 


360. 


00000 


408. 


00000 


380. 


00000 


414. 


00000 


362 


00000 


380 


00000 


377. 


00000 


393. 


00000 


357. 


00000 


369. 


00000 



0BS<I+4) 


343 


00000 


380, 


00000 


342 


.00000 


368 


.00000 


361 


.00000 


384 


,00000 


345 


00000 


353 


00000 


350 


00000 


367. 


.00000 


355. 


00000 


366, 


00000 


397. 


00000 


450, 


00000 


383. 


00000 


360 


00000 


373. 


00000 



Enter nuMber of desired function' 



Exit list procedure 

Select special function key labeled-ADV. STAT 

Remove BSDM media 

Insert General Statistics 

Select two independent sample test 



VARIABLE NUMBER =? 

1 

TWO INDEPENDENT SAMPLE TESTS 

VARIABLE -- ALTITUDE 

SUBFILE NUMBER FOR THE 'X' DATA? 

i 

X SUBFILE — HIGH 

SUBFILE NUMBER FOR THE 'Y' DATA? 



Y SUBFILE 



LOW 



****************************************************** 



Enter desired function; 
5 



Select Cramer-Von Mises 



CRAMER-VON MISES 



Hypothesis is that x distribution is the same as y 



SUM OF THE SQUARED DIFFERENCES 
YIELDS A TEST STATISTIC, T= 



9471 



.2359 



CRITICAL REGION OF SIZE 0.10 IS FOR T > 0.347 

0.05 IS FOR T > 0.461 
0.01 IS FOR T > 0.743 



Accept hypothesis 



Enter desired function 
6 



Select Kolmogorov-Smirnov test 



211 



KQLMOGOROV-SMIRNQV 

Same hypothesis 

MAXIMUM DIFFERENCE, T <IN ABS . VALUE) = .2556 

LARGE SAMPLE CRITICAL REGION OF SIZE O.iO IS FOR T > .2651 

0.05 IS FOR T > .2955 
0.01 IS FOR T > .3542 

Same conclusion 
Enter desired function: 
8 Exit 

Enter nunber of desired function: 

6 Return to BSDM 

Example On Multiple Sample Data Sets 

1. The following example was run to determine the effect of the addition of different 
sugars on length (in ocular units) of pea sections grown in tissue culture with auxin 
present. The first sample contains the control results, while the other samples contain: 

a. 2% glucose added 

b. 2% fructose added 

c. 1% glucose and 1% fructose added, and 

d. 2% sucrose added. 

After running the one way AOV, a large F value was calculated, indicating there was 
some difference. To determine which samples were different, two multiple comparison 
tests were run. In both the Least Significant Difference and in the Duncan's test, all 
samples differed significantly from the control sample. The Kruskal-Wallis test further 
supports this conclusion. 



* DATA MANIPULATION * 

Enter DATA TYPE (Press CONTINUE for RAW DATA): 

i Raw data 

Mode nuMber = ? 



Is data stored on prograci's scratch file (DATA)? 

NO 

Data file nane = ? 

TISSUE: INTERNAL 

Was data stored by the BS&DM systen ? 

YES 

Is data wediuM placed in device INTERNAL 

? 

YES 

Is program ciediun placed in correct device ? 

YES 



TISSUE CULTURE GROWTH 
Data file name: TISSUE = INTERNAL 



On mass storage 



212 



Data type is: 



Raw data 



Number of observations: 50 
Number of variables: i 



Variable names: 
i. GROWTH 

Subfile name 
i. CONTROL 

2. 2% GLUCOSE 

3. 2X FRUCT. 

4. 1ZGLU+1FRU 

5. 2XSUCR0SE 

SELECT ANY KEY 

Option nuMber = 
1 



beginning observation number of observations 

i 10 

ii 10 

2i 10 

31 10 

41 10 



Select special function key labeled-LIST 
List all data 



Data type is: Raw data 



TISSUE CULTURE GROWTH 



VARIABLE * 1 (GROWTH) 



I 




OBS(I) 


1 




75.00000 


6 




71.00000 


ii 




57.00000 


16 




60.00000 


21 




58.00000 


26 




56.00000 


31 




58.00000 


36 




56.00000 


41 




62.00000 


46 




62.00 00 


Option 



SELECT 


number = ? 


ANY 


KEY 



OBS(I+l) 
67.00000 
67.00000 
58.00000 
60.00000 
61.00000 
61.00000 
59.00000 
58.00000 
66.00000 
65.00000 



OBSCI+2) 
70 .00000 
67.00000 
60.00000 
57.00000 
56.00 000 
60.00000 
58.00000 
57.00000 
65.00000 
65.00000 



OBSCI+3) 
75.00000 
76. 00000 
59.00000 
59.00000 
58.00000 
57.00000 
61.00000 
57.0 000 
63.00000 
62.0 00 



0BS(I+4) 
65.00000 
68.00000 
62.00000 
61.00000 
57.00000 
58.00000 
57.0 00 
59.00 000 
64.00000 
67.00000 



Exit list procedure 

Select special function key labeled-ADV. STAT 

Remove BSDM media 

,. „ . -j.j- Insert General Statistics 

Enter number of desired function: 

4 Select three or more samples 

NUMBER OF TREATMENTS =? 
5 

MULTIPLE SAMPLE TESTS 

VARIABLE — GROWTH 

SUBFILE NUMBER FOR TREATMENT # 1 = 

? 

1 

TREATMENT * 1SUBFILE — CONTROL 

SUBFILE NUMBER FOR TREATMENT * 2 = 

V 

2 

TREATMENT # 2SUBFILE — 2% GLUCOSE 

SUBFILE NUMBER FOR TREATMENT # 3 = 

? 

3 



Specify treatments by subfiles 



213 



TREATMENT # 3SUBFILE — ZX FRUCT. 

SUBFILE NUMBER FOR TREATMENT # 4 = 

? 

4 

TREATMENT # 4SUBFILE — 1XGLU+1FRU 

SUBFILE NUMBER FOR TREATMENT # 5 = 

? 

5 

TREATMENT # 5SUBFILE — 2XSUCR0SE 



Enter desired function^ 
i 



Select one-way AOV 



ONE WAY AOV 



TRT # i 



75.00000 
65.00000 
76.00000 



67.00000 
71.00000 
68.00000 



70.00000 
67.00000 



75.00000 
67.00000 



TRT * 2 



57.00000 
62.00000 
59.00000 



58.00000 
60.00000 
61.00000 



60.00000 
60.00000 



59.00000 
57.00000 



TRT # 3 



58.00000 
57.00000 
57.00000 



61.00000 
56.00000 
58.00000 



56.00000 
61.00000 



58.00000 
60.00000 



TRT * 4 



58.00000 
57.00000 
57.00000 



59.00000 
56.00000 
59.00000 



58.00000 
58.00000 



61. 00000 

57.0 00 



TRT # 5 



62.00000 
64.00000 
62.00000 



66.00000 
62.00000 

67.00000 



65.0 0000 
65.00000 



63.00000 
65.00000 



.# 


N 


MEAN 


VARIANCE 


STD DEV 


STD 


ERRORS 


i 


10 


70.1000 


15.8778 


3.9847 


) 


t.2601 


2 


10 


59.3000 


2.6778 


1.6364 




.5175 


3 


10 


58.2000 


3.5111 


1 . 8738 




. 592S 


4 


10 


58.0000 


2.0000 


1.4142 




.4472 


5 


10 


64.1000 


3.2111 


1.7920 




. S667 



214 



ANALYSIS OF VARIANCE 



SOURCE 


DF 




SS 


MS 


F 


TOTAL 


49 




1322.8200 






TRTS 


4 




1077.3200 


269.3300 


49.3680 


ERROR 


45 




245.5000 


S.4S56 




PROB <F > 


49. 


3680) 


=0.0000 


Treatments diffe 


r significan 



BARTLETT'S TEST 
DF = 4 , CHI-SQUARE = 13.9386 

PROB (CHI-SQUARE > 13.9386) 



Enter desired function: 



= .0075 Variances within treatments also differ. 

Probably just first treatment differs from the others. 

Select multiple comparisons 



MULTIPLE COMPARISONS 



CHOOSE A NUMBER AND PRESS CONTINUE 

1 

WHAT CONFIDENCE LEVEL ? < . 99, . 95, etc . ) 

.95 

TABLE VALUE FROM STUDENT'S t 

2. 02 

DO YOU WISH TO PLOT ON THE CRT? 

YES 

Beep signify the end of plot, then press CONTINUE. 

DO YOU WANT A HARD COPYdF THIS IS FEASIBLE)? 

NO 



LSD 



ERROR MEAN SQUARE = 5.4556 
DEGREES OF FREEDOM = 45 
CONFIDENCE LEVEL = .95 
TABLE VALUE FROM STUDENT'S t = 



LSD procedure at 95% confidence. 



2.0200. LSD = 



2.1100 



SAMPLES RANKED 



A 





B 


- 


C 


_ 


MEANS 




1 -c 




2 -A 




3 -A 




4 -A 




5 -B 




CHOOSE A NUMBER 

1 

WHAT CONFIDENCE 


AND PRESS CONTINUE 


LEVEL ? < .99, .95, etc. ) 


.95 




TABLE VALUE FROr* 


i STUDENT'S t 


2.02 




DO YOU WISH TO PLOT ON THE CRT? 


NO 




Plotter indentif 


ier string(press CONT if'HPGL')? 



Treatments 2-4 are not different from one another. 
Treatment 1 differs from the others. 
Treatment 5 differs from the others. 



215 



Plotter select code, bus #<defults are 7,5)? 

WHICH PEN COLOR SHOULD BE USED? 

1 

Beep signify the end of plot, then press CONTINUE. 



LSD 



ERROR MEAN SQUARE = 5.4556 

DEGREES OF FREEDOM = 45 

CONFIDENCE LEVEL = .95 

TABLE VALUE FROM STUDENT'S t = 2.0200, LSD 



2.1100 



SAMPLES RANKED 



4 3 



5 1 



A — 

B 

C 

MEANS 

1 -C 

2 -A 

3 -A 

4 -A 

5 -B 



72.00 
70.40 
68.80 
67.20 

§= 65.60- 

z: 

uj 64.e8h 

o_ 

s: 

& 62.40 
60.80 
59.20 
57.60 
56.00 



<s 



LSD 



I 



II 



oj en *■ 
SAMPLE NUMBER 



in 



216 



CHOOSE A NUMBER AND PRESS CONTINUE 

S 

ERROR MEAN SQUARE =? 

5 

DEGREES OF FREEDOM =? 

2 

WHAT CONFIDENCE LEVEL ? ( . 99, . 95, etc . ) 

.95 

TABLE VAL FROM NEW MULT RANGE TEST FOR 5 MEANS 

? 

3.17 

TABLE VAL FROM NEW MULT RANGE TEST FOR 4 MEANS 

? 

3.i 

TABLE VAL FROM NEW MULT RANGE TEST FOR 3 MEANS 

? 

3.01 

TABLE VAL FROM NEW MULT RANGE TEST FOR 2 MEANS 

? 

2.86 



Choose Duncan's multiple comparison procedure 



Tables available in appendix 



DUNCAN'S TEST 



ERROR MEAN SQUARE = 5.0000 
DEGREES OF FREEDOM = 2 
LEVEL OF CONFIDENCE = .95 



NUMBER OF MEANS = 5, TABLE VALUE = 

NUMBER OF MEANS = 4, TABLE VALUE = 

NUMBER OF MEANS = 3, TABLE VALUE = 

NUMBER OF MEANS = 2, TABLE VALUE = 



3. 


170 


, DIFFERENCE 


s 


2. 


.242 


3. 


100 


, DIFFERENCE 


= 


2 


.192 


3. 


010 


, DIFFERENCE 


= 


2 


.128 


2, 


860 


, DIFFERENCE 


= 


2 


.022 



SAMPLES RANKED 



4 3 



5 1 



C 



MEANS 

1 -C 

2 -A 

3 -A 

4 ~A 

5 -B 

CHOOSE A NUMBER AND PRESS CONTINUE 
6 



Same conclusion as in LSD 



Exit multiple comparisons 



Enter desired function- 
3 



Choose Kruskal-Wallis test 



KRUSKAL-WALLIS TEST 



CHI-SQUARE = 38.1101 DF = 4 
P(CHI-SQUARE > 38.1101) = 0.0000 



Conclude treatments differ. 



Enter desired function 

5 



Exit 3 or more samples 



Enter nunber of desired function: 
6 



Return to BSDM 



217 



Analysis of Variance 



General Information 

Description 

The Analysis of Variance package is made up of six analysis routines as well as a number of 
auxiliary routines that can be used after the analysis of variance (ANOVA or AOV) is 
completed. 

The following analyses are available for balanced data sets - 

• Factorial design - multiway classification with or without major blocks. 

• Nested design - includes completely nested, mixed nested and crossed classifications. 

• Split-plot design - several types in which one or more factors can be in the whole plot. 

These three analyses can be used for balanced or unbalanced designs - 

• One-way ANOVA - completely randomized one-way classification. 

• Two-way ANOVA (unbalanced) - one or more of the cells can be empty or be unequal 
in sample size. 

• One-way Analysis of Covariance - for the completely randomized one-way classifica- 
tion. 

For each of the designs in this package, the objective of the routine is to sort out the sources 
of variability and assign, if possible, responsibility for a portion of the total variability in the 
data to certain factors in the design. 



Input 

The first step is to input your data via the Basic Statistics and Data Manipulation routines. 
Because the data for the AOV programs must be in a very structured format, please read the 
Basic Statistics and Data Manipulation section of this manual and the portion of this section 
entitled Data Structures before entering your data. After entering your data, one of the six 
types of designs is selected and questions will be asked in order to determine the exact 
design you are using. 



218 



Auxiliary Routines 

The following routines can be used to complement the analyses performed by the six design 
routines - 

• Orthogonal Polynomials - performs a decomposition of the specified sum of squares 
into linear, quadratic,..., portions. This routine should be used only for factors with 
quantitative levels. 

• Treatment Contrasts - performs a comparison on a specified factor. Output includes 
sum of squares and F ratio. 

• Multiple Comparison Procedures - can be used to perform one or more of five 
routines to determine which factor levels represent different population levels. For a 
more detailed description, please see the portion of this manual entitled Multiple Sam- 
ple Tests in the General Statistics section. 

• Interaction Plot - allows you to study the relationship between two or three factors. 
(Not available from One-way or Covariance routines.) 

• FPROB - generates right-tailed probability values for the F distribution. 

Special Routines 

New Response 

This allows you to specify a new response variable for the last design chosen. So, even after 
you have done multiple comparisons (or any other analysis) you may go back to the same 
design and specify a new response variable without having to answer all of the design 
questions. 

After this is done, a title and description of the last design will be displayed on the CRT. 

Special Considerations 

Limitations 

This program is capable of handling 50 variables with a total of 1500 data values. In 
addition, there are certain limitations imposed for each program as follows - 

• Factorial - the product of (levels of A)*(levels of B)*(levels of C)*(levels of D) = size 
=s 500. Also, (number of blocks)*size*(number of observations per cell) =s 1500. 

• Nested - size (as described above) =s 500. No blocks are permitted. 

• Split Plot - Blocks are necessary. Only factors A,B and C are permitted in addition to 
blocks, and (levels of A)*(levels of B)*(levels of C)*(number of blocks) *£ 500. 

• One Way - There can be up to 50 treatments. 

• Two Way (unbalanced) - At least one cell must have more than one observation. The 
number of rows (A factor) =s 20. The number of columns (B factor) =£ 20. (number of 
rows)*(number of columns) =£ 200. 



219 



• One-way Covariance - There can be up to 25 treatments. 

• Orthogonal Polynomial - The polynomial can be up to the tenth degree. 

• Treatment Contrast - There can be up to 20 levels of one-way means and up to 200 
levels of two-way means. 

• Multiple Comparison - same as for Treatment Contrast. 

• Interaction Plot - there can be no more than 20 levels of the factor plotted on the X 
axis, otherwise the plot becomes "messy". 

Balanced vs. Unbalanced Designs 

To convert from a balanced design to an unbalanced design, you need to use the data 
manipulation section of the package to create variable(s) with the factor levels for the two 
factors in the unbalanced design. 

On the other hand, if you have finished a factorial analysis and now want to use a one-way 
design on the same data set, the program allows you to do this by selecting the Advanced 
Statistics option on the menu. 



Discussion 

General 

The analysis of variance (AOV) technique can be used in many data analysis situations 
where it is desired to characterize the sources of variation in a "planned" experiment. The 
essential feature of AOV is that the total variation of the numbers (data) is uniquely decom- 
posed into separate parts. For example, suppose we have run an experiment in which we 
used four varieties of corn and three row spacings. We repeated this experimental set-up 
five times (on five fields). We can then break the total variation down into five components 
as indicated below: 











AOV 








Source 


DF 






SS 


MS 


F 




Total 


5*4*; 


3-1 = 


= 59 


SSt 








Fields (or Blocks) 


5-1 


= 4 




SSb 


MSb 


Fi = 


= MSb/MSe 


Varieties 


4-1 


= 3 




SSv 


MSv 


F 2 = 


= MSv/MSe 


Row Spacings 


3-1 


= 2 




SSr 


MSr 


Fa = 


= MSr/MSe 


Var. X Row 


3*2 = 


6 




SSvR 


MSvr 


F4 = 


= MSvr/MSe 


Error 


44 






SSe 


MSe 







220 



In order to more fully develop our understanding of the usefulness of AOV, let us discuss 
how one might use such a table. Starting with the first column, we see the decomposition of 
the total variation into its five components. The next column shows the allocation of the 
so-called degrees of freedom (see references). Notice that the degrees of freedom compo- 
nents add up to the degrees of freedom associated with the total sum of squares. For the 
total source of variation, the degrees of freedom will be the total number of observations in 
the experiment minus one. The SS(sum of squares) column shows the breakdown of the 
total sum of squares for the experiment into the various components. One could prove 
algebraically that SSt = SSb + SSv + SSr + SSvr + SSe and likewise for the degrees of 
freedom. The MS (mean square) column is obtained by taking SS/DF. This reflects an 
"average" variation due to each of the sources. 

The last column is the F-ratio or testing column. Generally, we are testing the hypothesis 
that there is "nothing" happening in the experiment versus the expected hypothesis that 
something "worthwhile" is occurring. If nothing is happening, then all mean sources of 
variation should be of the same magnitude as the error mean square. The F-ratio is a 
statistical test to see if the mean square for the source of variation in question is significantly 
bigger than the error mean square. If it is, we can conclude that there is a "real" effect. For 
example, suppose that F2 is quite large. We would then be able to conclude that the 
population variety means are not all the same. That is, at least one of the variety means 
differs significantly from the others. 

How big do the F values have to be? That depends on the degrees of freedom associated 
with the numerator MS and the degrees of freedom associated with the denominator (error) 
MS. The computed F values may be compared with tabled values to find out if they are 
significant at the .10, .05, .01, or .005 level, or, with this program, you can actually compute 
the level of significance. The program will automatically calculate the Prob[F > F 
calculated] for a factorial AOV. For nested or partially nested AOV, the user may elect to use 
the F probability option to find the probability levels. 

Factorial Versus Nested Models 

Many researchers have difficulty differentiating between a factorial model and a nested 
model for AOV. A brief example may be of some help. In a three-way factorial model, for 
example, the levels of factor B are the same over all levels of factors A and C. Suppose 
factor A is three temperature settings, factor B is two pressure settings and factor C is four 
different laboratories. In a factorial model, we would assume that each of the six (three 
temperature * two pressure) combinations had been studied at each of the four laborator- 
ies. In a nested AOV with factor C nested in A and B, we might assume that the same six 
combinations were run; however, for each of the six combinations, four different laborator- 
ies (greenhouses, plants, fields, classrooms, etc.) were used. Hence, a total of 24 laborator- 
ies were used instead of just four. Assuming just one observation per laboratory and ex- 
perimental combination, the AOV table for the factorial would be: 



221 



Factorial AOV Example 
Source DF SS MS 



Total 


3*2*4-1 = 


23 


JJ Total 




Temperature 


3-1 = 2 




SSt 


MSt 


Pressure 


2-1 = 1 




SSp 


MSp 


Temp x Pres 


2*1=2 




SStp 


MStp 


Laboratories 


4-1 = 3 




SSl 


MSl 


Temp x Lab 


2*3 = 6 




SStl 


MStl 


Pres x Lab 


1*3 = 3 




SSpl 


MSpl 


Temp x Pres x Lab 


2*1*3 = 6 




SStpl 


MStpl 



However, for the nested model described above, the AOV table would be: 

Nested AOV Example 
Source DF SS MS 



Total 


23 




SSlotal 




Temperature 


3-1=2 




SSt 


MSt 


Pressure 


2-1 = 1 




SSp 


MSp 


Temp x Pres. 


2*1=2 




SStp 


MStp 


Lab (temp x pres) 


(4-l)*3*2 = 


= 18 


SSlitpi 


MSlitpi 



Notice that the AOV tables are somewhat different. Actually, the SSutp) can be obtained 
(and is in the program) from the first AOV table by noting that SSl(TP) = SSl + SStl + 
SSpl + SStpl. Generally, in nested or partially nested AOV's, the nested factor is considered 
to be a random effect. 

Partially Nested vs. Nested Models 

Consider a laboratory experiment involving mice in which three levels of some drug (factor 
A) are to be investigated. Seven mice (factor B) are used for each drug level and the 
response variable is determined on four days (factor C). One model which might be used for 
the analysis would be three levels of factor A; seven levels of factor B nested on factor A; 
and four levels of factor C. The AOV table would be: 

AOV 

Source DF SS MS 



Total 


83 


Drug 


2 


Mice(Drug) 


18 


Days 


3 


Drug x Days 


6 


Time x Mice(Drug) 


54 



OO Total 

SSd MSd -* 

SSm(D) MSmidi 

SSt MSt -*- 

SSdt MSdt-*" 

SStmidi MStmid) - 



222 



This type of design is sometimes called a repeated measurements design. It is also a partially 
nested design because factor C is crossed both with factor A and the nested factor B. As is 
indicated by the arrows in the AOV table, at least two different "error" terms are used for 
studying the significance in this model. It should be noted that it is necessary to have exactly 
the same number of subjects within each level of factor A in order to use the analysis in this 
package. 



Two-Factor AOV Structure 

The analysis of variance is a method of decomposing the sum of squared deviations of the 
observations about the overall mean [l{yak - -y...) 2 ] into various sources. For a two-factor 
design, we may show sources of variation due to the row effect (A), the column effect (B), 
the row-by-column interaction effect (AB) and the within error effect (ERROR). For exam- 
ple, consider an experiment in which we have four levels of temperature (100, 150, 175, 
200°C) and three levels of pressure (5, 10, 15 psi) with several determinations of the 
chemical yield (y) for each combination of temperature (ROWS) and pressure (COL- 
UMNS). One possible arrangement of the data might be as shown below: 









Pressure 








5 


10 


15 


Temperature 




Column 1 


Column 2 


Column 3 


100 


Row 1 


ym. yiinii 


y 121.. yi2nl2 


y i3i. ,.yi3ni:i 


150 


Row 2 








175 


Row 3 








200 


Row 4 


y41 1 ... ,y41n41 


y421. .y42n42 


y431 y43n43 



Each y,jk stands for the numerical value of the chemical yield in percent. The subscript i 
refers to the row designator, the j for the column designator, and the k for the observation 
number in the i.jth cell. Notice that the nu are not necessarily all equal, nor is it necessary 
that n i( be > = 1. If the ny are all equal, the analysis of variance involves the usual summing 
and summing of squares, a task which could be performed by hand calculators. When the n„ 
are not all equal, the exact analysis is quite complicated. 

Note that the table which we have described above does not show how the experiment was 
actually run. According to good statistical practice the order of running the experiment 
should be in a random fashion. That is, conceptually, all of the possible sequences should 
be equally likely and the experimenter should choose one sequence at random. 



223 



Reasons for Unbalanced Designs 

Unbalanced two-factor designs might arise in at least three ways. First, the design could 
have been planned as a balanced design (all n„ equal). However, several observations may 
be lost due to death of a subject, etc. This often happens in research even though ex- 
perimenters use good experimental techniques. Second, because of the nature of the 
variability of one response (or some other reason), the experimenter may have set up the 
design with an unequal number of observations in the cells. For example, suppose that one 
of the row levels is really a control or standard dose. It may be a common practice to use 
fewer observations on the control than the other drugs (other "levels" of the row factor). A 
third possibility is that certain combinations of the row and column levels might yield results 
which are impossible to monitor in an experiment. This might happen if in the experiment 
described above, the highest temperature level (200°C) and the highest pressure level (15 
psi) proved to be "too much" for the chemical process. In general, of course, it is not a good 
procedure to design two-factor experiments in which certain levels of the factors cannot be 
included in the experiment. 

Approximate Analyses for Two-Factor Experiments 

If each cell (row-column combination) has at least one observation and the number of 
observations in each cell is approximately the same, the method of unweighted means is 
sometimes used. Essentially, in this analysis, the cell means are subjected to the usual 
two-way AOV with one observation per cell, and the within error term is added to the table 
after adjustment. (See Bancroft, reference 1, p. 35.) This approximate analysis will prob- 
ably allow you to draw accurate conclusions for most sets of data. 

One reason why we might use this type of analysis is because the "exact" analysis is quite 
complicated. The complexity of the analysis is related to the fact that the calculations which 
must be performed do not just involve the usual summing and summing of squared values. 
In short, the exact analysis is a "messy" problem. 



Unbalanced Two-Way AOV - "Exact" Solutions 

As described more completely in reference 1, Chapter 1, the solution involves rather messy 
notation. We shall avoid the notational problems by describing, in words, the procedures 
that you should use in interpreting the AOV tables, rather than describing the computing 
procedures which were used. 



224 



Once again, the idea of the AOV is to separate out the various sources of variation from an 
observable set of data. In the balanced two-factor design, the analysis of variance table 
might be written as follows: 

AOV 



Source 



df 



Sium of Squares Mean Squares 



Total 



N 



TSS 



Rows 


R - 1 


RSS 


Columns 


C-1 


CSS 


RxC 






Interaction 


(R-1)(C- 


-DISS 


Residual 


N-RC 


ESS 



RSS 
CSS 


- (R-D 

- (C-1) 


ISS -=■ 
ESS - 


(R-1) (C-1 
- (N-RC) 



In this table, R equals the number of rows, C equals the number of columns, and N equals 
the number of observed y's. The computations which are involved in obtaining the Sum of 
Squares column will not be described. Suffice it to say that in each case the individual 
observations or the means are compared to the overall mean. 

As a brief review, let us examine that AOV procedure. According to the AOV procedure, we 
are trying to determine if the source of variation for rows, columns, and/or the interaction is 
significantly bigger than the error source of variation. This is done by calculating certain 
ratios of mean squares—the so-called F-ratios. Under the assumption of no differences 
among the row population means (i.e., levels of temperature), the mean square (MS) for 
rows should be of the same magnitude as the MS for the error. In a similar fashion, the 
source of variation for columns and interaction can also be tested. 

For balanced sets of data, that is where the subclass frequencies are all the same, the 
decomposition of the sources of variation for a two-factor design is orthogonal. This means 
that every SS and MS in the table represents the source of variation as indicated in that row. 
When we have an unbalanced design, the table is not as easy to interpret. 

In order to understand the output provided by this program, we will use the hypothetical 
experiment described earlier. Suppose that the table of nu, the frequency counts for the 
twelve row-column cells is as follows: 



Temperature 



Pressure 





5 


10 


15 




100 


5 


4 


5 


N = 54 


150 


5 


5 


5 




175 


5 


5 


4 




200 


4 


3 


4 





225 



Ordinarily we would ask the investigator to use equal n^; however, there might be perfectly 
good reasons why this was not possible. 

Preliminary AOV Tables 

The next output from this program is the Preliminary AOV tables. The first table has the 
general form: 



Source 



DF 



Preliminary AOV 

SS MS F-ratio 



Total N - 1 = 53 SSt 

Subclass* RC-1 = 11 SSs MSs MSs/MSe 

ERROR N- RC = 42 SSe MSe 

* Rows + Columns + Interaction 

The decomposition in this table looks as if we have twelve individual treatments rather than 
four temperature and three pressure combinations. If the F-ratio is large (and the F-Prob is 
small), say less than about .05, we can conclude that not all twelve population means are 
the same. The second table has a further decomposition of the subclass source into main 
effect differences and interaction differences. 



Source 



DF 



Interaction Preliminary. AOV 

SS MS F-Ratio 



Total 


N-1 =53 




SSt 






Main Effects* 


R + C-2 = 5 




SSm 


MSm 


MSm/MSe 


Interaction** 


(R-D(C-l) = 


6 


SS. 


MS, 


MSi/MSe 


Error 


N-RC = 42 




SSe 


MSe 




* Row + Column 












**RxC 













This table helps us determine if there is interaction in our two-way design. This is important 
because it may help us decide which analysis to use next, that is, which of the FINAL AOV's 
we should choose (see Bancroft). 

If one or more cells are empty, the method of fitting constants must be used for the final 
analysis. For the method of fitting constants, we assume no interaction is present in the 
model. Hence, if either one n« = and/or interactions are assumed to be absent in the 
population, we should use the METHOD OF FITTING CONSTANTS FINAL AOV. If in- 
teraction between the row and column factors is expected to be present in the population 
and all n.j > = 1, the METHOD OF SQUARED MEANS should be used. 



226 



If you are uncertain whether or not interactions are present, your interpretation of the 
output of the PRELIMINARY AOV table for interactions may help you decide. If the F- 
PRQB for the interaction F-ratio is small enough, we might conclude that interaction is 
present. (Bancroft, reference 1, suggests that if F-PROB < .25, one should use the method 
of squared means. ) 

Interpreting the Method of Fitting Constants AOV 

Since this method assumes that the model is of the form Y = A + B * (ROW LEVELS) + C 
* (COLUMN LEVELS) + ERROR, what remains to be tested by this method is if the row 
levels (means) differ significantly from each other and if the column levels (means) differ 
significantly from each other. The calculations involve (see page 16, Bancroft) finding the 
solution to a set of least-squares equations. As we discussed above, when all rn, are equal, 
the sum of squares due to rows is orthogonal to the sum of squares for columns. However, 
when the n« are not all equal, by using the method of fitting constants, the program will 
construct the following table: 

Source DF SS MS F-Ratio 



Total 


N-l = 53 


SSt 






Rows (unadjusted) 


R-1 = 3 


SSr 


MSr 




Columns (adjusted) 


C-1 - 2 


SSc-A 


MSc-a 


Fi = MSc-a/MSe 


Columns (unadjusted) 


C-1 = 2 


SSc 


MSc 




Rows (adjusted) 


R-1 = 3 


SSr a 


MSr-a 


F 2 = MSr-a/MSe 


Interaction 


(R-D(C-l) = 6 


SS: 


MS. 


F 3 = MS./MSe 


Error 


N-RC = 42 


SSe 


MSe 





The first two F-ratios can be used to test the following hypotheses: 

Ho: The "B" terms in the model are not needed; Ho: The "C" terms in the model are not 
needed. The third F-ratio is the same test for the interaction obtained in the preliminary 
AOV table. Notice that the SS for columns is obtained after correction for rows. That is, 
SSc a (columns adjusted for rows) = SSm (main effects in preliminary AOV table) - SSrow. 
(rows ignoring the column effects). Hence, some of the calculation for the final AOV by the 
method of fitting constants are derived from the preliminary AOV table. 

In conclusion, the method of fitting constants allows us to make "good" tests for main 
effects if the interaction term is absent. Also, if one or more nu — zero we must use this 
method since the interpretation of a significant interaction is questionable anyway. After 
determining that the row and/or column means differ significantly, one might wish to do 
some type of multiple comparison procedure to determine where the significant differences 
lie. 



227 



Interpreting the Method of Squared Means AOV 

When interaction is assumed present in our model or suspected to be present in the model 
after studying the preliminary AOV table, the method of squared means can be used to find 
"good" estimates of the main effects if all n,, > 0. This analysis operates on the cell means 
weighted by Wi = c 2 /(S 1/n.j) for the ith row and Wj = r 2 /(21/nu for the jth column. The 
model for this situation would be: 

Y = A + B * (ROW LEVEL) + C * (COLUMN LEVEL) + 

D (ROW, COLUMN LEVELS) + ERROR 

where A represents the average value and D represents the coefficient for the interaction 
term. The method, which is described on pages 24-29 of Bancroft, would yield an AOV 
table as follows: 

Source DF SS MS F-Ratio 



Total 


N-1 =53 










Rows (weighted) 


R-1 = 3 




SSr-w 


MSr-w 


MSr-w/MSe 


Columns (weighted) 


C-1 = 2 




SSc-w 


MSc-w/MSe 


MSc-w/MSe 


Interaction 


(R-1) (C-1) = 


= 6 


SSi 


MS. 


MS./MSe 


Error 


N-RC = 42 




SSe 


MSe 





The F-ratios for rows and columns using the weighted cell means will indicate if the main 
effects are significant. Of course, if the interaction term is already determined to be signifi- 
cant, the interpretation of the main effects must be given careful consideration. Quite 
frequently experimenters find it useful to plot the subclass means in order to study the 
"pattern" for the interaction. 

Orthogonal Polynomial Breakdown 

If the levels of the row and/or column factors are quantitative, it might be of interest to 
decompose the sum of squares for these terms into single-degree-of-freedom terms for a 
polynomial model. For example, suppose that the row levels are quantitative such as the 
temperature levels which we described above (100, 150, 175, 200°C). Since there are four 
levels, it is possible to fit up to a third degree polynomial to the row levels. Hence, the SS for 
rows could be decomposed into orthogonal components for linear, quadratic and cubic 
terms, each with one degree of freedom. The program will perform the elaborate calcula- 
tions even if the row or column levels are unequally spaced. (For example, the column 
levels were given as 5, 10, 15 psi. Instead, they could have been 5, 10, 20 psi with unequal 
spacings between the levels.) 

For further information about these procedures, see references 1 and 2. 

References 

1. Bancroft, T.A. (1968). Topics in Intermediate Statistical Methods. The Iowa State 
University Press, Ames, Iowa. 

2. Searle, S.R. (1971). Linear Models, John Wiley and Sons. 



228 



Data Structures 

In order to provide for the analysis of six different types of designs the arrangement of the 
data must be 'presumed' by the program. The material that follows describes the various 
arrangements within the Basic Statistics and Data Manipulation (BSDM) routines, which are 
possible for each design. Please read the section dealing with the design which you are 
considering before attempting to enter your data. 

Further information about the designs considered in this package can be found in the 
Discussion section and in the references. 

Factorial Designs 

All data to be analyzed with the Analysis of Variance package is entered into memory via the 
Basic Statistics and Data Manipulation routines. The order in which the data is entered is 
very important. In general, sampling replications are entered in order, then factors are 
varied, then blocks are varied. That is, assuming a four-factor design and no sampling 
replications, the levels of factor D must vary the most rapidly, followed by the levels of C, B, 
A, and finally the levels of the blocks. Consider an example in which there are two blocks 
(major replications), two levels of A and three levels of B. Assume for the moment that we 
do not have any sampling replication and only one response variable. The structure within 
the Basic Statistics and Data Manipulation (BSDM) program would use only one variable 
since it is not necessary to store the levels of the factors and blocks when using the (ba- 
lanced) Factorial program. The structure for this two-way factorial in two blocks would be: 





Response 


Factor 


Factor 




OBS.# 


Variable 1 


B 


A 


Blocks 


1 


Yin 


Bi 


Ai 


Block 1 


2 


Yll2 


B2 






3 


Yll3 


B 3 






4 


Yl21 


Bi 


A2 




5 


Yl22 


B2 






6 


Yl23 


B 3 






7 


Y211 


Bi 


Ai 


Block 2 


8 


Y212 


B2 






9 


Y213 


B 3 






10 


Y221 


Bi 


A z 




11 


Y222 


B2 






12 


Y223 


B 3 







Note 

The levels of Factor B vary most rapidly while the blocks vary the 
slowest. The Ys represent numerical data which is the only in- 
formation stored in BSDM. The first subscript indicates the block, 
the second indicates the level of factor A and the third designates 
the level of factor B. 



229 



You should remember that it is absolutely essential that you arrange your data in this form 
prior to entering the BSDM program. Of course, if you are careful, there are ways around 
the apparent limitation suggested above. Consider the following data set which has already 
been entered via the BSDM program: 



OBS# 


Variable (i) 


Factor V 


Factor U 


Blocks 


1 


Yin 


Vi 


Ui 


Block 1 


2 


Yl21 


V2 






3 


Yll2 


Vi 


U2 




4 


Yl22 


V2 






5 


Yll3 


Vi 


u 3 




6 


Yl23 


V2 






7 


Y211 


Vi 


Ui 


Block 2 


8 


Y221 


V2 






9 


Y212 


Vi 


U2 




10 


Y222 


V2 






11 


Y213 


Vi 


U3 




12 


Y223 


V2 







First of all, note that blocks (major replications) must vary the slowest. We can use this data 
structure in the Factorial program by telling the program that factor A, the factor which 
varies slowly, is factor U and has three levels; while factor B is our factor V and has two 
levels. Hence, independent of the implied subscripts, levels and ordering, we have con- 
siderable flexibility in specifying the factors. We must only make sure the Factor A is the 
factor which varies most slowly while Factor B is the factor which varies most rapidly. 



So far we have described how the data must be structured for the major replications and 
factors. We will now describe the two modes of data arrangement which are permissible for 
the minor replications (samples). If you have only one sample per treatment combination, 
there will be no difference between the two modes. 



230 



The first mode assumes that the response variable resides in only one of the variables 
specified in BSDM. Hence any minor replications/samples will have to be entered as subse- 
quent observations in BSDM. For example, suppose we have a factorial with two blocks, 
two levels of factor A, and three levels of factor B, with two replications (samples) per 
factorial combination. The data structure with three different response variables might 
appear as follows: 







Variables 






Factor 




OBS# 


1 = %Ca 


2 = %Cu 


3 = %Fe 


Sample 


B 


A 


Block 


1 


Xn 


X.21 


X31 


1 


Bi 


Ai 


Block 1 


2 


Xl2 


X.22 


X32 


2 








3 


Xl3 


X.23 


X33 


1 


B 2 






4 


Xl4 


X.24 


X34 


2 








5 


XlS 


X.25 


X35 


1 


B3 






6 


Xl6 


X.26 


X36 


2 








7 


Xl7 


X.27 


X37 


1 


Bi 


A2 




8 


XlS 


X.28 


X38 


2 








9 


Xl9 


X29 


X39 


1 


B2 






10 


XllO 


X:210 


X310 


2 








11 


Xm 


X:211 


X311 


1 


B 3 






12 


Xll2 


X:212 


X312 


2 






Block 2 


24 


Xl24 


X:224 


X324 


2 


B 3 


A2 





The first mode of replicate/sample storage conserves on the use of variables (see Special 
Considerations for program limitations); however, it does use more observations. 



If you have only one response variable in your experiment it may be more efficient to use 
the second mode for specifying the sampling replications. This mode assumes that each 
observation in the BSDM program contains all replication values stored one per variable. 
Hence, the same design described above would appear as follows (here, the subscripts 
indicate the levels of factor A and factor B, respectively): 





Variables 




Factor 


Factor 




OBS. 


l = Repl 


2- 


= Rep2 


B 


A 


Block 


1 


Xn 




X21 


Bi 


Ai 


Block 1 


2 


X12 




X22 


B2 






3 


Xl3 




X23 


B 3 






4 


Xl4 




X24 


Bi 


A2 




5 


Xl5 




X25 


B2 






6 


Xl6 




X26 


B 3 







231 



One other example is included without comment. Keep in mind that in our examples we 
have named the factors A, B, C, and D. As long as your data is arranged in some order with 
one factor varying the most rapidly within another factor, etc; you can call these factors A, 
B, C, and D where your factor called A will vary the slowest, etc. 

Example (Factorial)— two Blocks, two levels of Factor A, three levels of factor B, two sam- 
pling replications: 

DATA ENTRY OPTIONS 



FORM 1 



FORM 2 



OBS.# 








Variable #1 


1 


Blki 


Ai 


Bi 


Repi 


2 








Rep2 


3 






B2 


Repi 


4 








Rep2 


5 






Bs 


Repi 


6 








Rep2 


7 




A 2 


Bi 


Repi 


8 








Rep2 


9 






B 2 


Repi 


10 








Rep2 


11 






Bs 


Repi 


12 








Rep2 


13 


Blk2 


Ai 


Bi 


Repi 



OBS.# 








Variable#l 


Variable#2 


1 


Blki 


Ai 


Bi 


Repi 


Repz 


2 






B 2 


Repi 


Rep2 


3 






Ba 


Repi 


Rep2 


4 




A 2 


Bi 


Repi 


Rep2 


5 






B 2 


Repi 


Rep2 


6 






Bs 


Repi 


Rep 2 


7 


Blk2 


Ai 


Bi 


Repi 


Rep2 


8 






B 2 


Repi 


Rep 2 


9 






Bs 


Repi 


Rep2 


10 




A 2 


Bi 


Repi 


Rep2 


11 






B 2 


Repi 


Rep2 


12 






Bs 


Repi 


Rep2 



The order of the observations must be as shown above to get the correct results. In general, 
the levels of blocks will vary slower than levels of factor A, B, C, D and replicates within cells 
vary the fastest. 

Nested Design 

The form of the data structure for the nested or mixed design is quite similar to that 
previously described for the Factorial Designs. As far as the program is concerned, the 
nested design is considered to be in a factorial arrangement. The program will calculate the 
sum of squares, etc., as if the design were a factorial design and then pool the appropriate 
terms to form the nested or mixed design which you specified. 

As you may have already noted, the design must be balanced. This means that if factor C is 
nested within factor A and is denoted as C(A), then there must be exactly the same number 
of levels of factor C within each level of factor A. You may wish to refer to the Discussion 
section to familiarize yourself with the design arrangements for a nested design as compared 
to a factorial design. 



232 



Perhaps an example of a completely nested design structure would be helpful at this time. 
Suppose that within each of five sections of land we select two lakes at random. From each 
lake assume that three random positions in the lake are chosen at which we select two 
samples. Suppose further that the samples are each divided into two beakers and are 
analyzed separately. Assume that three responses are measured: Yi = Var. l = ppm lead, 
Y2 = Var.2 = ppm zinc, and Y3 = Var.3 = ppm copper. 

In this experiment, we will designate the five land sections as the levels of factor A, the 
various lakes as levels of factor B, and the position as levels of factor C. Notice that factor B 
is nested in factor A, and that factor C is nested within factor B. These relationships are 
commonly denoted by B(A) and C(B) respectively. 

For the first form of data arrangement, the two samples per position in the lake will be 
shown as stored in subsequent observations (down) rather than in an additional variable 
(across). A dash ( — ) indicates a numerical value which would be entered in BSDM. 

Form 1 



Obs# 


Varl=Yi 


Var2 = Y 2 


Var3 = 


= Y 3 


Sample 


Position 


Lake 


Section 


1 


_ 


. 


- 




1 


Pi 


Li 


Sec 1 


2 


- 


- 


- 




2 


- 


- 


- 


3 


- 


- 


- 




. 1 


P2 


- 


- 


4 


- 


- 


- 




2 


- 


- 


- 


5 


- 


- 


- 




1 


P.3 


- 


- 


6 


- 


- 


- 




2 


- 


- 


- 


7 


- 


- 


- 




1 


Pl = P4* 


L2 


- 


8 


- 


- 


- 




2 


- 


- 


- 


9 


- 


- 


- 




1 


P 2 = P 5 


- 


- 


10 


- 


- 


- 




2 


- 


- 


- 


11 


- 


- 


- 




1 


P3='P 6 


- 


- 


12 


* 


■ 


~ 




2 








60 


_ 


_ 


_ 




2 


Pa = P30 


L2 = L 10 * 


Sees 



* Within each lake the "first" position Pj has no relationship with the "first" position in another lake; hence we have a total of thirty different lake positions. 
** Since each section has two lakes selected from it, there are a total of ten lakes studied in this project. 



233 



The other form of data entry for this nested design would use twice as many variables since 
each sample would be included as another variable rather than another observation. Hence 
the last row would look like: 





Sample 1 


Sample 2 


Sample 1 


Sample 2 


Sample 1 


Sample 2 


Obs# 
30 


Varl=Yi 


Var2 = Yi 


Var3 = Y 2 


Var4 = Y 2 


Var5 = Ya 


Var6 = Y 3 



With a little practice you will find that it is quite easy to structure your data so that the Nested 
Analysis will correctly recognize your data set. 

Mixed designs must be entered via the BSDM routines in a similar manner. Keep in mind 
that whichever factor you call D must have its levels varying more rapidly than factor C 
which in turn varies faster than factor B. The levels of factor A will change only after each 
level of factor B have appeared once. 



Note 

BLOCKS as described in the Factorial Design are not considered 
for the Nested Design. That is, you will not be asked any questions 
concerning blocks (major replications) of this design. 



Split-Plot Design 

In terms of the data structure in the BSDM routine, it is immaterial whether one is using a 
Split-Plot Design or a Factorial Design. Both designs are the same in terms of the data 
arrangement in BSDM. Examples representing the two modes of data arrangement for the 
minor replications (samples) will be shown below. Consider a split-plot experiment in which 
the pull-off force necessary to remove boxes from a tape is to be studied (see Hicks pp 219- 
222, 226). Two complete replications (blocks) of the following experiment were performed. 
Three long strips of tape with boxes attached were chosen to represent three different 
methods of attaching the boxes to the strips. A chamber was used to study the effects of 
three humidity levels (50, 70, and 90%) on the pulling force of three boxes. The ex- 
perimental procedure called for randomly choosing one of the three humidity levels and 
adjusting the chamber to maintain that level. Two portions of each of the three strips were 
placed in the chamber for a specified period of time. The pull-force was then measured for 
each of the six portions of strip. Subsequently, one of the two remaining levels of humidity 
was randomly chosen and the process was repeated. Finally, the last level of humidity was 
maintained in the chamber. Upon completion of the first three humidities times three strips 
times two samples = 18 measurements, the entire process was repeated again in a random 
fashion. 



234 



The reason that this is a split-plot design and not a factorial is because of the ordering of the 
measurements of pull force. Since it was not deemed possible to randomly investigate the 
effects of humidity and strip type on the pull force response, we have a restricted rando- 
mization of the split-plot type. 

The two forms for specifying the sample replications are shown below. Note how the factor 
names A and B have been assigned to the factors in this experiment and how that corres- 
ponds to the data arrangement as shown. Only one response variable is necessary for this 
design. 

FORM 1 





Y = pull force 




B 


A 




OBS# 


Variable 1 


Sample 


Humidity 


Strip 


Block 


1 


_ 


1 


50% 


SI 


Bl 


2 


- 


2 








3 


- 


1 


70% 






4 


- 


2 








5 


- 


1 


90% 






6 


- 


2 








7 


- 


1 


50% 


S2 




8 


- 


2 








9 


- 


1 


70% 






10 


- 


2 








11 


- 


1 


90% 






12 


- 


2 








13 


- 


1 


50% 


S3 




14 


- 


2 








15 


- 


1 


70% 






16 


- 


2 








17 


- 


1 


90% 






18 


- 


2 








19 


- 


1 


50% 


SI 


B2 


36 




2 









In this experiment we would specify two blocks (major replications). Factor A (strips) has 
three levels, factor B (humidity) has three levels, and there are two samples for mode 1 (all 
samples wihin the same variable). Later, in the Split-Plot Design program, we would specify 
that factor B (humidity) is the whole plot while factor A (strips) is the subplot. As the 
experiment is described above, the humidity factor (B) would be in the whole plot even 
though it does not vary as fast as the strip factor (A). We could have entered our data in a 
manner which would have had the levels of humidity varying the slowest. Then we would 
identify humidity as factor A. 



235 



The second mode of sample specification for this example would require two variables, say 
variable one and variable two. 

FORM 2 





Y = pull force 




B 


A 


OBS# 


Var 1 = Sample 1 Var2 = Sample2 


Humidity 


Strip 


Block 


1 


- 


50% 


Si 


Bi 


2 


- 


70% 






3 


- 


90% 






4 


- 


50% 


Sz 




5 


- 


70% 






6 


- 


90% 






7 


- 


50% 


S3 




8 


- 


70% 






9 


- 


90% 






10 


- 


50% 


Si 


B 2 


11 


- 


70% 






12 


- 


90% 






13 


- 


50% 


s 2 




14 


- 


70% 






15 


- 


90% 






16 


- 


50% 


S3 




17 


- 


70% 






18 




90% 







The one-way design, or one-way classification as it is sometimes called, has three possible 
forms of data organization or structures in BSDM. These three forms are identical to the 
forms for the ONE-WAY ANALYSIS OF COVARIANCE except that the covariance analysis 
will expect both a response variable, Y, and a covariate, X, to be specified while the 
ONE-WAY DESIGN expects only the response variable Y. 

The first mode of data organization for the one-way classification uses t variables in BSDM 
to specify the t treatments in this design. Consider an experiment in which four types of 
"mums" were investigated in a greenhouse experiment. Suppose two responses were mea- 
sured: diameter (Yi) and plant height (Y2). The data was collected in two separate years 
(subfiles) with approximately five pots per variety. One possible organization of this data is 
as follows: 



236 



Mode 1 Example 





Variable 




1 


2 


3 


4 


5 


6 


7 8 




Response 




Yi 


Y 2 


Yi 


Y 2 


Yi 


Y 2 


Yi Y 2 




Treatment/Var 


lety 


Type 1 


Type 


•2 


Type 3 


Type 4 




OBS# 


















Subfile 


1 




- 


- 


- 


- 


- 


- 


- 


1975 


2 
3 
4 




- 


- 


- 


- 


- 


- 


- 






_ 


_ 


MV 


MV 


_ 


_ 


_ 




5 




- 


- 


MV 


MV 


MV 


- 


MV MV 


Subfile 


6 




- 


- 


- 


- 


- 


- 


- 


1976 


7 




- 


- 


- 


- 


MV 


- 


- 




8 




MV 


MV 


- 


- 


- 


- 


- 




9 




MV 


MV 


- 


- 


- 


MV 


- 




10 




MV 


MV 


- 


- 


- 


- 


- 




11 




MV 


MV 


MV 


MV 


- 


- 


- 



Here, a dash ( - ) indicates a numerical value is present, and MV indicates that a missing 
value is assigned to this position. 



Note 

The arrangement shown above has provisions for missing values 
to accommodate the various number of pots per treatment (varie- 
ty). The two subfiles do not have the same number of pots per 
treatment. The MV operation must be used to 'square-off the 
sample sizes for each variable. 



You would tell the program that variables one, three, five, and seven represent the four 
treatments for the first response (diameter). You would then specify the subfile number. 
The program would then assume that the sample size is five if subfile one is specified and six 
if subfile two is specified. If subfiles are to be ignored, then a sample size of 11 would be 
assumed. Of course all calculations within the program would check for missing values (MV) 
and delete those values from the calculations. Subsequent to the analysis on the first 
response, Yi, you may remain within this subfile and specify another response, say Y2. 
Finally, you may select another subfile and/or variables for further analysis. 

The second mode for possible data organization within the BSDM structure uses only one 
variable for each response. Within this response variable, the treatment observations are 
assumed to be contiguous. You specify the number of observations in each treatment 
including any missing values. The program assumes that the first observation in the first 
treatment is observation number one if the first subfile is chosen or subfiles are ignored, or 
the first observation within the specified subfile. Thereafter, the subfile is partitioned into t 
nonoverlapping but connected intervals - one corresponding to each treatment. Hence, for 
the example with four treatments and two response variables, one possible arrangement 
might be: 



Mode 2 EXAMPLE 



237 



Variable 







1 


2 


Treatments 




OBS# 


Yi 


Y2 


(Variety#) 


SUEJFILE 1 


1 


_ 


_ 


1 


1975 


2 


- 


- 






3 


- 


- 






4 


- 


- 






5 


- 


- 






6 


- 


- 


2 




7 


- 


- 






8 


- 


- 






9 


- 


- 


3 




10 


- 


- 






11 


- 


- 






12 


- 


- 






13 


MV 


- 






14 


- 


- 


4 




15 


- 


- 






16 


- 


- 






17 


- 


- 





SUEJFILE 2 


18 


1976 


19 




20 




21 




22 




23 




24 




25 




26 




27 




28 




29 




30 




31 




32 




33 




34 




35 




36 



MV 



MV 



1 
2 



238 



Note 

The sample sizes for the first subfile of each variable would be 
five, three, four, and four, respectively. For subfile two, the sam- 
ple sizes would be two, five, five, and six. Of interest is the com- 
parison between the number of data storage positions needed for 
the two modes of arrangement. For mode 1, the number of posi- 
tions required would be 11 observations times 8 variables = 88. 
For the second mode, the number required is 36 observations 
times 2 variables = 72. In many cases, if there are several missing 
values you may conserve available memory locations by using the 
second mode of arrangement. 

The third mode of data entry allows for treatments which are not necessarily connected 
within one variable. Each treatment is composed of a contiguous set of observations. Since 
this mode of data arrangement may choose treatment groups throughout the data set, it is 
not possible or necessary to specify subfiles. The arrangement of the data is similar to the 
arrangement described for method 2, however it is possible to have "gaps" or "holes" in 
the data set. 

Consider the example described above. Suppose it is desired to compare 1975 variety #2 
with the 1976 variety #2 for both responses (Yi and Y2). Please refer to the Mode 2 
Example and note that we would need to compare observations 6, 7, and 8 with observa- 
tions 20, 21, 22, 23, and 24. The first three specified observations are from variety #2 in 
subfile one which is the 1975 data set and the other five values are from variety #2 in subfile 
two which is the 1976 data set. 

Note that although this mode of data arrangement is quite similar to Mode 2, it does provide 
for more freedom on the part of the data analyst in terms of which treatments are to be used. 

Two-Way (Unbalanced) Design 

The unbalanced nature of this design makes it more complicated in terms of the data 
arrangement. It will not be possible to assume that the order of input is completely specified 
by factor names such as factor A and factor B. This is because it is possible to have not only 
different numbers of minor replication (samples) within each treatment combination (levels 
of factor A and factor B), but also to have one or more cells completely missing. Of course, 
the absence of certain cells is not a desirable characteristic of any factorial experiment; 
however, there are certain situations in which missing cells naturally occur. 

Therefore it is necessary for the BSDM data structure to provide for proper identification of 
the row and column levels (factors A and B) as well as the particular sample number within 
that cell. Two methods of specification are permitted for this type of design. The first "data 
storage type" assumes that you will use three BSDM variables to specify the response 
variable and factor levels. One variable will be used to store the particular response to be 
analyzed at this time. One variable will be used for each of the two factors A and B. It is not 
necessary to use a variable to specify the sample or observation number; however, you may 
wish to do so in order to completely identify each observation. 



239 



Please note that the levels of factors A and B must be the integers 1, 2, ...up to the number of 
levels of each factor. Hence, if factor A has three levels 70, 80, and 120, you would store 
these three levels in a variable as 1, 2, and 3 rather than 70, 80, and 120. The purpose of 
this restriction is to conserve data storage allocation. Within the program you will be able to 
specify the actual levels of the variables when this is necessary for the computation. 

As an example of the first data storage type, suppose you have factors of time and tempera- 
ture involved in an experiment which is designed to study the effects of these two factors on 
the yield (Y) of a chemical process. Suppose you had used three time settings of 4, 5, and 
7.5 hours and three temperature settings of 110, 115, 120° F. Assume that, for one reason 
or another, from two to five samples were run at each treatment combination (temperature 
and time condition). Further, let us assume that at the highest temperature and time condi- 
tion, it was impossible to finish the experimental process. Thus, we can consider this "cell" 
as missing. Assume two responses Yi and Y2 were measured on almost all samples. One way 
to enter this data set in the BSDM program is as follows: 



Mode 1 Example 



BSDM Variable Number 



Obs 


1 


2 


3 


4 




A 


B 


# 


Yi 


Y2 


B Levels 


A Levels 


Sample 


Temp 


Time 


1 


MV 


_ 


1 


1 


1 


110° 


4hrs. 


2 


- 


- 


1 


1 


2 






3 


- 


- 


1 


2 


1 


115° 




4 


- 


- 


1 


2 


2 






5 


- 


. 


1 


2 


3 






6 


- 


. 


1 


3 


1 


120° 




7 


- 


- 


1 


3 


2 






8 


- 


- 


1 


3 


3 






9 


- 


- 


1 


3 


4 






10 


- 


- 


2 


1 


1 


110° 


5 hrs. 


11 


- 


- 


2 


1 


2 






12 


- 


- 


2 


1 


3 






13 


- 


- 


2 


2 


1 


115° 




14 


- 


- 


2 


2 


2 






15 


- 


- 


2 


3 


1 


120° 




16 


- 


- 


2 


3 


2 






17 


- 


- 


2 


3 


3 






18 


- 


- 


2 


3 


4 






19 


- 


MV 


2 


3 


5 






20 


- 


- 


3 


1 


1 


110° 


7.5 hrs. 


21 


- 


- 


3 


1 


2 






22 


- 


- 


3 


1 


3 






23 


- 


- 


3 


2 


1 


115° 




24 


- 


- 


3 


2 


2 






25 


- 


- 


3 


2 


3 






26 


MV 


MV 


3 


3 


1 


120° 





240 



Notes: 

1. Observation number 26 is included to let the program know that the cell with 
temp= 120, time = 7.5 is missing in both responses. 

2. Both observation #1 and #19 have one and only one missing response. 

3. Although we have shown the 26 observations in a systematic arrangement, this is not 
necessary except for your own information. 

4. The specification of variable numbers in the analysis will identify which factor it 
should consider as rows (factor A) and which it should consider as columns (factor B). 

The second data storage mode allows you to conserve on variables by using only one 
variable to identify both row and column levels. The levels are "packed" into four digits as 
xxyy, where xx identifies the row level and yy identifies the column level. Consider the 
example described above. Using the packed form of storage we will need to allocate at least 
three variables in the BSDM routine. One variable is needed for each response and one foi 
the 'packed' row/column identification. You may wish to use another variable to identify the 
sample numbers or you might wish to use the 'space' after the row/column specification. Foi 
example, suppose for the third row and second column you wish to identify the observation 
by the index 74. The packed version would be 0302.74. The program will use only the firsl 
four digits 0302 to identify the row and column numbers. Up to 6 digits may be input after 
the decimal point for identification purposes. 

The example described above may be entered via the BSDM routine as follows (for the first 
ten and the last three observations): 



Mode 2 Example 



BSDM Variable Number 



Obs 


1 


2 


3 




A 


B 


# 


Yi 


Y 2 


ID 














xxyy 


Obs# 


Temp 


Time 


1 


MV 


_ 


0101 


1 


110° 


4hrs. 


2 


- 


- 


0101 


2 






3 


- 


- 


0102 


1 


115° 




4 


- 


- 


0102 


2 






5 


- 


- 


0102 


3 






6 


- 


- 


0103 


1 


120° 




7 


- 


- 


0103 


2 






8 


- 


- 


0103 


3 






9 


- 


- 


0103 


4 






10 


- 


- 


0201 


1 


110° 




24 


- 


- 


0302 


2 






25 


- 


- 


0302 


3 






26 


MV 


MV 


0303 


1 


120° 


7.5 hrs 



241 



One-Way Covariance 

The three forms of data arrangement for the one-way analysis of covariance are the same as 
the one-way design except that both a response variable (Y) and a covariate (X) must be 
specified. Hence, for the example previously described for mode 1 of the one-way design 
you would need to specify 12 variables of the BSDM data set and specify a covariate for 
each treatment set. If different covariates are to be used with the two response variables, 
then you would need 16 variables. One possible ordering of these variables and treatments 
for the ith observation is as follows: 





Type 1 


Type 2 


Type 3 


Type 4 


Variable# 


1 2 3 

X Yi Y2 


4 5 6 
X Yi Y 2 


7 8 9 
X Yi Y 2 


10 11 12 
X Yi Y 2 



For both mode 2 and 3, you would need to specify one additional variable number as the 
covariate for each dependent variable. Of course the response variables may use the same 
covariate in the analysis. 



242 



Factorial Design 

Object of Program 

This program will calculate the complete analysis of variance table for a two-, three-, or 
four-factor, completely balanced experiment. There may be multiple observations per cell 
and the entire experiment may be replicated in blocks. The program will automatically print 
out all main effect and two-way interaction means. If three- or four-way interactions exist, 
these interaction means may be printed. If there is more than one observation per cell, then 
tests for homogeneity of variance may be computed. If the experiment has not been repli- 
cated, or only one observation per mean is present, there will be no F values computed. All 
F tests assume that the factors are fixed. A label of up to ten characters may be assigned to 
each factor. 

Typical Program Flow 



Input data via BSDM 


■ 


Select advanced statistics 






Insert program medium 


■ 




Choose factorial design 


. 




Specify variables and subfiles 






Main effect means two way 
interaction means are printed 






AOV table 






Test for homogeneity of variance 


1 




Perform auxiliary routines, 
e.g., interaction plot 



Special Considerations 

See the General Information portion of this AOV manual for program limitations. Also, 
carefully read the Data Structures section before entering your data through Basic Statistics 
and Data Manipulation. 

References 

1. Cochran, W.G. and Cox, G.M., Experimental Designs, John Wiley and Sons, 
Inc., 1957. 

2. Snedecor, G.W. and Cochran, W.G., Statistical Methods, Iowa State University Press, 
1967. 



243 



Nested or Partially Nested Design 

Object of Program 

This program will calculate and print the AOV for any valid nested design. The program 
does this by computing a general factorial and then combining sums of squares to get the 
desired results. There can be up to five nested factors if samples are entered. This program 
does not allow the experiment to be replicated in blocks. The program will not compute any 
F ratios unless the design is a completely nested design. All non-nested main effects, main 
effect means, and two-way interactions will be printed. If there are any non-nested, three- 
way interaction means, they may be printed. 

Possible Designs 

All possible designs are displayed with arbitrary factors P, Q, R and S. In the program you 
will be asked to match your factors (A, B, etc.) with these arbitrary labels to obtain the 
design you desire. The notation, Q(P), means that factor Q is nested within factor P. The 
following options are available. 

Number of factors = 2 
P 

Q(P) 

Number of factors = 3 

Design 1 Design 2 Design 3 



p 


P 


P 


Q(P) 


Q 


Q(P) 


R(Q(P) ) 


PQ 


R 




R(PQ) 


PR 
QR(P) 



Number of factors = 4 
Design 1 Design 2 



p 


P 


Q(P) 


Q 


R(Q(P)) 


R 


S(R(Q(P) ) ) 


PQ 




PR 




QR 




PQR 




S(PQR) 



Design 3 

P 

Q 

PQ 

R(PQ) 

S 

PS 

QS 

PQS 

RS(PQ) 



Design 4 

P 

Q(P) 

R 

PR 

QR(P) 

S 

PS 

QS(P) 

RS 

PRS 

QRS(P) 



244 



Typical Program Flow 



Input data via BSDM 



Choose Advanced Statistics option 



Insert program medium 



Choose nested and 
partially nested design 



Specify variables and subfiles 



Main effect means two (or 3) way 
interaction means are printed 



AOV table 



Perform auxiliary routines, 
e.g., interaction plot 



Special Considerations 

See the General Information portion of this AOV manual for program limitations. Also, 
carefully read the Data Structure section before entering your data through Basic Statistics 
and Data Manipulation. 

References 

1. C.R. Hicks "Fundamental Concepts in the Design of Experiments" 2nd edition. Holt, 
Rinehart and Winston, 1973. 

2. D.C. Montgomery "Design and Analysis of Experiments". Wiley, 1976. 



245 



Split Plot Designs 

Object of Program 

This program will calculate a general factorial and then combine sums of squares to form 
specific error terms for the split plot or split-split plot design. 

Blocks must be present and at least two factors are necessary. Up to three factors may be 
specified and minor replications (samples) may also be declared. 

All main effects and interaction means will be printed. All computed F tests assume the 
factors are fixed. 

Typical Program Flow 



Input data via BSDM 



Select Advanced Statistics option 



Insert program medium 



Select split plot design 



Specify variables and subfiles 



Block and main effect means, 
two way interaction means are printed 



AOV table 



Perform auxiliary routines, 
e.g., interaction plot 



Special Considerations 

See the General Information portion of this AOV manual for program limitations. Also, 
carefully read the Data Structures section before entering your data through Basic Statistics 
and Data Manipulation. 

References 

1. C.R. Hicks "Fundamental Concepts in the Design of Experiments" 2nd edition. Holt, 
Rinehart, Winston, 1973. 

2. D.C. Montgomery "Design and Analysis of Experiments". Wiley, 1976. 



246 



One-Way Classification 



Object of Program 

This program will perform a one-way analysis of variance for treatments of equal or unequal 
size. You may give a ten character name to each treatment. For each treatment the name, 
sample size, total, mean, and standard deviation will be printed. The analysis of variance 
table will include all sums of squares and mean squares as well as the calculated F and the 
probability associated with getting that F value or one larger. You also have control over 
how many decimal places are to be printed on the output. 



Typical Program Flow 



Input data via BSDM 



Select Advanced Statistics option 



Insert program medium 



Select one-way classification 



Specify variables and subfiles 



Summary statistics 



AOV table 



Perform auxiliary routines, 
e.g., multiple comparisons 



Special Considerations 

See the General Information portion of this AOV manual for program limitations. Also, 
carefully read the Data Structure section before entering your data through Basic Statistics 
and Data Manipulation. 

References 

1. W.J. Dixon, F.J. Massey "Introduction to Statistical Analysis" Third Edition. 
McGraw-Hill, 1969. 

2. G.W. Snedecor, W.G. Cochran "Statistical Methods" Sixth Edition. Iowa State Uni- 
versity Press, 1967. 



247 



Two-Way Unbalanced Design 

Object of Program 

The purpose of this program is to perform an analysis of variance on a two-way classifica- 
tion with unequal subclass frequencies. The analysis may be performed in two ways. 

If interactions are known to be present in the population, and all subclasses have at least 
one observation, then the method of weighted squares of means should be used to test the 
main effects. 

If interactions are known to be absent in the population, or if at least one subclass has no 
observations, then the method of fitting constants should be used. In any case, if at least one 
subclass has no observations, the method of fitting constants must be used. 

If it is not known whether or not interactions are present in the population, then a prelimin- 
ary analysis of variance should be studied in order to test for interaction. If this test is 
significant, then the method of weighted squares of means should be used. A significance 
level of 0.25 may be used when testing for the presence of interaction. 

Typical Program Flow 



Input data via BSDM 
I 



Select Advanced Statistics 



Insert program medium 



Select two-way classification 



Specify variables and subfiles 



Summary statistics 

\ 



AOV table 



T 



Perform auxiliary routines, 

e.g., multiple comparisons 



Special Considerations 

See the General Information portion of this AOV manual for program limitations. Also, 
carefully read the Data Structures section before entering your data through Basic Statistics 
and Data Manipulation. 

References 

1. Bancroft, T.A. (1968). Topics in Intermediate Statistical Methods. The Iowa State 
University Press, Ames, Iowa. 

2. Searle, S.R. (1971). Linear Models, John Wiley and Sons. 



248 



One-Way Analysis of Covariance 

Object of Program 

This program will perform a one-way analysis of covariance for equal or unequal sample 
sizes. You may give a ten-character label to each treatment. For each treatment, a covariate 
(X) and a response variable (Y) must be specified. 

For each treatment, the number of observations in the treatment, the means and standard 
deviations for the covariate (X) and the response (Y), the correlation between the two, and 
the equation of the least squares line will be printed. For the overall data, the same things 
will be computed and printed. 

The corrected sums of squares tables will be printed and the analysis of covariance table 
with the calculated F and the probability associated with getting that F value or one larger 
will be printed. 

Tests of the one-way analysis of variances for both X and Y, tests for equal slopes within 
treatments, and significant pooled regression will be calculated and printed. 

The adjusted means and the standard errors of the adjusted means will be printed. These 
adjusted means will be saved for further analysis when doing multiple comparisons, or 
treatment contrasts. 

Any time an observation is found with either the covariate (X) or response (Y) missing, the 
point will be deleted from the calculations. 

You also have control over how many decimal places are to be printed on the output. 



Typical Program Flow 



249 



Input data via BSDM 



Select Advanced Statistics option 



Insert program medium 



Select one way analysis of covariance 



Specify variables and subfiles 



Summary statistics are printed 



Within treatment 
regression is performed 



ANOVA table 
One-way analysis of X variable 
One-way analysis of Y variable 



Test of homogeneity 
of regression coefficientiDn 



Test of homogeneity 
of pooled regression coefficients 



One way analysis of covariance table 



Perform auxiliary routines, 
e.g., multiple comparisons 



Special Considerations 

See the General Information portion of this AOV manual for program limitations. Also, 
carefully read the Data Structure section before entering your data through Basic Statistics 
and Data Manipulation. 

References 

1. W.J. Dixon, F.J. Massey "Introduction to Statistical Analysis", Third Edition. 
McGraw-Hill, 1969. 

2. G.W. Snedecar, W.G. Cochran, "Statistical Methods", Sixth Edition. Iowa State 
University Press, 1967. 



250 



F-Prob 

Object of Program 

Given the numerator degrees of freedom, and the denominator degrees of freedom, and an 
F value>l, this program will calculate the probability that an F random variable has a value 
greater than or equal to the given F value. 

References 

1. Boardman, T.J. (editor) 9830A Statistical Distribution Pac, Hewlett-Packard (PN 
09830-70854), September, 1974. 

2. Boardman, T.J. (editor) 9845A General Statistics Package. 

3. Boardman, T.J. and R.W. Kopitzke, "Probability and Table Values for Statistical 
Distributions", 1975, Proceedings of the Statistical Computing Section of The Amer- 
ican Statistical Association, pp 81-86. 



251 



Orthogonal Polynomials 



Object of Program 

This program generates orthogonal polynomials. This allows you to determine if quantita- 
tive factor levels with equal or unequal spacings in the levels are linear, quadratic, etc., in 
their relationship to the response variable. The output includes the sum of squares, the 
F-ratio and the P(F>comp F) for each degree polynomial. 



Typical Program Flow 



Perform some type of AOV 



Select further analysis 



Select orthogonal polynomial 



Define the maximum degree 
of orthogonal polynomial 



Orthogonal polynomial decomposition 
on rows and columns 



Special Considerations 

Maximum Degree of Orthogonal Polynomial 

For a one-way classification design, it must be less than the number of treatments. 

For a two-way (unbalanced) design, it must be less than the number of levels of factor A. 

For other designs, it must be less than the number of levels of the factor. 

Enter zero if that factor is not a quantitative variable or if it is not desired to do orthogonal 
polynomial comparisons on the factor. 

Level Associated with Treatment (row, factor) #"i" 

When this question is asked, you should enter the quantity corresponding to this treatment 
(for one-way design), or this row (two-way design), or the level "i" of factor k (for other 
design). 



252 



Contrasts 

Object of Program 

This program performs treatment contrasts on main effect means or on two-way means with 
one of the factors held constant. This allows you to make any desired linear contrast of a set 
of treatment means by entering an appropriate set of coefficients. The output includes the 
user-entered coefficients, the contrasts, and the sum of squares, F-ratio and P(F>comp F) 
associated with the contrasts. 



Typical Program Flow 





Perform some type of AOV 






\ 






Choose further analysis 












Choose contrasts 






1 






Choose 1. Main effect? 

2. Two-way means? 








1 




2 






■ 










Enter the contrast 
coefficients 




Contrast on row enter 
the level # of column 
to be held constant 




■ 












1 








■ 



























Special Considerations 



How to Make a "Contrast" 

If the coefficients for the contrasts you enter are denoted by c(i) 
choosing the c(i) is that they must satisfy 

£c(i) = 



then one condition for 



where i is summed over all levels of the factor of interest. Obviously, this implies that some 
of the c(i) must be negative. Of course one or more of the c(i) may be equal to zero. 

Let's look at an example which demonstrates the procedure. Suppose you have a one-way 
classification with four treatments. You find in the AOV table that you have a significant F 
value. So, you reject the hypothesis that all the treatment effects are equal, i.e., you reject 

H0:Ti = T 2 = T 3 = T4. 



253 



You still don't know exactly which treatments are significantly different from one another. 
This is where you use a contrast. Suppose you want to know if treatment one is significantly 
different from treatment three, i.e., you want to test the hypothesis 

H0:Ti = T 3 , orH0:Ti-T 3 = 

or, written in still another way 

H0:l*Ti + 0*T 2 -1*T 3 + 0*T4 = 

If the number of observations in each treatment are equal, then to specify the above 
contrast all you need to do is to supply the coefficients of the treatments. That is, coefficient 
one is 1, coefficient two is 0, three is - 1 and four is 0. You must tell the program what the 
coefficients (of the T's above) are. 

Suppose the number of observations for the four respective treatments are 6, 8, 7, and 6. 
Suppose further that you want to test if treatment two is significantly different from treat- 
ment four. Write the hypothesis as: 

HO: 0*Ti+l*T2. + 0*T 3 -l*T4 = 0. 

Then try the following procedure to determine your contrast coefficients, c(i). Form a table 
using the number of observations for the ith treatment, n(i), as one column. Use the coeffi- 
cients of the T's in the above hypothesis as the last column. Call these coefficients c(i)n(i). 

Remember, one condition for a valid contrast is that 2c(i)n(i) = 0. So, check to make sure 
that condition is satisfied. Then, make a column for your as yet unknown contrast coeffi- 
cients, c(i). You should have the following table. 

n(i) c(i) n(i)c(i) 

6 
8 1 

7 
6 -1 

Now, just fill in the c(i) column. To do that notice that c(i) = n(i) c(i)/n(i). So you obtain the 
following. 

n(i) c(i) n(i)c(i) 



6 








8 


1/8 


1 


7 








6 


-1/6 


-1 



So, contrast coefficient one is 0, two is 1/8, etc. 

Notice that the contrast coefficients for a given contrast are not unique. For example, the 
above contrast would be performed if contrast coefficients of 0, 1/4, 0, — 1/3 were given. 
Also, a similar contrast would be obtained using 0, - 1/8, 0, 1/6 as the coefficients. 



254 



Interaction Plots 

Object of Program 

This program will plot two-way interaction, or three-way interaction means. The two-way 
interaction plot will be on one graph. You may decide which factor will be put on the X axis 
as well as the spacing of the levels, and then the other factor will be plotted. Each interaction 
line will be labeled indicating the level of the factor. 

For instance, the three levels of a factor B will be labeled Bl, B2, B3. 

The three-way interaction plot will be plotted on several graphs. That is, a two-way interac- 
tion will be plotted for each level of the third factor. The program will give you a prompt 
when it is necessary to do the next page of the plot. 

You may also have a legend drawn showing the length of the Least Significant Difference 
(LSD) and/or the length of Tukey's Honestly Significant Difference (HSD). To do these, it is 
necessary to enter the critical value, error mean square, and its corresponding degrees of 
freedom. 

Special Considerations 

Which interaction is to be plotted? 

When this question is asked, enter the two letters corresponding to the two factors. The 
input must be one of AB, AC, BC, AD, BD, or CD, and the one selected must be possible for 
your data set. 

What 3-way interaction is to be plotted? 

When this question is asked, enter the three letters corresponding to the three factors. The 
input must be one of ABC, ABD, ACD or BCD. 

The label of the X-axis for an interaction plot. 

The factor levels must be given in increasing order. Factors whose levels are not in increas- 
ing order must be given arbitrary level codes if they are to be used on the X-axis of an 
interaction plot. 

References 

1. C.R. Hicks, "Fundamental Concepts in the Design of Experiments"; Second Edition. 
Holt, Rinehart, and Winston, 1972. 

2. B.J. Winer, "Statistical Principles in Experimental Design"; Second Edition. McGraw- 
Hill, 1971. 



255 



Multiple Comparisons 

Object of Program 

This program allows you to select any one of five multiple comparison procedures to use on 
either main effect means or two-way table means. You must input the appropriate tabled 
values for the procedure selected. In addition, for the separation procedures for the two- 
way means, you will need to specify the appropriate standard deviation to be used. 

A separation table will be printed which should help you determine which treatment or 
factor levels are significantly different from one another. For example, the following table 
shows output for a set of treatments: 





Factoi 


A 
Sample 




Level 


Mean 


Size 


Separation 


1 


10.7 


10 


ab 


2 


9.8 


9 


a 


3 


11.7 


10 


b 


4 


15.8 


8 


c 



We would interpret this table as showing that factor level 4 is significantly different from the 
other levels of A since no other level has a "c" listed beside it. Also we see that level 1 
cannot be distinguished from level 2 and level 1 cannot be distinguished from level 3. And, 
level 2 can be shown to significantly differ from level 3 since they have no letters in com- 
mon. 

Of course, the conclusion one draws from the separation procedure may depend on which 
procedure is used and the level of significance you choose. 

Typical Program Flow 



Perform some type of AOV 



Choose further analysis 



Choose multiple comparisons 



Main effect means, 
two-way table means are printed 



Choose the procedure 
to be used (5 options) 



Multiple comparison 
is printed and/or plotted 



256 



Special Considerations 

Which factor/main effect should be used? 

When this question is asked, you should input A, B, C, or D as the response. 

What level of alpha are you going to use? 

The value you input in response to this question is used for printout purpose only and not 
for any calculations. 

What table value should you use? 

The following chart shows required inputs for tabled values: 

Procedure* Table Notation Parameter Reference 

1 Student's t taMdf) df = error degrees of freedom (1,4) 

2 Studentized range qa/2<p,df) p = # of means (6,4) 

df = error degrees of freedom 

3 Duncan's q*a(p,df) p is as above but reduces by 1 to p = 2 (3) 

4 Studentized range qaMp,df) p same as 3 (1,3,4) 

5 Snedecor's F F^tp-l.df) p= # of means (4,5) 

See reierences (1) and (2) for more information on all procedures. 



Unequal sample sizes 

In this case, the harmonic mean, no, sample size will be used where no = p/(l/m + 1/ 
nv + ... + l/n P ). 

For the methods used in Multiple Comparisons, please refer to the Multiple Sample Tests 
portion of the General Statistics section of this manual. 

References 

1. Boardman, T.J. and D.R. Moffitt (1971) "Graphical Monte Carlo Type I Error for 
Multiple Comparison Procedures". Biometrics 27:3, 738-744. 

2. Carmen, S.G., and M.R.. Swanson (1973) "Evaluation of Ten Pairwise Multiple Com- 
parison Procedures by Monte Carlo Methods". Journal of the American Statistical 
Association 68:341, pp 66-74. 

3. Duncan, D.B. (1955). Multiple range and multiple F tests. Biometrics 11, 1-42. 

4. Pearson, E.S. and Hartley, H.O. (1958). Biometrika Tables for Statisticians, Vol. I. 
Cambridge University Press, London. 

5. Scheffe,H. (1953). A method for judging all contrasts in the analysis of variance. 
Biometrika 40,87-104. 

6. Tukey, J.W. (1953). The problem of multiple comparisons. Unpublished notes, Prin- 
ceton University. 



257 



Factorial Design 

Example 

Twenty-four laboratory rats were deprived of food, except for one hour per day, for several 
weeks. At the end of that time, each rat was inoculated with one of four doses of a certain 
drug and, after one of three amounts of time, was fed. The weight (in grams) of the food 
ingested by each rat was measured. The purpose of the experiment is to determine the 
effect of the drug on the motivation of the rats. 



A 




B 






Time before feeding 




Dosage (mg/kg) 




(hours) 












.1 


.3 


.5 


.7 


1 


9.077 


5.63 


4.42 


1.38 


8.77 


8.76 


3.01 


3.96 


5 


9.16 


11.57 


5.22 


5.72 


11.82 


11.53 


9.21 


4.69 


9 


16.08 


10.37 


7.27 


5.48 


14.65 


14.46 


6.10 


9.28 



The design for this experiment is a two-way factorial with three levels of time and four 
dosage levels of the drug. Two rats (observations) per experimental combination were used. 
The data can be subjected to an analysis of variance in order to determine if there are 
significant differences between the three times before feeding or the four dosages of the 
drug. In addition, we can determine if there is a significant interaction between time and 
dosage. 

The F ratios indicate no significant interaction effect (F = .915), significant differences in 
time levels (F = 14.819) and dosage levels (F = 19.533). The orthogonal polynomial decom- 
position for the time factor (A) shows a significant linear effect. The decomposition for the 
dosage factor (B) shows a highly significant linear effect and a cubic effect. 

Even though the AB interaction (time or dosage) is not significant, a plot of the two-way 
means was included to show results of the INTERACTION PLOT routine. A reference LSD 
value is shown on interaction plot. 



258 



* DATA MANIPULATION * 

Enter DATA TYPE <Press CONTINUE for RAW DATA): 

1 

Mode nuMber = ? 



Is data stored on prograM's scratch file (DATA)? 

NO 

Data file name = ? 

DEPOFRATS: INTERNAL 

Was data stored by the BS&DM systen ? 

YES 

Is data piediuM placed in device INTERNAL 

? 

YES 

Is prograM nediun placed in correct device ? 

YES 



Raw data 

On mass storage 



FOOD DEPRIVATION OF RATS 



Data file na«e: DEPOFRATS : INTERNAL 
Data type is: Raw data 



Number of observations: 
Nunber of variables: 



12 



Variable nanes: 
i. OBS i WT 
2. OBS 2 WT 

Subfiles: NONE 



SELECT ANY KEY 

Option nuMber = ? 

i 

Enter Method for listing data: 

3 



Select special function key labeled-LIST 
List ail data 



FOOD DEPRIVATION OF RATS 



Data type is: Raw data 





Variable * i 


Variable # 2 




(OBS i WT ) 


(OBS 2 WT ) 


DBS* 
i 


9.07000 


8.77000 


2 


5.63000 


8.76000 


3 


4.42000 


3.01000 


4 


i. 38000 


3.96000 


5 


9.16000 


11.82000 


6 


ii. 57000 


11.53000 


7 


5.22000 


9.21000 


8 


5.72000 


4.69000 


9 


16.08000 


14.65000 


10 


10.37000 


14.46000 


ii 


7.27000 


6.10000 


12 


5.48000 


9.28000 



259 



Option number = ? 



SELECT ANY KEY 



Exit list procedure 

Select special function key labeled-ADV STAT 

Remove BSDM media 

Insert AOV2 

Select factorial design 



1, 5, and 9 hours 

.1, .3, .5, and .7 mg/kg 
Only 1 major replication 

2 rats per experimental combination 



Enter number of desired funtion: 

i 

Number of factors in design ? (2, 3, or 4) 

2 

Number of levels of factor A 

? 

3 

Number of levels of factor B 

? 

4 

Number of blocks in this design ? 

i 

No. obs per trt combination in each bl ock (saMple) f 

2 

Is the above information correct ? 

YES 

Do YOU want to assign names to the factors ? 

YES 

Enter the name for factor A <<ii characters) 

? 

TIME 

Enter the name for factor B (<ii characters) 

? 

DOSAGE 

Data entry option ? 

2 

Variable # for minor replication (sample) i 

? 

i 

Variable * for minor replication (sample) 2 

? 

2 

No. of decimals for printing calc. values(<=7>. 

4 

*r r^ ^ ^ ^ ^*r ^^ ^ ^ ^ *r ^ ^t ^ ^ ^ ^ ^ ^ *r ^ ^ t * ^ *P ^ ^t *rT ^ ^ ^^ ^ ^ ™ ^^^ * * ^^ ^ ^ * * ^ ^ ^ ^ * * ^ ^ ^ ™ * * ^ ^ ^ ^ ^ * * ^ ^ ^ ^ V ^ ^ ^ ^ 

* FACTORIAL ANALYSIS OF VARIANCE * 

FOOD DEPRIVATION OF RATS 



Minor replications are stored in different 
variables 



DESIGN 

Number of factors = 2 
No. of levels of factor A = 3 
No. of levels of factor B = 4 
No. of major replications (blocks) = 
No. of minor replications (samples) 

Subfiles will be ignored 
Response variable(s) are ■■ 
Variable no. i OBS i WT 
Variable no . 2 OBS 2 WT 

MEANS 

* Overall mean = 8.2338 



* Main Effect Means : 

Factor A - TIME Levels ( i - 3 ) 
5.6250 8.6150 



10.4613 



260 



Factor B - DOSAGE 
ii.59i7 



Levels ( i - 4 ) 
10.3867 



5.8717 



5.0850 



# Two Way Interaction Means 
Factor A 



TIME down and Factor B - DOSAGE across 

12 3 

8.9200 7.1950 3.7150 

10.4900 11.5500 7.2150 

15.3650 12.4150 6.6850 



4 

2.670 
5.2050 
7.380 



ANOVA TABLE 

Factorial Analysis of Variance 
Source (Nane) df Suns of Squares Mean Square F Ratio F-Prob 



Total 23 

A TIME 2 

B DOSAGE 3 

AB 6 

Sacipling Error 12 



339.9634 
95.3015 

188.4283 
17.6478 
38.5858 



Blk A B 



i 
2 
3 
4 
1 
2 
3 
4 
1 
2 
3 
4 



Mean 

8.920 

7.1950 

3.7150 

2.670 

10.4900 

11.5500 

7.2150 

5.2050 

15.3650 

12.4150 

6.6850 

7.3800 



Std DeM 

.2121 
2.2132 

.9970 
1.8243 
1.8809 

.0283 
2.8214 

.7283 
1. 0112 
2.8921 

.8273 
2.6870 



14 


7810 






47 


6507 


14.819 


.0006 


62 


8094 


19.533 


.0001 


2 


9413 


.915 


.5168 


3 


2155 







NOTE; F tests assuwe that all factors are fixed 



Should tests for hociogeneity of variance be Made? 

YES 

FACTOR LEVELS CELL STATISTICS 



From the AOV table it can be seen that the 
effects of Factor A and of Factor B are signifi- 
cant, but interaction between Factor A and 
Factor B is not significant. 



Vari< 


i n c e 


Co el 








Var 


Z 




0450 


2 


38 


4 


8984 


30 


76 




9941 


26 


84 


3 


3282 


68 


33 


3 


5378 


17 


93 




0008 




24 


7 


9601 


39 


10 




5305 


13 


99 


1 


0224 


6 


58 


8 


3640 


23 


29 




6844 


12 


38 


7 


2200 


36 


41 



Bartlett's test : 

Chi squared = 11.0311 with 11 degrees of freedom 
Prob< Chi squared > 11.0311) = .4410 

Specify a new variable for this design ? 

NO 

Enter desired nunber: 
4 



Request interaction plot 



261 



INTERACTION PLOT 

*P *P *l* ^ * ^ * * T* ^^ ^"H * *P *P ^ ^ ^^ ^ ^ ^ ^T ^ ^ ^^ * * * ^ ^ ^ * * V ^ * * ^ * * ^ ^ ^ ^ ^ ^ ^ * * ^ ^ ™ ^ ^ * * ^ * * * ^ * ™ ^ ^ ^ ^ ^ ^ ^ ^ ^ * ^ ^ ^ 



Is This correct ? 

YES Confirm design on CRT 

Plot which factor on the X axis ■• A,B 

? 

B 

Enter 4 levels of factor B(separate by connas) : 

? 

,1,.3,.5,.7 

Natie of the response ? <<il characters) 

WEIGHT 

Enter Y MininuM value. (Less than 2.67 ) 

? 



Enter Y naxinun value. (Greater than 15.365 ) 

? 

16 

Enter Y tic 

1 

# of decifial places for labelling Y axis(<= 6 )= 

? 

2 

Should length of the LSD and/or HSD be plotted ? 

YES 

Error Mean Square to calculate the LSD and/or HSD. 

3.21548 From AOV table 

Error Mean Square to be used is 3.21548 

t value for the LSD, or not to plot the LSD. 

2.179 t-tabled value 

Q value for the HSD, or not to plot the HSD. 



t = 2.179 LSD = 3.90733040255 



Plot on CRT 

V 

NO 

Plotter identifier string (press CONT if 'HPGL')? 

Enter the select code, bus # (defaults are 7,5)? 

Which PEN color should be used? 
1 



262 



Beep will sound when plot done, then press CONT . 
To interrupt plotting press 'STOP' key 
Press CONTINUE when the plotter is ready. 



FOOD DEPRIVATION OF RRTS 



I 




RB Interaction 



. 3 • 5 • * (0 

Factor B DOSRGE 



Are there any More plots to be Made ? 
NO 



Enter nuMber of desired funtion: 
9 



Return to BSDM 



263 



Nested or Partially Nested Design 

Example 

In order to compare two methods of display, a group of six new Thanksgiving greeting cards 
were selected. Eight stores were selected for the "promotional" display method and another 
eight stores were used for the "integrated" display method. For each of the two methods and 
each of eight stores per method, the same six card styles were compared using a response (Y) 
which measured dollar sales adjusted for store size. The data for each type of display, store, 
and greeting card style are shown below: 

Display Method 1 - "Promotional" (A) 



Card 

Style 

(B) 



Stores (C) 
4 5 



$1.21 


1.49 


1.76 


1.52 


0.65 


1.96 


1.21 


1.57 


1.72 


2.09 


2.21 


2.36 


2.83 


3.99 


2.01 


2.62 


1.72 


1.44 


1.84 


0.91 


1.30 


7.61 


2.01 


3.27 


0.29 


0.92 


0.37 


0.72 


0.43 


3.99 


2.35 


4.71 


1.44 


2.09 


1.84 


2.36 


1.96 


3.26 


2.01 


1.70 


4.43 


3.66 


0.51 


1.78 


2.13 


5.58 


1.41 


2.75 



Display Method 2 - "Integrated" (A) 













Stores (C) 










1 


9 


10 


11 


12 


13 


14 


15 


16 




$2.60 


2.21 


1.44 


1.20 


1.21 


3.03 


2.79 


1.18 


Card 


2 


1.67 


1.16 


1.73 


1.92 


4.84 


2.88 


4.10 


1.48 


Style 


3 


3.67 


0.78 


1.46 


1.65 


3.23 


1.92 


4.51 


1.48 


(B) 


4 


1.33 


0.39 


1.33 


1.37 


2.02 


1.68 


4.51 


2.34 




5 


3.33 


1.16 


1.86 


1.92 


3.23 


2.64 


3.96 


2.22 




6 


4.67 


1.90 


2.61 


3.27 


2.26 


2.36 


2.30 


1.55 



The mixed nested AOV for this model with factor A (display), factor C (stores) nested in factor 
A, and factor B (card style) crossed with A and C is shown below. The proper MS for testing 
differences between the two methods of display is C(A). Notice that the F ratio would be less 
than one = .42135/4.85529 indicating no significant difference between the methods as well 
as a considerable amount of store to store variation in the adjusted sales value. There 
does, however, appear to be significant differences between the population means for card 
types, i.e. F = 2.57257/. 92726 = 2.77 which is significant at the .024 level. 



A fairly standard procedure for the response variable Y considered here is to transform this 
response by Y* = ln(Y + l) in order to achieve a more homogeneous and consistent re- 
sponse. The next analysis of variance is performed on this new response. The net result is that 
the F ratio for differences in card type means is even more highly significant (3.93 versus 
2.77). 



264 



An LSD multiple comparison procedure was done on the six card styles. The results of this 
comparison show significant differences between style four and all others except style one 
with certain other differences existing as well. However, if one were looking for the highest 
adjusted daily sales, one should probably choose one of styles five, two, or six since they were 
not significantly different from one another but were different from the other styles (although 
three is questionably different). 

************************************************* 

* DATA MANIPULATION * 

******************************************************************************** 

Enter DATA TYPE (Press CONTINUE for RAW DATA) : 

i Raw data 

Mode nuMber = ? 

2 From mass storage 

Is data stored on progract's scratch file (DATA)? 

NO 

Data file rtane = ? 

GRETINGCDS: INTERNAL 

Uas data stored by the BS&DM systeM ? 

YES 

Is data MediuM placed in device INTERNAL 

? 

YES 

Is program ciediuM placed in correct device ? 

YES 



THANKSGIVING GREETING CARD EVALUATION 

Data file name- GRETINGCDS . INTERNAL 

Data type is: Raw data 

Nunber of observations: 96 
NuMber of variables: 1 



Variable nanes: 
i. DESIGN 

Subfiles: NONE 



SELECT ANY KEY 



Select special function key labeled-LIST 



Option nuMber - ? 

i List all the data 



THANKSGIVING GREETING CARD EVALUATION 
Data type is: Raw data 



VARIABLE # i (DESIGN) 

OBS(I+i> 0BS(I+2) 0BS(I+3) 0BS(I+4) 

i. 49000 1.76000 i. 52000 .65000 

i. 21000 1.57000 i. 72000 2.09000 

2.36000 2.83000 3.99000 2.01000 



I 


OBS(I) 


i 


i. 21000 


6 


1.96000 


il 


2.21000 



265 



16 


2.62000 


21 


1 .30000 


26 


.92000 


31 


2.35000 


36 


2.36000 


41 


4.43000 


46 


5.58000 


51 


1.44000 


56 


1.18000 


61 


4.84000 


66 


.78000 


7i 


4.51000 


76 


i. 37000 


81 


3.33000 


86 


2.64000 


91 


2.61000 


96 


1.55000 


Option 


number = ? 







SELECT 


ANY KEY 



1.72000 

7.61000 

.37000 

4.71000 



.96000 
.66000 
.41000 
.20000 
.67000 
2.88000 
1. 46000 
1.48000 
2.02000 
1.16000 
3.96000 
3.27000 



1 


.44000 


2 


.01000 




.72000 


1 


.44000 


3 


.26000 




.51000 


2 


.75000 


1 


,21000 


1 


.16000 


4 


10000 


1 


.65000 


1 


.33000 


1 


68000 


1 


.86000 


2 


.22000 


2 


,26000 



Enter number of desired funtion: 

2 

Number of factors in design ? (2, 3, or 4) 

3 

Number of levels of factor A 

? 

2 

Number of levels of factor B 
? 

6 

Number of levels of factor C 

? 

8 

Number of samples ? 

1 

Is the above information correct ? 

YES 

Which design (by number) is to be used ? 

3 

Which factor is P: A,B,C 

? 

A 

Which factor is Q: B,C 

? 

C 

Do YOU want to assign names to the factors ? 

YES 

Enter the name for factor A <<ii characters) 

? 

DISPLAY 

Enter the name for factor B <<ii characters) 

? 

CARD STYLE 

Enter the name for factor C <<ii characters) 

? 

STORES 

No. of decimal places to print calculated values. 

4 



1.84000 
3.27000 
.43000 
2.09000 
2.01000 
1.78000 
2.60000 
3.03000 
73000 
48000 
23000 
39000 
51000 
92000 
67000 
36000 



.91000 

.29000 

3.99000 

1.84000 



i. 
1. 

3. 

4. 
1. 
4. 



70000 
13000 
21000 
79000 
92000 
67000 
920 
33000 
2.34000 
3.23000 
1.90000 
2.30000 



2. 
i. 
3. 
1. 
1. 



Exit list procedure 

Select special function key labeled-ADV STAT 
Remove BSDM media 
Insert AOV2 media 

Choose nested design 



Shown on CRT, specify design type. 



266 



NESTED ANALYSIS OF VARIANCE 
THANKSGIVING GREETING CARD EVALUATION 



DESIGN 

Number of factors = 3 

No. of levels of factor A = 2 

No. of levels of factor B = 6 

No. of levels of factor C = 8 

No. of Minor replications (sanples) 

Response variable(s) are ■■ 
Variable no. 1 DESIGN 

MEANS 

* Overall Mean = 2.2327 



He Main Effect Mean-s = 

Factor A - DISPLAY Levels < 1 - 2 ) = 

2.1665 2.2990 
Factor B - CARD STYLE Levels < i - 6 ) = 

1.6894 2.4756 2.4250 

2.6981 

Factor C - STORES Levels < i - 8 ) = 

2.3400 1.6075 i.5800 

3.4083 2.7642 2.2392 



1.7969 
i.7483 



2.3112 



1742 



* Two Way Interaction Means 
Factor A 



DISPLAY down and Factor B 
1 2 

5 

1 1.4213 
2.0825 

2 1 . 9575 
2.5400 



CARD STYLE across 
2 3 

6 

2.4788 2.5125 

2.7812 

2.4725 2.3375 

2.6150 



Factor A - DISPLAY down and Factor C - STORES across 

1 2 3 

5 6 7 

1 1.8017 1.9483 1.4217 
1.5500 4.3983 1.8333 

2 2.8783 1.2667 1.7383 
2.7983 2.4183 3.6950 



4 

1.7225 
1.8712 



4 

8 

1.6083 

2.7700 

1 . 8883 

1.7083 



Factor B - 


- CARD 


STYLE down an 


d Factor C - 


STORES 


acr oss 






1 


2 




3 






5 


6 




7 


i 




1.90SO 


1.8500 




1.6000 






.9300 


2.4950 




2.0000 


2 




1.6950 


1.6250 




1.9700 






3.8350 


3.4350 




3.0550 


3 




2.6950 


1.1100 




1.6500 






2.2650 


4.7650 




3.2600 


4 




.8100 


.6550 




.8500 






1.2250 


2.8350 




3.4300 


S 




2.3850 


1.6250 




1.8500 



3600 
3750 
1400 
0500 
2800 
3750 
0450 
5250 
1400 



267 



2.5950 2.9S0O 

6 4.5500 2.7800 

2.1950 3.9700 

Should the 3-way Means be printed ? 
NO 



2.9850 
1.5600 
1.8550 



1.9600 
2.5250 
2.1500 



ANOVA TABLE 

Nested Analysis of Variance 
Source (Name) df Sums of Squares Mean Square 



Total 95 

A DISPLAY 1 

C(A) 14 

B CARD STYLE 5 

AB 5 

CB(A) 70 



148.0541 

.4213 

67.9740 

12.8628 

1 .8879 

64.9080 



.5585 

.4213 

.8553 

. 5726" 

.3776 

.9273" 



F = 2.77 significant 
at a = .02. 



Enter desired nuMber : 
7 

Enter nuMber of desired funtion; 
4 

SELECT ANY KEY 

SELECT ANY KEY 

Select option desired ■■ 

1 

Transf or«a tion nuMber = ? 

1 

Variable nuMber corresponding to X = ? 

1 

ParaMeter a = ? 

1 

ParaMeter b = ? 

1 

ParaMeter c = ? 

1 

Store transforMed data in Variable # < <= 2 ) 

? 

2 

Variable nane <<= 10 characters) = ? 

LN(Y+i) 

Is above inforMation correct? 

YES 

Press 'CONTINUE' when ready. 



There is a significant difference between the 
population means for card types but not be- 
tween the types of displays. 

Exit nested design 



Return to BSDM 
Select Transform key 

Algebraic transformation 



The following transforation was perforMed: a*<X A b)+c 

where a = 1 
b = 1 

c - 1 

X is Variable * 1 

TransforMed data is stored in Variable # 2 <LN<Y+i)) 



268 



Select option desired 

1 Another algebraic transformation 

Transf ormation number = ? 

3 

Variable number corresponding to X = ? 

2 

Parameter a = ? 

1 

Parameter b = ? 

i 

Par a Meter c = ? 



Store transformed data in Variable * ( <= 3 > 

V 

2 

Is above information correct? 

YES 

Press 'CONTINUE' when ready. 



The following transformation was performed: a#ln(bX)+c 
where a = i 
b = i 

c = 

X is Variable # 2 

Transformed data is stored in Variable * 2 <LN<Y+i>). 

Select option desired ■■ 

Exit transformation routine 

PROGRAM NOW UPDATING SCRATCH DATA FILE 

SELECT ANY KEY Select LIST key 

Option number = ? 

i 

Enter method for listing data: 

3 

THANKSGIVING GREETING CARD EVALUATION 
Data type is: Raw data 

Variable * i Variable * 2 
(DESIGN ) <LN(Y+i) ) 



OBS# 






i 


i. 21000 


.79299 


2 


1.49000 


.91228 


3 


i. 76000 


1.01523 


4 


1.52000 


. 92426 


5 


.65000 


.50078 


6 


1.96000 


1.08519 


7 


1.21000 


.79299 


8 


1.57000 


.94391 


9 


1.72000 


1.00063 


10 


2.09000 


1.12817 


ii 


2.21000 


1.16627 


12 


2.36000 


1.21194 


13 


2.83000 


1.34286 


14 


3.99000 


1.60744 



269 



IS 


2.01000 


1.10194 


16 


2.62000 


1.28647 


17 


1.720 


1.00063 


18 


1.44000 


.89200 


19 


1.84000 


1.04380 


20 


.91000 


.64710 


21 


1.30000 


.83291 


22 


7.61000 


2.15292 


23 


2.01000 


1.10194 


24 


3.27000 


1.45161 


25 


.290 


.25464 


26 


.92000 


.65233 


27 


.3700 


.31481 


28 


.72000 


.54232 


29 


.43000 


. 35767 


30 


3.99000 


1.60744 


31 


2.35000 


1.20896 


32 


4.71000 


1.74222 


33 


1.44000 


.89200 


34 


2.09000 


1.12817 


35 


1 .84000 


1.04380 


36 


2.36000 


1.21194 


37 


1.96000 


1 . 08519 


38 


3.26000 


1.44927 


39 


2.01000 


1.10194 


40 


1.70000 


.99325 


41 


4.43000 


1.69194 


42 


3.66000 


1.53902 


43 


.51000 


.41211 


44 


1.78000 


1.02245 


45 


2.13000 


i. 14103 


46 


5.58000 


1.88403 


47 


1.41000 


.87963 


48 


2.75000 


1.32176 


49 


2.60000 


1.28093 


50 


2.21000 


1.16627 


51 


1.44000 


.8920 


52 


1.20000 


. 78846 


53 


1.21000 


.79299 


54 


3.03000 


1.39377 


55 


2.79000 


1.33237 


56 


1.18000 


.77932 


57 


1.67000 


.98208 


58 


1.16000 


.77011 


S9 


1.73000 


1.00430 


60 


1.92000 


1.071S8 


61 


4.84000 


1.76473 


62 


2.880 


1.35584 


63 


4.10000 


1.62924 


64 


1 .48000 


.90826 


65 


3.67000 


1.54116 


66 


.78000 


.57661 


67 


1.46000 


.90016 


68 


1.65000 


.97456 


69 


3.23000 


1.44220 


70 


1.92000 


1.07158 


71 


4.51000 


1.70656 


72 


1.48000 


.90826 


73 


1.33000 


.84587 


74 


.39000 


.32930 


75 


1.33000 


.84587 


76 


1.37000 


.86289 


77 


2.02000 


1.10526 


78 


1.68000 


.98582 


79 


4.51000 


1.70656 


80 


2.34000 


1.20597 


81 


3.33000 


1.46557 



270 



82 


1.16000 


.77011 


83 


1.86000 


1.05082 


84 


1.920 


1.07158 


85 


3.23000 


1.44220 


86 


2.640 


1.29198 


87 


3.96000 


1.60141 


88 


2.22000 


1.16938 


89 


4.67000 


1.73519 


90 


1.90000 


1.06471 


91 


2.61000 


1.28371 


92 


3.27000 


1.45161 


93 


2.26000 


1.18173 


94 


2.36000 


1.21194 


95 


2.30000 


1.19392 


96 


1.55000 


. 93609 



Option number = ? 



SELECT ANY KEY 



Exit list procedure 
Return to AOV2 



Enter number of desired funtion: 

2 

NuMber of factors in design ? (2, 3, or 4) 

3 

Number of levels of factor A 

? 



Select nested design 



Number of levels of factor B 

? 

6 

Nunber of levels of factor C 

? 

8 

NuMber of samples ? 

1 

Is the above information correct ? 

YES 

Which design (by number) is to be used ? 

3 

Which factor is P: A,B,C 

? 

A 

Which factor is Q: B,C 

? 

C 

Do YOU want to assign names to the factors ? 

YES 

Enter the name for factor A (<ii characters) 

? 

DISPLAY 

Enter the name for factor B (<ii characters) 

? 

CARD STYLE 

Enter the name for factor C <<ii characters) 

? 

STORES 

Which variable number contains the response ? 

2 

No. of decimal places to print calculated values. 

4 



271 



NESTED ANALYSIS OF VARIANCE 

THANKSGIVING GREETING CARD EVALUATION 
DESIGN 



Nonber of factors = 3 

No. of levels of factor A = 2 

No. of levels of factor B = 6 

No. of levels of factor C = 8 

No. of Minor replications (sanples) 

Response variable(s) are ■■ 
Variable no. 2 LN(Y+i) 



= 1 



MEANS 

* Overall Mean 



1.1068 



* Main Effect Means : 

Factor A - DISPLAY Levels ( 1 - 2 ) :" 

1.0711 1.1426 

Factor B - CARD STYLE Levels ( 1 - 6 > 

.9621 1.2082 
1.2469 

Factor C - STORES Levels ( 1 - 8 ) : 

1.1236 .9108 

1.4248 1.2798 



1.1403 



.9144 
1.1372 



.9105 
.9817 



1.1730 



1.0825 



* Two Way Interaction Means 



Factor 


A - 


- DISPLAY 


down and 


Factor B - CARD 


STYLE 


across 












1 




2 








3 


4 










5 




6 












1 








.8710 
1.1132 




1.2307 
1.2365 








1.1404 


.8350 


2 








1.0533 

1.2329 




1 . 18S8 
1.2574 








1.1401 


.9859 


Factor 


A - 


- DISPLAY 


down and 


Factor C - STORES 


acr oss 












1 




2 








3 


4 










5 




6 








7 


8 


1 








.9388 
.8767 




1.0420 
1.6310 








.8327 
1.0312 


.9267 

1.2899 


2 








1.3085 
1.2882 




.7795 
1.2185 








.9961 

1.5283 


1.0368 
.9845 


Factor 


B - 


- CARD 


STYLE down and 


Factor C - 


STORES 


across 












1 




2 








3 


4 










5 




6 








7 


8 


1 








1.0370 
.6469 




1.0393 
1.2395 








.9536 
1.0627 


.8564 
.8616 


2 








.9914 
1 . 5538 




.9491 
1.4816 








1.0853 

1.3656 


1.1418 
1.0974 


3 








1.2709 
1.1376 




.7343 
1.6123 








.9720 
1.4043 


.8108 
1.1799 


4 








.5503 
.7315 




.4908 
1.2966 








.5803 
1 . 4578 


.7026 
1.4741 



272 



Should the 
YES 



i.1788 .9491 

i.2637 i.3706 

1.7136 1.3019 

1.1614 1.5480 

3~way weans be printed ? 



1 . 0473 

1.3517 

.8479 

1 .0368 



1 .1418 
1 .0813 
1.2370 
1.1289 



# Three Way Interaction Means 
Factor A - DISPLAY, Level 1 



Factor B - 


- CARD 


STYLE down and 


Factor C - STORES across 










1 


2 


3 


4 








5 


6 


7 


8 


1 






.7930 


.9123 


1.0152 


.9243 








.5008 


1.0852 


.7930 


.9439 


2 






1.0 006 


1.1282 


1.1663 


1.2119 








1.3429 


1.6074 


1.1019 


1.2865 


3 






1.0006 


.8920 


1.0438 


.6471 








.8329 


2.1529 


1.1019 


1.4516 


4 






.2546 


.6523 


.3148 


.5423 








.3577 


1.6074 


1.2090 


1.7422 


5 






.8920 


1.1282 


1.0438 


1 .2119 








1.0852 


1.4493 


1.1019 


.9933 


6 






1.6919 


1.5390 


.4121 


1.0225 








1.1410 


1.8840 


.8796 


1.3218 


Factor A - 


- DISPLAY, 


Level 2 








Factor B - 


- CARD 


STYLE down and 


Factor C - STORES across 










1 


2 


3 


4 








5 


6 


7 


8 


1 






1.2809 


1.1663 


.8920 


.7885 








.7930 


1 . 3938 


1.3324 


.7793 


2 






.9821 


.7701 


1.0043 


1.0716 








1.7647 


1.3558 


1.6292 


.9083 


3 






1.5412 


.5766 


.9002 


.9746 








1.4422 


1.0716 


1.7066 


.9083 


4 






.8459 


.3293 


.8459 


.8629 








1. 1053 


.9858 


1.7066 


1 .2060 


5 






1.4656 


.7701 


1.0508 


1.0716 








1.4422 


1.2920 


1.6014 


1.1694 


6 






1.7352 


1.0647 


1 .2837 


1.4516 








1.1817 


1.2119 


1.1939 


.9361 



ANOUA TABLE 



Source (Nane) 



Nested Analysis of Variance 
df Sums of Squares Mean Square 



Note: Below AOV table does not show F 
ratios because the appropriate error mean 
square depends on the design. 



Total 95 

A DISPLAY 1 

CCA) 14 

B CARD STYLE 5 

AB 5 

CB(A) 70 



12.5531 

.1225 

5.3373 

1.5185 

.1687 

5.4062 



.1321 

.1225 

.3812 

.3037" 

.0337 

.0772 



F = 3.93 



This table shows the differences among card 
styles are even more significant. 



Specify a new variable for this design ? 
NO 



Enter desired nuMber: 

3 

Is the design displayed on the CRT the latest one? 

YES 



Multiple comparisons 



273 



Multiple Conparisons 

^ ^ ^ ^ ^ ^ ^ '^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ fli ^ ^ *^ ^ * ™ *p ^ * ^ *p ^ * * * ^ 

Enter i or 2 to specify type of Means 

i Least significant difference 

Which Factor/Main Effect(A,B, or Oshould be used? 

B 

Error Mean Square, associated Degrees of FreedoM 

.07723,70 

Which procedure would you like to use ? 

i 

What level of Alpha are you going to use ? 

.05 

Enter table value froM Student's t with d.f.= 70 

? 

1.99 

Is a plot of LSD desired ? 

YES 

Plot on CRT ? 

NO 

Plotter indentifier string (press CONT if 'HPGL')? 

Enter select code, bus t (defaults are 7,5)? 

Which PEN color should be used? 

i 

Enter nane for labelling Y axis (< ii characters) 

LN(Adj.*> 

Beep will sound when plot done, then press CONT 

To interrupt plotting press 'STOP' key. 



274 



MULTIPLE COMPARISON PLOT : LSD 
THANKSGIVING GREETING CARD EVALUATION 



•n 

(E 



2.0 r 

1.8 - 

1.6 - 

1.4 - 

1.2 - 

1.0 - 

.8 - 

.6 - 

.4 - 

.2 - 

e.e - 



i 



if 1 ' 



CARD STYLE LEVEL NUMBER 



Least Significant Difference 

Error Mean square = .077£3 
Degrees of freedon = 70 
Harnonic average sanple size 
Alpha level = .05 
Table value fron Student's t 
LSD value = .1955 



16.0000 
i.99 



Multiple Comparisons on Factor CARD STYLE 



Level 


Mean 


Sa 


rtple 


Size 


Separation 


4 


.9105 






16 


a 


i 


.9621 






16 


ab 


3 


1.1403 






16 


be 


5 


1.1730 






16 


c 


2 


1.2082 






16 


c 


6 


1 . 2469 






16 


c 
Note: Where the levels' do not contain the 
same letters the factor levels are significantly 
different using the LSD procedure. 



275 



Another Separation Procedure on Factor 2 

? 

NO 

Another Factor to be used ? 

NO 

Multiple Conparison Procedures on Two-Way Means ? 

NO 

Enter nuwber of desired funtion= 

9 Return to BSDM 



276 



Split Plot Example 

Example 

Hicks (1973, ex. 13.1) describes a split-plot experiment in which four oven temperatures and 
three baking times were investigated with regard to the life, Y, of an electrical component. 
The oven temperatures and the replications (blocks) are in the whole plot while the baking 
times are in the subplots. Only one electrical component was subjected to the stress condi- 
tions within each block-baking time-temperature combination. 

The data table is shown below: 



Replication 
1 



Baking 
Time (A) 

5 

10 

15 

5 

10 

15 

5 

10 

15 



580 



Oven Temp. (B) 
600 620 



640 



217 


158 


229 


223 


233 


138 


186 


227 


175 


152 


155 


156 


188 


126 


160 


201 


201 


130 


170 


181 


195 


147 


161 


172 


162 


122 


167 


182 


110 


185 


181 


201 


113 


180 


182 


199 



Since this is a balanced design with three replications, we need only use one variable for data 
entry. The data is entered across each row in the table above. Hence, three groups of replica- 
tions are available with factor A as baking time and factor B as oven temperature. 

Within the split-plot program, we answer that there are two factors and three major replica- 
tions. The design is specified with factor B in the whole plot and factor A in the subplot. The F 
ratio shows only significant temperature effects (B). The HSD multiple comparison procedure 
suggests that oven temperature two is significantly lower in life time readings than are the 
other three temperatures. 

This conclusion is supported, as should be expected, by the more 'liberal' LSD procedure 
shown on the next multiple comparison output. 

If one runs this data set through the Factorial Analysis in order to separate the replication 
interaction terms as suggested by Hicks, one finds a highly questionable interaction between 
replications and baking time. To do this, you specify factor A as replication, factor B as baking 
time, and factor C as oven temperature in the FACTORIAL program. 

Note that in Hicks the printed AOV table shows the mean square for AB (replication by 
baking time) is 1755.32 which is substantially larger than any of the other replication interac- 
tions. 



277 



After looking at the data set, we believe that Hicks may have rearranged the original data, 
since you would ordinarily not expect the replication interaction terms to differ by that much 
in a split plot. See if you agree. 



i^ ^L- ^ ^f tb dj *y ij> t|j ^ ^f ^m ^ jj j* *rf ^ *■> ^^^U/ U/ ^ ^ ^ U/ ^ t^ t^ i^ ^r \b ^ ^ ^/ i^ t|/ ^ ^ ^ il/ ^ ^ ^r *|/ *b U/ tl/ ^ tb \b ^^^ ^ -w^ \Lr^v\^^r^^^^ ^r Uf ^^ <Jf 'Jj iAf -X 1 ^ t^ i±f iAf ^/ \L' ^ 
^ ^ * ^ ^ * * * ^ ^ ^ ^ ^ ^* ^ ^ ^ ^ ^ ^ ^ ^ ^ T ^ ^ T* ^ ^ ^ ^ ^ ^ ^ ^ ^ * ^ ^ ^ * ^ ™ ^ ^ ^ ^ * * * ^ ^ * ^ ^ * *^ *P ^ ^^^^^Jp^^^'f^^^ <^ ^ ^ ^ *P *p ^* ^ ^ 

* DATA MANIPULATION * 

^ ^ ^ ^ ^ ^ ^ .^ ^ ^ ^ ^ ^ J|\ ^k J|k ^ ^ J^ /p ^ ^ ^ ^ flk J|% J|l ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ T* ^ ^ 'P ^ ^ ^ ^ T* T* ^ ^ T* ^ 

Enter DATA TYPE (Press CONTINUE for RAW DATA): 

i Raw data 

Mode number = ? 

2 On mass storage 

Is data stored on prograw's scratch file (DATA)? 

NO 

Data file nsne = ? 

HICKS: INTERNAL 

Was data stored by the BS&DM system ? 

YES 

Is data MediuM placed in device INTERNAL 

? 

YES 

Is prograM nediuw placed in correct device ? 

YES 



HICKS SPLIT PLOT ON COMPONENT LIFE TIME 

Data file nane: HICKS ■■ INTERNAL 

Data type is: Raw data 

NuMber of observations: 36 
NuMber of variables: i 



Variable nanes: 
i. LIFETIME 

Subfiles: NONE 



SELECT ANY KEY 

Option nuMber = ? 
i 



Select special function key labeled-LIST 



HICKS SPLIT PLOT ON COMPONENT LIFE TIME 
Data type is: Raw data 



I 


OBS(I> 


i 


217.00000 


6 


138.00000 


ii 


155. 00000 


i6 


201.00000 


2i 


195.00000 


26 


122.00000 



VARIABLE # 1 (LIFETIME) 

OBS(I + i) 0BS(I+2) OBSU+3) 0BS(I+4) 

158.00000 229.00000 223.00000 233.00000 

186.00000 227.00000 175.00000 152.00000 

156.00000 188.00000 126.00000 160.00000 

201.00000 130.00000 170.00000 181.00000 

147.00000 161.00000 172.00000 162.00000 

167.00000 182.00000 170.00000 185.00000 



278 



31 181.00000 201.00000 213.00000 180.00000 182.00000 
36 199.00000 
Option nuciber = ? 



SELECT ANY KEY Select special function key labeled-ADV STAT 

Remove BSDM media 

Insert AOV2 media 
Enter nunber of desired funtion: 

3 Split plot designs 

NuMber of factors in design ? (2 or 3) 

o 

NuMber of levels of factor A 
7 

3 

NuMber of levels of factor B 

? 

4 

NuMber of blocks in this design ? 

3 

No. obs per trt conbina tion in each block <sawple) ? 

1 

Do YOU want to assign nacies to the factors ? 

YES 

Enter the nane for factor A <<11 characters) 

7 



BAKINGTIME 

Enter the narie for factor B <<ii characters) 

7 

OVEN TEMP. 

Which factor(s) are in the whole plots ? 

B 

Which factor(s) are in the split plots ? 

A 

Is the above infor«ation correct ? 

YES 

No. of decimal places to print calculated values. 

4 

SPLIT PLOT ANALYSIS OF VARIANCE 

HICKS SPLIT PLOT ON COMPONENT LIFE TIME 

DESIGN 

NuMber of factors = 2 
No. of levels of factor A = 3 
No. of levels of factor B = 4 
No. of Major replications (blocks) = 3 
No. of Minor replications (saMples) = 1 

Subfiles will be ignored 

Whole plot factor(s) are : 

Factor B 

Split-plot factor(s) are ■■ 

Factor A 

Response variable(s) are 

Variable no. 1 LIFETIME 

MEANS 

* Overall Mean = 178.4722 



279 



* Block and Main Effect Means : 

Factor Blocks - Levels < 1 - 3 ) 

187.4167 169.3333 178.6667 

Factor A - BAKINGTIME Levels ( 1 - 3 ) = 

177.9167 183.5833 173.9167 

Factor B - OVEN TEMP. Levels < 1 - 4 > = 

194.8889 148.6667 176.7778 



193.5556 



* Two Way Interaction Means 



Factor 


A - 


- BAKINGTIME 
1 


down 


and 


Factor B - 
2 


OVEN 


TEMP . across 
3 


4 


1 




189 


.0000 




135.3333 




185.3333 


202.0000 


2 




201 


3333 




151.0000 




179.0000 


203.0000 


3 




194, 


3333 




159.6667 




166.0000 


175.6667 



ANOVA TABLE 

Source (Name) 

Total 

Blocks 

B OVEN TEMP. 

Error (a) 



A BAKINGTIME 

BA 

Error <b) 



Split Plot Analysis of Variance 
df Sums of Squares Mean Square 



F Ratio F-Prob 



35 

2 
3 
6 



6 
16 



29330.9722 

1962.7222 

12494.3056 

1773.9444 



566 . 2222 
2600.4444 
9933.3333 



838. 


0278 








981 

4164 
295 


3611 
.7685 
.6574 


3 
14 


.319 
.086 


.1070 
.0040 


283 
433 
620 


liii 
,4074 
.8333 




.456 
.698 


.6418 
.6551 



NOTE; F tests assume that all factors are fixed 



Enter desired number: 

1 

Is the design displayed on the CRT the latest one? 

YES 



Only factor B has a significant difference 
among effects. 

Orthogonal polynomial comparisons 



Orthogonal Polynomial Comparisons 

Orthogonal polynomial comparisons on FACTOR 1 

? 

YES 

Enter the max degree of orthogonal poly 

2 

Value associated with level # i of FACTOR 1 

? 

5 



280 



Value associated with level # 2 of FACTOR i 
? 

10 

Value associated with level # 3 of FACTOR i 

? 

is 

Is the above information correct ? 

YES 

Enter Error Mean square, degrees of freedom 

620.83,16 From AOV table 



Orthogonal Polynomial Decomposition on BAKINGTIME 

Degree SS F-Ratio F-Prob 

i 96.0000 .1546 .69934 

2 470.2222 .7574 .39701 

Level of Treatments : 5 10 15 

Orthogonal poly comparisons on another FACTOR? 

YES 

Orthogonal polynomial comparisons on FACTOR i 

? 

NO 

Orthogonal polynomial comparisons on FACTOR 2 

? 

YES 

Enter the max degree of orthogonal poly 

3 

Value associated with level # i of FACTOR 2 

? 

580 

Value associated with level # 2 of FACTOR 2 

? 

600 

Value associated with level # 3 of FACTOR 2 

? 

620 

Value associated with level # 4 of FACTOR 2 

? 

640 

Is the above information correct ? 

YES 

Enter Error mean square, degrees of freedom 

295.66,6 From AOV table 



Orthogonal Polynomial Decomposition on OVEN TEMP. 

Degree SS F-Ratio F-Prob 

i 261.6056 .8848 .38320 

2 8930.2500 30.2045 .00152 

3 3302.4500 11.1698 .01557 

Level of Treatments = 580 600 620 640 
Orthogonal poly comparisons on another FACTOR? 
NO 

Eiinter number of desired funtion: 

6 Multiple comparisons 

Is the design displayed on the CRT the latest one? 

YES 



281 



************************************************** 

Multiple CoMparisons 
************************************* )K )|( # **** 1 |( ##)K * ) |< ## ^ # ) ( ( iK ^ ###)K ^^^ # ^ # ^^^ # ^^^ # ^^^^^ 

Enter i or 2 to specify type of Means 
i 

Which Factor/Main Effect(A or EDshould be used 1 

B 

Error Mean Square, associated Degrees of Freedow 

295.66,6 

Which procedure would you like to use ? 



What level of Alpha are you going to use ? 

.05 

for 4 Means, d.f.= 6 

? 

4.9 

Is a plot of HSD desired ? 

YES 

Plot on CRT ? 

NO 

Plotter indentifier string (press CONT if 'HPGL')? 

Enter select code, bus # (defaults are 7,5>? 

Which PEN color should be used? 

i 

Enter naMe for labelling Y axis << ii characters) 

LIFE TIME 

Beep will sound when plot done, then press CONT 

To interrupt plotting press 'STOP' key. 



Tukey's HSD 



282 



MULTIPLE COMPRRISON PLOT : TUKEY'S HSD 
HICKS SPLIT PLOT ON COMPONENT LIFE TIME 



u 

Ju 






209.0 
201.5 
194.0 
186.5 
179.0 
171.5 
164.0 
156.5 
149.0 
141.5 
134.0 



OVEN TEMP. LEVEL NUMBER 



Tukey's HSD 

Error Mean square = 295.66 

Degrees of freedom = 6 

HarMonic average sawple size = 9.0000 

Alpha level = .05 

Table value fron Studentized range = 4.9 

HSD value = 28.0848 



Multiple Cowparisons on Factor OVEN TEMP. 



Level 


Mean 


Sanple Size 


Separation 


2 


i48.6667 


9 


a 


3 


176.7778 


9 


b 


4 


193.5556 


9 


b 


i 


194.8889 


9 


b 



283 



Another Separation Procedure on Factor 2 
? 

NO 

Another Factor to be used ? 

NO 

Multiple CoMparison Procedures on Two-Way Means ? 

NO 

Enter nurtber of desired funtion: 

& Multiple comparisons 

Is the design displayed on the CRT the latest one? 

YES 



Least significant difference 



**************************************************** 

Multiple Comparisons 

******************************************************************************** 

Enter i or 2 to specify type of Means 

i 

Which Factor/Main Effect(A or EOshould be used ? 

B 

Error Mean Square, associated Degrees of Freedon 

295.66,6 

Which procedure would you like to use ? 

i 

What level of Alpha are you going to use ? 

.05 

Enter table value fro« Student's t with d.f.= 6 

? 

2.447 

Is a plot of LSD desired ? 

YES 

Plot on CRT ? 

NO 

Plotter indentifier string <press CONT if 'HPGL')? 

Enter select code, bus # (defaults are 7,5)? 

Which PEN color should be used? 

i 

Enter nane for labelling Y axis << ii characters) 

LIFE TIMES 

Beep will sound when plot done, then press CONT 

To interrupt plotting press 'STOP' key. 



284 



MULTIPLE COMPARISON PLOT : LSD 
HICKS SPLIT PLOT ON COMPONENT LIFE TIME 



(n 
u 

r 






205.8 
198.3 
191.6 
184.9 
178.2 
171.5 
164.8 
158.1 
151.4 
144.7 
138.0 



OVEN TEMP. LEVEL NUMBER 



Least Significant Difference 

Error Mean square = 295.66 

Degrees of freedow = 6 

Harrtonic average sanple size = 9.0000 

Alpha level = .05 

Table value fron Student's t = 2.447 

LSD value = 19.8346 



Multiple Conparisons on Factor OVEN TEMP. 



vel 

2 


Mean 
148.6667 


Sanple Size 
9 


Separation 
a 


3 


176.7778 


9 


b 


4 

i 


193.5556 
194.8889 


9 
9 


b 
b 



285 



Another Separation Procedure on Factor 2 

V 

NO 

Another Factor to be used ? 

NO 

Multiple Comparison Procedures on Two~Way Means V 

NO 

Enter number of desired funtion* 

i Factorial design 

Number of factors in design ? (2, 3, or 4) 

3 

Number of levels of factor A 

? 

3 

Number of levels of factor B 

? 

3 

Number of levels of factor C 
? 

4 

Number of blocks in this design ? 

i 

No. obs per trt combination in each block (sample) ? 

i 

Is the above information correct ? 

YES 

Do YOU want to assign nanes to the factors ? 

YES 

Enter the name for factor A (<ii characters) 

? 

REP 

Enter the name for factor B (<li characters) 

? 

BAKE TIME 

Enter the name for factor C (<ii characters) 

? 

OVEN TEMP. 

No. of decimals for printing calc. values(<=7). 

4 

* FACTORIAL ANALYSIS OF VARIANCE * 

4 4 4 ^' A' 4r W ^ 4 ^ At 4 W ^ ^ 4 ^ ^ W ^ ^ 4 4 4 4 4 ^ ^ 4 4 ^ ^lf W W 4 4 4 W W it W Jf W 4 W ifr ^t ^ W iL* W iV 4 W 4 W ilf it W 4 4 ^ ^ ^ W ^ 4f *At ^ 4 4r 4 W W ^ \fr W ^ 4 ^ 

HICKS SPLIT PLOT ON COMPONENT LIFE TIME 

DESIGN 

Number of factors = 3 
No. of levels of factor A = 3 
No. of levels of factor B = 3 
No. of levels of factor C = 4 
No. of major replications (blocks) => i 
No. of minor replications (samples) = i 

Subfiles will be ignored 
Response variable(s) sre ■■ 
Variable no. i LIFETIME 

MEANS 

* Overall mean = 178.4722 



286 



* Main Effect Means 



Factor A - REP Levels ( i - 3 ) : 

187.4167 169.3333 178.6667 
Factor B - BAKE TIME Levels ( i - 3 ) = 

177.9167 183.5833 173.9167 
Factor C - OVEN TEMP. Levels ( 1 - 4 ) ■ 

194.8889 148.6667 176.7778 



193.5556 



# Two Way Interaction Means 
Factor A - REP 

1 

2 
3 

REP 



Factor A 

1 
2 
3 

Factor B 

1 
•> 



down and Factor B - BAKE TIME 
1 2 

206.7500 196.0000 

168.7500 170.5000 

158.2500 184.2500 



OVEN TEMP. 



down and Factor C 

1 2 

208.3333 149.3333 

194.6667 134.3333 

181.6667 162.3333 



acr oss 
3 
159.5000 
168.7500 
193.50 

across 
3 
190 .0000 
163.6667 
176.6667 



BAKE TIME down and Factor C - OVEN TEMP. acre 

1 2 3 

189.0000 135.3333 185.3333 

201.3333 151.0000 179.0000 

194.3333 159.6667 166.0000 



Should the 3-way means be printed ? 
YES 



* Three Way Interaction Means 



202.0000 
184.6667 
194.0 00 



4 
202.0000 
203.0000 
175.6667 



Factor 


A - 


- REP, 


Level 1 
















Factor 


B - 


- BAKE 


TIME down 
1 


and 


Factor C - 
2 


OVEN 


TEMP. 
3 


acr oss 


4 




i 






217.0000 




158.0000 




229. 


0000 


223. 


0000 


2 






233.0000 




138.0000 




186. 


0000 


227. 


0000 


3 






175.0000 




152.0000 




155. 


0000 


156. 


0000 


Factor 


A - REP, 


Level 2 
















Factor 


B - 


- BAKE 


TIME down 
1 


and 


Factor C - 
2 


OVEN 


TEMP. 
3 


across 


4 




1 






188.0000 




126.0000 




160. 


0000 


201 . 


0000 


2 






201.0000 




130.0000 




170. 


0000 


181. 


0000 


3 






195.0000 




147.0000 




161. 


0000 


172. 


0000 


Factor 


A - 


- REP, 


Level 3 
















Factor 


B - 


- BAKE 


TIME down 
1 


and 


Factor C - 
2 


OVEN 


TEMP . 
3 


across 


4 




1 






162.0000 




122.0000 




167. 


0000 


182 


0000 


2 






170.0000 




185.0000 




181 


0000 


201. 


0000 


3 






213.0000 




180.0000 




182 


0000 


199 


0000 



ANOVA TABLE 



287 



Factorial Analysis of Variance 



Source <Nane> 


df 


Total 


35 


A 


REP 


2 


B 


BAKE TIME 


2 


C 


OVEN TEMP. 


3 


AB 
AC 




4 
6 


BC 




6 


ABC 




12 



Sums of Squares 



29330.9722 

1962.7222 

566.2222 

12494.3056 
7021.2778 
1773.9444 
2600.4444 
2912.0556 



Mean Square 



838 
981 
283 
4164 
1755 
295 
433. 
242 



0278 
3611 
liii 
7685 
3194 
6574 
4074 
6713 



We can see that the interaction between bak- 
ing temperature and replication is significant. 



Enter desired number; 
7 

Enter nuMber of desired funtion; 
4 



Exit factorial design. 



Return to BSDM. 



288 



One Way AOV 

Example 

Tissue Culture Growth was studied after exposure to five 'sugar' treatments; control, 2% 
fructose, 1% glucose and 1% fructose, and 2% sucrose. The response, Y, is length (in ocular 
units) of pea section grown in tissue culture with auxin present. 

The data was entered using One-Way AOV mode 2 in which all treatments are stored in one 
variable. Each treatment has ten observations (samples). Hence, observations 1 to 10 are in 
the first treatment, observations 11 to 20 are in the second treatment, etc. The F ratio for 
treatments shows a very strong indication that the population treatment levels are significantly 
different. Both the LSD and Duncan Multiple Comparison procedure separate the treatments 
into three non-overlapping groups - treatments 4, 3, and 2: and treatment 5; and treatment 1 
(control). Hence, if you add either glucose (2) or fructose (3) or both (4) you get shorter 
lengths that if you use just sucrose which is in turn shorter than the control treatment. 



* DATA MANIPULATION * 

Enter DATA TYPE (Press CONTINUE for RAW DATA) : 

i Raw data 

Mode number = ? 

2 On mass storage 

Is data stored on program's scratch file (DATA)? 

NO 

Data file name = ? 

TISSUE INTERNAL 

Was data stored by the BS&DM system ? 

YES 

Is data medium placed in device INTERNAL 

? 

YES 

Is prograM medium placed in correct device ? 

YES 



TISSUE CULTURE GROWTH 

Data file name: TISSUE = INTERNAL 

Data type is; Raw data 

Number of observations: 50 
Number of variables: i 



Variable names: 
i. GROWTH 

Subfile name beginning observation number of observations 

1. CONTROL i 10 

2. 2/£ GLUCOSE ii 10 

3. ZX FRUCT. 2i 10 

4. iXGLU+iFRU 31 10 

5. 2XSUCR0SE 41 10 



289 



SELECT ANY KEY 

Option nuMber = 
i 



Select special function key labeled-LIST 
List all data 



Data type is: Raw data 



TISSUE CULTURE GROWTH 



VARIABLE # 1 (GROWTH) 



I 




OBS(I) 


i 




75.00000 


6 




71.00000 


11 




57.00000 


16 




60.00000 


21 




58. 00000 


26 




56.00000 


31 




58.00000 


36 




56.00000 


41 




62.0 0000 


46 




62.00000 


Option 



SELECT 


nu fiber = ? 


ANY 


KEY 



OBS(I+l) 
67.00000 
67.00000 
58.00000 
60.00000 
61.00000 
61.00000 
59.00000 
58.00000 
66.00000 
65.00000 



0BS(I+2) 
70.00000 
67.00000 
60.00000 
57.00000 
56.00000 
60.00000 
58.00000 
57.00000 
65.00000 
65.00000 



Enter nunber of desired funtion: 

1 

How Many treatcients in this analysis ? 

5 

Enter nafie for treatMent/f actor (<11 characters) 

TISSUE 

Do YOU want to assign naoes to the treatnents ? 

YES 

Enter the name for treatMent 1 <<ii characters) 

? 

CONTROL 

Enter the nane for treatMent 2 (<ii characters) 

? 

27. GLUCOSE 

Enter the nane for treatMent 3 (<11 characters) 

? 

2X FRUCT. 

Enter the nane for treatMent 4 (<ii characters) 

7 

1XGLU+FRU 

Enter the naMe for treatMent 5 (<11 characters) 

? 

2XSUCRQSE 

Are the nanes displayed on the CRT correct ? 

YES 

TreatMent definition Mode = ? 

2 

Enter the nuMber of observations in treatMent i 

? 

10 

Enter the nuMber of observations in treatMent 2 

? 

10 

Enter the nuwber of observations in treatMent 3 

? 

10 



0BS(I+3) 
75.00000 
76.00 00 
59.00000 
59.00000 
58.00000 
57.00000 
61.00000 
57.00000 
63.00000 
62.00000 

Exit list procedure 



0BS<I+4) 
65.00000 
68.0 000 
62.00000 
61.00000 
57.00000 
58.00000 
57.00000 
59.00000 
64.00000 
67.00000 



Select ADV STAT 
Remove BSDM media 
Insert AOV1 media 
Select one way classification 



290 



Enter the number of observations in treatment 4 

? 

10 

Enter the number of observations in treatment 5 

? 

10 

Subfile * (enter to ignore subfile) = ? 



Is the design description on the CRT correct ? 

YES 

^ ^ * ^ ^ * ^ ^ ^ ^ ^^ ^ ^ V" V ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ - ^ ™ * ™ ^ * ^"^ ^ ^ ^ ^ * * * * * ^ ^ ^ *P * * ™ *P ^"^ ^ ^ ■!* *P *^ ^ ^ *P *P * ^ ^ t* *r *T t ^ ™ *P ^ * T* *^*f* ^ *P *l* *P 

ONE-WAY ANALYSIS OF VARIANCE: 
TISSUE CULTURE GROWTH 

* of decimals for printing calculated values(<=7>? 

4 

DESIGN 

# of treatments = 5 

# of observations in treatment i = 10 

# of observations in treatment 2 = 10 

# of observations in treatment 3-10 

# of observations in treatment 4 = 10 

# of observations in treatment 5 = 10 
Response - GROWTH 



SUMMARY STATISTICS 



Treatment Statistics 



Treatment nc 
CONTROL 
ZX GLUCOSE 
2% FRUCT. 
1ZGLU+FRU 
2XSUCR0SE 



Total 
701.0000 
593.0000 
582.0000 
580.0000 
641.0000 



70 



Mean 
1000 
59.3000 
58.2000 
58.0000 
64.1000 



Stan . Dev 
3.9847 



.6364 
.8738 
.4142 
.7920 



N 
10 
10 
10 
10 
10 



Overall 3097.0000 61.9400 5.1958 50 

ANOVA TABLE 



Source 


Df 


Total 


49 


TISSUE 


4 


Error 


45 



One-Way Analysis of Variance Table 

SS MS F-Ratio F-Prob 
1322.8200 

1077.3200 269.3300 49.3680 0.00000 

245.5000 5.4556 



We can see that the effects of population 
treatment levels are significantly different. 

Bartlett's test of homogeneity of variance ; 

Chi-square value = 13.939 with degrees of freedom = 4 

Do you wish to specify another subfile ? X 2 (4,.05) = 9.488,X 2 (4„01) = 13.277 

NO Both are smaller than the calculated X 2 value 

of 1 3.9386, so we know that the variances are 
Enter desired number; not homogeneous. 

3 Multiple comparisons 

Is the design displayed on the CRT the latest one? 
YES 



291 



MULTIPLE COMPARISONS 

Jit************************************************* 



Which procedure would you like to use ? 

i 

What level of Alpha are you going to use ? 

.05 

Enter table value forn Student's t with d . f = 45 

? 

2.014 

Is a plot of LSD desired ? 

YES 

Plot on CRT ? 

NO 

Plotter indentifier string (press CONT if 'HPGL')? 

Plotter select code, bus * (defaults are 7,5)? 

Beep will sound when plot done, then press CONT. 

Which PEN color should be used? 

i 

Enter nawe for labelling Y axis(<ii characters) 

LENGTH 

To interrupt plotting, press 'STOP' key. 



Least significant difference 



I 
I- 

Z 



MULTIPLE COMPRRISON PLOT : LSD 
TISSUE CULTURE GROWTH 



72.0 
70.4 
68.8 
67.2 
65.6 
64.0 
62.4 
60.8 
59.2 
57.6 
56.0 



_i_ 



2 3 4 5 
TISSUE LEVEL NUMBER 



292 



Least Significant Difference 

Error Mean square = 5.4556 

Degrees of freedoM = 45 

Harnonic average sample size = 10.0000 

Alpha level = .05 

Table value from Student's t = 2.014 

LSI) value = 2.1037 



Multiple CoMparisons on TISSUE 



Level 


Mean 


Sample Size 


4 


58.0000 


10 


3 


58.2000 


10 


2 


59.3000 


10 


5 


64.1000 


10 


i 


70.1000 


10 



Another Separation Procedure on TISSUE 

? 

YES 

Which procedure would you like to use ? 

3 

What level of Alpha are you going to use ? 
05 



Separation 
a 
a 
a 
b 
c 

This separates the treatment into three non- 
overlapping groups, treatments 4, 3, and 2 in 
one group, 5 in another, and 1 in the last. 



Select Duncan's Test 



Duncan's Test 

Error Mean square = 5.4556 

Degrees of freedom = 45 

HarMonic average sample size = 10.0000 

Alpha level = .05 



Means Separated Table Value 

for 5 Means and d.f.= 45 
■> 

3. 16 

5 3.1600 

for 4 Means and d.f.= 45 



Required Difference 



2.3340 



3.095 

4 3.0950 

for 3 Means and d.f.= 45 
? 
3.0 05 

3 3.0050 

for 2 Means and d.f.= 45 
? 
2.85 

2 2.8500 



2860 



>195 



2.1051 



Multiple CoMparisons on TISSUE 



Level 


Mean 


Sample 


Size 


Sep 


aration 


4 


58.0000 




10 




a 


3 


58.2000 




10 




a 


2 


59.3000 




10 




a 


S 


64.1000 




10 




b 


1 


70.1000 




10 




c Same conclusion as above 



293 



Another Separation Procedure on TISSUE 

? 

NO 



NOTE: HARMONIC AVER SAMPLE SIZE OF 10 USED 

IN CALCULATING THE MULTIPLE COMPARISONS. 
Enter nunber of desired funtion^ 
9 Return to BSDM 



294 



Two Way (Unbalanced) 

Example 

The following data from Bancroft (1968, Ex. 1.3) is a two-way classification with factor A 
representing five different batches of silver and factor B representing two batches of iodine 
which are used to make silver iodine. The response, Y, is the reacting weights (coded). 
Apparently several samples were lost because the design is unbalanced. 

Iodine 



Sj 



Silver 



22 


-1 


25 


40 




18 


41 


23 


41 


13 


29 




20 




37 




49 


61 


50 




55 





The data is entered using two variables. Variable one is used to identify the rows and columns 
and variable two contains the response, Y. Hence, a value in variable one of 0301 indicates 
that the observation in variable two is from the third level of silver (A) and the first level of 
Iodine (B). The Two-Way Unbalanced routine is used with the method of fitting constants 
selected as the desired procedures because of the presence of empty cells. This analysis indi- 
cates that the sampled batches of silver do not support the hypothesis of equality for the 
population means. 



The multiple comparison procedure by Student, Newman & Keuls (SNK) shows no separa- 
tion between the five samples of silver. This probably can be explained by both the conserva- 
tive nature of the SNK procedure and the fact that the AOV procedure uses an adjusted mean 
square for silver. 



295 



************************************************* 

* DATA MANIPULATION * 

Enter DATA TYPE (Press CONTINUE for RAW DATA): 

i 

Mode number = ? 



Is data stored on program's scratch file (DATA)? 

NO 

Data file nafie = ? 

SLVRIODN INTERNAL 

Was data stored by the BS&DM system ? 

YES 

Is data medium placed in device INTERNAL 

? 

YES 

Is program Medium placed in correct device ? 

YES 



Raw data 

On mass storage 



CODED REACTIN WEIGHTS OF SLIVER IODINE 



Data file nans: SLVRIODN = INTERNAL 

Data type is: Raw data 

NuMber of observations: 16 
Number of variables: 2 



Variable names: 
i. ROWjCOLUMN 
2. RWEIGHT 

Subfiles: NONE 



SELECT ANY KEY 

Option nu nber = ? 

1 

Enter Method for listing data: 

3 



Select special function key labeled-LIST 



CODED REACTIN WEIGHTS OF SLIVER IODINE 
Data type is: Raw data 



Variable * 1 Variable # 2 
(ROWjCOLUMN) (RWEIGHT ) 



OBS* 








i 


101 


.00000 


22.00000 


2 


iOi 


.00000 


25.00000 


3 


201 


.00000 


41.00000 


4 


201 


.00000 


41.00000 


5 


30i 


.00000 


29.00000 


6 


301 . 


00000 


20.00000 


7 


301. 


00000 


37.00000 


8 


401. 


00000 


49.00000 


9 


401. 


00000 


50.00000 



296 



10 


501 


00000 


55 


00000 


ii 


102. 


00000 


-1. 


00000 


12 


102. 


00000 


40 


00000 


13 


102. 





18. 


00000 


14 


202. 


.00000 


23, 


00000 


15 


202, 


00000 


13. 


00000 


16 


402. 


00000 


61. 


00000 



Option number = 



SELECT ANY KEY 



Exit list routine 

Select special function key labeled-ADV STAT 
Remove BSDM media 
Insert AOV1 media 

Two-way unbalanced design 



Enter nuciber of desired funtion= 

2 

Data storage type = 

2 

Variable number for packed identification = 

1 

Enter # of rows, * of columns (separate by comma) 

5,2 

Do YOU wish to label the row and column factors ? 

YES 

Enter nana of row factor (<ii characters) 

SILVER 

Enter name of column factor <<ii characters) 

IODINE 

Enter the variable number for response 

2 

Is the above information correct ? 

YES 

TWO-WAY UNBALANCED ANALYSIS OF VARIANCE: 
CODED REACTIN WEIGHTS OF SLIVER IODINE 

* of decimal places for calculated values <<=7)? 

4 

DESIGN 



# of rows = 5 

# of columns = 2 
Response = RWEIGHT 

SUMMARY STATISTICS 



Row Column 

1 1 

1 2 

2 1 

2 2 

3 1 

4 i 

4 2 

5 1 



Subclass Statistics 

Total Mean 

47.0000 23.5000 

57.0000 19.0000 

82.0000 41.0000 

36.0000 18.0000 

86.0000 28.6667 

99.0000 49.5000 

61.0000 61.0000 

55.0000 55.0000 



Stan . Dev 


N 


2.1213 


2 


20.5183 


3 


.0000 


2 


7.0711 


2 


8.5049 


3 


.7071 


2 


0.0000 


1 


0.0000 


1 



297 





Mean 


N 


20 


.8000 


5 


29 


.5000 


4 


28 


,6667 


3 


53 


.3333 


3 


55 


.0000 


i 



Row Statistics 

Row Total 

1 104.0000 

2 118.0000 

3 86.0000 

4 160.0000 

5 55.0000 

Colunn Statistics 

Col Total Mean N 

1 369.0000 36.9000 10 

2 154.0000 25.6667 6 

ANOVA TABLE 

Preliciinary AOV ( Test two way Model ) 



Source 


Df 


SS 


MS 


F-Ratio 


F-Prob 


Total 


15 


4255.4375 








Subclass 


7 


3213.7708 


4S9.ii0i 


3.5260 


.04908 


Error 


8 


1041.6667 


130.2083 







Prelininary AOV ( Test for Interaction ) 



Source 


Df 


SS 


MS 


F-Ratio 


F-Prob 


Total 


15 


4255.4375 








Main 


5 


2722.2592 


544.4518 


4.1814 


.03641 


Int 


2 


491.5116 


245.7558 


1.8874 


.21308 


Error 


8 


1041.6667 


130.2083 







4. 4 4 4. 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 44 4 4 4 44 4 4 4 4 4 4 4 44 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 

Analysis of Variance ( Method of Fitting Constants ) 



Source 




Df 


SS 


MS 


F-Ratio 


F-Prob 


Total 




15 


4255.4375 








SILVER 




4 


2572.3042 


643.0760 






IODINE 


<Adj) 


i 


149.9550 


149.9550 


1.1517 


.31450 


IODINE 




1 


473.2042 


473.2042 






SILVER 


<Adj) 


4 


2249.0550 


562.2638 


4.3182 


.03749 


Int 




2 


491.5116 


245.7558 






Error 




8 


1041.6667 


130.2083 







4 4 4 4 4 4 4' 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 

Enter desired nuMber; 

3 Multiple comparisons 

Is the design displayed on the CRT the latest one? 

YES 

444444 ^^^44444444444444444444444444444444444444444444444444444444444444444444444 



298 



MULTIPLE COMPARISONS 

Enter i or 2 to specify type of Means 
i 

Which Factor/Main Effect<A or EOshould be used ? 
A 

Which procedure would you like to use ? 
4 

What level of Alpha are you going to use ? 
. 05 



Student Newman-Kevls 



Student-Newnan-Keuls Test 

Error Mean square = 130,2083 
Degrees of freedoM = 8 
HarMonic average saMple size = 
Alpha level = .05 



P "?A7>P 



. 36£ 



Means Separated Table Value 

for 5 Means and d.f.= 8 
? 
4.89 

5 4.8900 

for 4 Means and d.f.= 8 
? 
4.53 

4 4.5300 

for 3 neans and d.f.= 8 
? 

4. 04 

3 4.0400 

for 2 Means and d.f.= 8 
? 

3.26 

2 3.2600 



Required Difference 



36.3053 



33.6325 



29.9945 



24.2035 



Multiple CoMparisons on SILVER 



Level 


Mean 


SaMple Size 


Sep 


aration 


i 


20.8000 


5 




a 


3 


28.6667 


3 




a 


2 


29.5000 


4 




a 


4 


53.3333 


3 




a 


S 


55.0000 


i 




a 



Another Separation Procedure on SILVER 

"i 

NO 

Another Factor to be used ? 

NO 

Multiple CoMparison Procedures on Two-Way Means ? 

NO 



NOTE= HARMONIC AVER SAMPLE SIZE OF 2.36220472441 USED 

IN CALCULATING THE MULTIPLE COMPARISONS. 
Enter nuMber of desired funtion-- Return to BSDM 

V 



299 



One Way Analysis of Covariancc 

Example 

An experiment to evaluate the effects of various growth stiumulants (X-4 on tomato seedlings 
was performed in which: 

X == Initial length of seedling (m.m.) 

Y = Growth in length (m.m.) during experiment 



Stimulant X-4 


Stimulant BC 


Stimulant F32 


Stimulant OX 


X Y 


X 


Y 


X 


Y 


X 


Y 


29 22 


15 


30 


16 


12 


5 


23 


20 22 


9 


32 


31 


8 


25 


31 


14 20 


1 


26 


26 


13 


16 


28 


21 24 


6 


25 


35 


25 


10 


26 


6 12 


19 


37 


12 


7 


24 


33 



The data was entered using the first mode of storage for the covariance program. That is, 
each X,Y pair was stored in two variables and each of the four treatments used different 
variable pairs. Hence, for the Stimulant X-4, the initial length, X, was stored in Variable 1 and 
the growth, Y, was stored in Variable 2; while for the stimulant OX, the X value was stored in 
Variable 7 and the Y in Variable 8. Each variable has five observations. 

The first part of the output from the One-way Covariance routines shows the within treate- 
ment statistics including totals, means, standard deviations, sample sizes, correlation coeffi- 
cients, and regression coefficients. Note that the correlation coefficient and regression coeffi- 
cient are for all of the data points taken together without regard to treatment group. Hence, it 
should not be surprising that no overall relationship exists between the X and Y variables. The 
test for homogeneity of regression coefficients confirms that we can accept the hypothesis that 
all treatment regression coefficients are essentially the same. The test for significance of 
pooled regression confirms that the relationship between the X and Y pooled across all treat- 
ments is significant (level = .0003). 

Whereas the F ratio for treatment differences on the X's is non-significant (level = .121 17), the 
F ratio on the original Y's is significant at the .00037 level. The analysis of covariance adjust- 
ment to the original data does not change the significance of the treatment effect 
( a = . 00000), but rather makes the difference in the means even more pronounced. This is 
shown by studying the "Table of Means" and noting the adjustment made in the original Y 
means after the use of the covariate X. 

The use of the Tukey HSD multiple comparison procedure shows that stimulants one and 
three differ from all other stimulants, while no significant difference can be shown between 
two and four. 



300 



* DATA MANIPULATION * 

****************************************************** 

Enter DATA TYPE! (Press CONTINUE for RAW DATA): 

i Raw data 

Mode nuMber = ? 



On mass storage 



Is data stored on progran's scratch file (DATA)? 

NO 

Data file nane = ? 

TOMATO INTERNAL 

Was data stored by the BSM)M systeft ? 

YES 

Is data ftediuM placed in device INTERNAL 

? 

YES 

Is prograci nediuci placed in correct device ? 

YES 



EFFECTS OF GROWTH STIMULANTS ON TOMATO SEEDLING LENGTHS 



Data file nane: TOMATO: INTERNAL 

Data type is: Raw data 

NuMber of observations: 5 
Number of variables: 8 



Variable nanes 


i. 


X-4:I 


2. 


X-4:G 


3. 


BC:I 


4. 


BC:G 


5. 


F32 = I 


6. 


F32:G 


7. 


OX: I 


8. 


OX = G 


S u b f i 


les: NON 



Select LIST key 



SELECT ANY KEY 

Option nuMber •- ? 

i 

Enter Method for listing data: 

3 

EFFECTS OF GROWTH STIMULANTS ON TOMATO SEEDLING LENGTHS 
Data type is: Raw data 

Variable # i Variable # 2 Variable * 3 Variable # 4 Variable # 5 

(X-4:I > (X~4:G ) <BC:I ) ( BC : G ) ( F32 : I ) 



u«s# 












1 


29.000 00 


22.00000 


15.00000 


30.0 00 


16.00000 


2 


20.00000 


22.00000 


9.00000 


32.0 00 00 


31.00000 


3 


14.00000 


20.000 


i. . 


26.00 00 


26.0 000 


4 


21.00000 


24. 00 00 


6.00000 


25. 000 


35.0 0000 


5 


6.00000 


12.00000 


19.00000 


37. 000 


12.00000 



301 





Variable # 6 
<F32>G > 


DBS* 
i 
2 
3 
4 
5 


12.00000 

8.00000 

13.00000 

25.00000 

7.00000 



Variable # 7 

(ox a ) 



5.00000 
25. 0000 
16.00000 
10.00000 
24.00000 



Variable # 8 

<QX:G ) 



23.00000 
31.00000 
28.00000 
26.00000 
33.00000 



Option nunber = 



SELECT ANY KEY 



Exit list procedure 

Select ADV STAT key 
Remove BSDM media 
Insert AOV1 media 

One way analysis of covariance 



Enter nunber of desired funtion: 

3 

How Many treatnents in this analysis ? 

4 

Enter a nane for treatMent/f actor << 11 characters) 

TREATMENT 

Do YOU want to assign nanes to the treatments ? 

YES 

Enter the nane for trt. 1 <<=10 characters) 

? 

X-4 

Enter the nane for trt. 2 (<=10 characters) 

? 

BC 

Enter the nane for trt. 3 <<=10 characters) 

? 

F32 

Enter the nane for trt. 4 (<=10 characters) 

? 

OX 

Are the nanes displayed on the CRT correct ? 

YES 

Treatnent definition Mode = ? 

1 

Enter the X var . , Y var . for treatnent 1 

? 

i>2 

Enter the X var . , Y var . for treatMent 2 

? 

3,4 

Enter the X var . , Y var . for treatnent 3 

? 

5,6 

Enter the X war., Y var. for treatnent 4 

? 

7,8 

Is the design description on the CRT correct ? 

YES 

***************************************************** 

ONE-WAY ANALYSIS OF COVARIANCE 
EFFECTS OF GROWTH STIMULANTS ON TOMATO SEEDLING LENGTHS 

********************************************************* *********************** 

# of decinal places for calculated values<<=7) ? 
4 



302 



DESIGN 

# of treatMents = ^ 

# of observations 

# of observations 

# of observations 

# of observations 
Covariate X = X-4-I 
Response Y = X~4 : G 



in treatment 1=5 

in treatment 2=5 

in treatMent 3=5 

in treatment 4=5 



SUMMARY STATISTICS 



TreatMent Statistics 



Treatment 
X--4 



BC 

F3Z 

OX 



X 
Y 

X 
Y 

X 
Y 

X 
Y 



Total 

90.0000 

L00. 0000 

50.0000 
150.0000 

120.0000 
65.0000 

80.0000 
141.0000 



Mean 
18.0000 
20.0000 

10.0000 
30.0000 

24.0000 

13.00 

16.0000 
28.2000 



Stan . Dev 


N 


8.5732 


5 


4.6904 


5 


7.1414 


5 


4.8477 


5 


9.7724 


5 


7.1764 


5 


8.6891 


5 


3 . 9623 


S 



Overall 



X 
Y 



340.000 
456.0 00 



17.0000 
22.8000 



9.4088 
8.5076 



TreatMen t 
X-4 
BC 
F32 
OX 



Within Treatment Regressions 



Corr .Coef . 
.8331 
.8449 
.6310 
.9730 



Regression Coef. 
.4558 
. 5735 
.4634 
.4437 



Overall -.0487 -.0440 

ANQVA TABLE 

One-Way Analysis of Variance Table(X-Var iable) 

MS F-Ratio 
2.2561 



Source 


Df 




SS 


Total 


19 


1682. 


0000 


Treatment 


3 


500. 


0000 


Error 


16 


1182 


.0000 



Source 


Df 


SS 


Total 


19 


1375.2000 


Treatment 


3 


924.4000 


Error 


16 


450.8000 



166.6667 
73.8750 



One-Way Analysis of Variance TabIe<Y-Var iable) 

MS F~Ratio 

308.1333 10.9364 
28.1750 



F~Prob 
.12117 



F-Prob 
.00037 



We can see that the effects of X-variables 
have no significant difference, but the effects 
of Y-variables are significantly different. 



303 



Source 


Df 


SS 


MS 


Total 


18 


1371.9444 




Treatfien t 


3 


1188.3559 


396.1186 


Error 


15 


183.5885 


12.2392 



Test of homogeneity of regression coefficients : 

F-value = .0538 with 3 and 12 degrees of freedom 

P(F> .05) = .98277 We consider all treatment regression coeffi- 

cients are the same. 

Test of significance of pooled regression coefficient ■■ 

F-value = 21.8324 with 1 and 15 degrees of freedom 
P<F> 21.83) = .00030 

We can see that the relationship between X 

,, , . „ _ „„. . .„.-...-.,.,„„„ and Y pooled across all treatments is signifi- 

Pooled Regression Coefficient = .475465313029 can , s 

Pooled Correlation Coefficient = .7699 

************************************************* 



One Way Analysis of Covariance Table 

F-Ratio F-Prob 
32.3647 0.00000 

******************************************************************************** 

We can see that the effects of treatments 
are significantly different. 

Table of Y Means 

Treatment name Unadjusted Y Mean Adjusted Y Mean 

X-4 20.0000 19.5245 

BC 30.0000 33.3283 

F32 13.0000 9.6717 

OX 28.2000 28.6755 

******************************************************************************** 

Do you want to change response for this subfile? 
NO 

Enter desired number: 

3 , Multiple comparisons 

Is the design displayed on the CRT the latest one? 

YES 

******************************************************************************** 

MULTIPLE COMPARISONS 

******************************************************************************** 

Which procedure would you like to use ? 

2 Tukey's HSD 

what level of Alpha are you going to use ? 

.05 

for 4 means and d.f.= IS 

? 

4.08 

Is a plot of HSD desired t 

YES 

Plot on CRT ? 

NO 

Plotter indentifier string (press CONT if 'HPGL')? 



Stand. Dev 


N 


1 . 5646 


5 


1.5646 


5 


1.5646 


5 


1.5646 


5 



Plotter select code, bus # (defaults are 7,5)? 



304 



Beep will sound when plot done, then press CONT . 

Which PEN color should be used? 

i 

Enter nacie for labelling Y axis<<ii characters) 

GROWTH 

To interrupt plotting, press 'STOP' key. 



MULTIPLE COMPRRISON PLOT : TUKEY'S HSD 
EFFECTS OF GROWTH STIMULRNTS ON TOMRTO SEEDLING LEN 



o 
a. 



37.0 
33.9 
38.8 
27.7 
24.6 
21.5 
16.4 
15.3 
12.2 
9.1 
6.0 



X-4 LEVEL NUMBER 



Tu key's HSD 

Error Mean square = 12.2392 

Degrees of freedoM = 15 

Harnonic average sample size = 5.0000 

Alpha level = .05 

Table value fron Studentized range = 4.08 

HSD value = 6.3834 



Level 3 differs from Level 1 , which differs 
from Level 4 & 2 



305 



Multiple CoMparisons on TREATMENT 



Level 


Mean 


Sanple Size 


Separation 


3 


9.6717 


5 


a 


i 


19.5245 


5 


b 


4 


28.6755 


5 


c 


2 


33.3283 


5 


c 



Another Separation Procedure on TREATMENT 
? 

NO 



NOTE: HARMONIC AVER SAMPLE SIZE OF 5 USED 

IN CALCULATING THE MULTIPLE COMPARISONS. 
Enter nunber of desired funtion= 



Return to BSDM 



306 



Notes 



307 



Principal Components 
and Factor Analysis 



General Information 

Description 

The Principal Components and Factor Analysis Software accomplishes a variety of factor- 
analytic techniques. Input may be raw data, a correlation matrix, a covariance matrix, or a 
factor matrix. Factors are extracted from the correlation matrix. You may choose either the 
principal axes method or the maximum likelihood method to extract the initial factors. 
Orthogonal varimax or quartimax rotations and/or oblique oblimin rotations may be applied 
to the factor matrix. In the oblique rotation, you can control the degree of correlations 
among factors. Graphical presentation of the relationship between pairs of initial or rotated 
factors is also available. 

The program computes the case scores and provides a plot for the case scores between each 
pair of factors if the raw data has been input. Case scores may be stored on a new file for 
further study. 

For a brief discussion of the techniques and computing formulas used in these programs, 
see the Discussion Section. 

Setting Up the Data 

The first thing you need to do is to enter the data by using the Basic Statistics and Data 
Manipulation (BSDM) routines. The input may be the raw data, a correlation matrix, a covar- 
iance matrix, or a factor matrix. If a correlation matrix or a covariance matrix is to be entered, 
only the distinct elements will be requested, i.e., only the portion on and above the main 
diagonal. After the data has been loaded into memory, you are ready to use the Principal 
Components and Factor Analysis programs. 

Special Considerations 

Factor oir Principal Component Scores 

In the case where an observation has one or more missing values, the score for that observa- 
tion will not be calculated and a blank line will be printed. 

Storing the Correlation Matrix 

In the case where it would be desirable to continue analysis at another time, you may store 
the correlation matrix. Note that the correlation matrix can later be input as data in BSDM. 



308 



Principal Components 

Object of Program 

A principal components analysis for a correlation matrix may be performed by selecting this 
option. Principal components will be printed. A table of eigenvalues is then printed. This 
includes the eigenvectors as well as the proportion and the cumulative proportion of the 
total variance accounted for by each component. 

If raw data has been input, case scores on the components may be computed and stored. If 
a missing value is encountered in the calculation of component scores, the program will 
ignore that particular observation. Case scores are calculated for all observations in the data 
set even if the principal components were developed for only one subfile. 

Special Considerations 

Component Output Options 

Four output options are available and are described on the CRT display. Each option allows 
you to inform the program how to determine how many components should be output. 
When using the minimum eigenvalue size option, many researchers choose a value of 1.00, 
while the maximum cumulative percent some researchers use is about 90 percent. The 
calculations, however, will be done for all principal components, i.e., one for each variable 
which has been included in the analysis. The number of components which result from your 
selected option will be used to determine the number printed later on in this routine. 

Plots 

For both the principal components plot and the component scores plot, you may select 
component numbers up to and including the number of variables you originally specified for 
the present analysis. Of course, if you originally had twenty variables, a plot of the 19th or 
20th components may not be very useful. 

Storing Principal Components Scores 

The component scores are calculated and stored in the data matrix for all components 
which you specify. Component scores are generated for all observations in the data set 
across all subfiles. This feature may be useful for cross validation of the components be- 
tween subfiles. 



309 



Factor Analysis 

Object of Program 

The extraction and rotation of the initial factors may be performed by selecting this option. 
Factors are extracted from a correlation matrix by the principal axes method or by the 
maximum likelihood method. If the principal axes method is used, three types of initial 
communality estimates may be used as diagonal elements of the correlation matrix; namely, 
squared multiple correlations, maximum absolute raw correlations or user-specified values. 

For the principal axes method, you determine the number of factors to be extracted from 
the original matrix. (The number of factors to be extracted can be specified by you or you 
can specify the minimum eigenvalue bound). The maximum likelihood method provides a 
statistical basis for judging the adequacy of a model with a specified number of factors. 

The unrotated factors do not generally represent useful scientific factor constructs and 
hence it is usually necessary to rotate. Orthogonal quartimax or varimax rotations and/or 
oblique rotations may be performed on a factor matrix. After rotation, a table of the 
variance extracted by each factor is printed along with the new factor loading matrix. 

The program can graphically represent the original variables in terms of their factor loadings 
in a space that corresponds to the common factors. Thus, using pairs of axes, one obtains p 
points (where p is the number of variables) whose coordinates are factor loadings with 
respect to pairs of the common factors (before and after rotations). 

If the raw data has been input, factor scores for each factor may be computed and stored 
after each rotation. These factor scores can be plotted in pairs. 

Special Considerations 

Factor Extraction Methods 

For more information on the comparisons between the principal axes and maximum likeli- 
hood methods of factor extraction, see references 1,2 and 3. 

Principal Axes Method 

a. The maximum number of factors must be less than p, the number of variables in 
the analysis and must also be less than 15. 

b. In choosing the minimum eigenvalue size for inclusion of a factor some analysts 
use a value around 1.00. Keep in mind that if the variables were uncorrelated, 
each eigenvalue would be 1.00 with the sum (total variance) equal to p. 

c. The maximum number of iterations is set by default at 25. Some analysts believe 
that this number should be very small, say one or two. 

d. The total variance is by convention, p, the number of variables in the analysis. 



310 



Maximum Likelihood Method (MLM) 

a. If p is the number of variables in the analysis, then the maximum number of 
factors (m) which can be extracted by the MLM cannot exceed the largest integer 
satisfying 



m<V 2 ( (2p+l)-(8p + l) |.5). 



This quantity is calculated in the program and displayed as the maximum number 
of factors that you may extract. See reference 11 for a more detailed discussion. 

b. This method may be very time consuming. If you have a large number of vari- 
ables, we suggest that you consider using the principal axes method instead. 

c. This method may not converge at all. If this seems to be the case (i.e., the number 
of iterations and/or "tries" within an iteration is excessive), the program will allow 
you to stop and change to the principal axes method. 

d. The chi-square statistic and hence the accuracy of the probability value depend 
on the number of observations being quite large. If your sample size is small you 
should interpret the chi-square values as only an approximation to the adequacy 
of the model. Some authors suggest that you should specify a fairly large value for 
alpha in the goodness-of-fit test, especially when the sample size is small. 

Rotations 

Oblique rotation schemes available in this set of programs consist of solutions generated 
under the oblimin criterion. A whole class of rotations may be performed, as the oblimin 
solution is indexed by a constant ranging between and 1. The most important and gener- 
ally applicable special case is bi-quartimin, which corresponds to an index value .5. Other 
important special cases are quartimin (index = 0) and covarimin (index = 1.0). A thor- 
ough discussion of these methods is given in (3). 

Kaiser normalization will be used automatically in the program. 

Output at each rotation stage consists of both primary factors and reference factors. These 
two types of factors are related by transformation though they are subject to different inter- 
pretations. In fact, columns of the primary factor matrix are simply multiples of the corres- 
ponding columns in the reference factor matrix. It should be noted, that since they are the 
elements of the primary factors (as in the orthogonal case), these elements may \>e larger than 
1.00. It is the primary factors which are used in factor score calculations. The distinction 
between the aforementioned concepts is well explained in (2) and (3). 

Select New Variables 

After completing an analysis on certain variables and subfiles, you may wish to select other 
variables and/or subfiles for further analyses. You may specify the variables and subfiles 
you wish to investigate by choosing this option. 

When you decide to select new variables, the program will go back to the beginning of the 
PC and FA procedures. 

When entering the variable numbers, you may enter the numbers separated by commas, or 
by dashes when denoting consecutive variables, i.e., 1, 3, 6, 8-11 for variables 1, 3, 6, 8, 9, 
10. 11. 



311 



Discussion 

The purpose of this section is to reacquaint you with some of the fundamentals of principal 
components and factor analysis. Of course, it will not be possible to cover all of the material 
that would be necessary to understand all aspects of principal components and factor 
analysis in this section. Several of the references do have very good discussions on the 
basics of factor analysis and how it can be used. In particular, Sections 1.1, 1.2, and 1.3 of 
reference #11 have a very good discussion of the basics of Factor Analysis. In addition, 
reference #9 has some good material in Chapters 1, 3, 4 and 5. The other references also 
have some useful material. 

The basic idea of multivariate statistical methods which fall into the category labeled Factor 
Analysis is to examine a matrix expressing the dependence structure of the response vari- 
ables and to .determine certain factors which have generated the dependence in these 
responses. We measure p variables on n individuals. These p variables frequently are 
interrelated, that is, they are not independent of one another. The objective of factor 
analysis and principal components is to find certain hidden, or latent, factors which are 
fewer in number than the original p variables. Ideally, the observable variables may be 
represented as functions of the latent factors in such a way that the original dependence 
structure among the responses will be generated by the new system, to some degree of 
accuracy. Hopefully, the number of latent variables or factors will be considerably less than 
p, the original number of variables. In simplest terms, the responses may be thought of as 
linear combinations of the latent factors, and the goal of factor analysis is to estimate the 
coefficients of these linear combinations. 

If we are fortunate, the coefficients of the latent factors, sometimes called factor scores, will 
have some meaningful interpretation in terms of the original p variables. We would hope 
that the number of factors, or latent variables, would be considerably less than p. Ideally, 
two or three primary latent variables can be used in interpreting the results of the experie- 
ment. They are essentially new variables - new response variables that we can use in 
evaluating the results of the experiment. 

This program performs a principal component analysis and factor analysis on a correlation 
matrix. Given the response variables Xi, X2, ..., X P , the technique of principal components 
tries to find the coefficients, say, An, A21, ..., A P i such that the linear combination 

Yi = A11X1 + A21 X2 + ... + A P iXp 

"explains" the greatest proportion of the total response variance. Having found the desired 
set of values, we then seek new coefficients, say, A12, A22, ..., A P 2 such that the linear 
combination 

Y2 = A12X1 + A22 X2 +4 ... -I- A P 2Xp 

is uncorrelated with Yi and so that Y2 explains the largest portion of the response variance 
remaining after Yi has been removed. In principal component analysis, we proceed in this 
manner until we have obtained Yi, ..., Y P . Since the Y's are chosen to be uncorrelated, their 
total response variance will be the same as the original Xi, ..., X P . These linear combinations 
of the X's are called principal components, Yi being the first principal component, Y2 being 
the second principal component, etc. In fact, the coefficients Aij, A2 j; ..., A pj of the jth 
principal component are the elements of the eigenvector of the sample correlation matrix R 
corresponding to the jth largest eigenvalue lj. The importance of the jth component is 



312 



measured by Vp. Then, if a large proportion, say 80%, of the total response variance for the 
X's is accounted for by a few of the Y's, we will have obtained a smaller description of the 
initial dependence structure. This is the main object of principal component and factor 
analyses - reduction of dimensionality. The program computes the principal components, 
eigenvalues, proportion of the total variance, and cumulative proportion of the total variance 
accounted for by each component. 

For a study of the dependence structure, factor analysis is another technique for explaining 
the covariance of the responses. Principal components is simply a transformation of the 
responses. Factor analysis proposes a model for the responses which may be written as 

Xi =\nYi +\i2Y 2 + ...+ XimYm + ei 



X P = XpiYi +AVY2 + ...+ \p m Y m + e P 

where Yj is called the jth common factor variable, Xa is a coefficient reflecting the importance 
of the jth factor for the ith response variable, and e. is called a specific factor variable. Under 
this model, each response variable, Xo is expressed as a linear combination of a few com- 
mon factor variables Yi, ..., Ym. Let F = (X..j), then F is the so-called factor loading matrix, 
the quantity 



hi 2 = ^\ 2 ij 

j = i 

is called the communality of the ith variable, and the variance of e, is called the unique 
variance of the ith variable. If we replace the diagonal elements of the sample correlation 
matrix R with communalities and denote it by R* then 

R* ■= FF ' 

This equation has been called "the fundamental factor theorem". 

You can choose either the principal axes method or the maximum likelihood method to 
extract the initial factors. A brief comparison between these two methods can be found in 
reference 2. Factors which are not rotated do not generally represent useful scientific factor 
constructs and hence it is usually necessary to rotate. The desire for correlated (oblique) 
factors or uncorrelated (orthogonal) factors leads to either an oblique rotation or orthogonal 
rotation of the initial factor solution. 

The program computes the case scores for either principal components or factors if the raw 
data has been input. For detailed information on the calculation and the interpretation of 
case scores, see Chapter 16 of reference 3. 

The program also provides a graphical presentation of the initial and rotated factors. 



313 



Methods and Formulae 

Correlation Matrix: 

Raw Data Input: 

Let the input consist of N cases with p variates per case and let X = (Xu), i = 1, ..., N;j = 1, 
..., p, denote the data input matrix. The covariance matrix S = (su) is computed from 



(N - 1)S = % X,Xi' - Nxx' 

i = i 

where Xi' = (xji, ..., xjp), 

i N 



The correlation matrix, which is used for the principal components analysis and/or factor 
analysis, is then given by 

R = (r,j) where m = s,j/(siiSjj) V 2 

Covariance or Correlation Matrix Input: 

Let the input consist of a matrix for p variates. For a covariance matrix, the p(p + l)/2 
distinct elements of the matrix S are entered and the correlation matrix R = (n,) is com- 
puted by 

ru = Sij/(siiSj i ) 1 / 2 

In the third method of input, the distinct elements of R are entered directly. 



Principal Components Analysis: 

The eigenvalues and corresponding eigenvectors of R are obtained by a variant of the QR 
method (see page 219 of reference 5). Let the eigenvalues of R be denoted by 
6i2=023=. ..5=0p and let W = (wij) be a pxp matrix of column eigenvectors (i.e., the jth column 
of W consists of the elements of the eigenvector corresponding to the jth eigenvalue 8j). 
Then W is a matrix of principal components and 0i is the variance accounted for by the ith 
component. 

Case Scores: 

For each data case a vector of component scores f is computed by 

f = Wz 

where W is the matrix of principal components and z is the vector of standardized values of 
the variables. 



314 



Factor Extractions 

Principal Axes Method: 

The main diagonal elements of R are either unaltered or adjusted by one of the following 
options: 

(i) squared multiple correlations on the main diagonal where rn is given by x» = 1 - l/r u and 
r is the ith diagonal element of R \ The Cholesky square root method is used to obtain R" 1 
(ii) maximum absolute row value among rij, j = l,...,p 
(iii) User specified values. 

The p eigenvalues and corresponding eigenvectors of R are obtained by the QR method. 
Let the eigenvalues of R be denoted by 0i>023=. ..s=0p and the matrix of column eigenvec- 
tors be denoted by W = (wi, W2, ..., w P ). The number of factors obtained is M = min {m, # 
of 9i such that 9i > + c}, where M is the maximum number of factors (user specified) and c is 
the minimum eigenvalue for factor inclusion (also user specified). Then the jth column of 
the factor loading matrix F = (f«) is VOjWj. New estimates of communalities are then given by 



i=i 

If more than one iteration is requested, the diagonal of R is adjusted by the new estimates of 
communalities and the extraction procedure is repeated. Iterations are continued until the 
maximum number is reached or until the maximum change in the communality estimates is 
less than 0.0001. If for a particular iteration any of the estimates of communalities exceed 
one, the process will terminate, a message will be printed, and the factor matrix for the 
previous iteration will be printed. Note that the number of factors may change during the 
iterative process. 

Maximum Likelihood Method: 

The Enslein procedure (see reference 13) is used to obtain the maximum likelihood solu- 
tions of the factor loading matrix F and the unique variance 0h of the ith variable. If k is the 
number of factors and 



fk(<J>) = - log rr 0i + V P - (p - k) 



i = k+l 



where 01=5023=... S20 P are the eigenvalues of 0" 1/2 R$" 1/2 and where <& = diag (<J>n, 4)22, ..., 
4>pp), the ML solution of 4>» is the value i<\>\\ which minimize the value of fk(<J>). The factor 
loading matrix F is then computed by 

F = 4>" 2 W (H - I) 1 ' 2 

where W = (wi, wz, ...,Wk), H = diag (9i, 02, 9k) and where wi, W2, ..., Wk are the eigenvec- 
tors corresponding to the k largest roots. The initial estimate of 0u = (1 - k/2p)/r " 

where r" is the ith diagonal element of R '. The minimization procedure of the method of 
Fletch and Powell is applied to the function fk(<&). For a detailed explanation of the com- 
putation procedure, see reference 13. 



315 



The program performs a sequence of maximum likelihood factor analyses for k = ki, ki + 1 , 
ki + 2, ... , k2, where ki is the minimum number of factors. The sequence terminates when 
the maximum number of factors k2 is reached or when a proper solution has been found and 
is acceptable from the point of view of goodness-of-fit at a user specified level of signifi- 
cance. If for a particular k the solution is improper (Heywood, see reference 3), having q < k 
of the unique variances equal to "zero", the corresponding q variables are eliminated and 
the partial correlation matrix R22xi is computed as follows: 

(i) Find R _1 by square root method 

(ii) Delete the q columns and rows from R" 1 and evaluate the inverse of the resulting 
matrix denoted by Ri 

(iii) R22X1 = D- 1/2 RiDi 1/2 where Di is a diagonal matrix with the diagonal elements of Ri 

The matrix R22X1 of order (p-q) is analyzed as before with the number of factors k-q, and 
the resulting solution is again examined for properness. The procedure repeats until a proper 
solution has been found for some k>0. A goodness-of-fit test is performed on this solution by 
computing 



X 2 = [N - 1 


- (2p + 5)/6 - 2k/3]log 


" 4> + FF' 
R 


' freedom 


v =[(p-k)2-p-k]/2 





Note that R can be either the original correlation matrix or the partial correlation matrix, and 
p is the order of R. If the computed chi-square value is greater than the tabled value with a 
prescribed level of significance, the value of k is increased by one and the above procedure 
is repeated. If the solution is acceptable, then the process terminates. 

The final solution is combined with the principal components of the eliminated variables 
(see equations (56), (57) of referenced, if any, to give a complete solution for all the 
original variables. 

Factor Rotation: 

Orthogonal Rotation: 

(i) Quartimax method: The object of the quartimax method is to determine the orthogonal 
transformation matrix T which will carry the original factor matrix F into a new factor matrix 
B = (bij) for which 

P k 

Q = 2 Z W 

i = 1 j = 1 

is a maximum. See page 298 of reference 3 for a detailed discussion. 



316 



(ii) Varimax method: The orthogonal varimax criterion requires that the final factor matrix 
B = (bij) maximize the function 



P k k / p \ ; 

= p X X (b../h.) 4 - X ( X by W ) 

i = 1 j = 1 j = 1 M = 1 ' 



where 

k 
hi2 = X U 2 



the communality of the ith variable of the initial factor matrix. See page 304 of reference 3 
for a detailed discussion. 

Oblique Rotation: 

Oblique oblimin rotation may be performed to minimize the value 



B = 

1<-I = 



k r~ p p p — i 

X P X (Vii 2 /hi 2 ) (V., 2 /h, 2 ) - X X Vn 2 /h. 2 X V,i 2 /h, 2 

<:, = 1 l_ 1 = l ' | = l 1 = 1 _l 



where 



= Z 



h, 2 = £, fu 2 

j = i 

is the communality of the ith variable of the initial factor matrix. \ is the rotation constant in the 
range to 1. Values of \ which yield standard oblique rotations are: 

(i) Quartimin: \ = 0; least oblique 
(ii) Biquartimin: X = 0.5; less oblique 
(iii) Covarimin: X = 1; most oblique 

Both reference and primary factors are obtained. See page 324 of reference 3 for a detailed 
discussion. 

Factor Scores: 

Computation of factor scores begins with the calculation of a factor score coefficient matrix 
C where C is PXM, P is the number of variables and M the number of factors. If we let F be 
the given factor matrix (either orthogonal or oblique factors), and R the correlation matrix 
for the original data, C is calculated in one of two ways. 



317 



Orthogonal Factors: 
C - R-'F 

Oblique Factors: 
C = R X FQ 

where F is an oblique primary factor matrix and Q is the correlation matrix of the primary 
factors. 

Once C has been computed, the factor scores, f, for each data case are computed by 
f = c'z 

where z is the vector of standardized values of the variables. For detailed information on the 
calculation of the primary factor matrix and the Q matrix above, interpretation of the 
primary factors, reference structure matrix, and factor scores, see reference 3. 

References 

1. Enslein, K., Ralston, A., and Wilf, H. S. (eds.) (1977) Statistical Methods for Digital 
Computers, John Wiley & Sons, Inc., New York. 

2. Gnanadesikan, R. (1977) Methods for Statistical Data Analysis of Multivariate 
Observations, John Wiley & Sons, New York. 

3. Harman, H. H. (1967) Modern Factor Analysis, 2nd ed., University of Chicago Press, 
Chicago. 

4. Joreskog, K. G. (1967), "Some Contributions to Maximum Likelihood Factor Analy- 
sis". Psychometrika, Vol. 32, p 443-482. 

5. Martin, K. (1978) 9845B Numerical Analysis Library, Vol. 1., Hewlett-Packard Part 
No. 09845-10351. 

6. Morrison, D. F. (1976) Multivariate Statistical Methods, 2nd ed., McGraw-Hill Book 
Company, New York. 

7. Vecchia, D. F. Unpublished Notes for 9830A Factor Analysis. 

8. Cooley, William W. and Lohnes, Paul R. (1971) Multivariate Data Analysis, John 
Wiley and Sons, Inc., New York. 

9. Guertin, Wilson H. and Bailey, John P., J. (1970), Introduction to Modern Factor 
Analysis, Edwards Brothers, Inc., 1970, Ann Arbor. 

10. Horst, Paul, (1965) Factor Analysis of Data Matrices, Holt, Rinehart and Winston, 
Inc., New York. 

11. Morrison, Donald A. (1965) Multivariate Statistical Methods, Holt, Rinehart and Win- 
ston, Inc., New York. 

12. Comrey, Andrew L. (1973) A First Course in Factor Analysis, Academic Press, New 
York. 

13. Enslein, Kurt (Ralston, A. & Wilf, H. eds.) Statistical Methods for Digital Computers, 
Volume 4, John Wiley and Sons, Inc., New York. 



318 



Examples 

Sample Problem #1 

This example uses a simple artificial data set which is given below. The raw data was entered 
in keyboard mode. The principal component analysis was performed. Notice the "% of total 
variance" row corresponds to random data. Component plots of component 1 vs. compo- 
nent 2 and component 1 vs. component 3 were generated. Component scores were output 
and a plot of component scores was made, again for the same pairs of components. 

Factor analysis by the principal axes method was done. Communalities were found by 
iteration. The iterations are not output on the printer but do appear on the CRT. The 
number of factors chosen to explain the variation was 3 in this example. Factor rotation 
plots were made for factor 1 vs. factor 2 and factor 1 vs. factor 3. An orthogonal varimax 
rotation was performed. The contribution of factors, % of total variance, and factor plots 
were output. Factor scores were also output. 



se No. 


Xi 


X 2 


X 3 


X4 


X= 


1 


7 


9 


6 


5 


2 


2 


5 


5 


4 


6 


2 


3 


1 


2 


3 


4 


5 


4 


1 


6 


5 


2 


3 


5 


4 


6 


5 


2 


5 


6 


7 


9 


6 


6 


5 


7 


6 


5 


3 


2 


1 


8 


9 


8 


6 


5 


3 


9 


4 


6 


5 


2 


1 


10 


6 


5 


4 


3 


5 


11 


3 


2 


1 


6 


5 


12 


5 


6 


5 


2 


3 


13 


6 


5 


4 


5 


4 


14 


1 


6 


5 


8 


9 


15 


9 


8 


9 


6 


5 


16 


7 


3 


1 


9 


5 


17 


1 


5 


9 


3 


7 


18 


3 


5 





7 


9 


19 


6 


2 


4 


8 


6 


20 


4 


6 


4 


2 


8 



319 



* DATA MANIPULATION * 

A A W W 4 W W W 4 W WW ^ W A ^ W ^ W 4 ^ *t 4 4 4 4 W 4 W 4 4 4 4 W W W W W W W 4 W W W Jf W 4 W W 4 W 4 ♦ W ^fc" 4 4 ^t W ^W^t ifr w W 4 4 it i W 4 ^ W W W W ^t W 4 ^ 

J|» ^ ^ *T* t* * *■* * *r^ *l* ™ ^ ^ ^ ™ * ^ ^ ^ ^ ™ ^ ^ ^ ^^^ ^ ^^^ ^ ^ ♦ ^ ^ ^ ^ ^ ^ ™ ^ ^r^ ^ ^ ^ ^ * ^ ^ ^ ^ * ^ ^ * ^ ^ ^ * * ^^ * ^ ^ ^ ^ ^ ^ ^ ^ ^ * ^ ^ ^ ^ 

Enter DATA TYPE (Press CONTINUE for RAW DATA) ■■ 

i 

Mode number = ? 



Is data stored on program's scratch file (DATA)? 

NO 

Data file name = ? 

PFACSMPBi -INTERNAL 

Was data stored by the BS&DM system ? 

YES 

Is data medium placed in device INTERNAL 

? 

YES 

Is program medium placed in correct device ? 

YES 



Raw data 

On mass storage 



SAMPLE PROBLEM #i 



Data file nane: PFACSMPBi : INTERNAL 

Data type is: Raw data 

Number of observations: 20 
Number of variables: 5 



Variable 


names = 


i. Xi 




2. X2 




3. X3 




4. X4 




5. X5 




Subfiles: 


NONE 



SELECT ANY KEY 

Option number = ? 

i 

Enter method for listing data: 

3 



Press special function key labeled-LIST 
List all data 



SAMPLE PROBLEM #i 



Data type is: Raw data 



Variable # i 
(Xi ) 



Variable # 2 
<X2 ) 



Variable # 3 
(X3 ) 



Variable # 4 
(X4 ) 



Variable # 5 
(X5 ) 



OBS# 
i 
2 


7.00000 
5.00000 


9.00000 
5. 00000 


6.00000 
4.00000 


5.00000 
6.00000 


2.00000 

2.00000 


3 
4 
5 
6 


i. 00000 
1.00000 
4.00000 
7.00000 


2.00000 
6.00000 
6.00000 
9.00000 


3.00000 
5.00000 
5.00000 
6.00000 


4.00000 
2.00000 
2.00000 
6.00000 


5.00000 
3.00000 
5.00000 
5.00000 



320 



7 


6.00000 


5.00000 


3.00000 


2.00000 


1.00000 


8 


9.00000 


8.00000 


6.00000 


5. 00000 


3.00000 


9 


4.00000 


6.00000 


5.00000 


2.00000 


1.00000 


10 


6.00000 


S. 00000 


4.00000 


3.00000 


5.00000 


11 


3.00000 


2.00000 


1.00000 


6.00000 


5.00000 


12 


5.00000 


6.00000 


5.00000 


2.00000 


3.00000 


13 


6.00000 


5.00000 


4.00000 


5. 00000 


4.00000 


14 


1.00000 


6.00000 


5.00000 


8.00000 


9.00000 


15 


9.00000 


8.00000 


9.00000 


6.00000 


5.00000 


16 


7.00000 


3.00000 


1.00000 


9.00000 


5.00000 


17 


1.00000 


5.00000 


9.00000 


3.00000 


7.00000 


18 


3.00000 


5.00000 


0.00000 


7.00000 


9.00000 


19 


6.00000 


2.00000 


4.00000 


8.00000 


6.00000 


20 


4.00000 


6.00000 


4.00000 


2.00000 


8.00000 



Option number = 



SELECT ANY KEY 



Exit list procedure 

Select special function key labeled-ADV STAT 

Remove BSDM media 

Insert Principal Components & Factor Analysis 

media 



Use all the variables in the analysis (YES/NO) ? 

YES 

Is the above information correct ? 

YES 

PRINCIPAL COMPONENTS AND FACTOR ANALYSIS 
SAMPLE PROBLEM *1 

where variables to be used are ■■ 

1. XI 

2. X2 

3. X3 

4. X4 

5. XS 



CORRELATION MATRIX 



XI 

X2 
X3 
X4 



X2 


X3 


X4 


X5 


4204206 


.17S3833 


.2259743 


-.3753400 




.6175669 


-.2043786 


-.2005056 






-.2764709 


-.1251464 
.3879237 



Do you want to store the correlation Matrix ? 
NO 

Enter number of desired funtion: 

2 

Press "CONTINUE^ when ready. 



We could store the correlation matrix for later 
use, if we wished. 

Select principal component analysis 



* PRINCIPAL COMPONENT ANALYSIS * 

J^ ^ J^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^^ ^ ™ * ^ ™ ^ ^^ ^ ^ t ^ ^ *r ^ ^ 

Enter the option for components output <i ,2, 3,or 41 



Output all principal components 



COMPONENT MATRIX 



321 



Variable Na 

1. XI 

2. X2 

3. X3 

4. X4 

5. X5 


«e 


1 

.383267 

.574271 

.513971 

-.305216 

-.407427 


COMPONENT 
2 
.637731 
. 138684 
-.090709 
.741708 
.125325 


3 
-.297991 
.330269 
.507831 
.133991 
.725451 


4 

-.092255 

-.584914 

.673708 

. 322234 

-.302733 


5 

.590843 
-.446965 

.125823 
-.484690 

.447628 


Eigenvalue 




2.084182 


1.255467 


1.046971 


.363811 


.249569 


X of total 
variance 




41.68365 


25.10934 


20.93941 


7.27622 


4.99139 


Cunulative 
variance 


X 


41.68365 


66.79298 


87.73240 


95.00861 


100.00000 



Do you wish to plot the principal components ? 

YES 

Plot on CRT ? 

NO 

Plotter identifier string (press CONT if 'HPGL')? 

Enter select code, HPIB bus (defaults are 7,5)? 

A beep will signify the end of the plot. 

Which pen nu«ber should be used ? 
1 



Note: First 3 principal components have 
Eigen values bigger than 1 .0. 



322 



Enter the pair of component numbers which will be used in this plot ? 

SAMPLE PROBLEM #1 
Component Plot 



w 



o 

o 



1.0 

.8 

.6 

.4 

.2 

0.0 

-.2 

-.4 

-.G 

-.8 

-1.0 



H 1 1 h 



H 1 H 



03 00 CO *- (VI 03 <\l 

r r oa 



(O CD 63 



~ I 

I 



COMPONENT 1 



323 



Plot for another two factors ? 

YES 

Which pen nunber should be used ? 

i 

Enter the pair of conponent nunbers which will be used in this plot ? 

i,3 



SRMPLE PROBLEM #1 
Component Plot 

1.0 

.8 

.6 

.4 

.2 

0.0 I 

-.2 
-.4 
-.6 
-.8 



m 



0. 

i 



-1.0 



H H 



H 1 1 1 



8Qco*-cuscu«-<0a>8 

^ I* |* i* f s * * * J 

i 



COMPONENT 1 



324 



Plot for another two factors ? 

YES 

Which pen nuMber should be used ? 

1 

Enter the pair of cociponent nuMbers which will be used in this plot ? 

2,3 

SRMPLE PROBLEM #1 
Component Plot 



1.0 

.8 

.6 

.4 

.2 

0.0 

-.2 

-.4 

-.6 

-.8 

-1.0 



m 



o 

o 



H 1 1 h 



H 1 1 1 1 



SB(0«-N8CU*-<0<D8 

■ •••••••••• 

~. | | | | s — 

I 



COMPONENT 2 



Plot for another two factors ? 

NO 

Enter the option nuMber <i,2,or 3)= 

i 



Select component scores 



COMPONENT SCORES 











COMPONENT 






Observation 


* 


i 


2 


3 


4 


5 


i 




2.07540 


.71235 


-.15044 


-.23088 


-.72271 


2 




.09139 


.34176 


-.93465 


.51003 


-.65276 


3 




-1.81738 


-i. 30509 


-.35682 


.53949 


-.01545 


4 




. 33929 


-1.86345 


-.00753 


-.01155 


•-.72163 



325 



b 

6 

7 

8 

9 

10 

ii 

12 

13 

14 

15 

16 

17 

18 

19 

20 





.44941 


-1 


.00182 




.25200 


- 


.37656 


. 35664 


1 


. 42788 


1 


.19038 




.82627 


- 


.47569 


-.36426 




.71513 


- 


.69652 


-1 


.81193 


- 


.24860 


.17100 


1 


.93132 


1 


.20276 


- 


.23760 


- 


.15167 


.14705 


1 


.13760 


-1 


.21350 


- 


.97337 




.13479 


-.39946 




.12078 


_ 


.20532 


- 


.30636 


_ 


.32603 


.77361 


2 


.22775 


- 


.08321 


- 


.92196 




. 153S7 


-.07616 




.94491 


- 


. 85573 


- 


.47841 


- 


. 1S733 


.21200 




.03008 




.38027 


- 


.49735 




.07921 


.16732 


1 


.48126 




. 36963 


2 


.17657 




.05362 


-.83926 


2 


.13151 


1 


.50862 


1 


.10035 




.61701 


.48187 


1 


.74141 


1 


. 94865 


-1 


.06175 




.14396 


.01767 




.14576 


-i 


.55787 


2 


.00757 


1 


.07660 


.26029 


2 


.44801 




.68660 




.61275 


-1 


.35411 


-.22557 


1 


.53268 


1 


.24477 


- 


.18584 


1 


.07945 


.56124 


- 


.29196 


- 


.80330 




.94850 


-1 


.05530 


.86858 



Do you wish to plot the case scores ? 

YES 

Plot on CRT ? 

NO 

Plotter identifier string <press CONT if 'HPGL')? 

Enter sselect code, HPIB bus <defaults are 7,5)? 

A beep will signify the end of the plot. 

Which pen number should be used ? 

1 



326 



Enter the pair of cowponent nuwbers which will be used in this plot ? 
i>2 



3.8 
2.4 

i.e 

1.2 
.6 



SRMPLE PROBLEM #1 
Component Scores Plot 



(V 



ui 

z 
o 

Q. 

X. 

o 

o 



B.a I hr-H 1 h 

-.6 

-1.8 
-2.4 



-3.0 



H 1 1 1 1 



StDNUSlONIItS 

<oc\i»- , »- , i(B •-• — « cu m 
i I i i 



COMPONENT 1 



327 



Plot for another two factors ? 

YES 

Which pen nuober should be used ? 

i 

Enter the pair of coMponent numbers which will be used in this plot 1 
i,3 



SAMPLE PROBLEM *1 
Component Scores Plot 



en 



o 

Q. 
£ 
O 
O 



3.8 
2.4 
1.8 
1.2 
.6 * 



8.0 I 1 h 

-.6 

X 

X 

-1.2 
-1.8 



-2.4 



-3.8 



-"H 1 1 1 1 



X 



<D(\l(0S(0(Via>*'(9 
S — — (UP) 



(!) (VI — — I 

I I I I 



COMPONENT 1 



328 



Plot for another two factors ? 

YES 

Which pen number should be used ? 

i 

Enter the pair of cociponent numbers which will be used in this plot ? 

9 1 



3.0 
2.4 

i.e 

1.2 

.6 

0.0 

-.6 

-1.2 

-1.8 

-2.4 

-3.0 



SAMPLE PROBLEM #1 
Component Scores Plot 



I h 



H h 



H 1 1 1 1 



S«-<D<U<08C0(U<D<T8 

........... 

(i) (V .-> •* I 8 •< »« <U CO 

1 I I I 



COMPONENT 2 



Plot for another two factors ? 

NO 

Store the principal cociponent case scores ? 

NO 

Enter nunber of desired funtion: 

3 

Max. * of factors to be extracted <<= 15) ; 

3 



Select factor analysis 

We must specify how many factors we want 
to use. From the principal component analy- 
sis it appears that three might be correct. 



329 



^ ^ J^ ^ ^ ^ ^ ^ ^ ^ ^ ^^ ^ * * ^ * * * ^ * ^ ^ ^ ^^ ^ ^ ^ ^ ^^ ^ ^ V * T* T^ *^*l* *r *l* 

* FACTOR ANALYSIS BY PRINCIPAL AXES METHOD * 



A maximum of 3 factors will be extracted. 
Enter Commonality Estimate type <i,2,3,or 4) = 

2 Squared multiple correlation used on the di- 

agonal of the correlation matrix as the ini- 
tial estimates. 

COMMUNALITY ESTIMATION 

Squared Multiple Correlation has been used to compute the commonality estimates. 
Initial Estimated Commonalities of Variables : 
Variable Commonality 



1. XI 

2. X2 

3. X3 

4. X4 

5. X5 



.47407 
.50461 
.40850 
.42089 
.39380 



Starting values 



Do you wish to specify a win. eigenvalue for factor inclusion ? 

NO 

Do you want to refine the commonality estimates using iteration ? 

YES 

Enter the maximum * of iterations <default=25> 

5 

Max. number of iterations for factor extraction = 5 



Commonalities of Variables after 5 iterations 
Variable Communality 



1. XI 

2. X2 

3. X3 

4. X4 

5. X5 



.74634 
.72370 
. 57824 
.67900 
.63413 



Final estimates 



UNROTATED FACTOR MATRIX 



Variable Name 

1. XI 

2. X2 

3. X3 

4. X4 

5. XS 


1 

.540204 

.784661 

.644004 

-.386153 

-.522787 


2 
.628415 
.093539 
-.120257 
.713522 
.114566 


3 
-.244171 
.315046 
.386055 
.144134 
.589658 


Contribution 
of factor 


1.74468 


.94036 


. 67638 


"/. of total 

Variance 

Extracted 


34.89350 


18.80713 


13.52766 



330 



Do you wish to perforM any factor rotations ? 

YES 



* FACTOR ROTATION * 

Do you wish to plot the original factors ? 

YtS 

Plot on CRT ? 

NO 

Plotter identifier string (press CONT if 'HPGL')? 

Enter the select code, HP bus (defaults are 7,5)? 

Which PEN nutiber should be used? 

i 

The pair of factor numbers used in this plot =? 

A beep will signify the end of the plot. 



SRMPLE PROBLEM #1 
UNROTRTED Factor Plot 



O 

I- 

K 

u. 



1.0 

.8 

.6 

.4 

.2 

0.0 

-.2 

-.4 

-.8 

-.8 

-1.0 



3 

H H 



-\ \- 



3 



8 


CD 


(0 


* 


C\i 


8 


• 


• 


• 


• 


■ 


• 


v4 

1 


1 


1 


1 


1 


8 



CM 



CO 



FRCTOR 1 



331 



Plot for another two factors ? 

YES 

Which PEN number should be used? 

i 

The pair of factor numbers used in this plot =? 

i,3 

A beep will signify the end of the plot. 



SRMPLE PROBLEM #1 
UNROTRTED Factor Plot 



s 



i.e 

.8 
.6 
.4 
.2 

0.0 h 

-.2 

-.4 

-.6 

-.8 

-1.0 



CO 



4 

H h 



— i 

i 



(u eg 
i* s 



(VI 



CO CO 



FRCTOR 1 



332 



Plot for another two factors ? 

YES 

Which PEN nurtber should be used? 

1 

The pair of factor numbers used in this plot =? 

2,3 

A beep will signify the end of the plot. 



SRMPLE PROBLEM *1 
UNROTRTED Factor Plot 



<n 

s 

I- 



1.0 

.6 

.6 

.4 

.2 
0.B h 
-.2 
-.4 
-.6 
-.8 
-1.0 



■■ 5 



H 1 h 



-I 1 1 1 1 



s 


OD 


CO 


*• 


(U 


s 




• 


• 


• 


■ 


• 


1 


1 


1 


1 


1 


s 



w 



(0 CD 



FACTOR 2 



Plot for another two factors ? 

NO 

Enter the type of rotation <i or 2) = 

1 

Enter the Method of orthogonal rotationd or 2) 

1 



Orthogonal rotation 
Choose varimax method 



FACTOR MATRIX 



333 



ORTHOGONAL VARIMAX ROTATION 



Variable Nana 

1. XI 

2. X2 

3. X3 

4. X4 

5. X5 


1 

.218231 

.796148 

.747073 

-.244315 

-.026678 


FACTOR 

2 3 

.041559 -.834861 

-.099285 -.282820 

-.139647 -.024948 

.738402 -.272169 

.656311 .450191 


Contribution 
of factor 


1.30000 


1.00707 


1.05435 


X of total 

Variance 

Extracted 


25.99992 


20.14135 


21.08702 



Note by the factor coefficients that factor 1 
seems to be a weighted average of X2 and 
X3 ; factor 2 is a weighted average of X4 and 
X5, while factor 3 seems to be essentially X1 
(and maybe X5). 



Do you wish to plot the rotated factors ? 

YES 

Plot on CRT ? 

NO 

Plotter identifier string <press CONT if 'HPGL')? 

Enter the select code, HP bus (defaults are 7,5)? 

Which PEN nuMber should be used? 
1 



334 



The pair of factor nunbers used in this plot =? 

1,2 

A beep will signify the end of the plot. 



SRMPLE PROBLEM #1 
VRRIMRX ROTRTED Factor Plot 



<u 

K 
O 

»- 

o 

(E 
U. 



1.8 
.8 

4 
.6 

.4 

.2 

0.0 I 1 1 1 h 



-.2 



-.4 



-.6 



-.8 



-1.0 



I 



CD CO 

I* 



■+— I 1 1 



CU (9 CVI 
f IS 



CO CD 



FRCTOR I 



335 



Plot for another two factors ? 

YES 

Which PEN nunber should be used? 

1 

The pair of factor nunbers used in this plot =? 

1,3 

A beep will signify the end of the plot. 



SRMPLE PROBLEM *1 
VRRIMRX ROTATED Factor Plot 



m 
o 

U 

u. 



i.e 

.8 
.6 
.4 
.2 
8.0 I 1 1 h 



-.2 

-.4 

-.6 

-.8 

-1.8 



CD CO 



I 



H 1— y+- 



cu 
i" 



eg 
s 



CU 



CD 



FRCTOR I 



336 



Plot for another two factors ? 

YES 

Which PEN number should be used? 

i 

The pair of factor nuMbers used in this plot =? 

A beep will signify the end of the plot. 



SRMPLE PROBLEM #1 
VflRIMRX ROTATED Factor Plot 



m 

at 
o 

i- 
o 
cc 



1.0 

.6 
.6 
.4 
.2 



0.0 I 1 1 1 b 

-.2 
-.4 
-.6 
-.8 
-1.0 



H 1 1 1 



eg 


CD 


CO 


*• 


OJ 


ea 


cu 


• 


• 


« 


■ 


• 


• 


• 


T 


1 


1 


1 


1 


ea 





CO CD 



FACTOR 2 



Plot for another two factors 1 

NO 

Enter the option nunber <i,2,or 3)= 

i 



Print out factor scores 



FACTOR SCORE COEFFICIENTS 



337 



FACTOR MATRIX 









FACTOR 




Variable 


Nane 


i 


2 


3 


1. Xi 




-.014160 


.060858 


.682742 


2. X2 




.576544 


.074114 


.043713 


3. X3 




.392323 


.018432 


.099292 


4. X4 




-.078039 


.558876 


.207201 


S. X5 




.162978 


.479519 


.277970 



FACTOR SCORES 







FACTOR 


Observation # 


1 


2 


3 


1 


1.03930 


-.25987 


-.95596 


2 


-.43066 


-.22543 


-.50906 


3 


-1.13434 


-.30973 


1.11956 


4 


.24275 


-1.03780 


1.06651 


5 


.36361 


-.56069 


.49214 


6 


1.21218 


.58816 


-.69300 


7 


-.54262 


-1.37420 


-.58291 


8 


.82101 


-.04477 


-1.35708 


9 


.08832 


-1.37066 


.02261 


10 


-.12901 


-.31560 


-.15906 


11 


-1.55654 


.20332 


.31475 


12 


.22038 


-.94163 


-.01234 


13 


-.26S01 


-.03697 


-.45482 


14 


.45414 


1.62051 


1.23567 


15 


1.44080 


.62500 


-1.08097 


16 


-1.40375 


1.05664 


-1.05258 


17 


.89618 


.00956 


1.64180 


18 


- . 65896 


1.35218 


.58881 


19 


-1.05594 


.98328 


-.42485 


20 


.39817 


.03870 


.80077 



Do you wish to plot the factor scores ? 

YES 

Plot on CRT ? 

NO 

Plotter identifier string (press CONT if 'HPGL')? 

Enter the select code, HP bos (defaults are 7,5)? 



Which PEN nunber should be used? 
1 



338 



The pair of factor nunbers used in this plot =? 

1,2 

A beep will signify the end of the plot. 



SAMPLE PROBLEM *1 
VRRIMRX ROTATED Factor Scores Plot 



s 

I- 
u. 



3.0 








2.4 








1.8 






X 


1.2 


" X 


X 




.6 






X » 




X 








M 


X 

X 


1 T 1 1 1 1 

X 


-.6 






. X 


-1.2 




X 


X 
X 

X 


-1.8 








-2.4 








-3.B 









OIUIOSlOAIIBtS 



W (U -* —• I 

I I I I 



FRCTOR i 



Plot for another two factors ? 

NO 

Do you wish to store the factor scores ? 

YES 

Enter a title for the new data set : 

FACTOR SCORES 

How Many factor scores do you want to store ? 

i 

Natie of data file = 

SCORE INTERNAL 

Is data MediuM placed is device INTERNAL 

? 

YES 



339 



PROGRAM NOW STORING FACTOR SCORES 

Is program Mediuii replaced in deuicelNTERNAL 

? 

YES 



*** The i factor analysis scores were stored in SCORE = INTERNAL *** 

Do you wish to perform another rotation ? 

NO 

Enter nuwber of desired funtion: 

4 Return to BSDM 



340 



Sample Problem #2 

The correlation matrix for a set of six fowl bone measurements of White Leghorn Fowl are 
considered. The correlation matrix is the subject of Example 7.5, page 243 of Morrison (see 
reference 11). 

The six measurements are: 

Xi = Skull length 
X 2 = Skull breadth 
X3 = Humerus 
X 4 = Ulna 
Xs = Femur 
Xb = Tibia 

Extraction of the principal components for the matrix reveals that 76% of the variance is 
explained by the first component and 88% by the first two components together. Thus, if 
one were interested in data reduction, it may be practical to use only the first two compo- 
nents (or factors). 

This particular example permits an easy interpretation of the factors or components. For 
example, the first factor may be interpreted as a general average dimension of all bones, 
with the wing and leg bones receiving slightly higher loadings. Further explanation of the 
components may be obtained in Morrision (11). 

The data was input as a correlation matrix. A principal component analysis was done and it 
showed that two components accounted for over 88% of the total variance. Component 
plots were done for component 1 vs. component 2, component 1 vs. component 3, and 
component 2 vs. component 3. 

Factor analysis by the method of principal axes was done. Communalities were calculated. 
Three factors were used in the factor analysis. The first two factors accounted for over 80% 
of the total variance. A factor plot was done for factor 1 vs. factor 2. Then an orthogonal 
varimax rotation was performed. The result of the rotation and a new factor plot was output. 



^ ^ * ^ * * * t* ^ ^ * T ^ * ^ * * ^ ^ ^ ^ ^ ^ t V ^ * ^ ^ ^ ^ T ^ ^ ^ ^ T* V ^ V ^ ^ ^ ^ ^ ^ t* V *P *p t* ^ * ^ * ^ * * 'n *P 'r- ^ ^ ^ ^ ™ * * ^ * ™ * * * * * * * * * 

* DATA MANIPULATION * 

Enter DATA TYPE <Press CONTINUE for RAW DATA): 

3 This data was stored as a correlation matrix. 

Mode nunber = ? 

2 

Is data stored on program's scratch file (DATA)? 

NO 

Data file napie = ? 

BONELNGTH: INTERNAL' 

Was data stored by the BS&DM systew ? 

YES 

Is data ciediuM placed in device INTERNAL 

? 

YES 

Is prograM MediuM placed in correct device ? 

YES 



341 



BONE LENGTHS OF WHITE LEGHORN FOWL (MORRISON P 



M3> 



Data file na«e: BONELNGTH: INTERNAL 

Data type is: Correlation Matrix 

Number of observations : 6 
NuMber of variables: 6 



V. 


aria 


ble nanes; 




i. 


SKULL LGTH 




2. 


SKULL BOTH 




3. 


HUMERUS 




4. 


ULNA 




5. 


FEMUR 




6. 


TIBIA 


Si 


jbf i 


les: NONE 



SELECT ANY KEY 



Press special function key labeled-LIST 



BONE LENGTHS OF WHITE LEGHORN FOWL (MORRISON P. 243) 
Data type is: Correlation Matrix 





Variable # 1 


Variable # 2 


Variable * 3 


Variable ♦ 4 


Variable # 5 




(SKULL 


LGTH) 


(SKULL BDTH) 


(HUMERUS ) 


(ULNA ) 


(FEMUR ) 


VAR# 














i 


1 


.00000 


.58400 


.61500 


.60100 


.57000 


2 




.58400 


1.00000 


.57600 


.53000 


.52600 


3 




.61500 


.57600 


1.00000 


.94000 


.87500 


4 




.60100 


.53000 


.94000 


1.00000 


.87700 


5 




.57000 


.52600 


.87500 


.87700 


1.00000 


6 




.60000 


.55500 


.87800 


.88600 


.92400 





Variable # 6 




(TIBIA ) 


VAR# 




1 


.60000 


2 


.55500 


3 


.87800 


4 


.88600 


5 


.92400 


6 


1.00000 



SELECT ANY KEY 



Use all the variables in the analysis (YES/NO) ? 

YES 

Is the above inforwation correct ? 

YES 



Select special function key labeled-ADV STAT 

Remove BSDM media 

Insert Principal Components & Factor Analysis 
media 



342 



PRINCIPAL COMPONENTS AND FACTOR ANALYSIS 
BONE LENGTHS OF WHITE LEGHORN FOUL <MORRISON P. 243) 



where variables to be 


used are 


i. SKULL LGTH 




2. SKULL BOTH 




3. HUMERUS 




4. ULNA 




5. FEMUR 




6. TIBIA 





CORRELATION MATRIX 



SKULL LGTH 
SKULL BDTH 
HUMERUS 
ULNA 
FEMUR 



SKULL BDTH 
.5840000 



HUMERUS 
.6150000 
.5760000 



ULNA 

6010000 

.5300000 

.9400000 



FEMUR 

S700000 

.5260000 

.8750000 

.8770000 



TIBIA 
.6000000 

.5550000 
.8780000 
.8860000 
.9240000 



Do you want to store the correlation Matrix ? 

NO 



Enter number of desired funtion; 

2 

Press ^CONTINUE" when ready. 



Select principal component analysis 



******************************** 

* PRINCIPAL COMPONENT ANALYSIS *' 

******************************** 
Enter the option for conponents ou tput ( 1 ,2,3, or 4) 
1 Output all the principal components 



COMPONENT MATRIX 



Variable Naeie 

1 . SKULL LGTH 

2. SKULL BDTH 

3. HUMERUS 

4. ULNA 

5. FEMUR 

6. TIBIA 



1 
,347463 
.326404 
.443411 
.439972 
.434532 
.440140 



COMPONENT 
2 
.536959 
.696453 
.187321 
.251402 
.278188 
.225718 



3 
. 766673 

636305 
.040071 

011196 
.059205 
.045735 



4 
.049099 
,0 02033 
.524079 
.488769 
.514259 
.468582 



5 
.027212 
.008031 
.168550 
.151309 
.669453 
.706912 



6 
.0 02378 
,058829 
.680900 
.693763 
. 132887 
.184237 



Eigenvalue 

7. of total 
variance 



4.567571 



.714123 



76.12618 11.90205 



.412129 



6.86882 



173189 



2.88648 



.075859 



1.26431 



057129 



.95216 



Cumulative 7. 
variance 



76.12618 88.02823 



94.89705 



97.78353 



99.04784 100.00000 



343 



Do you wish to plot the principal cowponents ? 

YES 

Plot on CRT ? 

NO 

Plotter identifier string (press CONT if 'HPGL')? 

Enter select code, HPIB bus (defaults are 7,5)? 

A beep will signify the end of the plot. 

Which pen nunber should be used ? 

i 

Enter the pair of component numbers which will be used in this plot ? 

1,2 



BONE LENGTHS OF WHITE LEGHORN FOWL 
Component Plot 



K 

o 

0. 

5 

o 



i.e 

.8 

.6 

.4 

.2 
0.0 H 
-.2 
-.4 
-.6 
-.8 
-1.0 



H h 



H f- 



(9 


CD 


(0 


*■ 


M 


8 


(VI 


■ 


■ 


■ 


• 


■ 


• 


• 


1 


i 


1 


1 


1 


s 





CO 09 



COMPONENT 1 



344 



Plot for another two factors ? 

YES 

Which pen nuciber should be used ? 

i 

Enter the pair of component nuwbers which will be used in this plot ? 

i , 3 



m 



o 
o. 

£ 
O 

o 



BONE LENGTHS OF WHITE LEGHORN FOWL 
Component Plot 

1.0 
.8 
.6 
.4 
.2 

e.e h 

-.2 
-.4 
-.6 
-.8 



-1.8 



H 1 1 h 



H 1 



(90D(DtNfiMt(DO(S 
• «•■•*«*■•* 

~> I I I I s — 

I 



COMPONENT 1 



345 



Plot for another two factors ? 

YES 

Which pen nu fiber should be used ? 

1 

Enter the pair of conponent nunbers which will be used in this plot V 

2,3 



BONE LENGTHS OF WHITE LEGHORN FOWL 
Component Plot 



m 



1.0 

.8 

.6 

.4 

.2 
0.0 h 
-.2 
-.4 
-.6 
-.8 
-1.0 



-\ h 



H 1 1 1 



s 


00 


CO 


*■ 


(U 


ta 


• 


■ 


• 


• 


■ 


• 


^4 
1 


1 


1 


1 


1 


s 



(U 



CO 



COMPONENT 2 



Plot for another two factors ? 

NO 

Enter nuMber of desired funtiom 

3 

Method for extracting factorsd OR 2) 

i 

Max. # of factors to be extracted (<= 15) 

3 



Select factor analysis 
Use principal axes method 



346 



******************************************** 

* FACTOR ANALYSIS BY PRINCIPAL. AXES METHOD * 
******************************************** 



A maximum of 3 factors will be extracted. 
Enter CoMMunality EstiMate type (1,2,3, or 4) 



Squared multiple correlation 



COMMUNALITY ESTIMATION 

Squared Multiple Correlation has been used to coMpute the coMMunality estiMates. 
Initial EstiMated CoMMunalities of Variables : 
Variable CoMMunality 



i. SKULL LGTH 

2. SKULL BDTH 

3. HUMERUS 

4. ULNA 

5. FEMUR 

6. TIBIA 



.46814 
.42741 
.90169 
.90232 
.87345 
.88329 



Do you wish to specify a Min. eigenvalue for factor inclusion ? 

NO 

Do you want to refine the coMMunality estiMates using iteration ? 

YES 

Enter the naxinun # of iterations <default=25) 

S 

Max. nuMber of iterations for factor extraction = 5 



CoMMunalities of Variables after 5 iterations 



Variable 

i. SKULL LGTH 

2. SKULL BDTH 

3. HUMERUS 

4. ULNA 

5. FEMUR 

6. TIBIA 



CoMMunali ty 

.60294 
.56058 
.93835 
.94385 
.91719 
.93088 



UNROTATED FACTOR MATRIX 



Variable NaMe 

1. SKULL LGTH 

2. SKULL BDTH 

3. HUMERUS 

4. ULNA 

5. FEMUR 

6. TIBIA 

Contribution 
of factor 

X of total 

Variance 

Extracted 



.684976 
.636078 
.951391 
.945555 
.928596 
.942826 



4.42422 



73.73696 



■.365703 
-.393993 
.081564 
.150044 
.176294 
.125079 



.36486 



6.08099 



FACTOR 

3 

.003721 

-.027403 

.162951 

.165112 

-.154345 

-.162222 



10472 



1 .74530 



347 



Do you wish to perforw any factor rotations ? 
YES 



W ^ 4 ^lr ^w \V w \V w w st 4 4 st 4 SU \V s^ 

* FACTOR ROTATION * 

Do you wish to plot the original factors ? 

YES 

Plot on CRT ? 

NO 

Plotter identifier string (press CONT if 'HPGL')? 

Enter the select code, HP bus (defaults are 7,5)? 

Which PEN nuMber should be used? 

i 

The pair of factor numbers used in this plot =? 

i>2 

A beep will signify the end of the plot. 



BONE LENGTHS OF HHITE LEGHORN FOHL 
UNROTHTED Factor Plot 



§ 

i- 
o 



1.0 

.8 

.6 

.4 

.2 
8.0 h 
-.2 
-.4 
-.6 
-.8 
-1.8 



H h 



-I h 



H 1 



Note that factors lie on top of each other. 
(See factor matrix) 



s 


CD 


CD 


<■ 


cu 


6) 


■ 


• 


• 


• 


• 


■ 


*>4 

1 


1 


1 


1 


I 


8 



(XI 



CD 



FACTOR 1 



348 



Plot for another two factors ? 

YES 

Which PEN nunber should be used? 

i 

The pair of factor nunbers used in this plot =? 

1,3 

A beep will signify the end of the plot. 



BONE LENGTHS OF HHITE LEGHORN FOWL 
UNROTHTED Factor Plot 



m 

s 

IE 
U. 



1.8 

.8 

.6 

.4 

.2 

0.0 I 1 h 

-.2 
-.4 



-.6 



-.8 



-1.0 



-\ 1 lo<— I 1 



Note that factors lie on top of each other. 
(See factor matrix) 



S o o 
~ l I* 



cu O 
f S 



cu 



CD CD 



FACTOR 1 



349 



Plot for another two factors ? 

YES 

Which PEN nuctber should be used? 

i 

The pair of factor numbers used in this plot -^ 

2,3 

A beep will signify the end of the plot. 



BONE LENGTHS OF HHITE LEGHORN FOWL 
UNROTHTED Factor Plot 



m 

§ 

i- 
o 

U. 



1.8 

.8 

.6 

.4 

.2 
8.8 f- 
-.2 
-.4 
-.6 
-.8 
-1.8 



H h 



I 



34 



H h 



H 1 



65 



S CD (0 



(VI 
f 



S 

s 



w 



(O 



FACTOR 2 



Plot for another two factors ? 

NO 

Enter the type of rotation (1 or 2) = 

1 

Enter the Method of orthogonal rotationd or 2) 

i 



350 



FACTOR MATRIX 



ORTHOGONAL VARIMAX ROTATION 



Variable Na«e 


i . 


SKULL LGTH 


2. 


SKULL BDTH 


3. 


HUMERUS 


4 


ULNA 


5. 


FEMUR 


6 


TIBIA 


Contribution 




of factor 


X 


of total 




Variance 


Extracted 





FACTOR 




1 


2 


3 


.351827 


-.689172 


. 064838 


.298532 


-.686028 


.028647 


.809812 


-.465665 


.256342 


.843788 


-.405943 


.259001 


.873357 


-.388363 


.060132 


.856571 


-.438891 


.067387 


3.07714 


1.67068 


.14597 


51.28572 


27.84462 


2.43291 



Do you wish to plot the rotated factors ? 

YES 

Plot on CRT ? 

NO 

Plotter identifier string (press CONT if 'HPGL')? 

Enter the select code, HP bus (defaults are 7,5)? 



Which PEN nuMber should be used? 
1 



The pair of factor nunbers used in this plot =? 

i,2 

A beep will signify the end of the plot. 



BONE LENGTHS OF HHITE LEGHORN FOHL 
VRRIMRX ROTATED Factor Plot 



351 



s 

H 
O 



1.0 

.8 

.6 

.4 

.2 

0.8 1 1 h 

-.2 
-.4 
-.8 
-.8 



-1.0 



I 



(0 



H 1 1 1 1 



21 



I I 



f 



s 
s 



(U 



.f 



Note that factors lie on top of each other. 
(See factor matrix) 



to 



FACTOR 1 



352 



Plot for another two factors ? 

YES 

Which PEN nuMber should be used? 

i 

The pair of factor nunbers used in this plot =? 

i,3 
A beep will signify the end of the plot. 



BONE LENGTHS OF WHITE LEGHORN FOWL 
VRRIMRX ROTATED Factor Plot 



m 

8 
8 



1.0 

.8 

.6 

.4 

.2 

0.0 

-.2 

-.4 

-.6 

-.8 

-1.0 



H h 



Note that factors lie on top of each other. 
(See factor matrix) 



........... 

— I I I I 8 — 



FACTOR 1 



353 



Plot for another two factors ? 

YES 

Which PEN nuMber should be used? 

i 

The pair of factor numbers used in this plot =? 

2,3 

A beep will signify the end of the plot. 



BONE LENGTHS OF WHITE LEGHORN FOWL 
VRRIMflX ROTATED Factor Plot 



m 

s 



l.a 

.8 

.6 

.4 

.2 

0.0 

-.2 

-.4 

-.6 

-.8 

-1.0 



34 



H 1 h 



65 



H 1 h 



8a>a><rNSfti«-(0CDS 

• ••••■••••• 

— I I I I s - 



FACTOR 2 



Plot for another two factors ? 

NO 

Do you wish to perforM another rotation ? 

NO 

Enter nunber of desired funtion: 
4 



Return to BSDM 



354 



Notes 



355 



Monte Carlo Simulations 



General Information 

Description 

The programs in this software package are meant primarily as a library of utility routines to be 
combined with the user's own programs. Hence, each routine is set up as an independent, 
modular unit with a standard of input and output parameters. These subprograms contain no 
actual inputs or outputs, with the exception of error messages. 

With each routine, the package provides a general-purpose front-end driver. In some cases, 
such as the Spectral and Run tests, the driver plus the routine make sense as a stand-alone unit. 
In other cases, such as the various random number deviates, the drivers are simply meant to 
introduce the user to the subprogram itself. 

The software package does not establish the printers or the mass storage devices. It is the user's 
responsibility to select the printer and mass storage device before using any of these routines. 

The 9826/36 operating system includes a random number generator, RND. 

General Instructions 

How Do I Load A Stand Alone Program? 

1. Insert the program disc into the computer. 

2. None of the drivers ask for the desired printer or mass storage device. This must be set by 
the user from the keyboard. 

3. Type: LOAD "File name", 10 
Press: EXECUTE. 

4. At this point, appropriate inputs are requested, computations are performed, and the 
results are printed or saved on a mass storage device. 



356 



How Do I Add One Of The Utility Subprograms Onto My Program? 

Each program file has a driver and then one or more subprograms. If you want to incorporate 
just one of these subprograms into your routine, how do you do it? 

The entire file needs to be loaded into memory first, and then the particular subprogram needs 
to be saved in a temporary file. Finally, after you have written your own code, you can link the 
temporary file containing the desired subprogram on after your code. 

1. Insert the program cartridge or disc into the computer. 

2. Type: LOAD "File name" 
Press: EXECUTE 

3. After the program has been loaded, 
Type: EDIT 

Press: EXECUTE 

4. At this point, the screen looks as follows: 

10 Beginning of driver proqra«. 
20 

Driver program 

END 



100 SUB Sub.._to_be...linked 



SUBEND 



5. If subprogram Sub_to_be_linked is the one desired and it goes from line 100 to line 500, 
then 

Type: SAVE "TEMP", 100,500 
Press: EXECUTE. 

6. Type: SCRATCH A 
Press: EXECUTE. 

7. After you enter your program into memory, for this example assume that the last line of 
your code is line 2500. Then 

Type: GET "TEMP", 2510 
Press: EXECUTE. 

8. The desired subprogram is then linked on behind your routine. 



357 



Special Considerations 

1. All the programs in this package have been set up using the random number generator 
RND. This may be replaced by the super random generator contained in RSUPER. 

2. You now have two different random number generators at your disposal. 

RND: a randomly generated generator. (See the section further on in General 

Information for more details. ) 

RSUPER: a combination generator. (See "RSUPER" for further details. ) 

It is strongly suggested that any serious Monte Carlo simulation should be run with both of 
these generators. 

3. This package is meant to provide a set of subprogram utilities which you can combine to 
meet your particular needs. Each utility may be viewed as an independent modular unit. 
This allows you to combine these building blocks into your own program. 

4. In order to get a feel for how each utility works and, in the case of the various generators, 
how much confidence you can place in them, driver routines have been provided. So, it 
is suggested that you first use these driver programs as is, and then later adapt them to 
your particular need. 

5. In order to allow you the most flexibility, no references are made to printers or mass 
storage devices. Hence, to have a particular program run from a floppy disk in the 
internal disc drive and have all information printed on the CRT, you would type in the 
following before running your program: 

1. a. Type: MASS STORAGE IS ^INTERNAL" 
b. Press: EXECUTE 

2 a. Type: PRINTER IS 1 
b. Press: EXECUTE 

6. Each of the driver programs for the random deviates allows you to: 

1. generate a set of random numbers to be printed or saved on a mass storage device, 
or 

2. get a feeling for the quality of the generator by running through some randomly 
generated tests. 



358 



7. There may be occasions where you will not have enough memory to store all the random 
numbers you would like to have. A number of possible tricks are available to you: 

a. Presently all deviates are set up in full precision arrays. Can you store the deviates in 
an integer? Where a full precision array requires 8 bytes per number, an integer only 
requires 2. Care must be taken here to dimension your array using an INTEGER 
statement rather than a DIM. Also, the parameters in the SUB statement must be 
changed to INTEGER. 

b. Can you generate and use the random numbers in a partitioned fashion? For exam- 
ple, generate 1000 deviates, use them; generate 1000 more, use them; etc. 

c. If b is not possible, can you make use of your mass storage device to recall the 
deviates as you need them? For example: 

i. generate 1000 deviates; store them; generate 1000 more, store them; etc. 

ii. bring first 1000 deviates into memory; use them; bring them 1000 in, use them; 
etc. 

8. Entering a value of 1 for the printer's select code automatically causes the program to 
skip over the question requesting the printer's bus address. 

9. If you choose to check through some examples of random data sets produced by one of 
the generators, default values are supplied for the parameters. For example, you may see 
a prompt such as: 

# OF RANDOM DEVIATES IN EACH SET? 
100 

If the default number, 100, is acceptable to you simply press CONTINUE and 100 
deviates will be generated in each set. If you wish to have a different number generated, 
edit the number in the response line before pressing CONTINUE. 

10. If you store a set of random numbers produced by one of the generators, the data set 
may be read into a statistical data base created by Basic Statistics and Data Manipulation 
(BSDM) and then accessed by any other statistics routine. 

To access the data using BSDM, remember that the data was not stored by BSDM. Thus, 
you will need to supply a name for the data set, a variable name, number of observations, 
etc. 



359 



9826/36 Random Number Generator: RND 

This generator uses a standard "multiplicative congruential generator". In this generator, a 
starting value called the seed is multiplied by a positive integer constant, and the result is taken 
modulus M. 

X (l + 1) = A*XiModM 

The algorithm used in the RND has a starting seed of 37480660. This seed may be set by the 
program to any new value by using the RANDOMIZE statement. 

In this routine, the value A = 16 807, is used for the multiplier. The modulus M = 2 31 - 1. The 
exact steps used in the algorithm are presented below. 

The algorithm below is the one used to generate the next random number in a sequence from 
the previous one (i.e., the seed) using RND: 

1. Multiply the current seed by 16 807. 

2. Take the result of Step 1 Modulus M. 

3. Save result of Step 2 as the new seed. 

4. Convert the result of Step 2 to a number between and 1. 
(Divide by 2 31 -1). 

5. Go to Step 1. 
References 

1. Camp, Warren V. and Lewis, T.G., "Implementing a Pseudo-Random Number Gener- 
ator on a Minicomputer", IEEE Transactions on Software Engineering, May, 1977. 

2. Knuth, Donald E. , The Art of Computer Programming, Volume 2: Seminumerical Algor- 
ithms, Addision-Wesley, Reading, Mass., 1969. 

3. Learmonth, J. and Lewis, P.A.W., "Naval Postgraduate School Random Number Gener- 
ator Package LLRANDOM", Naval Postgraduate School, Monterey, Calif., 1973. 

4. Learmonth, J. and Lewis, P.A.W., "Statistical Tests of Some Widely Used and Recently 
Proposed Uniform Random Number Generators", Naval Postgraduate School, Mon- 
terey, Calif., 1973. 

5. MacLauren, M.D. and Marsaglia, G., "Uniform Random Number Generators", JACM 
12, Jan. 1965, p. 83-89. 

6. Marsaglia, G. and Bray, T.A., "One-line Random Number Generators and Their Use in 
Combinations", CACM, Vol. II, 1968, p. 757-759. 

7. Musyck, E., "Search For a Perfect Generator of Random Numbers", Studiecentrum 
Voor Kernenergie, E. Plaskylaan 144, Brussels 4, Belgium, January, 1977. 

8. Reddy, Y.V., "PL/I Process Generators", SIMULETTER, Vol. Ill, Oct. 1976, p. 25-29. 

9. Wheeler, Robert E., "Random Variable Generators", SIMULETTER, Vol. Ill, Oct. 1976, 
p. 16-22. 



360 



Random Number Generators 

Object of Program 

Subprograms with optional drivers are provided to generate random deviates on some stan- 
dard statistical distributions. 

The subprograms have been set up as independent modules. Hence, it is quite simple to use 
these routines in your own programs. Choose values for the required input parameters, call the 
subprogram and the resulting outputs are returned to you. See the General Information section 
if this manual for detailed instructions. 

Optional drivers have also been set up for your use. In general, the drivers: i) allow you to 
directly generate a set of deviates to be printed or saved on a mass storage device; and ii) 
provide the ability to check out the particular generator through the use of some standard tests 
in order to get a feel for the quality of the deviates produced. 

Typical Program Flow 







































Choose to check through examples 




Choose to consider a specific data set 


1 






Use default parameter values 




Enter parameters 










Numbers are generated and statistics 
on the deviates are printed. 




Print out the data set 
















Store the data set 



361 



(RBETA) 

Random Numbers Generated 

from a Beta Distribution 

Description 

Given a Beta distribution with VI and V2 degrees of freedom, respectively, this subprogram 
generates a set of random deviates. The probability density function is: 

f(x) = [x | (Vl/2-l)][(l-x) t (V2/2-l)]/[B(Vl/2,V2/2)] 
for 0=sxssl, where B(*,*) is the beta function. 

File Name 

"RBETA" 

Calling Syntax 

CALL Random_beta (N,V1,V2,X(*) ) 

Input Parameters 

N number of deviates desired. 

VI, V2 degrees of freedom on the Beta distribution. 

Output Parameters 

X(*) array of dimension (1:N) containing the N deviates. 

Algorithm 

This routine generates deviates for the beta distribution with vl, v2 degrees of freedom. The 
method used is valid for both integer and non-integer vl and v2: 

1. Generate uniform random deviates ul and u2. 

2. Setyl=ul f (2/vl);y2 = u2 f (2/v2), repeating this process until finding yl +y2< = 1. 

3. Thenx = yl/(yl+y2). 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2 Seminumerical Algor- 
ithms, Reading, Mass.: Addision-Wesley, 1969, p. 115. 



362 



(RBINOM) 

Random Integers Generated From a 

Binomial Distribution (T,P) 

Description 

Given that some event occurs with probability P and that we carry out T independent trials, this 
subprogram generates a set of integers with the binomial distribution (T,P). The probability 
density function is: 

f(x) = (I)[P t x][(l-P) t (T-x)] 
Forx = 0,l,...,T. 

File Name 

"RBINOM" 

Calling Syntax 

CALL Random-binomial (N,P,T,X(*) ) 

Input Parameters 

N number of deviates 

P probability of the event occurring. 

T number of independent trials. 



Output Parameters 

X(*) array of dimension (1:N) containing integers randomly 

generated for the number of occurrences. 



Algorithm 

Given T and P: 

1. Set Sum = 0. 

2. For 1 = 1 to T. 

3. Generate a uniform random deviate U. 

4. IfU<=PthenSum = Sum + l. 

5. Next I. 

6. The binomial deviate is equal to Sum. 

Reference 

1. Reddy, Y.V., "PL/I Process Generators", SIMULETTER, Vol III, Oct. 1976, p. 25-26. 



363 



(RCHISQ) 

Random Numbers From a Chi-square 

Distribution 

Description 

Given the number of degrees of freedom and the number of deviates desired, this subprogram 
generates a set of random numbers with the Chi-square distribution. The probability density 
function is: 

f(x) =[.5 f (v/2)][x t (v/2-l)][exp( - .5x)]/[G(v/2)] for x > 0, where v is the degrees of 
freedom and G(*) is the gamma function. 

File Name 
"RCHISQ" 

Calling Syntax 

CALLRandom_chi_sq(N,V,X(*) ) 

Input Parameters 

N number of deviates desired. 

V degrees of freedom. 

Output Parameters 

X(*) array of dimension (1:N) containing the N deviates. 

Algorithm 

This utility generates random deviates for the Chi-square distribution with v degrees of 
freedom. 

For each deviate, if v = 2*k, where k is an integer 

set x = 2*(yl + y2 + ... +yk) where the y's are independent random variables with the 
exponential distribution, each with mean = 1. 

Ifv = 2*k + l, 

set x = 2*(yl + y2 + ... +yk) +z | 2 where the y's are as before, and z is a random 
variable independent of the y's, with the normal distribution (mean = , standard 
deviation = 1). 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2 Seminumerical Algor- 
ithms. Reading, Mass: Addison-Wesley, 1969, p. 115. 



364 



(REXPON) 
Random Numbers From an Exponential Distribution 

Description 

Given a mean, which you supply, this subprogram generates a set of exponential deviates. The 
probability density function is: 

f(x) = [exp( - x/|x)]/|x 

for x>0, where jul is the mean of the distribution = M(jl. 

File Name 
"REXPON" 

Calling Syntax 

CALL Random_expon (N,Mu,X(*) ) 

Input Parameters 

N number of deviates desired. 

Mu mean of the distribution. 

Output Parameters 

X(*) array of dimension (1:N) containing the N deviates. 

Algorithm 

This routine uses the random minimization method (due to George Marsaglia) to compute an 
exponentially distributed variable without using the logarithm subroutine. Although this routine 
takes slightly more space, it is much faster than the traditional algorithm. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2 Seminumerical Algo- 
rithms. Reading, Mass.: Addison-Wesley, 1969, p. 114. 



365 



(RF) 
Random Numbers Generated From an F-Distribution 

Description 

Given an F-distribution (variance-ratio distribution) with VI and V2 being the numerator and 
denominator degrees of freedom, respectively, this subprogram generates a set of correspond- 
ing random deviates. The probability density function is: 

[G(Vl/2 + V2/2)][(Vl/V2) | Vl/2][x j (Vl/2-1)] 

f W= G(Vl/2)G(V2/2)[(l + (Vl/V2)x) t (V1/2 + V2/2)] 

for x>0, VI and V2 positive integers. 



File Name 
'RF' 



"DC" 



Calling Syntax 

CALLRandom_f(N,Vl,V2,X(*) ) 

Input Parameters 

N number of deviates desired. 

VI, V2 degrees of freedom on the F-distribution. 

Output Parameters 

X(*) array of dimension (1:N) containing the N random numbers. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2 Seminumerical Algo- 
rithms. Reading, Mass.: Addison-Wesley, 1969, p. 116. 



366 



(RGAMM1) 
Random Integers Generated From a Gamma (Alpha) 

Distribution 

Description 

This subprogram generates a set of Gamma (Alpha) deviates. The probability density function 
is: 

f(x) = [(x) t (Alpha - 1) )(exp(-x)]/G(Alpha) 

where Alpha>0 is the distribution parameter and G(*) is the gamma function. 

File Name 

"RGAMM1" 

Calling Syntax 

CALL Random_gammal (N,Alpha,X(*) ) 

Input Parameters 

N number of random numbers desired. 

Alpha Gamma parameter. 

Output Parameters 

X(*) array of dimension (1:N) containing numbers randomly generated with the given 

Gamma distribution. 



367 



(RGAMM2) 
Random Numbers Generated From a Gamma 

(A,B) Distribution 

Description 

This subprogram generates a set of Gamma (A,B) random deviates. The probability density 
function is: 

f(x) = [x t (B-l)][exp(-x/A)]/[G(B)A | B] 

for x, A and B>0, where G(*) is the gamma function. 

File Name 

"RGAMM2" 

Calling Syntax 

CALL Random_gamma2 (N,A,B,X(*) ) 

Input Parameters 

N number of random deviates desired. 

A,B Gamma parameters, B must be an integer. 

Output Parameters 

X(*) array of dimension (1:N) containing deviates randomly generated with the Gamma 

distribution. 



Algorithm 

1. Given C 
mean = 

2. The corresponding Gamma deviate is equal to the sum of the B exponential deviates. 



1. Given Gamma parameters A and B, generate B independent exponential deviates with 
mean = A. 



368 



(RGEOM) 
Random Integers Generated From a Geometric 

Distribution 

Description 

Given that a certain event occurs with probability P, this subprogram generates N random 
integers with the appropriate Geometric distribution; that is, each random integer represents 
the number of individual trials needed until the given event first occurs (or between occurrences 
of the event). The probability density function is: 

f(x) = P(l-P) t (x-D 
for x= 1,2... . 

File Name 
"RGEOM'Calling Syntax 

Call Random_geom (N,P,Integer(*) ) 

Input Parameters 

N number of random integers desired. 

P probability of a given event occurring. 

Output Parameters 

Integer(*) array of dimension (1:N) containing integers randomly generated for the number 
of independent trials needed until the given event occurs. 

Algorithm 

The probability of the event first occurring on the Rth trial is P*(l-P) T (R-l). 

A convenient way to generate a variable with this distribution when P is small, is to set R = the 
least integer function of [ln(U)/ln(l -P)] where U is a uniformly generated random number. 

Reference 

1. Knuth, Donald E. , The Art of Computer Programming, Volume 2, Seminumerical Algor- 
ithnms, Reading, Mass.: Addison-Wesley, p. 116. 



369 



(RLNORM) 
Random Lognormal Deviates 

Description 

This subprogram generates a set of random deviates such that the natural logarithm of the 
deviates follows a normal distribution with mean = Mu and standard deviation = Sigma. The 
probability density function is: 

f(x) = [exp(-.5[(lnx -Mu)/Sigma] t 2)]/[x((2*PI) f .5)*Sigma] 

File Name 

"RLNOiRM" 

Calling Syntax 

CALL RandomJognorm (N,Mu, Sigma, X(*)) 

Input Parameters 

N number of deviates desired. 

Mu mean of the associated normal distribution. 

Sigma standard deviation of the associated normal distribution. 

Output Parameters 

X(*) array of dimension (1:N) containing the N lognormal deviates. 

Algorithm 

1. Let S = log[(Sigma | 2)/(Mu | 2) + 1]. 

2. LetU = log(Mu) - 0.5*S. 

3. Generate a normal deviate A, with mean = U and standard deviation = Square Root of 
(S). 

4. Then the lognormal deviate is equal to exp (A). 

Reference 

1. Reddy, Y.V., "PL/I Process Generators", SIMULETTER, Vol. Ill, Oct., 1976, p. 27. 



370 



(RNEGBI) 

Random Numbers Generated From a Negative 

Binomial Distribution 

Description 

This subprogram generates a set of Negative Binomial random deviates, that is, each random 
integer represents the number of trials needed until a given event occurs R times. The probabil- 
ity density function is: 

fM =(r: 1 i) < p T R)(d-P) T (x-R)) 

forOs=P=£l,andx = 1,2... . 

File Name 

"RNEGBI" 

Calling Syntax 

CALL Random_neg_bin (N,R,P,X(*) ) 

Input Parameters 

N number of random integers desired. 

R failure value. 

P probability. 

Algorithm 

1. Given parameters R and P, generate R random geometric deviates with parameter P. 

2. The corresponding Negative Binomial Deviate is equal to the sum of the R geometric 
deviates. 

Reference 

1. Wheeler, R.E., "Random Variable Generators", SIMULETTER, Vol. IV, April, 1973, p. 
22. 



371 



(RNORM) 

Normal Random Deviates With Mean = 

And Standard Deviation = 1 

Description 

This subprogram calculates an even number of normally distributed variables with mean = 
and standard deviation = 1. The probability density function is: 

f(x) = [exp(-.5(x t 2))]/[(2*PI) f -5] 

File Name 

"RNORM" 

Calling Syntax 

CALL Random-normal (N,X(*) ) 

Input Parameters 

N number of normal deviates desired. N must be even. 

Output Parameters 

X(*) array of dimension (1:N) containing the N normal deviates. 

Algorithm 

This utility generates random deviates for the normal distribution with mean = and standard 
deviation = 1. An adapted form of the Polar Method is used. (See Reference 1.) 

Special Considerations 

1. Due to the nature of the algorithm used, this routine generates an even number of normal 
deviates. If an odd number is requested, an error message is printed and the routine has 
to be re-entered again. 

2. This method is rather slow, but it has essentially perfect accuracy and takes a minimum of 
storage space. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical Algor- 
ithms. Reading, Mass.: Addison-Wesley, 1969, p. 104. 



372 



(RNORM1) 

Normal Random Deviates With Specified 

Mean and Standard Deviation 

Description 

This subprogram generates a set of normal random deviates with mean = Mu and standard 
deviation = Sigma. The probability density function is: 

f(x)= exp[-(x-Mu) 2 /(2*Sigma | 2)]/[(2*PI) t -5*Sigma] 
where Sigma >0. 

File Name 

"RNORM1" 

Calling Syntax 

CALL Random_normall (N,Mu,Sigma,X(*) ) 

Input Parameters 

N number of deviates desired 

Mu assume a normal distribution with mean = Mu. 

Sigma assume a normal distribution with Standard Deviation = Sigma. 

Output Parameters 

X(*) array of dimension (1:N) containing the N normal deviates. 

Algorithm 

Given a mean = u and standard 
deviation = s, 

1. Generate a deviate x with a normal distribution with mean and standard deviation = 1. 

2. Then y = u + s * x. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical Algor- 
ithms. Reading, Mass.: Addison-Wesley, 1969, p. 113. 



373 



(RNORM2) 

Dependent Normally Distributed Random Variables 

(Bivariate Normal Deviates) 

Description 

This subprogram generates two dependent random variables which have a bivariate normal 
distribution with marginal means = Mul,Mu2, marginal standard deviations = Sigma 1, 
Sigma2, and Correlation Coefficient = Rho. 

File Name 

"RNORM2" 

Calling Syntax 

CALL Random_normal2 (Mul,Mu2,Sigmal,Sigma2,Rho,Xl(*),X2(*) ) 

Input Parameters 

Mul, Mu2 marginal means. 

Sigma 1, Sigma2 marginal standard deviations. 

Rho marginal correlation coefficient. 

Output Parameters 

Xl(*), X2(*) two vectors of dependent normally distributed random variables. 

Algorithm 

If xl and x2 are independent normal deviates with mean = and standard deviation = 1, and 
if 

yl = Mul + Sigmal*xl, and y2 - Mu2 + Sigma2*(Rho*xl + J 1 - Rho f 2*x2) 

then yl and y2 are dependent random variables, normally distributed with means Mul, Mu2 
and standard deviations Sigma 1 and Sigma2, and with correlation coefficient Rho. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical Algor- 
ithms. Reading, Mass.: Addison-Wesley, 1969, p. 113. 



374 



(RPAR1) 
Random Pareto Generator Of The First Kind 

Description 

This program generates sets of random Pareto deviates of the first kind. The probability density 
function is defined as follows: 

f(x) - [N*A t N]/x f (N + l)forx>A 

File Name 

"RPAR1" 

Calling Syntax 

CALL Random_paretol (Number A,N,X(*) ) 

Input Parameters 

Number number of random deviates desired. 

A,N Pareto parameters. 

Output Parameters 

X(*) array of dimension (1:N) containing N Pareto deviates of the first kind. 

Algorithm 

1. Given parameters A and N, generate a uniform deviate U. 

2. Then the Pareto deviate is equal to: A/(l-U) f (1/N). 



375 



(RPAR2) 
Random Pareto Generator Of The Second Kind 

Description 

This program generates sets of random Pareto deviates of the second kind. The probability 
density function is defined as follows: 

f(x) = [N*B t N]/[B + x] t (N + l)forx>0. 

File Name 

"RPAR2" 

Calling Syntax 

CALL Random_pareto2 (Number B,N,X(*) ) 

Input Parameters 

Number number of random deviates desired. 

B,N Pareto parameters. 

Output Parameters 

X(*) array of dimension (1:N) containing N Pareto deviates of the second kind. 

Algorithm 

1. Given parameters B and N, generate a uniform deviate U. 

2. Then the Pareto deviate is equal to: B/(l-U) t (1/N)-B. 



376 



(RPOISS) 

Random Integers Generated From 

A Poisson Distribution 

Description 

This subprogram generates a set of Poisson deviates with a specified mean. The probability 
density function is: 

f(x) = [exp(-Mu) (Mu | x)]/x! 

for x = 0,1,..., where Mu is the mean of the distribution, and Mu>0 

File Name 

"RPOISS" 

Calling Syntax 

CALL Random_poisson (N,Mu,X(*) ) 

Input Parameters 

N number of random integers desired. 

Mu mean of the Poisson distribution. 

Output Parameters 

X(*) array of dimension (1:N) containing integers randomly generated with the given 

Poisson distribution. 

Algorithm 

Given a mean of the distribution Mu, 

1. Set: P = exp (-Mu) 

N - 

Q = 1 

2. Generate a random variable U, uniformly distributed between and 1. 

3. Set: Q = Q*U 

4. If Q>P, then set N = N + 1 and return to step 2. 
Else, terminate the algorithm with output N. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical Algor- 
ithms. Reading, Mass.: Addison-Wesley, 1969, p. 116. 



377 



(RSPHER) 

Random Points on an M-dimensional 

Sphere of Radius One 

Description 

This subprogram generates a set of random points on an M-dimensional sphere of radius one. 

File Name 

"RSPHER" 

Calling Syntax 

CALL Random-sphere (N,M,X)*) ) 

Input Parameters 

N number of random points desired. 

M number of dimensions of the sphere. 

Output Parameters 

X(*) array of dimension (1:N) containing the N random points. 

Algorithm 

1. Let XI, X2 . . . ., Xm be independent normal deviates (means = 0, standard 
deviation = 1). 

2. LetR = SQR(XlT2 + X2T2 + ...+XmT2). 

3. Then the point (Xl/R,X2/R,...,Xm/R) is a random point on the M dimensional sphere of 
radius one. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2 Seminumerical Algor- 
ithms. Reading, Mass.: Addison-Wesley, 1969, p. 116. 



378 



(RSUPER) 
Super Uniform Random Number Generator 

Description 

Given methods for generating two random sequences, this schuffling algorithm successfully 
outputs the terms of a 'considerably more random' sequence. This routine uses RND twice to 
generate 'super' random numbers and, due to the slow execution speed, should be used only in 
cases where no regular random number generator will do. The probability density function is: 

f(x) = 1 
for0=sx=sl 

File Name 

"RSUPER" 

Calling Syntax 

CALL Random_super (N,X(*) ) 

Input Parameters 

N number of random deviates desired. 

Output Parameters 

X(*) array of dimension (1:N) containing N uniformly generated random numbers on 

the range (0,1). 

Algorithm 

This method has been suggested by Bays and Durham in (Ref. 1). Given methods for generat- 
ing two pseudo-random sequences xn and yn, this routine will output terms of a 'considerably 
more random' sequence. 

A temporary table V( 1:107) is used in the generation of sequence yn. 

1. Fill table V with the first 107 elements of sequence Xn. 

2. Set X,Y equal to the next numbers of the sequences Xn,Yn, respectively. 

3. Set J = INT(101*Y + 1) 

4. Output V(J) and set V(J) = X. 
Go to step 2. 

In our routine, both sequences Xn and Yn are generated using RND. 

Knuth contends that the sequence obtained by applying this algorithm will satisfy virtually 
anyone's requirements for randomness in a computer-generated sequence. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Vol. II. Seminumerical Algor- 
ithms, Second Edition, Reading, Mass.: Addison-Wesley, 1969, 1981. 



379 



Special Considerations 

1. As a result of our own tests, this generator comes highly recommended. It performed 
extremely well on all of our tests of randomness. In terms of execution speed and storage 
space, it is approximately three times as slow as RND alone, plus it requires an extra 856 
or so bytes for storage of the temporary array. 

2. In using this routine, it is suggested that as many random deviates be generated on one 
call as is possible. Each time the subprogram is entered, 107 new table values are 
created. 

3. If you are interested in repeatability of an experiment, remember that initial seeds must 
be set for RND (using RANDOMIZE). 

4. If you plan on calling this routine a large number of times, a significant amount of time 
would be saved if the table V is set up once in your calling routine and then passed as an 
additional parameter to Random_super. This will avoid the overhead of redoing this table 
each time you enter the routine. 

(RT) 
Random Numbers Generated From A T-Distribution 

Description 

This subprogram generates a set of random deviates for a T-distribution with V degrees of 
freedom. The probability density function is: 

f(x) = G( (V+l)/2)/[G(V/2) ( (V*PI) t .5) ( (1 + ( X T 2)/V( f (V + l)/z] 
forV = 1,2,... 

File Name 
"RT" 

Calling Syntax 

CALL Random_t (N,V,X(*) ) 

Input Parameters 

N number of random deviates desired. 

V degrees of freedom. 

Output Parameters 

X(*) array of dimension (1:N) containing the N random deviates. 

Algorithm 

1. Letyl be a normal deviate, (mean = 0, standard deviation = 1) 

2. Let y2 be independent of yl, having the Chi-square distribution with v degrees of 
freedom. 

3. Then x = yl/(SQR(y2/v) ) is independent, having the T distribution with v degrees of 
freedom. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical Algor- 
ithms. Reading, Mass.: Addison-Wesley, 1969, p. 116. 



380 



(RT1EXT) 
Random Type I Extreme-Value Generator 

Description 

This program generates sets of random Type I Extreme-Value deviates. The cumulative dis- 
tribution function is defined as follows: 

f(x) = exp(-exp[-Alpha*(x-Mu)] ) 

File Name 

"RT1EXT" 

Calling Syntax 

CALL Random_typelext (Number,Alpha,Mu,X(*) ). 

Input Parametes 

Number number of random deviates desired. 
Alpha Mu Type I parameters. 

Output Parameters 

X(*) array of dimension (1:N) containing N Type I deviates. 

Algorithm 

1. Given parameters Alpha and Mu, generate a uniform deviate U. 

2. Then the Type II deviate is equal to: - log[ - log(U)]/Alpha + Mu. 



381 



(RT2EXT) 
Random Type II Extreme- Value Generator 

Description 

This program generates sets of random Type II Extreme-Value deviates. The cumulative dis- 
tribution function is defined as follows: 

F(x) = exp[-(V/x) f K] 

File Name 

"RT2EXT" 

Calling Syntax 

CALL Random_type2ext (Number, V,K,X(*) ) 

Input Parameters 

Number number of random deviates desired. 
V,K Type II parameters. 

Output Parameters 

X(*) array of dimension (1:N) containing N Type II deviates. 

Algorithm 

1. Given parameters V and K, generate a uniform deviate U. 

2. Then the Type II deviate is equal to: V*[-log(U)] t (-1/K). 



382 



(RUNIF) 
Uniform Random Number Generator 

Description 

This program generates sets of uniform random numbers. The probability density function is: 

f(x) = 1 
for ss x =s 1 

Calling Syntax 

CALL Random_uniform (N,X(*) ) 

Input Parameters 

N number of random deviates desired. 

Output Parameters 

X(*) array of dimension (1:N) containing N uniformly generated random numbers on 

the range (0,1)- 



383 



(RWEIBU) 
Random Integers Generated 
From a Weibull Distribution 

Description 

This subprogram generates a set of Weibull deviates. The cumulative distribution function is: 
F(x) = 1- exp[-(x t (Beta) )/Alpha] 

File Name 

"RWEIBU" 

Calling Syntax 

CALL Random_weibull (N,Alpha,Beta,X(*) ) 

Input Parameters 

N number of random deviates desired. 

Alpha, Beta Weibull parameters. 

Output Parameters 

X(*) array of dimension (1:N) containing deviates randomly generated with the 

given Weibull distribution. 

Reference 

1. Wheeler, R.E., "Random Variable Generators", SIMULETTER, Vol. IV, April 1973, p. 
22. 



384 



Tests for Randomness 



Object of Programs 

A standard set of statistical tests for randomness is provided. These tests are designed as 
independent subprograms with optional drivers. These driver programs have been set up to test 
the binary random number generator RND for randomness. The aim here is twofold: i) to 
actually allow you to check the randomness of RND; and ii) to show you how a typical test 
might be set up. 



(TCHISQ) 
Chi-square Test 

Description 

This subprogram performs a Chi-Square test on a set of observations placed in a set of cate- 
gories with given probabilities. 

File Name 

"TCHISQ" 

Calling Syntax 

Call Chi_sq_test (N,Cats,Prob(*),Obs(*),V,P) 

Input Parameters 

N number of observations. This should be at least 5*Cats, but preferably much 

larger, for a valid test. 

Cats number of categories. 

Prob(*) array of dimension (l:Cats) containing the probabilities of any event occurring in 
a particular category. Care must be taken to insure that no probability value is too 
small. 

Obs(*) array of dimension (l:Cats) containing the number of observations occurring in 

each category. 

Output Parameters 

V Chi-square statistic. V is expected to have the Chi-square distribution with 

(Cats- 1) degrees of freedom. 

P right-tailed probability; Prob (X>V). 



385 



Special Considerations 

1. The Chi-square method can only be used with sets of independent observations. 

2. The proper choice of N is somewhat obscure. Large values of N will tend to smooth out 
'locally' non-random behavior, that is, blocks of numbers with a strong bias followed by 
blocks of numbers with the opposite bias. But, N should be large enough so that each 
of the expected values N*Prob> = 5 for the probability associated with each category. 
Preferably, N should be taken much larger than this. So, the method should probably 
be used with a number of different values of N. 

3. From the Chi-square formula, we can see that a very small probability value would 
severely influence the Chi-square statistic. Hence, it is suggested that categories with 
very small probabilities be grouped together into one larger category. 

4. You must supply the routine with the number of categories into which the data is to be 
partitioned. For example, to check the randomness of the first digit, ten categories will 
be sufficient. To check the first two digits, 100 categories are recommended. 

Algorithm 

A fairly large number, N, of independent observations is made. We count the number of 
observations falling into each of K categories, and compute the quantity. 

K 

V = (1/N)X( (observed(I) | 2)/Prob(I) )-N 

i = 1 

In the associated driver program, the right-tailed probability P(X> V) is then calculated using 
(K- 1) as the number of degrees of freedom. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical 
Algorithms. Reading, Mass.: Addison-Wesley, 1969, p. 35-40. 



386 



(TKS) 
Kolmogorov-Smirnov Test 

Description 

Given a continuous cumulative distribution function F(X), this subprogram calculates the stan- 
dard Kolmogorov-Smirnov statistics of maximum deviation. 

File Name 

"TKS" 

Calling Syntax 

Call K_s_test (N,Knp,Knn) 

Input Parameters 

N number of observations 

The distribution function F(X) must be provided as an in-line function to the subprogram. 

Output Parameters 

Knp positive K-S statistic. 

Knn negative K-S statistic. 

Algorithm 

Given a distribution function F(x) = probability that (X< = x) for a random variable X, the 
statistics Knp (Kn positive) and Knn (Kn negative) can be obtained as follows: 

1. Obtain the observations xl,x2,..., xn. 

2. Sort the observations: xl< = x2< = ...< = xn. 

3. Knp = SQR(n)* maximum of [j/n-F(xj)] where 1< = j< = n. 

Knn = SQR(n) * maximum of [F(xj) - (j - l)/n] where 1 < = j< = n. 

Special Considerations 

1. The method used in the driver program (using several tests for moderately sized N, then 
combining the observations later in another K-S test), tends to detect both local and 
global nonrandom behavior. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical 
Algorithms. Reading, Mass.: Addison-Wesley, 1969, p. 41-48. 



387 



(TMAXT) 
Maximum of T Test 

Description 

This routine generates groups of uniform random numbers, finds the maximum of each group 
and then applies the Kolmogorov-Smirnov test to the resulting set of numbers. 

File Name 

"TMAXT" 

Calling Syntax 

CALL Max_of_t (N,T,Knp,Knn) 

Input Parameters 

N number of groups to be tested. 

T size of each group. 

Output Parameters 

Knp positive Kolmogorov-Smirnov statistic. 

Knn negative Kolmogorov-Smirnov statistic. 

Algorithm 

For 0< = j<n, let Vj = max(Utj, Utj + 1, ..., Utj + t-1) where the U's are uniformly distri- 
buted random numbers. 

Now apply the Kolmogorov-Smirnov test to the sequence VO, VI, ..., Vn-1, with the dis- 
tribution function F(x) = x t t, (0< = x< = 1). 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Vol. II, Seminumerical Algo- 
rithms, Readinbg, Mass.: Addison-Wesley, 1969, p. 64. 



388 



(TPOKER) 
Modified Poker Test 

Description 

This subprogram calculates the number of distinct values in a given set of observations. A 
Chi-square test is then applied to the set of data. 

File Name 

'TPOKER" 

Calling Syntax 

CALL Poker_test (K,N,Digits,V,P) 

Input Parameters 

K number of possible different digits in a set. The degrees of freedom is then 

(K- 1). A reasonable number here is 5. 

N number of test sets to be used. N should be at least 5*(K-1), but preferably much 

larger, for a valid Chi-square test. 

Digits range on the allowed digits, [0,Digits-l]; 13 or 10 would be reasonable values 

here. 

Output Parameters 

V Chi-square statistic. V is expected to have the Chi-square distribution with (K-l) 

degrees of freedom. 

P right-tailed probability; Prob (X>V). 

Algorithm 

In general, we look at n groups of k successive numbers. We count the number of k-tuples 
with r different values. For example, generate 1000 groups of 5 successive numbers, where 
the numbers range from 1 to 13. How many sets have all 5 numbers different? How many 
have 4 different? How many 3? 2? 1? 

A Chi-square test is then made, using the probability. 

P(r) = d*(d-l)*...*(d-r+l)/(d t k)*S(k,r) 

where d is the number of possible digits considered and S(k,r) is the standard Sterling number 
of k,r. 

Special Considerations 

You will be required to enter a starting and ending value for the number of groups desired, as 
well as the increment between values. At each value, three independent tests are run. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical 
Algorithms. Reading, Mass.: Addison-Wesley, 1969, p. 57-58. 



389 



(TRUNS) 
Runs Test 

Description 

This subprogram sets up N random numbers and calculates the number of ascending or 
descending runs in the sequence. A special Chi-square statistic is then produced. 

File Name 

"TRUNS" 

Calling Syntax 

CALL Runs_test (N,Direction,V,P) 

Input Parameters 

N number of random deviates used. The value of N should be 4000 or more. 

Direction Direction = 1 means an ascending run. 
Direction = — 1 means a descending run. 

Output Parameters 

V Chi-square statistic. Since adjacent runs are not independent, a standard Chi- 

square test cannot be used here. A special test, with six degrees of freedom is 
used instead. 

P Right-tailed probability; Prob (X>V). 

Algorithm 

In this eilgorithm, we examine the length of monotone subsequences of an original sequence 
of random numbers; that is, segments which are increasing or decreasing. 

1. Calculate the increasing (or decreasing) run lengths and count how many runs have 
length 1, 2, ..., 6 or greater. 

2. Since adjacent runs are not independent, we cannot apply a standard Chi-square test to 
the above data. Instead, we calculate a special statistic V (see Ref. 1, p. 61) which 
should have the Chi-square distribution with six degrees of freedom, when N is large. 
The value of N should be at least 4000 for a valid test. This test may also be used for 
decreasing runs. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical 
Algorithms. Reading, Mass.: Addison-Wesley, 1969, p. 60-61. 



390 



(TSERAL) 
Serial Test 

Description 

This subprogram tests whether pairs of successive numbers are uniformly distributed in an 
independent manner. 

File Name 

"TSERAL" 

Calling Syntax 

CALL Seria]_test (N,D,D_squared,V,P) 

Input Parameters 

N number of uniform random numbers to be tested. 

D number of digits permitted; 5 or 10 is a reasonable number here. 

D_squared D*D; this must be passed as a parameter to allow for dynamic allocation of 
arrays. 

Output Parameters 

V Chi-square statistic. V is expected to have the Chi-square distribution with (D * 

D - 1 ) degrees of freedom. 

P right-tailed probability; Prob(X>V). 

Algorithm 

Given n = total number of uniform random numbers. 

d = number of digits permitted; that is, the deviates created are used to create inte- 
gers 1,2..., d. 

yj = jth random integer. 

Then for each pair of integers (q,r) with 0< = q, r<d, count the number of times the pair 
(y2j,y2j + 1) = (q,r) occurs, for 0< = j<n. 

Finally, apply the Chi-square test to these k = d*d equi-probable categories with probability 
l/(d*d) in each case. 

Special Considerations 

1. The number of digits permitted may be chosen as any convenient number. But care 
must be taken since a valid Chi-square test should have n large compared to k; that is, 
n>5*d*d at least. 

So, if 

d = 10 then n>500 
d = 20 then n>2000 
etc. 



391 



2. This test may easily be adapted to triples, quadruples, etc., instead of pairs. But the 
value of d must be severely limited in order to avoid having too many categories. Fre- 
quently, in this case, less exact tests, such as the poker test or the maximum t test are 
used instead. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical 
Algorithms. Reading, Mass.: Addison-Wesley, 1969, p. 55-66. 



(SPCTRL) 
Spectral Test 

Description 

This test is used in theoretically determining the value of coefficient A, given the word size of the 
computer, M, in the linear congruential model described in the General Information section of 
this manual. The value of A is crucial in setting up a good uniform random number generator. 
This is by far the most powerful test currently available on any sized machine. It tends to 
measure the statistical independence of adjacent n-tuples of numbers and is generally applied 
for N - 2,3,4 and perhaps a few higher values of N. 

File Name 

"SPCTRL" 

Calling Syntax 

CALL Spectral (A,M,N,Info,Q,V,Cn) 

Input Parameters 

A the multiplier to be tested. It is essential that the linear congruential sequence be of 

maximal period. 

M modulus used in the model; in our case, M=£2"49- 1. 

N size of n-tuple to be measured. This test is generally applied for N = 2,3, 4 and 

perhaps a few higher values of N. 

Info intermediate information on program execution each time a particular section of 

code has been entered as well as total number of iterations required for conver- 
gence can be printed out at the user's option: 

Info = 1 — < print out intermediate information. 
Info = = > do not print out the information. 



392 



Output Parameters 

Q V | 2, equals the wave number squared. 

V smallest non-zero wave number in the spectrum. 

Cn = PI t (N/2)*V T N 

(N/2)!*M 

Special Considerations 

1. Since BASIC string routines are used to perform the multi-precision arithmetic, this 
program is very slow. 

2. The subprogram allows at most 12 digits for A and M. If larger numbers are desired, 
some parameters must be changed to strings before entering the routine. 

Change: SUB Spectral (A,M,N,Info,Q,V,Cn) 

DIM 

Coef$ = VAL$(A) 
CALL Clean-up (Coef$) 
Base$ = VAL$(M) 
CALL Clean-up (Base$) 



To: SUB Spectral (Coef$,Base$,N,Info,Q,V,Cn) 

3. As suggested in the literature, the driver has been set up for N = 2,3,4,5,6. 

4. The multi-precision arithmetic routines are set up as independent subprograms so that 
the user may apply them to other contexts as well. Presently, each of these routines 
allows for up to 90 digits of accuracy. This can be increased simply by changing the 
DIM statements at the beginning of each routine. 

Note 

This test is quite slow. It is not unusual for it to run for a couple of 
hours with one pair. 

5. The program has been set up with n-tuples of size 2, 3, 4, 5 and 6. For each of these 
values, the quantity Cn is calculated. Large values of Cn correspond to randomness, 
small values correspond to nonrandomness. Knuth suggests that the multiplier A passes 
the spectral test if the Cn values are all greater than or equal to 0.1, and it passes the 
test with flying colors if all are greater than or equal to 1. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming, Vol. II, Seminumerical Algo- 
rithms. Reading, Mass.: Addison-Wesley, 1969, p. 69-100. 



393 



Elementary Sampling Techniques 

Object of Programs 

This section provides some elementary sampling and shuffling techniques. Independent sub- 
programs with optional driver routines are provided. 



(SSEL) 
Selection Sampling 

Description 

Given a set of N objects, this program will select n of them at random in an unbiased manner 
(a simple random sample without replacement). 

File Name 

"SSEL" 

Calling Syntax 

CALL SeLsampling (T_number,S_number,X(*) ) 

Input Pairameters 

T_number total number of records in the set. 

S_number number of records to be selected. 

Output Parameters 

X(*) array of size (1:N) containing the index numbers of the records to be sampled. 

Algorithm 

To select n records at random from a set of N, where 0<n< = N: 

1. Sett = 0, m = 0. 

2. Generate a random number U, uniformly distributed between zero and one. 

3. If (N - t)*U> = (n - m), then go to step 5. 
Else go to step 4. 

4. Select the next record index for the sample. 

m = m + 1. 
t = t+1. 
If m<n then go to step 2. 
Else the sample is complete and the algorithm terminates. 

5. Skip the next record index. 

t = t+1 
Go to step 2. 



394 



Special Considerations 

1. In order to avoid connections between samples obtained on different runs, care must be 
taken to use different starting seeds each time this program is run. RND (using RANDO- 
MIZE) allows for this. The seed can either be initialized in the calling program or the 
subprogram itself. 

A simple way of initializing different seeds for different runs is to do the following: use the 
digits from the month, day, and time that the program is run as the seed. For example, if 
you are running the program on June 19 at 9:47 am, then your seed would be 6190947. 

Reference 

1. Knuth, Donald E., The Art of Computer Programming. Vol. II, Seminumerical Algo- 
rithms, Reading, Mass.: Addison- Wesley, 1969, p. 122. 



(SSHUFL) 
Shuffling 

Description 

Given an array of numbers, this program randomly shuffles the array. 

File Name 

"SSHUFL" 

Calling Syntax 

CALL Sshuffle (N,X(*) ) 

Input Parameters 

N number of digits in the array to be shuffled. 

X(*) array of dimension (1:N) containing the digits to be shuffled. 

Output Parameters 

X(*) array of dimension (1:N) containing the shuffled digits. 

Algorithm 

Let XI, X2, ..., Xt be a set of t numbers to be shuffled. 

1. Set: j = t. 

2. Generate a random number U, uniformly distributed between zero and one. 

3. Set: k = greatest integer in [j*U + 1]. Hence, k is a random integer between i and j. 
Exchange Xk and Xj. 

4. j = j-1. 

If j>l then return to step 2. 

Else the algorithm terminates at this point. 



395 



Reference 

1. Knuth, Donald E., The Art of Computer Programming, Volume 2, Seminumerical 
Algorithms. Reading, Mass.: Addison-Wesley, 1969, p. 124-125. 



396 



Notes 



397 



Appendix 



Changes Necessary For Larger Data Sets 

CAUTION 

INCREASING THE SIZE OF THE DATA SET MAY CAUSE A 
PROBLEM. THERE MAY NOT BE ENOUGH ROOM ON THE 
PROGRAM DISC TO STORE THE ENLARGED DATA SET. TO 
FIND OUT, PROCEED AS FOLLOWS. 

A. Perform the following check on each of your program tapes or discs (excluding Monte 
Carlo Random Number Generator): 

1. Make sure nothing of value is in the scratch file "DATA". If there is, use the STORE 
routine to save it. 

2. Type: PURGE "DATA" 

3. Press: EXECUTE ,=J 

4. Type: CREATE V "DATA", 2 + (8*n) DIV 1280,1280 where n is the maximum num- 
ber of data values you wish to use in the statistics routines (and is equal to number of 
variables times number of observations per variable). 

5. Press: EXECUTE 

In addition, follow the above procedure for the file named "BACKUP" on Basic Statistics 
and Data Manipulation. 

If you obtain an error using the above procedure on any of the program tapes or discs, 
you must transfer all data to a larger media in order to expand the data set. 



398 



B. Make the following change to Basic Statistics and Data Manipulation: 

1. Type:.LOAD"FILEl" 

2. Press: . EXECUTE 

3. Type: EDIT 80 

4. Press: EXECUTE 

5. By editing, make the line read 

Mno = n 
where n is the maximum number of data values you wish to use in the statistics 
routines. This must be less than or equal to 1500. 

6. Press: ENTER 

7. Press: shift RESET 

8. Type: PURGE "FILE1" 

9. Press: EXECUTE 

10. Type: STORE "FILE 1" 

11. Press: EXECUTE 



Note 
Maximum number of variables is 50 and cannot be changed by the 
user. 

Statistics Library Data Formats 

The following is a description of the data format used in the Statistics Library. Also included is 
an explanation of the steps you need to perform to have a program create data compatible with 
the library. 

Method 1 Numeric Data Only 

If you wish to have another program, write a data file that is compatible with the library. It is 
important to note that the actual numeric data could be written in one of two forms: 



.2 

113 
> 



Observations 

0! 2 3 4 N 



V, 

V 2 

v 3 
v„ 



OR 





Variables 






Vi v 2 v 3 


c 
o 


0j 




(0 


2 




> 


o 3 




01 


• 




!/> 


• 




XI 


• 




O 


N 





v n 



The statistics library will prompt you for additional information such as sample size (n), number 
of variables (p), title of the data set, and names of the variables. 



399 



The statements needed to store the data are as follows: 



! P = n o . of variables 
N = n o ♦ of observations 
THIS COULD BE X(N.P> 



05 OPTION BASE 1 

10 P = 3 

20 N=10 

30 ALLOCATE X(P.N) 

40 ! 

50 ! Put data into matrix X 

GO ! 

70 CREATE BDAT "FILE " , INT ( ( 8*P*N ) / 1Z80 ) +2 

80 ASSIGN SFilel TO "FILE1" 

90 OUTPUT SFilel !X<*> 

100 ASSIGN BFilei TO * 

110 END 



(1280 ! 8 bytes per entry and 

! 1280 bytes per logical 
! record 



Method 2 Numeric Data and Descriptive Data 

If you wish to have another program, write a data file that is compatible with the library and if 
you wish to have it store descriptive information as well, you need to prepare the file in a slightly 
different manner. 

The following data is stored in record 1 of the data file: 



Data set title T$[80] 

Number of observations No 

Number of variables Nv 

Variable names Vn$(50)[10] 

Number of subfiles Ns 

Subfile names Sn$(20)[10] 

Subfile characterizations Sc(20) 



(max. is 50) 
(max. is 20) 



Note 

No, Nv, Ns, and the array Sc(*) should be declared in real precision. 

Starting with record 2, the Statistics Library expects to find the data array. 
The statements needed to store the data are as follows: 



P=no. of variables 

N = n o i of observations 



05 OPTION BASE 1 

10 P = 3 

20 N=10 

30 ALLOCATE X(P iN) 

35 DIM T*[80] . Mn$(50)C10], Sn* < 20 ) C 10] , Sc(20) 

40 ! 

50 ! Put data into matrix X and descriptive data into other variables 

GO ! 

70 CREATE BDAT "FILE1 " , I NT ( ( B*P*N ) / 1 280 ) +2 . 1 280 

80 ASSIGN SFilel TO "FILE1" 

85 OUTPUT SFile ,1 >T$ iNo ,Nv iUr,$<*) .Ns ,Sn*(*) ,Sc (*) 

90 OUTPUT SFile ,2!X(*> 

100 ASSIGN SFilel to * 

110 END 



Write record 1 
Write records 2 i3 i . . , 



When using this format and the Statistics Library asks you the question, "Was the data stored 
by the BS&DM system?", answer Yes. This will tell the library to expect the header record as 
record #1. 



400 



Statistical Tables 

Quantiles of the Spearman Test Statistic 0, 



n 


p = .900 


.950 


.975 


.990 


.995 


.999 


4 


.8000 


.8000 










5 


.7000 


.8000 


.9000 


.9000 






6 


.6000 


.7714 


.8286 


.8857 


.9429 




7 


.5357 


.6786 


.7450 


.8571 


.8929 


.9643 


8 


.5000 


.6190 


.7143 


.8095 


.8571 


.9286 


9 


.4667 


.5833 


.6833 


.7667 


.8167 


.9000 


10 


.4424 


.5515 


.6364 


.7333 


.7818 


.8667 


11 


.4182 


.5273 


.6091 


.7000 


.7455 


.8364 


12 


.3986 


.4965 


.5804 


.6713 


.7273 


.8182 


13 


.3791 


.4780 


.5549 


.6429 


.6978 


.7912 


14 


.3626 


.4593 


.5341 


.6220 


.6747 


.7670 


15 


.3500 


.4429 


.5179 


.6000 


.6536 


.7464 


16 


.3382 


.4265 


.5000 


.5824 


.6324 


.7265 


17 


.3260 


.4118 


.4853 


.5637 


.6152 


.7083 


18 


.3148 


.3994 


.4716 


.5480 


.5975 


.6904 


19 


.3070 


.3895 


.4579 


.5333 


.5825 


.6737 


20 


.2977 


.3789 


.4451 


.5203 


.5684 


.6586 


21 


.2909 


.3688 


.4351 


.5078 


.5545 


.6455 


22 


.2829 


.3597 


.4241 


.4963 


.5426 


.6318 


23 


.2767 


.3518 


.4150 


.4852 


.5306 


.6186 


24 


.2704 


.3435 


.4061 


.4748 


.5200 


.6070 


25 


.2646 


.3362 


.3977 


.4654 


.5100 


.5962 


26 


.2588 


.3299 


.3894 


.4564 


.5002 


.5856 


27 


.2540 


.3236 


.3822 


.4481 


.4915 


.5757 


28 


.2490 


.3175 


.3749 


.4401 


.4828 


.5660 


29 


.2443 


.3113 


.3685 


.4320 


.4744 


.5567 


30 


.2400 


.3059 


.3620 


.4251 


.4665 


.5479 



a The entries in this table are selected quantiles w p of the Spearman rank correlation 
coefficient p when used as a test statistic. The lower quantiles may be obtained from the 
equation 

Wp = -w^ p 

The critical region corresponds to values of p smaller than (or greater than) but not includ- 
ing the appropriate quantile. Note that the median of p is 0. 



This table was reprinted from Practical Nonparametric Statistics by W.J Conover, with permission from John Wiley and Sons, Inc., and authors Dr 
Gerald J. Glasser and Dr. Winter. 



401 



Quantiles of the Wilcoxon Signed Ranks Test Statistic" 





W.005 


w.oi 


W.025 


W.05 


w.io 


►f.20 


W.30 


W.40 


W.50 


n(n + 1) 

2 


n = 4 














1 


3 


3 


4 


5 


10 


5 











1 


3 


4 


5 


6 


7.5 


15 


6 








1 


3 


4 


6 


8 


9 


10.5 


21 


7 





1 


3 


4 


6 


9 


11 


12 


14 


28 


8 


1 


2 


4 


6 


9 


12 


14 


16 


18 


36 


9 


2 


4 


6 


9 


11 


15 


18 


20 


22.5 


45 


10 


4 


6 


9 


11 


15 


19 


22 


25 


27.5 


55 


11 


6 


8 


11 


14 


18 


23 


27 


30 


33 


66 


12 


8 


10 


14 


18 


22 


28 


32 


36 


39 


78 


13 


10 


13 


18 


22 


27 


33 


38 


42 


45.5 


91 


14 


13 


16 


22 


26 


32 


39 


44 


48 


52.5 


105 


15 


16 


20 


26 


31 


37 


45 


51 


55 


60 


120 


16 


20 


24 


30 


36 


43 


51 


58 


63 


68 


136 


17 


24 


28 


35 


42 


49 


58 


65 


71 


76.5 


153 


18 


28 


33 


41 


48 


56 


66 


73 


80 


85.5 


171 


19 


33 


38 


47 


54 


63 


74 


82 


89 


95 


190 


20 


38 


44 


53 


61 


70 


82 


91 


98 


105 


210 



a The entries in this table are quantiles w p of the Wilcoxon signed ranks test statistic 
T, for selected values of p < .50. Quantiles w v for p > .50 may be computed from the 
equation 

w v = «(« + l)/2 - *>!_„ 

where n(n 4- l)/2 is given in the right hand column in the table. Note that P(T < w p ) < p 
and P{T > w p ) < 1 — p if ff is true. Critical regions correspond to values of T less than 
(or greater than) but not including the appropriate quantile. 



rhis table was reprinted from the Journal of the American Statistical Associ.^o n. Dr. Robert L. McComack author, and with the permission of the 
American Statistical Association. 



402 



Quantiles of the Kolmogorov Test Statistic" 



One-Sided Test 

/> = .90 .95 .975 .99 .995 p = .90 .95 .975 .99 .995 

Two-Sided Test 

p = .80 .90 .95 .98 .99 p = .80 .90 .95 .98 .99 



n= I 


.900 


.950 


.975 


.990 


.995 n = 


21 


.226 


.259 


.287 


.321 


.344 


2 


.684 


.776 


.842 


.900 


.929 


22 


.221 


.253 


.281 


.314 


.337 


3 


.565 


.636 


.708 


.785 


.829 


23 


.216 


.247 


.275 


.307 


.330 


4 


.493 


.565 


.624 


.689 


.734 


24 


.212 


.242 


.269 


.301 


.323 


5 


.447 


.509 


.563 


.627 


.669 


25 


.208 


.238 


.264 


.295 


.317 


6 


.410 


.468 


.519 


.577 


.617 


26 


.204 


.233 


.259 


.290 


.311 


7 


.381 


.436 


.483 


.538 


.576 


27 


.200 


.229 


.254 


.284 


.305 


8 


.358 


.410 


.454 


.507 


.542 


28 


.197 


.225 


.250 


.279 


.300 


9 


.339 


.387 


.430 


.480 


.513 


29 


.193 


.221 


.246 


.275 


.295 


10 


.323 


.369 


.409 


.457 


.489 


30 


.190 


.218 


.242 


.270 


.290 


11 


.308 


.352 


.391 


.437 


.468 


31 


.187 


.214 


.238 


.266 


.285 


12 


.296 


.338 


.375 


.419 


.449 


32 


.184 


.211 


.234 


.262 


.281 


13 


.285 


.325 


.361 


.404 


.432 


33 


.182 


.208 


.231 


.258 


.277 


14 


.275 


.314 


.349 


.390 


.418 


34 


.179 


.205 


.227 


.254 


.273 


15 


.266 


.304 


.338 


.377 


.404 


35 


.177 


.202 


.224 


.251 


.269 


16 


.258 


.295 


.327 


.366 


.392 


36 


.174 


.199 


.221 


.247 


.265 


17 


.250 


.286 


.318 


.355 


.381 


37 


.172 


.196 


.218 


.244 


.262 


18 


.244 


.279 


.309 


.346 


.371 


38 


.170 


.194 


.215 


.241 


.258 


19 


.237 


.271 


.301 


.337 


.361 


39 


.168 


.191 


.213 


.238 


.255 


20 


.232 


.265 


.294 


.329 


.352 


40 


.165 


.189 


.210 


.235 


.252 










Approximation 


I 


1.07 


1.22 


1.36 


1.52 


1.63 










for 


it > 40 




Vn 


Vh 


v7i 


V M 


v; 



° The entries in this table are selected quantiles »„ of the Kolmogorov test statistics 7\, 7",+, 
and 7",- as defined by (6.1.1) for two-sided tests and by (6.1.2) and (6.1.3) for one-sided tests. 
Reject H at the level a if T exceeds the 1 — a quantile given in this table. These quantiles are 
exact for n <> 20 in the two-tailed test. The other quantiles are approximations which are equal to 
the exact quantiles in most cases. 



This table was reprinted from the Journal of the American Statistical Association with the permission of the American Statistical Association, author 
Dr. J L. Miller. 



403 



Quantiles of the Mann-Whitney Test Statistic 



n p 


m=2 


3 


4 


5 


6 


7 


8 


9 


10 


// 


12 


13 


14 


15 


16 


17 


18 


19 


20 


.001 



























































.005 





















































1 


1 


2 .01 



































1 


1 


1 


1 


1 


1 


2 


2 


.025 




















1 


1 


1 


1 


2 


2 


2 


2 


2 


3 


3 


3 


3 


.05 











1 


1 


1 


2 


2 


2 


2 


3 


3 


4 


4 


4 


4 


5 


5 


5 


.10 





1 


1 


2 


2 


2 


3 


3 


4 


4 


5 


5 


5 


6 


6 


7 


7 


8 


8 


.001 















































1 


1 


1 


1 


.005 























1 


1 


1 


2 


2 


2 


3 


3 


3 


3 


4 


4 


3 .01 

















1 


1 


2 


2 


2 


3 


3 


3 


4 


4 


5 


5 


5 


6 


.025 











1 


2 


2 


3 


3 


4 


4 


5 


5 


6 


6 


7 


7 


8 


8 


9 


.05 





1 


1 


2 


3 


3 


4 


5 


5 


6 


6 


7 


8 


8 


9 


10 


10 


11 


12 


.10 


1 


2 


2 


3 


4 


5 


6 


6 


7 


8 


9 


10 


11 


11 


12 


13 


14 


15 


16 


.001 


























1 


1 


1 


2 


2 


2 


3 


3 


4 


4 


4 


.005 














1 


1 


2 


2 


3 


3 


4 


4 


5 


6 


6 


7 


7 


8 


9 


4 .01 











1 


2 


2 


3 


4 


4 


5 


6 


6 


7 


9 


8 


9 


10 


10 


11 


.025 








1 


2 


3 


4 


5 


5 


6 


7 


8 


9 


10 


11 


12 


12 


13 


14 


15 


.05 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


15 


16 


17 


18 


19 


.10 


1 


2 


4 


5 


6 


7 


8 


10 


11 


12 


13 


14 


16 


17 


18 


19 


21 


22 


23 


.001 




















1 


2 


2 


3 


3 


4 


4 


5 


6 


6 


7 


8 


8 


.005 











1 


2 


2 


3 


4 


5 


6 


7 


8 


8 


9 


10 


11 


12 


13 


14 


5 .01 








1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 


17 


.025 





1 


2 


3 


4 


6 


7 


8 


9 


10 


12 


13 


14 


15 


16 


18 


19 


20 


21 


.05 


1 


2 


3 


5 


6 


7 


9 


10 


12 


13 


14 


16 


17 


19 


20 


21 


23 


24 


26 


.10 


2 


3 


5 


6 


8 


9 


11 


13 


14 


16 


18 


19 


21 


23 


24 


26 


28 


29 


31 


.001 




















2 


3 


4 


5 


5 


6 


7 


8 


9 


10 


11 


12 


13 


.005 








1 


2 


3 


4 


5 


6 


7 


8 


10 


11 


12 


13 


14 


16 


17 


18 


19 


6 .01 








2 


3 


4 


5 


7 


8 


9 


10 


12 


13 


14 


16 


17 


19 


20 


21 


23 


.025 





2 


3 


4 


6 


7 


9 


11 


12 


14 


15 


17 


18 


20 


22 


23 


25 


26 


28 


.05 


1 


3 


4 


6 


8 


9 


11 


13 


15 


17 


18 


20 


22 


24 


26 


27 


29 


31 


33 


.10 


2 


4 


6 


8 


10 


12 


14 


16 


18 


20 


22 


24 


26 


28 


30 


32 


35 


37 


39 


.001 














1 


2 


3 


4 


6 


7 


8 


9 


10 


11 


12 


14 


15 


16 


17 


.005 








1 


2 


4 


5 


7 


8 


10 


11 


13 


14 


16 


17 


19 


20 


22 


23 


25 


7 .01 





1 


2 


4 


5 


7 


8 


10 


12 


13 


15 


17 


18 


20 


22 


24 


25 


27 


29 


.025 





2 


4 


6 


7 


9 


11 


13 


15 


17 


19 


21 


23 


25 


27 


29 


31 


33 


35 


.05 


1 


3 


5 


7 


9 


12 


14 


16 


18 


20 


22 


25 


27 


29 


31 


34 


36 


38 


40 


.10 


2 


5 


7 


9 


12 


14 


17 


19 


22 


24 


27 


29 


32 


34 


37 


39 


42 


44 


47 


.001 











1 


2 


3 


5 


6 


7 


9 


10 


12 


13 


15 


16 


18 


19 


21 


22 


.005 








2 


3 


5 


7 


8 


10 


12 


14 


16 


18 


19 


21 


23 


25 


27 


29 


31 


8 .01 





1 


3 


5 


7 


8 


10 


12 


14 


16 


18 


21 


23 


25 


27 


29 


31 


33 


35 


.025 


1 


3 


5 


7 


9 


11 


14 


16 


18 


20 


23 


25 


27 


30 


32 


35 


37 


39 


42 


.05 


2 


4 


6 


9 


11 


14 


16 


19 


21 


24 


27 


29 


32 


34 


37 


40 


42 


45 


48 


.10 


3 


6 


8 


11 


14 


17 


20 


23 


25 


28 


31 


34 


37 


40 


43 


46 


49 


52 


55 


.001 











2 


3 


4 


6 


8 


9 


11 


13 


15 


16 


18 


20 


22 


24 


26 


27 


.005 





1 


2 


4 


6 


8 


10 


12 


14 


17 


19 


21 


23 


25 


28 


30 


32 


34 


37 


9 .01 





2 


4 


6 


8 


10 


12 


15 


17 


19 


22 


24 


27 


29 


32 


34 


37 


39 


41 


.025 


1 


3 


5 


8 


11 


13 


16 


18 


21 


24 


27 


29 


32 


35 


38 


40 


43 


46 


49 


.05 


2 


5 


7 


10 


13 


16 


19 


22 


25 


28 


31 


34 


37 


40 


43 


46 


49 


52 


55 


.10 


3 


6 


10 


13 


16 


19 


23 


26 


29 


32 


36 


39 


42 


46 


49 


53 


56 


59 


63 


.001 








1 


2 


4 


6 


7 


9 


11 


13 


15 


18 


20 


22 


24 


26 


28 


30 


33 


.005 





1 


3 


5 


7 


10 


12 


14 


17 


19 


22 


25 


27 


30 


32 


35 


38 


40 


43 


10 .01 





2 


4 


7 


9 


12 


14 


17 


20 


23 


25 


28 


31 


34 


37 


39 


42 


45 


48 


.025 


1 


4 


6 


9 


12 


15 


18 


21 


24 


27 


30 


34 


37 


40 


43 


46 


49 


53 


56 


.05 


2 


5 


8 


12 


15 


18 


21 


25 


28 


32 


35 


38 


42 


45 


49 


52 


56 


59 


63 


.10 


4 


7 


11 


14 


18 


22 


25 


29 


33 


37 


40 


44 


48 


52 


55 


59 


63 


67 


71 


.001 








1 


3 


5 


7 


9 


11 


13 


16 


18 


21 


23 


25 


28 


30 


33 


35 


38 


.005 





1 


3 


6 


8 


11 


14 


17 


19 


22 


25 


28 


31 


34 


37 


40 


43 


46 


49 


11 .01 





2 


5 


8 


10 


13 


16 


19 


23 


26 


29 


32 


35 


38 


42 


45 


48 


51 


54 


.025 


1 


4 


7 


10 


14 


17 


20 


24 


27 


31 


34 


38 


41 


45 


48 


52 


56 


59 


63 


.05 


2 


6 


9 


13 


17 


20 


24 


28 


32 


35 


39 


43 


47 


51 


55 


58 


62 


66 


70 


.10 


4 


8 


12 


16 


20 


24 


28 


32 


37 


41 


45 


49 


53 


58 


62 


66 


70 


74 


79 



404 



Quantiles of the Mann-Whitney Test Statistic (continued) 



n p 


m = 2 


3 


4 


5 


6 


7 


8 


9 


10 


// 


12 


13 


14 


15 


16 


n 


18 


19 


20 


.001 








1 


3 


5 


8 


10 


13 


15 


18 


21 


24 


26 


29 


32 


35 


38 


41 


43 


.005 





2 


4 


7 


10 


13 


16 


19 


22 


25 


28 


32 


35 


38 


42 


45 


48 


52 


55 


12 .01 





3 


6 


9 


12 


15 


18 


22 


25 


29 


32 


36 


39 


43 


47 


50 


54 


57 


61 


.025 


2 


5 


8 


12 


15 


19 


23 


27 


30 


34 


38 


42 


46 


50 


54 


58 


62 


66 


70 


.05 


3 


6 


10 


14 


18 


22 


27 


31 


35 


39 


43 


48 


52 


56 


61 


65 


69 


73 


78 


.10 


5 


9 


13 


18 


22 


27 


31 


36 


40 


45 


50 


54 


59 


64 


68 


73 


78 


82 


87 


.001 








2 


4 


6 


9 


12 


15 


18 


21 


24 


27 


30 


33 


36 


39 


43 


46 


49 


.005 





2 


4 


8 


11 


14 


18 


21 


25 


28 


32 


35 


39 


43 


46 


50 


54 


58 


61 


13 .01 


1 


3 


6 


10 


13 


17 


21 


24 


28 


32 


36 


40 


44 


48 


52 


56 


60 


64 


68 


.025 


2 


5 


9 


13 


17 


21 


25 


29 


34 


38 


42 


46 


51 


55 


60 


64 


68 


73 


77 


.05 


3 


7 


11 


16 


20 


25 


29 


34 


38 


43 


48 


52 


57 


62 


66 


71 


76 


81 


85 


.10 


5 


10 


14 


19 


24 


29 


34 


39 


44 


49 


54 


59 


64 


69 


75 


80 


85 


90 


95 


.001 








2 


4 


7 


10 


13 


16 


20 


23 


26 


30 


33 


37 


40 


44 


47 


51 


55 


.005 





2 


5 


8 


12 


16 


19 


23 


27 


31 


35 


39 


43 


47 


51 


55 


59 


64 


68 


14 .01 


1 


3 


7 


11 


14 


18 


23 


27 


31 


35 


39 


44 


48 


52 


57 


§1 


66 


70 


74 


.025 


2 


6 


10 


14 


18 


23 


27 


32 


37 


41 


46 


51 


56 


60 


65 


70 


75 


79 


84 


.05 


4 


8 


12 


17 


22 


27 


32 


37 


42 


47 


52 


57 


62 


67 


72 


78 


83 


88 


93 


.10 


5 


11 


16 


21 


26 


32 


37 


42 


48 


53 


59 


64 


70 


75 


81 


86 


92 


98 


103 


.001 








2 


5 


8 


11 


15 


18 


22 


25 


29 


33 


37 


41 


44 


48 


52 


56 


60 


.005 





3 


6 


9 


13 


17 


21 


25 


30 


34 


38 


43 


47 


52 


56 


61 


65 


70 


74 


15 .01 


1 


4 


8 


12 


16 


20 


25 


29 


34 


38 


43 


48 


52 


57 


62 


67 


71 


76 


81 


.025 


2 


6 


11 


15 


20 


25 


30 


35 


40 


45 


50 


55 


60 


65 


71 


76 


81 


86 


91 


.05 


4 


8 


13 


19 


24 


29 


34 


40 


45 


51 


56 


62 


67 


73 


78 


84 


89 


95 


101 


.10 


6 


11 


17 


23 


28 


34 


40 


46 


52 


58 


64 


69 


75 


81 


87 


93 


99 


105 


111 


.001 








3 


6 


9 


12 


16 


20 


24 


28 


32 


36 


40 


44 


49 


53 


57 


61 


66 


.005 





3 


6 


10 


14 


19 


23 


28 


32 


37 


42 


46 


51 


56 


61 


66 


71 


75 


80 


16 .01 


1 


4 


8 


13 


17 


22 


27 


32 


37 


42 


47 


52 


57 


62 


67 


72 


77 


83 


88 


.025 


2 


7 


12 


16 


22 


27 


32 


38 


43 


48 


54 


60 


65 


71 


76 


82 


87 


93 


99 


.05 


4 


9 


15 


20 


26 


31 


37 


43 


49 


55 


61 


66 


72 


78 


84 


90 


96 


102 


108 


.10 


6 


12 


18 


24 


30 


37 


43 


49 


55 


62 


68 


75 


81 


87 


94 


100 


107 


113 


120 


.001 





1 


3 


6 


10 


14 


18 


22 


26 


30 


35 


39 


44 


48 


53 


58 


62 


67 


71 


.005 





3 


7 


11 


16 


20 


25 


30 


35 


40 


45 


50 


55 


61 


66 


71 


76 


82 


87 


17 .01 


1 


5 


9 


14 


19 


24 


29 


34 


39 


45 


50 


56 


61 


67 


72 


78 


83 


89 


94 


.025 


3 


7 


12 


18 


23 


29 


35 


40 


46 


52 


58 


64 


70 


76 


82 


88 


94 


100 


106 


.05 


4 


10 


16 


21 


27 


34 


40 


46 


52 


58 


65 


71 


78 


84 


90 


97 


103 


110 


116 


.10 


7 


13 


19 


26 


32 


39 


46 


53 


59 


66 


73 


80 


86 


93 


100 


107 


114 


121 


128 


.001 





1 


4 


7 


11 


15 


19 


24 


28 


33 


38 


43 


47 


52 


57 


62 


67 


72 


77 


.005 





3 


7 


12 


17 


22 


27 


32 


38 


43 


48 


54 


59 


65 


71 


76 


82 


88 


93 


18 .01 


1 


5 


10 


15 


20 


25 


31 


37 


42 


48 


54 


60 


66 


71 


77 


83 


89 


95 


101 


.025 


3 


8 


13 


19 


25 


31 


37 


43 


49 


56 


62 


68 


75 


81 


87 


94 


100 


107 


113 


.05 


5 


10 


17 


23 


29 


36 


42 


49 


56 


62 


69 


76 


83 


89 


96 


103 


110 


117 


124 


.10 


7 


14 


21 


28 


35 


42 


49 


56 


63 


70 


78 


85 


92 


99 


107 


114 


121 


129 


136 


.001 





1 


4 


8 


12 


16 


21 


26 


30 


35 


41 


46 


51 


56 


61 


67 


72 


78 


83 


.005 


1 


4 


8 


13 


18 


23 


29 


34 


40 


46 


52 


58 


64 


70 


75 


82 


88 


94 


100 


19 .01 


2 


5 


10 


16 


21 


27 


33 


39 


45 


51 


57 


64 


70 


76 


83 


89 


95 


102 


108 


.025 


3 


8 


14 


20 


26. 


33 


39 


46 


53 


59 


66 


73 


79 


86 


93 


100 


107 


114 


120 


.05 


5 


11 


18 


24 


31 


38 


45 


52 


59 


66 


73 


81 


88 


95 


102 


110 


117 


124 


131 


.10 


8 


15 


22 


29 


37 


44 


52 


59 


67 


74 


82 


90 


98 


105 


113 


121 


129 


136 


144 


.001 





1 


4 


8 


13 


17 


22 


27 


33 


38 


43 


49 


55 


60 


66 


71 


77 


83 


89 


.005 


1 


4 


9 


14 


19 


25 


31 


37 


43 


49 


55 


61 


68 


74 


80 


87 


93 


100 


106 


20 .01 


2 


6 


11 


17 


23 


29 


35 


41 


48 


54 


61 


68 


74 


81 


88 


94 


101 


108 


115 


.025 


3 


9 


15 


21 


28 


35 


42 


49 


56 


63 


70 


77 


84 


91 


99 


106 


113 


120 


128 


.05 


5 


12 


19 


26 


33 


40 


48 


55 


63 


70 


78 


85 


93 


101 


108 


116 


124 


131 


139 


.10 


8 


16 


23 


31 


39 


47 


55 


63 


71 


79 


87 


95 


103 


111 


120 


128 


136 


144 


152 



This table was reprinted from Practical Nonparametric Statistics by W.J. Conover, with permission from John Wiley and Sons, Inc., and author L.R. 
Verdooren . 



405 



Percentage Points of the Duncan New Multiple Range Test 



\^ p 
ni ^y 


2 


3 


4 


5 


6 


7 


8 


9 


10 


12 


14 


16 


18 


20 


50 


100 


1 


18.0 


18.0 


18.0 


18.0 


18 


18.0 


18 


18.0 


18 


18.0 


18.0 


18.0 


18.0 


18.0 


18 


18.0 


2 


6.09 


6.09 


6.09 


6.09 


6.09 


6.09 


6.09 


6.09 


6.09 


6.09 


6.09 


6.09 


6.09 


6.09 


6.09 


6 09 


3 


4.50 


4.50 


4.50 


4.50 


4.50 


4.50 


4.50 


4.50 


4.50 


4.50 


4.50 


4.50 


4.50 


4.50 


4.50 


4 50 


4 


3.93 


4.01 


4.02 


4.02 


4.02 


4.02 


4.02 


4.02 


4.02 


4 02 


4.02 


4.02 


4.02 


4.02 


4.02 


4 02 


5 


3.64 


3.74 


3.79 


3.83 


3.83 


3.83 


3.83 


3.83 


3 83 


3.83 


3.83 


3.83 


3.83 


3.83 


3.83 


3.83 


6 


3.46 


3.58 


3.64 


3.68 


3.68 


3.08 


3.68 


3.68 


3.68 


3.68 


3.68 


3.68 


3.68 


3.68 


3.68 


3 68 


7 


3.35 


3.47 


3.54 


3.58 


3.60 


3.61 


3.61 


3.61 


3.61 


3.61 


3.61 


3.61 


3.61 


3.61 


3.61 


3 61 


8 


3.26 


3.39 


3.47 


3.52 


3.55 


3.56 


3.50 


3.56 


3.50 


3.56 


3.56 


3.56 


3.56 


3.56 


3 56 


3.56 


9 


3.20 


3.34 


3.41 


3.47 


3.50 


3.52 


3.52 


3.52 


3.52 


3.52 


3.52 


3.52 


3.52 


3.52 


3.52 


3.52 


10 


3 15 


3.30 


3 37 


3.43 


3.46 


3.47 


3.47 


3.47 


3.47 


3.47 


3.47 


3 47 


3 47 


3 48 


3 48 


3 48 


11 


3 11 


3.27 


3.35 


3.39 


3.43 


3.44 


3.45 


3.46 


3.46 


3.46 


3.46 


3.46 


3.47 


3 48 


3 48 


3.48 


12 


3.08 


3.23 


3.33 


3.36 


3.40 


3.42 


3 44 


3.44 


3.46 


3.46 


3.46 


3.46 


3 47 


3.48 


3.48 


3 4S 


13 


3.06 


3 21 


3 30 


3.35 


3.38 


3.41 


3 42 


3 44 


3.45 


3.45 


3.46 


3.46 


3 47 


3.47 


3.47 


3 47 


14 


3.03 


3.18 


3 27 


3.33 


3 37 


3.39 


3.41 


3.42 


3.44 


3.45 


3.46 


3.46 


3.47 


3 47 


3.47 


3 47 


15 


3 01 


3.16 


3.25 


3 31 


3.30 


3.38 


3.40 


3.42 


3.43 


3.44 


3.45 


3.46 


3.47 


3 47 


3.47 


3 47 


1G 


3.00 


3.15 


3.23 


3.30 


3.34 


3 37 


3.39 


3.41 


3.43 


3.44 


3.45 


3.46 


3.47 


3 47 


3.47 


3 47 


17 


2.98 


3.13 


3.22 


3.28 


3.33 


3.36 


3.38 


3 40 


3.42 


3.44 


3.45 


3.46 


3.47 


3 47 


3.47 


3.47 


IS 


2 97 


3.12 


3.21 


3.27 


3.32 


3 35 


3.37 


3 39 


3.41 


3.43 


3.45 


3.46 


3.47 


3.47 


3 47 


3 47 


19 


2.96 


3.11 


3.19 


3.26 


3.31 


3.35 


3.37 


3.39 


3.41 


3.43 


3 44 


3.46 


3 47 


3 47 


3.47 


3 47 


20 


2.95 


3.10 


3.18 


3 25 


3.30 


3 34 


3.30 


3.38 


3 40 


3.43 


3.44 


3.46 


3 46 


3.47 


3 47 


3 47 


22 


2.93 


3 08 


3.17 


3 24 


3 29 


3.32 


3.35 


3.37 


3.30 


3.42 


3 44 


3.45 


3.46 


3 47 


3 47 


3 47 


24 


2.92 


3.07 


3.15 


3.22 


3.28 


3.31 


3.34 


3.37 


3 38 


3.41 


3 44 


3.45 


3 46 


3.47 


3 47 


3 47 


20 


2.91 


3.06 


3.14 


3.21 


3.27 


3.30 


3.34 


3.30 


3.38 


3.41 


3 43 


3.45 


3.46 


3.47 


3 47 


3 47 


28 


2.90 


3.04 


3 13 


3.20 


3.26 


3.30 


3.33 


3.35 


3.37 


3.40 


3.43 


3.45 


3 46 


3 47 


3 47 


3 47 


:to 


2.89 


3.04 


3.12 


3.20 


3 25 


3.29 


3 32 


3.35 


3.37 


3.40 


3.43 


3.44 


3.46 


3.47 


3.47 


3.47 


40 


2.86 


3.01 


3.10 


3.17 


3.22 


3 27 


3.30 


3.33 


3.35 


3.39 


3.42 


3.44 


3.46 


3.47 


3.47 


3.47 


no 


2.83 


2.98 


3.08 


3.14 


3.20 


3.24 


3.28 


3.31 


3 33 


3.37 


3.40 


3.43 


3 45 


3.47 


3 48 


3.48 


100 


2.80 


2.95 


3 .05 


3.12 


3.18 


3.22 


3.26 


3.29 


3.32 


3.36 


3.40 


3.42 


3 45 


3.47 


3 53 


3.53 


oo 


2.77 


2.92 


3.02 


3.09 


3.15 


3.19 


3.23 


3.2U 


3.29 


3.34 


3.38 


3 41 


3.44 


3.47 


3.61 


3.67 



♦Using special protection levels based on degrees of freedom. 



This tabic was reprinted from Biometrics , Vol. 11 with the permission of the Biometric Society and author D.B. Duncan. 



406 



Percentage Points of the Studentized Range, q=(x n -X!)/s v . 

Ujiper 10% points 



X 


2 


3 


4 


5 


6 


7 


8 


9 


10 


1 


8-93 


13-44 


16-36 


18-49 


20-15 


21-51 


22-64 


23-62 


24-48 


2 


413 


5-73 


6-77 


7-64 


8-14 


8-63 


905 


9-41 


9-72 


3 


3-33 


4-47 


5-20 


5-74 


616 


6-51 


6-81 


706 


7-29 


4 


301 


3-98 


4-59 


603 


5-39 


5-68 


5-93 


6-14 


6-33 


5 


2-85 


3-72 


4-26 


4-66 


4-98 


6-24 


5-46 


5-65 


6-82 


6 


2-75 


3-56 


407 


4-44 


4-73 


4-97 


617 


5-34 


5-60 


7 


2-68 


3-45 


3-93 


4-28 


4-55 


4-78 


4-97 


514 


5-28 


8 


2-63 


3-37 


3-83 


417 


4-43 


4-65 


4-83 


4-99 


5-13 


9 


2-59 


3-32 


3-76 


4-08 


4-34 


4-54 


4-72 


4-87 


601 


10 


2-56 


3-27 


3-70 


402 


4-26 


4-47 


4-64 


4-78 


4-91 


11 


2-54 


3-23 


3-66 


3-96 


4-20 


4-40 


4-57 


4-71 


4-84 


12 


2-52 


3-20 


3-62 


3-92 


416 


4-35 


4-51 


4-65 


4-78 


13 


2-50 


318 


3-59 


3-88 


412 


4-30 


4-46 


4-60 


4-72 


14 


2-49 


3-16 


3-56 


3-85 


4-08 


4-27 


4-42 


4-66 


4-68 


15 


2-48 


3-14 


3-54 


3-83 


405 


4-23 


4-39 


4-52 


4-64 


16 


2-47 


312 


3-52 


3-80 


403 


4-21 


4-36 


4-49 


4-61 


17 


2-46 


311 


3-50 


3-78 


400 


4-18 


4-33 


4-46 


4-58 


18 


2-45 


3-10 


3-49 


3-77 


3-98 


4-10 


4-31 


4-44 


4-55 


19 


2-45 


309 


3-47 


3-75 


3-97 


4-14 


4-29 


4-42 


4-53 


20 


2-44 


308 


3-46 


3-74 


3-95 


4-12 


4-27 


4-40 


4-51 


24 


2-42 


3-05 


3-42 


3-69 


3-90 


407 


4-21 


4-34 


4-44 


30 


2-40 


3-02 


3-39 


3-65 


3-85 


402 


4-16 


4-28 


4-38 


40 


2-38 


2-99 


3-35 


3-60 


3-80 


3-96 


4-10 


4-21 


4-32 


60 


2-36 


2-96 


3-31 


3-56 


3-75 


3-91 


4-04 


416 


4-25 


120 


2-34 


2-93 


3-28 


3-52 


3-71 


3-86 


3-99 


410 


4-19 


00 


2-33 


2-90 


3-24 


3-48 


3-66 


3-81 


3-93 


404 


4-13 



X 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 


1 


25-24 


25-92 


26-54 


27-10 


27-62 


2810 


28-54 


28-96 


29-35 


29-71 


2 


10-01 


10-26 


10-49 


10-70 


10-89 


1107 


11-24 


11-39 


11-54 


11-68 


3 


7-49 


7-67 


7-83 


7-98 


812 


8-25 


8-3T 


8-48 


8-58 


8-68 


4 


6-49 


6-65 


6-78 


6-91 


702 


713 


7-23 


7-33 


7-41 


7-50 


5 


6-97 


610 


6-22 


6-34 


6-44 


6-54 


6-63 


6-71 


6-79 


6-86 


6 


5-64 


5-76 


5-87 


6-98 


607 


616 


6-25 


6-32 


6-40 


6-47 


7 


6-41 


5-53 


5-64 


5-74 


5-83 


5-91 


5-99 


606 


613 


619 


8 


5-25 


5-36 


5-46 


5-56 


5-64 


5-72 


6-80 


5-87 


6-93 


600 


9 


513 


5-23 


6-33 


5-42 


5-51 


5-58 


5-66 


5-72 


6-79 


5-85 


10 


503 


513 


6-23 


5-32 


5-40 


5-47 


5-54 


5-61 


5-67 


5-73 


11 


4-95 


5-05 


515 


5-23 


5-31 


6-38 


5-45 


5-51 


5-57 


5-63 


12 


4-89 


4-99 


6-08 


5-16 


6-24 


5-31 


5-37 


5-44 


6-49 


5-55 


13 


4-83 


4-93 


6-02 


510 


518 


6-25 


5-31 


6-37 


5-43 


5-48 


14 


4-79 


4-88 


4-97 


5-05 


512 


5-19 


5-26 


5-32 


5-37 


5-43 


15 


4-75 


4-84 


4-93 


601 


5-08 


5-15 


5-21 


6-27 


5-32 


5-38 


16 


4-71 


4-81 


4-89 


4-97 


604 


611 


6-17 


5-23 


5-28 


5-33 


17 


4-68 


4-77 


4-86 


4-93 


501 


507 


5-13 


6-19 


5-24 


5-30 


18 


4-65 


4-75 


4-83 


4-90 


4-98 


5-04 


510 


516 


5-21 


5-28 


19 


4-63 


4-72 


4-80 


4-88 


4-95 


501 


507 


513 


5-18 


5-23 


20 


4-61 


4-70 


4-78 


4-85 


4-92 


4-99 


505 


5- 10 


616 


5-20 


24 


4-54 


4-63 


4-71 


4-78 


4-85 


4-91 


4-97 


502 


507 


512 


30 


4-47 


4-56 


4-64 


4-71 


4-77 


4-83 


4-89 


4-94 


4-99 


603 


40 


4-41 


4-49 


4-56 


4-63 


4-69 


4-75 


4-81 


4-86 


4-90 


4-95 


60 


4-34 


4-42 


4-49 


4-56 


4-62 


4-67 


4-73 


4-78 


4-82 


4-86 


120 


4-28 


4-35 


4-42 


4-48 


4-54 


4-60 


4-65 


4-69 


4-74 


4-78 


00 


4-21 


4-28 


4-35 


4-41 


4-47 


4-52 


4-67 


4-61 


4-65 


4-69 



n: size of sample from which range obtained, v. degrees of freedom of independent a,. 



407 



Percentage Points of the Studentized Range, q=(x n -x 1 )/s v . (continued) 

Upper 5 % points 



X 


2 


3 


4 


5 


6 


7 


8 


9 


10 


1 


17-97 


26-98 


32-82 


3708 


40-41 


4312 


45-40 


47-36 


4907 


2 


6-08 


8-33 


9-80 


10-88 


11-74 


12-44 


1303 


13-54 


13-99 


3 


4-50 


5-91 


6-82 


7-50 


804 


8-48 


8-85 


918 


9-46 


4 


3-93 


504 


5-76 


6-29 


6-71 


7-05 


7-35 


7-60 


7-83 


5 


3-64 


4-60 


5-22 


5-67 


6-03 


6-33 


6-58 


6-80 


6-99 


6 


3-46 


4-34 


4-90 


5-30 


8-63 


5-90 


6-12 


6-32 


6-49 


7 


3-34 


4-16 


4-68 


508 


5-36 


5-61 


5-82 


6-00 


616 


8 


3-26 


4-04 


4-53 


4-89 


5-17 


5-40 


5-60 


5-77 


5-92 


9 


3-20 


3-95 


4-41 


4-76 


502 


5-24 


5-43 


5-59 


5-74 


10 


3-15 


3-88 


4-33 


4-63 


4-91 


5-12 


5-30 


5-46 


5-60 


11 


3-11 


3-82 


4-26 


4-57 


4-82 


5-03 


5-20 


5-35 


5-49 


12 


3-08 


3-77 


4-20 


4-51 


4-75 


4-95 


512 


5-27 


5-39 


13 


306 


3-73 


415 


4-45 


4-69 


4-SS 


505 


5-19 


5-32 


14 


3-03 


3-70 


4-11 


4-41 


4-64 


4-83 


4-99 


5-13 


5-25 


15 


301 


3-67 


4-08 


4-37 


4-59 


4-78 


4-94 


5-08 


5-20 


16 


300 


3-65 


405 


4-33 


4-56 


4-74 


4-90 


5-03 


5- 15 


17 


2-98 


3-63 


402 


4-30 


4-52 


4-70 


4-86 


4-99 


511 


18 


2-97 


3-61 


400 


4-23 


4-49 


4-67 


4-82 


4-96 


507 


19 


2-96 


3-59 


3-98 


4-25 


4-47 


4-65 


4-79 


4-92 


504 


20 


2-95 


3-58 


396 


4-23 


4-45 


4-62 


4-77 


4-90 


501 


24 


2-92 


3-53 


3-90 


4-17 


4-37 


4-54 


4-68 


4-81 


4-92 


30 


2-89 


3-49 


3-85 


4-10 


4-30 


4-46 


4-60 


4-72 


4-82 


40 


2-86 


3-44 


3-79 


4-04 


4-23 


4-39 


4-52 


4-63 


4-73 


60 


2-83 


3-40 


3-74 


3-98 


416 


4-31 


4-44 


4-55 


4-65 


120 


2-80 


3-36 


3-68 


3-92 


410 


4-24 


4-36 


4-47 


4-56 


00 


2-77 


3-31 


3-63 


3-86 


403 


4-17 


4-29 


4-39 


4-47 



X 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 


1 


50-59 


51-96 


53-20 


54-33 


55-36 


56-32 


57-22 


58-04 


58-83 


59-56 


2 


14-39 


14-75 


1508 


15-38 


15-65 


15-91 


16- 14 


16-37 


16-57 


16-77 


3 


9-72 


9-95 


1015 


10-35 


10-52 


10-69 


10-84 


10-98 


11-11 


11-24 


4 


803 


8-21 


8-37 


8-52 


8-66 


8-79 


8-91 


903 


913 


9-23 


5 


7-17 


7-32 


7-47 


7-60 


7-72 


7-83 


7-93 


803 


812 


8-21 


6 


6-65 


6-79 


6-92 


703 


7-14 


7-24 


7-34 


7-43 


7-51 


7-59 


7 


6-30 


6-43 


6-55 


6-66 


6-76 


6-85 


6-94 


7-02 


7-10 


7-17 


8 


605 


6-18 


6-29 


6-39 


6-48 


6-57 


6-65 


6-73 


6-80 


6-87 


9 


5-87 


5-98 


609 


6-19 


6-28 


6-36 


6-44 


6-51 


6-58 


6-64 


10 


5-72 


6-83 


5-93 


603 


6-11 


6- 19 


6-27 


6-34 


6-40 


6-47 


11 


5-61 


5-71 


5-81 


5-90 


5-98 


606 


6-13 


6-20 


6-27 


6-33 


12 


6-51 


5-61 


5-71 


5-80 


5-88 


6-95 


602 


609 


6-15 


6-21 


13 


5-43 


6-53 


5-63 


6-71 


5-79 


6-86 


5-93 


5-99 


605 


6-11 


14 


5-36 


6-46 


5-55 


5-64 


5-71 


5-79 


5-85 


5-91 


5-97 


603 


15 


5-31 


5-40 


5-49 


6-57 


5-65 


5-72 


5-78 


5-85 


5-90 


6-96 


16 


5-26 


6-35 


6-44 


5-52 


5-59 


6-66 


6-73 


5-79 


5-84 


5-90 


17 


5-21 


5-31 


539 


5-47 


5-54 


5-61 


5-67 


6-73 


5-79 


5-84 


18 


5-17 


5-27 


5-35 


5-43 


5-50 


5-57 


5-63 


5-69 


5-74 


6-79 


19 


5-14 


6-23 


5-31 


5-39 


5-46 


5-53 


5-59 


6-65 


5-70 


6-75 


20 


5-11 


5-20 


5-28 


5-36 


5-43 


5-49 


5-55 


5-61 


5-66 


6-71 


24 


501 


6-10 


5-18 


5-25 


5-32 


5-38 


5-44 


5-49 


5-55 


5-59 


30 


4-92 


500 


5-08 


5-15 


5-21 


6-27 


5-33 


6-38 


5-43 


6-47 


40 


4-82 


4-90 


4-98 


504 


5-11 


516 


5-22 


6-27 


5-31 


6-36 


60 


4-73 


4-81 


4-88 


4-94 


500 


606 


6-11 


5-15 


5-20 


5-24 


120 


4-64 


4-71 


4-78 


4-84 


4-90 


4-95 


5-00 


504 


509 


6-13 


00 


4-55 


4-62 


4-68 


4-74 


4-80 


4-86 


4-89 


4-93 


4-97 


6-01 



n: size of sample from which range obtained, v. degrees of freedom of independent *,. 



408 



Percentage Points of the Studentized Range, q=(x n -x,)/s v . (continued) 

Upper 1 % points 



\ n 

V \ 


2 


3 


4 


5 


6 


7 


8 


9 


10 


1 


9003 


1350 


164-3 


1856 


202-2 


215-8 


227-2 


2370 


245-6 


2 


1404 


1902 


22-29 


24-72 


26-63 


28-20 


29-53 


30-68 


31-69 


3 


8-26 


10-62 


1217 


13-33 


14-24 


15-00 


15-64 


16-20 


16-69 


4 


6-51 


8-12 


9-17 


9-96 


10-58 


1110 


11-55 


11-93 


12-27 


5 


5-70 


6-98 


7-80 


8-42 


8-91 


9-32 


9-67 


9-97 


10-24 


6 


5-24 


6-33 


703 


7-56 


7-97 


8-32 


8-61 


8-87 


9-10 


7 


4-95 


5-92 


6-54 


7-01 


7-37 


7-68 


7-94 


8-17 


8-37 


8 


4-75 


5-64 


6-20 


6-62 


6-96 


7-24 


7-47 


7-68 


7-86 


9 


4-60 


5-43 


5-96 


6-35 


6-66 


6-91 


713 


7-33 


7-49 


10 


4-48 


5-27 


5-77 


6- 14 


6-43 


6-67 


6-87 


7-05 


7-21 


11 


4-39 


5- 15 


5-62 


5-97 


6-25 


6-48 


6-67 


6-84 


6-99 


12 


4-32 


5-05 


5-50 


5-84 


6-10 


6-32 


6-51 


6-67 


6-81 


13 


4-26 


4-96 


5-40 


5-73 


5-98 


6- 19 


6-37 


6-53 


6-07 


14 


4-21 


4-89 


5-32 


5-63 


5-88 


608 


6-26 


6-41 


6-54 


15 


417 


4-84 


6-25 


5-56 


5-80 


5-99 


6-16 


6-31 


6-44 


16 


413 


4-79 


5- 19 


5-49 


5-72 


5-92 


6-08 


6-22 


6-35 


17 


410 


4-74 


5-14 


5-43 


5-66 


5-85 


601 


6-15 


6-27 


18 


4-07 


4-70 


509 


5-38 


6-60 


5-79 


5-94 


6-08 


6-20 


19 


4-05 


4-67 


505 


5-33 


5-55 


5-73 


6-89 


6-02 


6- 14 


20 


402 


4-64 


5-02 


5-29 


5-51 


5-69 


6-84 


5-97 


609 


24 


3-96 


4-55 


4-91 


517 


5-37 


5-54 


5-69 


5-81 


5-92 


30 


3-89 


4-45 


4-80 


505 


5-24 


5-40 


5-54 


5-65 


5-76 


40 


3-82 


4-37 


4-70 


4-93 


511 


5-26 


5-39 


5-50 


5-60 


60 


3-76 


4-28 


4-59 


4-82 


4-99 


513 


6-25 


5-36 


5-45 


120 


3-70 


4-20 


4-50 


4-71 


4-87 


5-01 


612 


5-21 


5-30 


oo 


3-64 


4-12 


4-40 


4-60 


4-76 


4-88 


4-99 


508 


5-16 



\ n 
v \ 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 


1 


253-2 


2600 


266-2 


271-8 


277-0 


281-8 


286-3 


290-4 


294-3 


2980 


2 


32-59 


33-40 


3413 


34-81 


35-43 


3600 


36-53 


•3703 


37-50 


37-95 


3 


1713 


17-53 


17-89 


18-22 


18-52 


18-81 


19-07 


19-32 


19-55 


19-77 


4 


12-57 


12-84 


1309 


13-32 


13-53 


13-73 


13-91 


14-08 


14-24 


14-40 


5 


10-48 


10-70 


10-89 


11-08 


11-24 


11-40 


11-55 


11-68 


11-81 


11-93 


6 


9-30 


9-48 


9-65 


9-81 


9-95 


1008 


10-21 


10-32 


10-43 


10-54 


7 


8-55 


8-71 


8-86 


900 


9- 12 


9-24 


9-35 


946 


9-55 


9-65 


8 


8-03 


8-18 


8-31 


8-44 


8-55 


8-66 


8-76 


8-85 


8-94 


903 


9 


7-65 


7-78 


7-91 


803 


8-13 


8-23 


8-33 


8-41 


8-49 


8-57 


10 


7-36 


7-49 


7-60 


7-71 


7-81 


7-91 


7-99 


8-08 


815 


8-23 


11 


713 


7-25 


7-36 


7-46 


7-56 


7-65 


7-73 


7-81 


7-88 


7-95 


12 


6-94 


706 


7-17 


7-26 


7-36 


7-44 


7-52 


7-59 


7-66 


7-73 


13 


6-79 


6-90 


7-01 


710 


7-19 


7-27 


7-35 


7-42 


7-48 


7-55 


14 


6-66 


6-77 


6-87 


6-96 


7-05 


713 


7-20 


7-27 


7-33 


7-39 


15 


6- 55 


6-66 


6-76 


6-34 


6-93 


700 


7-07 


714 


7-20 


7-26 


16 


6-46 


6-56 


6-66 


6-74 


6-82 


6-90 


6-97 


703 


7 09 


715 


17 


6-38 


6-48 


6-57 


6-66 


6-73 


6-81 


6-87 


6-94 


7-00 


7-05 


18 


6-31 


641 


6-50 


6-58 


6-65 


6-73 


6-79 


6-85 


6-91 


6-97 


19 


6-25 


6-34 


643 


6-51 


6-58 


6-65 


6-72 


6-78 


6-84 


6-89 


20 


6-1'J 


6-28 


6-37 


6-45 


6-52 


6-59 


6-65 


6-71 


6-77 


6-82 


24 


602 


6-11 


6- 19 


6-26 


6-33 


6-39 


6-45 


6-51 


6-56 


6-61 


30 


5- 8 j 


5-93 


601 


6-08 


614 


6-20 


6-26 


631 


6-36 


6-41 


40 


5-69 


5-76 


5-83 


5-90 


5-96 


602 


607 


6-12 


616 


6-21 


60 


5-53 


5-60 


5-67 


5-73 


5-78 


5-84 


5-89 


5-93 


6-97 


601 


120 


5-37 


5-44 


5-50 


5-56 


5-61 


5-66 


5-71 


5-75 


5-79 


5-83 


oo 


5-23 


5-29 


5-35 


5-40 


5-45 


5-49 


5-54 


5-57 


5-61 


5-65 



This table was reprinted from Biometrika Tables for Statisticians, Vol. 1, 3rd Edition, Table 29, with the permission of the Biometrika Trustees. 



The Normal Probability Function 



409 



The integral P(X) and ordinate Z(X) in terms of the standardized deviate X 



HX) 



■00 
01 
•02 
•03 

•04 
•06 

•06 
■07 
■08 
■09 
•10 

■11 
•It 

•IS 

•14 
•16 

•16 
■17 
■18 
■19 
•SO 

•tl 
•It 

■ts 

Si 
S6 

S6 
S7 
S8 
S9 
■SO 

•SI 
•31 
S3 

■S4 
■35 

■S6 
37 
■38 
•39 
■40 

•41 
■4* 
■43 

■44 
■45 

•46 
■47 
■48 
■49 
■60 



•5000000 
•5039S94 
■6079783 
•5119665 
•5159534 
•6199388 

•6239222 
•5279032 
•5318814 
•5358564 
•5398278 

•6437953 
•5477684 
•5517168 
•6556700 
•5596177 

•5635595 
•5674949 
•5714237 
•5753454 
•6792597 

■6831662 
•5870644 
•5909541 
•5948349 
■5987063 

•6025681 
•6064199 
•6102612 
•6140919 
•6179114 

■6217195 
•6255153 
•6293000 
•6330717 
•6368307 

•6405764 
•6443088 
•6480273 
•6517317 
■6654217 

•6590970 
•6627573 
•6664022 
•6700314 
•6730448 

•6772419 
■6808225 
•6843863 
•6879331 
•6914625 



6 

+ 



39894 
39890 
39882 
39870 
39854 
39834 

39810 
39782 
39750 
39714 
39075 

39631 
39584 
39532 
39477 
39418 

39355 
39288 
39217 
39143 
39065 

38983 
38897 
38808 
38715 
38618 

38518 
38414 
38306 
38195 
38081 

379C3 
37842 
37717 
37589 
37468 

37323 
37185 
37044 
36900 
36753 

36602 
36449 
36293 
36133 
35971 

35806 
35638 
35467 
35294 





4 

8 

12 

16 

20 

24 

28 
32 
36 
40 

44 
48 
51 
65 
69 

63 

67 
71 

74 
78 



86 
89 
93 
97 

100 
104 
107 
111 
114 

118 
121 
125 
128 
131 

135 
138 
141 
144 
147 

160 
153 
156 
169 
162 

165 
168 
171 
173 
176 



Z(X) 



•3989423 
•3989223 
•3988625 
•3987628 
•3986233 
■3984439 

■3982248 
•3979661 
•3976677 
•3973298 
•3969525 

•3965360 
•3960802 
•3956854 
•3950517 
•3944793 

•3938684 
•3932190 
■3925315 
•3918060 
■3910427 

•3902419 
•3894038 
•3885286 
•3876166 
■38G6681 

•3856834 
■3846627 
•3836063 
•3825146 
•3813878 

•38022C4 
•3790305 
•3778007 
•3705372 
•3752403 

■3739106 
•3725483 
•3711539 
•3697277 
■3682701 

■3667817 
■3652627 
■3637136 
■3621349 
•3605270 

•3588903 
•3572253 
•3565325 
•3538124 
•3520653 



199 

598 

997 

1395 

1793 

2191 

2588 
2984 
3379 
3773 

416G 

4558 
4948 
5337 
5724 
6110 

6493 
C875 
7255 
7633 

8008 

8381 
8752 
9120 
9485 
9847 

10207 
10564 
10917 
11268 
11615 

11 958 

12298 
12635 
12968 
13297 

13623 
13944 
14262 
14575 
14886 

15190 
15491 
15787 
16079 
16367 

16650 
16928 
17202 
17470 



P 



399 
399 
399 
398 
398 
397 

397 
39G 
395 
394 
393 

392 
390 
389 
387 
386 

384 
382 
380 
378 
375 

373 
371 
368 
365 
362 



357 
354 
350 
347 

344 
340 
337 
333 
329 

325 
322 
318 
313 
309 

305 
301 
296 
292 
288 



278 
274 
269 
264 



■50 
■51 
■5t 
■53 
■54 
■55 

■56 
■57 
■58 
■59 

■60 

■61 
■6S 
•6S 
■64 
■65 

•66 
■67 
■68 
■69 
10 

•71 
•7t 
•73 
14 
•75 

•76 
•77 
•78 

■79 
■80 

■81 
■8S 
■83 
■84 
■85 

■86 

•87 
■88 
■89 
■90 

■91 
■91 
■93 

■94 
•95 

•96 
■97 
•98 
99 
1-00 



P(X) 



•6914625 
•6949743 
•6984682 
•7019440 
•7054015 
•71)88403 

•7122603 
•7156612 
•7190-127 
•7224047 
•7257469 

•7290691 
•7323711 
•7356527 
•7389137 
■7421539 

•7453731 
•7485711 
•7517478 
•7549029 
•7580363 

•7611479 
•7642375 
7673049 
•7703500 
•7733726 

•7763727 
•7793501 

•7823046 
•7852361 
•7881446 

•7910299 
•7938919 
•7967306 
■7995458 
•8023375 

•8051055 
■8078498 
•8105703 
•8132671 
•8159399 

■8185887 
•6212136 
■8238145 
•8263912 
■6289439 

■8314724 
•8339768 
■8364569 
•8389129 
•8413447 



i 

+ 



35118 
34939 
34758 
34574 
34388 
34200 

34009 
33815 
33620 
33422 
33222 

33020 
32816 
32610 
32402 
32192 

31930 
31767 
31551 
31334 
31116 

30896 
30674 
30451 
30226 
30001 

29773 
29545 
29316 
29085 
28853 

28620 
28387 
28152 
27917 
27680 

27443 
27205 
26967 
26728 
26489 

26249 
26008 
25768 
25527 
25285 

26044 
24802 
24560 
24318 



176 
179 
181 
184 
186 
189 

191 
193 
196 
198 
200 

202 
204 
206 
208 
210 

212 

214 
215 
217 
219 

220 
222 
223 
225 

226 

227 
228 
230 
231 
232 

233 

234 
235 
235 
236 

237 
238 
238 
239 
239 

240 
240 
241 
241 
241 

242 
242 
242 

242 
242 



Z(X) = e-»*7V(2ff), P(X) =\-Q{X) 



-11 



Z{u) du. 



410 



The Normal Probability Function (continued) 



Z(X) 



•3520653 
•3502919 
■3484925 
•3466677 
•3448180 
■3429439 

•3410458 
■3391243 
•3371799 
-3352132 
•3332246 

•3312147 
•3291840 
•3271330 
•3250623 
•3229724 

•3208638 
•3187371 
•3165929 
3144317 
•3122539 

•3100603 
•3078513 
•3056274 
•3033893 
•3011374 

■2988724 
■2965948 
•2943050 
2920038 
•2896916 

•2873689 
•2850364 
•2826945 
■2803438 
■2779849 

•2756182 
•2732444 
•2708640 
•2684774 
•2660852 

•2636880 
•2612863 
•2588805 
•2564713 
■2540591 

•2516443 
•2492277 
•2468095 
•2443904 
■2419707 



17734 
17994 

18248 
18497 
18741 
18981 

19215 
19444 
19667 
19886 
20099 

20307 
20510 
20707 
20899 
21086 

21267 
21442 
21613 
21777 
21936 

22090 
22239 
22381 
22519 
22650 

22777 
22897 
23013 
23122 
23227 

23325 
23419 
23507 
23589 
23666 

23738 
23805 
23866 
23922 
83972 

24017 
24058 
24093 
24122 
24147 

24167 
24182 
24191 
24196 



264 
259 
254 
249 
244 
239 

234 
229 
224 
219 
213 

208 
203 
197 
192 
187 

181 
176 
170 
165 
159 

154 
148 
143 
137 
132 

126 
121 
115 
110 
104 

99 
93 

88 
63 

77 

72 
66 
61 
56 
51 

45 
40 
35 
30 
25 

20 

15 

10 

5 





100 
1-01 
102 
V03 

1-04 
V05 

106 
1-07 
108 
109 
110 

111 
lit 
IIS 
1H 
115 

1-16 
1-17 
1-18 
119 

ISO 

1*1 

vn 

1-23 
V24 
1-25 

1-26 
1-27 
128 
1-29 
ISO 

1-31 
1-32 
133 
134 
1-35 

1-36 
1ST 
138 
1-39 
Vlfi 

11,1 

1-lfi 
143 

l-U 

1-45 

1-46 

1-47 
1-48 
1-49 
1-50 



P(X) 



■8413447 
•8437524 
•8461358 
•8484950 
•8508300 
•8531409 

•8554277 
■8576903 
■8599289 
■8621434 
•8643339 

•8665005 
•8686431 
•8707619 
•8728568 
■8749281 

■8769756 
•8789995 
•8809999 
•8829768 
■8849303 

■8868606 
•8887676 
•8906514 
•8925123 
•8943502 

•8961653 
•8979577 
•8997274 
•9014747 
•9031995 

•9049021 
•9065825 
•9082409 
•9098773 
•9114920 

•9130850 
•9146565 
•9162067 
•91773515 
■9192433 

■9207302 
•9221962 
•9236415 
•9250663 
•9264707 

•9278550 
■9292191 
■9305634 
■9318879 
•9331928 



6 

•+ 



24076 
23834 
23592 
23351 
23109 
22868 

22626 
22386 
22145 
21905 
21665 

21426 
21188 

20950 
20712 

20475 

20239 
20004 
19769 
19535 
19302 

19070 
18839 
18609 
18379 
18151 

17924 
17697 
17472 
17248 
17026 

16804 
16584 
16365 
16147 
15930 

15715 
15501 
15289 
15078 
14868 

14660 
14453 
14248 
14044 
13842 

13642 
13443 
13245 
13049 



8* 



242 
242 
242 
242 
242 
241 

241 
241 
240 
240 
240 

239 
239 
238 
237 
237 

236 
235 
235 
234 
233 

232 
231 

230 

229 
228 

227 
226 
225 
224 
223 

222 
220 
219 
218 
217 

215 
214 
212 
211 
210 

208 
207 
205 
204 
202 

201 
199 
197 
196 
194 



Z(X) 



•2419707 
•2395511 
•2371320 
•2347138 
•2322970 
■2298821 

•2274696 
•2250599 
•2226535 
•2202508 
•2178522 

2154582 
2130691 
2106856 
2083078 
2059363 

2035714 
2012135 
1988631 
1965205 
1941861 

1918602 
1895432 
1872354 
1849373 
1826491 

1803712 
1781038 
1758474 
1736022 
1713686 

1691468 
1669370 
1647397 
1625551 
1603833 

1582248 
1560797 
1539483 
1518308 
1497275 

1476385 
1455641 
1435046 
1414600 
1394306 

1374165 
1354181 
1334353 
1314684 
1295176 



24196 
24191 
24182 
24168 
24149 
24126 

24097 
24064 
24027 
23986 
23940 

23890 
23836 
23778 
23715 
23649 

23578 
23504 
23426 
23344 
23259 

23170 

23077 
22981 
22882 
22779 

22673 
22564 
22452 
22337 

22218 

22097 
21973 
21847 
21717 
21585 

21451 
21314 
21175 
21033 

20890 

20744 
20596 
20446 
20294 
20140 

19985 
19828 
19669 
19508 



+ 


5 

10 
14 
19 
24 

28 
33 
37 
41 
46 

00 
54 
68 
62 
66 

70 
74 
78 
82 
85 

89 
93 

96 

99 

103 

106 
109 
112 
115 
118 

121 
124 
127 
129 
132 

134 
137 
139 
142 
144 

146 
148 
150 
152 
164 

155 
157 
159 
160 
162 



Note sign of second difference, 8*. 



The Normal Probability Function (continued) 



411 



X 


P(X) 


S 

+ 


a» 


Z(X) 


1*0 


•9331928 


12855 
12662 
12471 
1228S 
12094 
11908 


194 


•1295176 


1*1 


■9344783 


193 


•1275830 


l*t 


-9357445 


191 


•1256646 


1*3 


-9369916 


189 


•1237628 


1*4 


■9382198 


188 


•1218775 


1-55 


■9394292 


186 


•1200090 


1*6 


■9406201 


11724 
11541 
11360 
11181 
11004 


184 


•1181573 


1*7 


■9417924 


183 


•1163225 


1*3 


■9429466 


181 


•1145048 


1*9 


■9440826 


179 


•1127042 


1*0 


•9452007 


177 


•1109208 


1*1 


•9463011 


10828 
10654 
10482 
10311 
10142 


176 


•1091548 


l*t 


•9473839 


174 


•1074061 


1*3 


■9484493 


172 


•1056748 


1*4 


■9494974 


170 


•1039611 


1*5 


■9605285 


169 


■1022649 


1*6 


■9515428 


9975 
9810 
9647 
9485 
9326 


167 


•1005864 


1*7 


■9525403 


165 


■0989255 


1*3 


■9535213 


163 


•0972823 


1*9 


■9544860 


162 


■0956568 


1-70 


■9554345 


160 


-0940491 


1-71 


■9563671 


9167 
9011 

8856 
8704 
8553 


158 


■0924591 


17t 


-9572838 


156 


■0908870 


1-73 


■9581849 


155 


•0893326 


1-74 


•9590705 


153 


■0877961 


1-76 


-9599408 


151 


■0862773 


1-76 


■9607961 


8403 

8256 
6110 
7966 
7824 


149 


■0847764 


ITT 


■9616364 


147 


-0832932 


178 


•9624620 


146 


•0818278 


1-79 


•9632730 


144 


■0803801 


1*0 


■9640697 


142 


■0789502 


181 


-9648521 


7684 
7545 
7409 
7273 
7140 


140 


■0775379 


V8t 


•9656205 


139 


•0761433 


1*3 


•9663750 


137 


0747663 


1*4 


•9671159 


135 


•0734068 


1*5 


•9678432 


133 


O720C49 


186 


■9685572 


7009 
6879 
6751 
6624 
6500 


132 


■0707404 


1*7 


■9692581 


130 


0694333 


1*8 


■9699460 


128 


0681436 


1*9 


•9706210 


126 


•0668711 


V90 


•9712834 


125 


0656158 


V91 


•9719334 


6377 
6255 
6136 
6018 
5902 


123 


0643777 


1-99 


•9725711 


121 


0631566 


V93 


■9731966 


120 


0619524 


1*4 


•9738102 


118 


0607602 


1-95 


-9744119 


116 


0595947 


1-96 


9750021 


6787 
6674 
6563 
5453 


115 


0584409 


1-97 


•9755808 


113 


0573038 


1-98 


•9761482 


in 


0561831 


1-99 


•9767045 


110 


0550789 


too 


•9772499 


108 


0539910 



19346 
19183 
19018 
18853 
18685 
18517 

18348 

18177 
18006 
17834 
17661 

17487 
17312 
17137 
16962 
16786 

16609 
16432 

16255 
16077 
15899 

15722 
15544 
15366 
15188 
15010 

14832 
14654 
14477 
14300 
14123 

1394ft 
13770 
13594 
13419 
13245 

13071 
12897 
12725 
12553 
12382 

12211 
12041 
11873 
11705 
11538 

11372 
11206 
11042 
10879 



+ 



162 
163 
165 
166 
167 
168 

169 
170 
171 
172 
173 

174 
174 
175 
176 
176 

177 
177 
177 
178 
178 

178 
178 
178 
178 
178 

178 
178 
177 
177 
177 

176 
176 
176 
175 
175 

174 
173 
173 
172 
171 

170 
170 
169 
168 
167 

166 
165 
164 
163 
162 



X 


P{X) 


4 

+ 


S* 


too 


■9772499 


6345 
5239 
5134 
6031 
4929 
4829 


108 


toi 


•9777844 


106 


tot 


•9783083 


105 


tos 


•9788217 


103 


104 


•9793248 


102 


tos 


■9798178 


100 


t-06 


■9803007 


4731 
4634 
4539 
4445 
4352 


98 


tort 


•9807738 


97 


t-08 


•9812372 


95 


t-09 


•9816911 


94 


t-io 


•9821356 


92 


til 


•9825708 


4262 
4172 
4084 
3998 
3913 


91 


tit 


•9829970 


89 


113 


•9834142 


88 


t-14 


■9838226 


86 


tis 


•9842224 


85 


tie 


■9846137 


3829 
3747 
3666 
3587 
3509 


84 


tn 


■9849966 


82 


t-18 


•9853713 


81 


tl9 


•9857379 


79 


tto 


-9860966 


78 


ttl 


-9864474 


3432 
3357 
3283 
3210 
3138 


77 


ttt 


•9867906 


75 


tts 


•9871263 


74 


t*4 


•9874545 


73 


tts 


•9877755 


71 


t-t6 


■9880894 


3068 
2999 
2932 
2865 
2800 


70 


tn 


■9883962 


69 


tts 


■9886962 


68 


t-29 


■9889893 


66 


t*o 


■9892759 


65 


t*l 


•9895559 


2736 
2674 
2612 
2552 
2492 


64 


tst 


■9898296 


63 


1*3 


■9900969 


62 


t*4 


■9903581 


60 


t*5 


•9906133 


59 


t*6 


•9908625 


2434 
2377 
2321 
2267 
2213 


68 


t*7 


■9911060 


67 


t*8 


-9913437 


56 


$39 


•9915758 


65 


t-40 


•9918025 


54 


t-41 


•9920237 


2160 
2108 
2058 
2008 
1960 


63 


t'42 


•9922397 


52 


t-43 


•9924506 


51 


t-44 


•9926564 


50 


t-45 


•9928572 


49 


t-46 


"9930531 


1912 
1865 
1820 
1775 


48 


247 


•9932443 


47 


S-48 


■9934309 


46 


t-49 


■9936128 


45 


t*0 


■9937903 


44 



Z(X) = e-«^/ v '(27r), P(X) = 1 - Q(X) 



J -00 



)du. 



412 



The Normal Probability Function (continued) 



Z{X) 



■0539910 
■0529192 
•0518636 
■0508239 
•0498001 
■0487920 

■0477996 
■0468226 
•0458611 
•0449148 
■0439836 

■0430674 
•0421661 
■0412795 
•0404076 
•0395500 

•0387069 
•0378779 
•0370629 
•0362619 
•0354746 

•0347009 
•0339408 
0331939 
■0324603 
•0317397 

•0310319 
■0303370 
029«546 
■0289847 
•0283270 

•0276816 
•0270481 
•0264265 
•0258166 
•0252182 

■0246313 
■0240556 
■0234910 
O220374 
■0223945 

■0218624 
O213407 
■021 18294 
11203284 
■0198374 

0193563 
■0188850 
•0184233 
•0179711 
•0175283 



10717 
10557 
10397 
10238 
10081 
9924 

9769 
9616 
9463 
9312 
9162 

9013 
8866 
8720 
8575 
8432 

8290 
8149 
8010 

7873 
7737 

7602 
7468 
7337 
7206 
7077 

6950 

6824 
6699 
6576 
6455 

6335 
6216 
6009 
6984 
6870 

5757 
6646 
5536 
5428 
6322 

6817 
6113 
6011 
4910 
4811 

4713 
4617 
4522 

4428 



+ 



162 
161 
160 
159 
157 
156 

155 
154 
153 
161 
150 

149 
147 
146 
145 
143 

142 
140 
139 
138 
136 

135 
133 
132 
130 
129 

1S7 
126 
125 
123 
122 

120 
119 
117 
116 
114 

113 
HI 
110 
108 
107 

105 
104 
102 
101 
99 

98 
96 
95 
93 
92 



t-50 
tSl 
SSi 
t-53 
t-5+ 
t-55 

t-58 
t-57 
t-58 
t-59 

teo 
tei 

t-6t 
t-63 
t-6* 
S-65 

t-66 
t-67 
1-68 
tH9 
t-70 

til 

t-72 
t-73 

en 

t-75 

t-76 
t-77 
g-78 
t-79 
t-80 

t-81 
t8t 
t-83 
i-84 
t-85 

t-86 
t-87 
t-88 
t-89 
t-90 

t-91 
t-9t 
t-93 
t-91, 
t-95 

toe 

t-97 
t-98 
2-99 
300 



P(X) 



•9937903 
■9939634 
■9941323 
■9942969 
■9944574 
■9946139 

■9947664 
■9949151 
■9950600 
•9952012 
■9953388 

•9954729 
•9956035 
•9957308 
•9958547 
■9969754 

■9960930 
■9962074 
■9963189 
•9964274 
•9965330 

•9966358 
•9967359 
•9968333 
•9909280 
■9970202 

•9971099 
•9971972 
•9972821 
■9973646 
■9974449 

•9975229 
■9975988 
■0976726 
•9977443 
■9978140 

■9978818 
■9979476 
•9980116 
■9980738 
■9981342 

■9981929 
■9982498 
•93830.-12 
•90*3589 
-9984111 

■9984618 
•9985110 
•9985588 
■9986051 

■998G501 



t 

+ 



1731 
1688 
1646 
1606 
166S 
1626 

1487 
1449 
1412 
1376 
1341 

1306 
1272 
1230 
1207 
1176 

1146 
1115 
1085 
1050 
1028 

1001 
974 
948 
922 
897 

873 
849 
825 
803 
781 

759 

738 
717 
697 
678 

658 
640 
622 

604 
687 

570 
653 
637 
622 
607 

492 
478 
464 
450 



44 
43 
42 

41 
40 
39 

39 
38 
37 
36 
35 

35 
34 
33 
32 
32 

31 

30 
29 
29 
28 

27 
27 
26 
26 
25 

24 
24 

23 
23 
22 

22 

21 
21 
20 
20 

19 
19 
18 
18 
17 

17 
16 
16 
16 
16 

15 
14 
14 
14 
13 



Z(X) 



•0175283 
■0170947 
■0166701 
•0162545 
■0158476 
0154493 

0150596 
0146782 
0143051 
0139401 
0135830 

Ol 32337 
Ol 28921 
0125581 
0122316 
OH9122 

•0116001 
Ol 12951 
0109969 
Ol 07056 
■0104209 

O101428 

0098712 
•00960.-.8 
O0934C0 
0090936 

0088465 

O086058 
O083f.07 
0081308 
O079155 

O076965 
0074829 
0072744 
0070711 
0068728 

O066793 
0064907 
O063067 
0061274 
0059525 

O067821 
0060160 
■0051541 
O052963 
O061426 

0049929 
0048470 
0047050 
O045066 
0044318 



4336 
4246 
4167 
4069 

3982 
3897 

3814 
3731 
3650 
3571 
3493 

3416 

3340 
3266 
3193 
3121 

3051 
2981 
2013 
2«47 
2781 

2717 
2G54 
2502 
2631 
•471 

2413 
2355 

2209 
2244 

2189 

2136 
2084 
2033 
1983 
1934 

1886 
1839 
1793 
1748 
1704 

1661 
1619 
1578 
1537 
1497 

1459 
1421 

1384 
1347 



+ 

92 
91 
89 
88 
60 
86 

84 

82 
81 
80 

78 

77 
76 
74 
73 
72 

70 
69 
68 
67 
66 

64 
C3 
62 
61 
60 

69 
57 
56 
55 
54 

63 
52 
51 
60 
49 

48 
47 
46 
45 



43 

42 
41 
40 
40 

39 
38 
37 
36 
35 



Note sign of second difference, 8*. 



413 



The Normal Probability Function (continued) 



X 


P{X) 


8 

+ 


<$• 


Z(X) 


i 


8* 

+ 




X 


P(X) 


s 

+ 


8* 


s-oo 


•9986501 


437 

424 
411 
399 
387 
375 


13 


•0044318 


1312 
1277 
1243 
1210 
1178 
1146 


35 


3-50 


•9997674 




3 


301 


•9986938 


13 


•0043007 


35 




3-51 


•9997759 


86 


3 


S-02 


•9987361 


13 


•0041729 


34 




SSI 


•9997842 


83 


3 


SOS 


•9987772 


12 


•0040486 


33 




3-53 


•9997922 


80 


3 


3-04 


■9938171 


12 


0039276 


32 




3-54 


■9997999 


77 


3 


3-05 


■9988558 


12 


•0038098 


32 




355 


•9998074 


74 
72 


3 


3-06 


•9988933 


364 
353 
342 
332 
322 


11 


■0036951 


1115 

1085 

1056 

1027 

999 


31 




356 


-9998146 




3 


3-07 


•9989297 


11 


•0035836 


30 




357 


■9998215 


69 


2 


3-08 


•9989650 


11 


■0034751 


29 




3-58 


•9998282 


67 


2 


309 


■9989992 


10 


•0033695 


29 




359 


•9998347 


65 


2 


3-10 


•9990324 


10 


O0326G8 


28 




360 


9998409 


62 

60 


2 


3:11 


•9990646 


312 

302 
293 
284 
276 


10 


•0031669 


971 
944 
918 
893 
668 


27 




361 


9998469 




2 


Sit 


•9990957 


10 


•0030698 


27 




S-6S 


•9998527 


58 


2 


313 


■9991260 


9 


•0029754 


26 




3-63 


■9998583 


56 


2 


3H 


•991)1553 


9 


•0028835 


26 




3-64 


•9998637 


54 


2 


SIS 


•9991836 


9 


•0027943 


25 




3-65 


■9998689 


52 
60 


2 


3-16 


•9992112 


267 
258 
250 
242 
235 


9 


•0027076 


843 
820 
797 
774 
752 


24 




3-66 


•9998739 


48 


2 


317 


■9992378 


8 


•0026231 


24 




3-67 


•9998787 


2 


318 


■9992636 


8 


■0025412 


23 




3-68 


•9998834 


47 


2 


319 


■9992886 


8 


■0024615 


23 




369 


•9998879 


45 
43 
42 


2 


3-20 


■9993129 


8 


■0023341 


22 




3-70 


•9996922 


2 


SSI 


■9993363 


227 
220 
213 

206 
200 




■0023089 


731 
710 
689 
669 
650 


21 




3-71 


•9998964 


40 
39 
37 
36 
35 


2 


331 


•9993590 




•0022358 


21 




3-79 


■9999004 




383 


•9993810 




■0021649 


20 




3-73 


-9999043 




3-H 


-9994024 




•0020960 


20 




3-74 


■99990B0 




385 


-9994230 




-0020290 


19 




375 


■9999116 




sue 


•9994429 


193 
187 
181 
175 
169 


6 


•0019641 


631 
612 
695 
677 
660 


19 




3-76 


■9999150 


33 
32 
31 

30 
29 




sun 


■9994623 


6 


■0019010 


18 




3-77 


•9999184 




3S8 


•9994810 


6 


■0018397 


18 




3-78 


■9999216 




SS9 


■9994991 


6 


•0017803 


17 




3-79 


■9999247 




3-30 


-9995166 


6 


•0017226 


17 




3-80 


•9999277 




SSI 


■9995335 


164 
159 
153 
148 
143 


6 


■0016666 


643 
627 
612 
496 
481 


17 




381 


■9999305 


28 
27 
26 
25 
24 




331 


■9995499 


5 


-0016122 


16 




383 


-9999333 




333 


■9995658 


6 


■0015695 


16 




383 


•9999359 




3-34 


■9995811 




■0016084 


15 




384 


•9999385 




3-85 


-9995959 




■0014587 


15 




S-85 


■9999409 




3-36 


-9996103 


139 
134 
130 
125 
121 




O014106 


467 
453 
439 
426 
413 


16 




3-86 


•9999433 


23 

22 
21 
20 
19 




3-37 


-9996242 




■0013639 


14 




387 


•9999456 




338 


•9996376 




•0013187 


14 




3-88 


0999478 




339 


•9996505 




■0012748 


13 




3-89 


O099499 




3-40 


-9996631 




•0012322 


13 




3-90 


-9999519 




3-41 


■9990752 


117 
113 

109 
106 
102 




001 18M) 


400 
388 
376 
364 
353 


13 




3-91 


-9999539 


19 
18 
17 
17 
16 




3-4* 


■9990869 




■0011510 


12 




3-9t 


0999557 




3-43 


•999C982 




■0011122 


12 




3-93 


O990575 




3-u 


■9997091 




■0010747 


12 




394 


■9990593 




3-45 


■9997197 




-0010383 


n 




396 


■9999609 




3-46 


■9997299 


99 
95 
92 
89 


3 


-0010030 


342 
331 
320 
310 


11 




3-96 


0999625 


16 
15 
14 
14 




3-47 


•9997398 


3 


0009689 


11 




3-97 


•9999641 




3-48 


■9997493 


3 


0009358 


10 




3-98 


•9999655 




3-49 


•9997586 


3 


0009037 


10 




3-99 
400 | 


0999670 




3-60 


•9997674 


3 


0008727 


10 




0999683 





Z(X) = e-«*7V(2ff), P(X)=l-<UX)=j X Z{u)du. 



414 



The Normal Probability Function (continued) 



Z(X) 



•0008727 
•0008426 
•0008135 
•0007853 
■0007681 
•0007317 

•0007061 
■0006814 
•0006575 
•0006343 
•0006118 

•0005902 
•0005693 
•0005400 
•0005294 
•0005105 

■0004921 

■000-1744 
•0004573 
■0004408 
•0004248 

■0004093 
■0003944 
■0003800 
•0003661 
■0003526 

•0003396 
•0003271 
•0003149 
•0003032 
•0002919 

■0002810 
•0002705 
•0002004 
•0002506 
•0002411 

■0002320 
•0002232 
•0002147 
■0002005 
•0001987 

■0001910 
■0001837 
•0<O1766 
■0001698 
■0001633 

■0001569 
•0001508 
■0001449 
•0001393 
■0001338 



301 
291 
282 
273 
264 
256 

247 
239 
232 
224 
217 

210 
203 
196 
189 
183 

177 
171 
1G5 
160 
155 

149 
144 
139 
135 
130 

125 
121 
117 
113 
109 

105 

102 

98 

95 

91 

88 
85 
82 
79 
76 

73 

71 
68 
60 
63 

61 

59 
67 
65 



+ 



10 
10 
» 
» 
9 
8 

8 
8 
8 
8 
7 

7 
7 
7 
6 
6 

6 
6 
6 
6 
6 

6 
5 
5 
6 
5 

4 
4 
4 
4 
4 

4 
4 
4 
3 

3 







8 


t* 




t 


S* 


X 


P(X) 


+ 




Z(X) 




+ 


400 


•9999683 


13 
13 

12 
12 
11 
11 


1 


■0001333 


63 
61 
49 
47 
45 
43 


2 


4-oi 


•999'J(;06 


1 


•0001286 


2 


402 


•9999709 





-0001235 


2 


40s 


■9999721 




■0001186 


2 


4-04 


■9999733 




■0001140 


2 


40s 


-9999744 




■0001094 


2 


409 
407 


•9999755 
■9099765 


10 

10 

9 

9 

9 




■0001051 
•0001009 


42 
40 
39 
37 
36 




408 


■9999775 




■0000969 




409 
410 


•0999784 
■9999793 




■0000930 
•0000893 




411 


•9999802 


8 
8 
8 
7 
7 




•0000857 


35 
33 
32 
31 

30 




411 


•9999811 




■0000822 




413 


■9999819 




O000789 




4U 


■9999826 




•0000757 




41s 


■9999834 




•0000726 




419 


■9999841 


7 
7 
6 

e 
e 




■0000697 


28 
27 
26 
25 
24 




417 
418 


•9999848 

■9999864 




•0000668 
■0000641 




4-19 


•9999861 




•0000615 




4*0 


■99998U7 




•0000589 




421 


■9999872 


6 
6 
6 
6 
6 




•0000565 


23 
22 
22 
21 
20 




422 


■9999378 




•0000542 




4-23 


-9999883 




•0000519 




4H 


•9999888 




•0000498 




425 


■99U9893 




•0000477 




4-26 


■9999893 






■0000467 


19 
18 
18 
17 
16 




4-27 


•9999902 






-0000438 




4-28 


•9999907 






-0000420 




429 


-9999911 






•0000402 




4S0 


-9999916 






■0000385 




431 


■9999918 


4 
3 
3 
3 
3 




•0000369 


16 
15 
14 




4-st 


•9999922 




■0000351 




433 


•9999925 




•0000339 




4*4 


-9999929 




■0000324 


14 




4-35 


•9999932 




■0000310 


13 




436 


•9999033 


3 
3 
3 
3 

a 




■0000297 


13 




4-37 


•9991W38 




■00002H4 


12 




438 


■9999941 




■0000272 


12 





4-39 


•9999943 




•0000261 


XI 




440 


•9999946 




•0000249 


11 




iil 


•9999948 


s 
2 
2 

2 
2 




0000239 


10 




4V 


•9999951 




■0000228 


10 




4-43 


■9999953 




•0000218 


9 




4-U 


•9990958 




•0000209 


9 




445 


■9999957 




•00002UO 


9 




4-46 


•9999959 


1 
3 
2 
> 




•0000191 


s 




447 


•9999961 




■0000183 


g 




448 


■9999963 




■0000175 


8 




449 


•9999064 




•0000167 


7 




4-50 


•9999966 




•00001C0 


1 



Note sign of second difference, S*. 



415 



The Normal Probability Function (continued) 



X 


P(X)* 


Z(X)» 


450 


66023 


159837 


4-61 


67586 


162797 


*5t 


69080 


146051 


4-53 


70508 


139500 


454 


71873 


133401 


4-66 


73177 


127473 


466 


74423 


121797 


457 


75614 


116362 


4-68 


76751 


111159 


469 


77838 


106177 


460 


78875 


101409 


4-61 


79867 


96845 


16* 


80813 


92477 


463 


81717 


88297 


464 


82580 


84208 


465 


83403 


80472 


466 


84100 


76812 


467 


84940 


73311 


468 


86656 


69962 


469 


8C340 


66760 


4-70 


86992 


63698 


4-71 


87614 


60771 


47t 


88208 


67972 


473 


88774 


66296 


W4 


89314 


62739 


4-75 


89829 


50295 


476 


90320 


47960 


4-77 


90789 


45728 


4-78 


91235 


43596 


4-79 


91C61 


41559 


480 


92067 


39613 


431 


92453 


37755 


4-8t 


92822 


35980 


483 


93173 


34285 


4 84 


93508 


32067 


485 


93827 


31122 


486 


94131 


29647 


4-87 


94420 


28239 


4 88 


94696 


26895 


489 


94958 


25613 


490 


96208 


24390 


491 


95446 


23222 


49t 


96673 


22108 


493 


95889 


21046 


4-94 


96094 


20033 


495 


96289 


19066 


496 


96475 


18144 


497 


96652 


17265 


498 


96821 


1G428 


499 


9G981 


15629 



X 


P(X)* 


Z(X)* 


6-00 


97133 


148C7 


6-01 


97278 


14141 


6-Ot 


97416 


13450 


603 


97548 


12791 


604 


97672 


12162 


605 


97791 


11564 


606 


97904 


10994 


6-07 


98011 


10451 


6-Ot 


98113 


9934 


609 


98210 


9441 


610 


98302 


8972 


611 


98389 


8626 


Sit 


98472 


8101 


613 


98551 


7696 


614 


98626 


7311 


616 


98698 


6944 


616 


98766 


6695 


617 


98830 


6263 


618 


98891 


6947 


619 


98949 


6647 


6*0 


99004 


6361 


6*1 


99056 


6089 


6*t 


99105 


4831 


6*3 


99152 


4586 


6*4 


99197 


4351 


6*5 


99240 


4128 


6*6 


99280 


3917 


6*7 


99318 


3716 


6*8 


99354 


3525 


6*9 


99388 


3344 


680 


99421 


3171 


631 


99452 


3007 


631 


99481 


2852 


6-33 


99509 


2704 


6-34 


99535 


2563 


636 


99660 


2430 


636 


99584 


2303 


6-37 


99606 


2183 


638 


99628 


2069 


639 


99648 


1960 


640 


99667 


1857 


641 


99085 


1760 


6-4$ 


9970:2 


1667 


543 


99718 


'579 


644 


99734 


1495 


645 


99748 


1416 


646 


99762 


1341 


6-47 


99775 


1270 


6-48 


99787 


1202 


64$ 


99799 


1138 



X 


P(X)* 


Z(X)* 


650 


99810 


1077 


651 


99821 


1019 


651 


99831 


965 


663 


99840 


913 


654 


99849 


864 


655 


99867 


817 


6-66 


99865 


773 


657 


99873 


731 


658 


99880 


691 


6-59 


99886 


654 


660 


99893 


618 


661 


99899 


585 


66t 


99905 


553 


663 


99910 


522 


664 


99915 


494 


6-65 


99920 


467 


6-66 


99924 


441 


6-67 


99929 


417 


668 


99933 


394 


6-69 


99936 


372 


6-70 


99940 


351 


6-71 


99944 


332 


61t 


99947 


313 


6-73 


99950 


296 


6-74 


99953 


280 


6-76 


99955 


264 


676 


99958 


249 


677 


99960 


235 


6-73 


99963 


222 


6-79 


99965 


210 


680 


999C7 


198 


581 


99969 


187 


68t 


99971 


176 


6*3 


99972 


166 


6*4 


99974 


167 


686 


99975 


148 


6-86 


99977 


139 


6H7 


991)78 


131 


688 


99979 


124 


689 


99981 


117 


690 


99982 


110 


691 


999H3 


104 


6-9t 


99984 


98 


6-93 


99985 


92 


694 


99986 


87 


695 


99987 


82 


696 


99987 


77 


697 


99988 


73 


698 


99989 


68 


699 


99990 


65 


600 


99990 


61 



Z(X) = e-*x , W2n), P(X) = l-Q(X) = f Z(u)du. 



* The entries for P(X) and Z{X) on this page are given to 10 decimal places; thus 0-99999 should be prefixed 
to each entry for P(X) and a decimal point, followed by four, five, ..., eight zeros, as appropriate, to Z[X). 



This table was reprinted from Biometrika Tables for Statisticians, Vol 1, 3rd Edition, Table 1, with the permission of the Biometrika Trustees. 



4k 

ON 



Percentage Points of the F-distribution (Variance Ratio) 

Upper 25 % points 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


12 


15 


20 


24 


30 


40 


60 


120 


oo 


1 


583 


7-60 


8 20 


8-58 


8-82 


898 


9- 10 


919 


926 


932 


941 


9-49 


9-58 


9 63 


967 


971 


9-76 


9-80 


985 


2 


257 


300 


3 15 


323 


328 


331 


334 


3 35 


337 


338 


339 


341 


343 


343 


344 


3-45 


346 


347 


3-48 


3 


202 


2-28 


230 


239 


241 


2-42 


243 


2 44 


2-44 


2-44 


2-45 


246 


2-46 


246 


2-47 


247 


247 


2-47 


2-47 


4 


1-81 


200 


205 


206 


207 


208 


208 


208 


208 


208 


208 


208 


208 


208 


208 


2 08 


208 


2 08 


208 


5 


1-69 


1 86 


1-88 


1-89 


1 89 


1-89 


1-89 


1-89 


1-89 


1-89 


1-89 


189 


1-88 


1-88 


1-88 


1 88 


1-87 


1-87 


1-87 


6 


162 


1 76 


1-78 


1-79 


1-79 


1-78 


1-78 


1-78 


1-77 


1-77 


1-77 


1 76 


1-76 


1-75 


1-75 


1-75 


174 


1-74 


1-74 


7 


1 57 


1-70 


1-72 


1-72 


1 71 


1-71 


1-70 


1-70 


1 69 


1 69 


1 68 


1 68 


1-67 


1-67 


1-66 


1 66 


1-65 


1 65 


1-65 


8 


154 


1 66 


1-67 


106 


1 66 


1-65 


1-64 


1-64 


163 


1 63 


1 62 


1 62 


1 61 


1 60 


160 


1-69 


1-69 


1-58 


1-58 


9 


1-51 


162 


1-63 


1-63 


1 62 


1-61 


1-60 


1-60 


159 


1-59 


1-68 


1-57 


1 56 


1-66 


1-55 


1-54 


1-54 


1-53 


1-63 


10 


1 49 


1-60 


1 60 


1-59 


159 


1 58 


1-57 


1-66 


156 


155 


1 54 


1 53 


1-52 


1-52 


1-51 


1-51 


1-50 


1 49 


1-48 


11 


1-47 


1-58 


1-58 


1 57 


1 56 


1-55 


1-54 


1-53 


J 53 


1-52 


1 51 


1-50 


1-49 


1-49 


1 48 


1 47 


1-47 


1 46 


145 


12 


1 46 


1 56 


1-56 


1 55 


1 54 


1-53 


1-52 


1-61 


1-51 


1-50 


1-49 


148 


1-47 


1 46 


1 45 


1-46 


1-44 


1-43 


1-42 


13 


1 45 


1-55 


1 55 


1 53 


1-52 


1 51 


1 50 


149 


149 


1-48 


1-47 


1-46 


1 45 


1-44 


1 43 


1-42 


1 42 


1 41 


1-40 


14 


1-44 


1-53 


1-53 


1 52 


1-51 


1-50 


1-49 


1-48 


1-47 


1 46 


145 


144 


1-43 


1-42 


1-41 


1-41 


1-40 


1-39 


1-38 


15 


1-43 


1-52 


1 52 


1-61 


1-49 


1-48 


1-47 


1-46 


1 46 


1-45 


1 44 


1-43 


1-41 


1 41 


1-40 


1 39 


1-38 


1-37 


1-36 


16 


1 42 


1-51 


1 51 


1-60 


1-48 


1-47 


1-46 


1 45 


1 44 


1-44 


1 43 


1 41 


1 40 


1-39 


1-38 


1 37 


1 36 


1-35 


1-34 


17 


1-42 


I 51 


1-50 


1-49 


1-47 


1 46 


1 45 


1 44 


1 43 


1 43 


141 


1 40 


1 39 


1 38 


1-37 


1 36 


1-35 


1 34 


1-33 


18 


1-41 


1 50 


1-49 


1-48 


1 46 


1 45 


144 


143 


1-42 


1-42 


1-40 


1 39 


1 38 


1-37 


1-36 


1 35 


1-34 


1 33 


1 32 


19 


1-41 


149 


1-49 


1-47 


1 46 


144 


143 


1-42 


1 41 


1-41 


1 40 


1-38 


1 37 


1 36 


1-35 


134 


133 


1 32 


130 


20 


1-40 


1-49 


1-48 


1-47 


1-45 


1 44 


143 


1-42 


1-41 


1-40 - 


1-39 


1 37 


1-30 


1-35 


1-34 


1-33 


1-32 


1-31 


1-29 


21 


1-40 


1 48 


1-48 


1 46 


1-44 


143 


1 42 


1 41 


1-40 


1-39 


1-38 


1-37 


1-35 


1 34 


1-33 


1-32 


1 31 


1 30 


1-28 


22 


1 40 


1-48 


1-47 


145 


144 


1-42 


1 41 


1-40 


1-39 


1-39 


1 37 


1 36 


1 3t 


1 33 


1-32 


1-31 


1 30 


1 29 


1-28 


23 


1 39 


1 47 


1-47 


1 45 


1 43 


1-42 


1 41 


1-40 


139 


1-38 


1-37 


1 35 


1-31 


1-33 


1-32 


1-31 


1 30 


1 28 


1-27 


24 


1 39 


1-47 


146 


1 44 


1-43 


1-41 


1-40 


1-39 


1-38 


1-38 


1 36 


1-35 


1-33 


1 32 


1-31 


1-30 


1-29 


1-28 


1-26 


25 


1-39 


147 


1 46 


1 44 


142 


1 41 


1-40 


1-39 


1 38 


1-37 


1-36 


134 


1 33 


132 


131 


1-29 


1-28 


1-27 


1 25 


26 


1-38 


1 46 


1 45 


1 44 


142 


141 


1 39 


1-38 


1-37 


1-37 


1 35 


1 34 


1-32 


1-31 


1-30 


1-29 


1 28 


1-26 


1 25 


27 


1-38 


I 46 


1 45 


143 


142 


1-40 


1 39 


1-38 


1-37 


1 36 


1 35 


1 33 


1-32 


1 31 


1-30 


1-28 


1-27 


1 26 


1 24 


28 


1 38 


1 46 


1 45 


1 43 


1-41 


1 40 


1 39 


1-38 


1-37 


1-36 


1 34 


1 33 


1-31 


1 30 


1-29 


1-28 


1 27 


1 25 


1 24 


29 


1 38 


1 45 


145 


1 43 


1-41 


1-40 


1-38 


1-37 


1-36 


1-35 


1 34 


1-32 


l-3t 


1 30 


1-29 


1-27 


1 26 


1 25 


1 23 


30 


1-38 


145 


144 


1-42 


1 41 


139 


1-38 


1-37 


1-36 


1-35 


1-34 


1-32 


1-30 


1 29 


1-28 


1-27 


1 26 


1-24 


1-23 


40 


1 36 


1-44 


1-42 


140 


1-39 


1 37 


1-36 


1 35 


1 34 


1-33 


1-31 


1-30 


1-28 


1 26 


1 25 


1-24 


1-22 


1 21 


1 19 


60 


1 35 


1-42 


1-41 


1-38 


1-37 


1 35 


1-33 


1-32 


1-31 


1-30 


1-29 


1-27 


1-25 


1-24 


1-22 


1-21 


1 19 


1 17 


115 


120 


1 34 


1-40 


1 39 


1-37 


1 35 


1 33 


1-31 


1-30 


1-29 


1-28 


1-26 


1-24 


1-22 


1 21 


119 


118 


1 16 


113 


110 


00 


1 32 


1-39 


1-37 


1 35 


133 


1 31 


1 29 


1-28 


1-27 


1-25 


124 


122 


119 


118 


1-16 


1 14 


1 12 


108 


100 



^~gi~~ I — » whore «}=sjSi/i' 1 and s\ = S t fv t are independent mean squares estimating a common variance o** and based on i^and v % degrees of freedom, respectively* 



Percentage Points of the F-distribution (Variance Ratio) (continued) 

Upper 10 % points 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


12 


15 


20 


24 


30 


40 


60 


120 


CO 


1 


3986 


4950 


63-59 


65-83 


67-24 


68 20 


5891 


69-44 


69-86 


60 19 


60-71 


61-22 


6174 


6200 


6226 


62-53 


6279 


6306 


63-33 


2 


853 


9-00 


916 


9-24 


9-29 


9-33 


9-35 


9-37 


938 


9-39 


9-41 


942 


944 


9-45 


9-46 


9-47 


9-47 


9-48 


9-49 


3 


6-54 


6-46 


6-39 


6-34 


6-31 


5 28 


5-27 


625 


6 24 


623 


6-22 


6-20 


618 


618 


517 


6-16 


615 


6-14 


513 


4 


4 64 


4-32 


419 


411 


4-05 


401 


3-98 


3-95 


3-94 


392 


3 90 


3-87 


384 


383 


3-82 


3-80 


3-79 


3-78 


3-76 


5 


406 


3-78 


362 


3-52 


345 


3-40 


3-37 


3 34 


332 


330 


3-27 


3-24 


321 


319 


317 


316 


314 


312 


310 


6 


378 


3-46 


3 29 


3-18 


311 


305 


301 


298 


296 


2-94 


290 


2-87 


2-84 


2-82 


280 


2-78 


2-76 


2-74 


2-72 


7 


369 


326 


307 


2-96 


2-88 


2-83 


2-78 


2-75 


272 


2-70 


267 


263 


2-69 


258 


2-56 


254 


251 


2-49 


2-47 


8 


346 


311 


292 


2-81 


2-73 


2-67 


2-62 


2-69 


2-66 


2 54 


2-60 


2-46 


2-42 


2-40 


238 


236 


234 


2-32 


2-29 


9 


3 36 


301 


2-81 


269 


2-61 


2-65 


251 


2-47 


2-44 


2-42 


2-38 


2-34 


230 


2-28 


2-26 


2-23 


2-21 


218 


216 


10 


3-29 


2-92 


2-73 


261 


2-52 


246 


241 


238 


236 


2-32 


2-28 


224 


2-20 


2-18 


216 


213 


211 


208 


206 


It 


323 


2-86 


266 


2-64 


2-45 


239 


2 34 


230 


2-27 


2-25 


2-21 


217 


212 


2- 10 


208 


205 


203 


200 


1-97 


12 


318 


2-81 


261 


2-48 


2-39 


233 


2-28 


2-24 


221 


2-19 


216 


210 


206 


204 


201 


1 99 


1-96 


1-93 


1-90 


13 


314 


276 


256 


243 


235 


2-28 


2-23 


2-20 


216 


214 


210 


205 


201 


1 98 


1 96 


1 93 


1-90 


1-88 


1 85 


14 


310 


2-73 


2-52 


2 39 


231 


2-24 


2 19 


216 


212 


2-10 


205 


201 


1-96 


1-94 


1-91 


1-89 


1-86 


1-83 


1-80 


15 


307 


2-70 


2-49 


236 


2-27 


221 


2-16 


2-12 


209 


206 


2-02 


1 97 


1-92 


1-90 


1-87 


1 85 


1-82 


1-79 


1-76 


16 


305 


2-67 


246 


233 


224 


218 


213 


209 


206 


203 


1-99 


1-94 


1-89 


1-87 


184 


181 


1-78 


1-75 


1-72 


17 


303 


264 


244 


231 


2-22 


215 


210 


206 


203 


200 


1 96 


1-91 


1-86 


1-84 


181 


1-78 


1-75 


1-72 


169 


18 


3-01 


2-62 


2-42 


2-29 


2-20 


213 


208 


204 


200 


1-98 


1-93 


1-89 


1-84 


1-81 


1-78 


1-75 


1-72 


169 


166 


19 


2 99 


261 


2 40 


2-27 


218 


211 


2-06 


202 


1-98 


1-96 


1-91 


1-86 


1-81 


1-79 


176 


1-73 


1-70 


167 


163 


20 


2-97 


2 59 


2-38 


225 


216 


209 


204 


200 


1-96 


1-94 


1-89 


1-84 


1-79 


1-77 


1-74 


171 


1 68 


1 64 


1-61 


21 


296 


2-67 


236 


223 


2 14 


208 


202 


1 98 


1 95 


1 92 


1-87 


1 83 


1-78 


1-75 


1 72 


1 69 


1-66 


1 62 


169 


22 


295 


2 66 


235 


2 22 


2 13 


206 


201 


1-97 


1-93 


1-90 


1-86 


1-81 


1-76 


1 73 


1-70 


1-67 


164 


160 


1-67 


23 


2 94 


2-55 


2-34 


221 


2 11 


2-05 


1-99 


1-95 


1-92 


1 89 


1-84 


1-80 


1-74 


1-72 


1 69 


1-66 


1-62 


1-59 


1-65 


24 


2 93 


2 64 


2-33 


219 


2 10 


204 


1-98 


1-94 


191 


1-88 


183 


1-78 


1-73 


1-70 


167 


1-64 


161 


1-67 


1-53 


25 


292 


2-63 


2-32 


2-18 


2-09 


202 


1-97 


1 93 


1-89 


1-87 


1-82 


1-77 


1-72 


169 


166 


1-63 


1 59 


156 


1 52 


26 


291 


2-62 


2-31 


217 


208 


201 


1-96 


1 92 


1-88 


1 86 


1-81 


1 76 


1-71 


1 68 


1-65 


1-61 


1 58 


1-54 


1-60 


27 


290 


2-61 


2-30 


217 


2 07 


200 


1-95 


1-91 


1-87 


1-85 


1-80 


1-75 


1-70 


1-67 


1 64 


1-60 


1-57 


1-53 


1-49 


28 


2-89 


2 50 


229 


216 


206 


200 


1-94 


1-90 


187 


1-84 


1-79 


1-74 


1 69 


166 


1-63 


1-69 


1-56 


1 52 


1-48 


29 


2-89 


2-60 


2-28 


216 


206 


1-99 


1-93 


1-89 


1-86 


1-83 


1-78 


1-73 


1-68 


1-66 


1-62 


1-58 


1-66 


1-61 


1-47 


30 


2-88 


249 


2-28 


214 


205 


1 98 


1-93 


1-88 


1-85 


1-82 


1-77 


1-72 


1-87 


1-64 


161 


1-57 


1-64 


1-60 


146 


40 


284 


2 44 


223 


209 


200 


1 93 


1-87 


1-83 


1-79 


1-76 


1-71 


166 


161 


1-57 


164 


151 


1-47 


142 


1 38 


60 


2-79 


239 


2-18 


204 


1-95 


1-87 


1-82 


1-77 


1-74 


1-71 


1-66 


1-60 


1-64 


161 


1-48 


1-44 


140 


135 


1-29 


120 


2-75 


235 


2-13 


1-99 


1 90 


1-82 


1-77 


1-72 


1 68 


1 65 


1-60 


1-55 


148 


1-45 


1 41 


1-37 


1 32 


1-26 


1-19 


00 


2-71 


2 30 


208 


1-94 


1-86 


1-77 


1-72 


167 


1-63 


1-60 


1-65 


1 49 


1-42 


1-38 


1-34 


1-30 


1-24 


117 


100 



F^ -s = — '/— i where »l=SJv l and «J = £,/>', are independent mean squares estimating a common variance cr* and based on e, and v, degrees of freedom, respectively. 



4*. 

^1 



00 



Percentage Points of the F-distribution (Variance Ratio) (continued) 
Upper 5 % points 



y. 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


12 


15 


20 


24 


30 


40 


60 


120 


CO 


i 


1614 


199-6 


215-7 


224-6 


2302 


2340 


2368 


238-9 


240-5 


241 9 


2439 


245 9 


2480 


249 1 


2501 


2511 


2522 


2533 


254 3 


2 


18-61 


1900 


19 16 


19 25 


1930 


19-33 


19 35 


1937 


1938 


1940 


19-41 


1943 


19-45 


1945 


1946 


1947 


1948 


19 49 


19 50 


3 


1013 


9-65 


9-28 


9- 12 


901 


8-94 


889 


8-86 


8-81 


8-79 


8-74 


8-70 


866 


864 


862 


859 


857 


8-55 


8 53 


4 


7-71 


694 


6-59 


639 


6-26 


6 16 


6 00 


604 


600 


696 


5-91 


6-86 


6-80 


6-77 


575 


6-72 


6-69 


6 66 


563 


5 


661 


6-79 


641 


519 


505 


4-95 


488 


482 


4-77 


474 


468 


462 


4 58 


453 


4 50 


446 


443 


4-40 


4-36 


6 


6-99 


614 


476 


4 63 


439 


4-28 


421 


415 


410 


4-06 


400 


394 


387 


3 84 


381 


3-77 


374 


3-70 


367 


7 


6-59 


4-74 


435 


412 


3-97 


3-87 


3-79 


373 


3-68 


3 64 


3-57 


3 51 


3 44 


341 


338 


334 


3-30 


3 27 


323 


8 


6-32 


4-46 


407 


3 84 


369 


358 


3-60 


344 


3-39 


335 


3 28 


322 


3 15 


3-12 


3 08 


304 


301 


297 


2-93 


9 


6-12 


4-26 


3-86 


3 63 


3 48 


337 


3-29 


323 


3-18 


314 


307 


301 


294 


290 


2 86 


2 83 


279 


275 


271 


10 


4-96 


410 


3-71 


348 


333 


3-22 


3-14 


307 


302 


2-98 


291 


285 


2-77 


274 


270 


266 


262 


2-68 


254 


11 


484 


3-98 


3 69 


336 


3-20 


309 


301 


2 05 


290 


2-85 


2-79 


2-72 


285 


261 


2-57 


253 


249 


245 


240 


12 


476 


3-89 


349 


3-26 


311 


300 


291 


2-85 


2-80 


2'75 


269 


2-62 


2 54 


251 


247 


243 


238 


234 


230 


13 


467 


3-81 


341 


3-18 


303 


2-92 


2-83 


2-77 


2-71 


2-67 


260 


2-53 


246 


2-42 


2 38 


234 


230 


225 


221 


14 


460 


3-74 


3 34 


311 


2-96 


2-86 


2-76 


2-70 


265 


260 


2-53 


2-46 


239 


235 


231 


2-27 


222 


218 


2 13 


15 


4 64 


368 


329 


306 


2-90 


2-79 


2-71 


2 64 


259 


2-54 


248 


2-40 


233 


229 


2-25 


220 


2 16 


2 11 


2 07 


16 


4-49 


363 


3-24 


3-01 


2-85 


2-74 


2-66 


259 


264 


2-49 


2-42 


2-35 


2-28 


2 24 


2 19 


2 15 


2 11 


206 


201 


17 


445 


3-69 


3 20 


2-96 


2-81 


2-70 


2-61 


255 


2-49 


2-45 


238 


231 


2-23 


2 19 


2 15 


2- 10 


206 


201 


1-96 


18 


441 


3-65 


316 


2-93 


2-77 


266 


2 58 


251 


246 


2-41 


234 


227 


2 19 


2 15 


211 


206 


202 


1-97 


1-92 


19 


438 


362 


3-13 


2-90 


2-74 


2-63 


2-64 


248 


2-42 


238 


231 


223 


2 16 


2 11 


207 


203 


1 98 


1 93 


1-88 


20 


436 


349 


3-10 


2-87 


2-71 


260 


2-51 


245 


2-39 


235 


228 


220 


2 12 


208 


204 


1 99 


1 95 


1 90 


1 84 


21 


4 32 


3-47 


307 


2-84 


2 68 


2-67 


2-49 


2-42 


2-37 


232 


2 25 


218 


2 10 


205 


201 


1-96 


1 92 


1-87 


1-81 


22 


4 30 


3 44 


305 


2-82 


266 


2-55 


246 


2-40 


234 


2-30 


223 


215 


207 


203 


1 98 


1 94 


1 89 


1 84 


1-78 


23 


4-28 


3-42 


303 


2-80 


2-64 


2-63 


2-44 


237 


232 


227 


2-20 


213 


205 


201 


1 96 


1 91 


186 


1-81 


1-76 


24 


4-26 


3-40 


301 


2-78 


262 


2-61 


2-42 


236 


230 


2-25 


218 


2-11 


203 


1 98 


1-94 


1 89 


1-84 


1-79 


1-73 


25 


4 24 


3-39 


2-99 


2-76 


2-60 


2-49 


240 


234 


2-28 


224 


216 


209 


201 


196 


192 


1-87 


182 


1-77 


1-71 


26 


4-23 


337 


298 


2-74 


259 


2-47 


239 


2-32 


2-27 


222 


2-15 


207 


1 99 


1 95 


1 90 


1-85 


1 80 


1-75 


1 69 


27 


4-21 


3-35 


296 


2-73 


267 


2-40 


2-37 


231 


2-25 


2-20 


2-13 


206 


1 97 


1 93 


1 88 


1 84 


1 79 


1 73 


1-67 


28 


4-20 


3-34 


295 


271 


2 66 


2-45 


2-36 


2-29 


2-24 


219 


212 


204 


1-96 


191 


1-87 


1 82 


1-77 


1-71 


1 05 


29 


4-18 


3-33 


2-93 


2-70 


2-65 


2-43 


235 


228 


2-22 


218 


210 


203 


194 


190 


1-85 


1 81 


1-75 


1-70 


1-64 


30 


417 


3-32 


2 92 


269 


263 


2-42 


2-33 


227 


2-21 


216 


209 


201 


1 93 


1 89 


1 84 


1 79 


1 74 


1 68 


I 62 


40 


408 


3-23 


2-84 


261 


2-45 


2 34 


2-25 


218 


212 


208 


200 


1 92 


1 84 


1 79 


1-74 


1 69 


1 64 


1-58 


1-51 


60 


400 


3-15 


2-76 


2 53 


2-37 


2 25 


217 


210 


204 


1 99 


1-92 


1 84 


1-75 


1-70 


1-65 


1-69 


I 63 


1 47 


1-39 


120 


3-92 


307 


268 


245 


229 


217 


209 


202 


1-96 


1 91 


1-83 


1 75 


1-66 


1 61 


1 55 


1 50 


1 43 


1 35 


1-25 


CO 


384 


300 


2-60 


2 37 


221 


2-10 


201 


194 


1-88 


1 83 


1-75 


1 67 


1 57 


1 52 


1-46 


1 39 


1 32 


1 22 


1 00 



f = j = — ' / — i where »} — Sjlv, and «} = 8Jv t axe independen t mean squares estimating a common variance <t* and based on v x and v t degrees of freedom, respectively. 
1 »",/ c, 



Percentage Points of the F-distribution (Variance Ratio) (continued) 

Upper 2-5 % points 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


12 


15 


20 


24 


30 


40 


60 


120 


00 


1 


647-8 


7995 


864-2 


899-6 


921-8 


9371 


948-2 


956-7 


963-3 


968-6 


976-7 


9849 


993- 1 


997-2 


1001 


1006 


1010 


1014 


1018 


2 


38-51 


3900 


3917 


39-25 


39 30 


3933 


3936 


39-37 


3939 


39-40 


39-41 


39-43 


39-45 


3946 


39-46 


39-47 


39-48 


39-49 


3950 


3 


17-44 


1604 


16-44 


1510 


14-88 


14-73 


14-62 


14-54 


14-47 


14-42 


1434 


1425 


14-17 


14-12 


14-08 


1404 


13-99 


13-95 


13-90 


4 


12-22 


1065 


998 


9-60 


9-36 


9-20 


907 


8-98 


8 90 


8-84 


8-76 


8-66 


8 66 


8-51 


8-46 


8-41 


8 36 


8-31 


8-26 


5 


1001 


843 


7-76 


7-39 


715 


698 


685 


6-76 


6-68 


6-62 


6-62 


6-43 


6 33 


6-28 


6-23 


6-18 


6 12 


607 


602 


6 


881 


7-26 


660 


623 


6-99 


5-82 


6-70 


6-60 


5-52 


5-48 


6-37 


6-27 


617 


512 


507 


501 


4-96 


4-90 


4-85 


7 


807 


6 54 


6-89 


6-52 


5-29 


612 


4-99 


4-90 


482 


4-76 


4-67 


4-57 


4-47 


4 42 


4-36 


431 


4-25 


4 20 


414 


8 


7-57 


606 


5-42 


605 


4-82 


4-65 


4-53 


443 


436 


4-30 


4-20 


4-10 


400 


3-95 


3-89 


384 


3-78 


3-73 


3-67 


9 


7-21 


6-71 


508 


4-72 


4-48 


4-32 


4-20 


410 


4-03 


3-96 


3-87 


3-77 


367 


3-61 


3-56 


361 


3-45 


3-39 


333 


10 


6 94 


646 


4-83 


4-47 


4-24 


407 


395 


385 


3-78 


3-72 


3'62 


3-52 


3-42 


3-37 


3-31 


3-26 


3-20 


314 


308 


11 


672 


5-26 


4-63 


4-28 


404 


3-88 


376 


3-66 


3-59 


353 


343 


333 


323 


317 


312 


3-06 


300 


294 


288 


12 


6 55 


5- 10 


4-47 


412 


3-89 


3-73 


3-61 


3-51 


344 


337 


3-28 


318 


307 


3-02 


2-96 


2-91 


2-85 


2-79 


2-72 


13 


641 


4-97 


435 


400 


3-77 


360 


348 


339 


331 


3-25 


3 15 


305 


2-95 


2-89 


2-84 


2-78 


2-72 


266 


2-60 


14 


6-30 


4-86 


4-24 


3-89 


3-66 


360 


3 38 


329 


3-21 


3-15 


305 


295 


2-84 


2-79 


2-73 


2-67 


2-61 


2-55 


2-49 


15 


6 20 


4-77 


415 


3-80 


3-58 


3-41 


3 29 


3-20 


312 


306 


296 


2-86 


2-76 


2-70 


2 64 


2-59 


2-52 


246 


240 


16 


612 


4-69 


408 


3-73 


3-50 


3 34 


322 


312 


305 


2-99 


289 


2-79 


268 


2-63 


2-57 


2-51 


245 


238 


2-32 


17 


604 


4-62 


401 


366 


3-44 


328 


316 


306 


2-98 


2-92 


282 


2-72 


262 


256 


2-50 


244 


238 


232 


2 25 


18 


6 98 


4 56 


3-95 


3-61 


3-38 


322 


310 


301 


293 


2-87 


2-77 


2 67 


2-56 


250 


244 


2-38 


2-32 


2 26 


219 


19 


592 


4-61 


3-90 


3-56 


3-33 


317 


305 


2-96 


2-88 


2-82 


2-72 


262 


251 


245 


2 39 


2 33 


2-27 


2-20 


2-13 


20 


6-87 


4 46 


386 


351 


329 


313 


301 


291 


2-84 


2-77 


268 


2-67 


2-46 


241 


2 35 


2-29 


2-22 


2-16 


2 09 


21 


683 


442 


3-82 


3-48 


325 


309 


297 


2-87 


2-80 


2-73 


264 


2-53 


242 


2-37 


2-31 


225 


218 


2-11 


204 


22 


6-79 


438 


3-78 


3-44 


322 


305 


293 


2-84 


276 


2-70 


260 


2-50 


239 


233 


2-27 


221 


2-14 


208 


200 


23 


6-75 


4-35 


375 


341 


3 18 


302 


2-90 


2-81 


2-73 


2-67 


2-57 


2-47 


2 36 


2-30 


2 24 


218 


211 


204 


1-97 


24 


6-72 


432 


372 


338 


3-15 


299 


2-87 


2-78 


2-70 


2-64 


254 


2-44 


2-33 


2-27 


2-21 


215 


2-08 


201 


1-94 


25 


669 


4-29 


369 


335 


313 


297 


2-85 


2-75 


2-68 


2-61 


2-51 


241 


230 


2-24 


218 


212 


205 


1-98 


1-01 


26 


566 


4-27 


367 


333 


310 


2-94 


282 


2-73 


265 


2-59 


249 


2-39 


2-28 


2-22 


216 


209 


203 


1-95 


1-88 


27 


5-63 


424 


365 


331 


308 


292 


280 


271 


263 


2-57 


2-47 


236 


2-25 


219 


213 


207 


200 


1-93 


1-85 


28 


5-61 


4-22 


363 


3 29 


306 


2-90 


2-78 


2-69 


261 


2-55 


2-45 


2-34 


2-23 


217 


211 


205 


1-98 


1-91 


183 


29 


569 


4-20 


361 


3-27 


3-04 


2-88 


2-76 


2-67 


2-69 


2 53 


2-43 


2-32 


2-21 


216 


209 


203 


1-96 


1-89 


1-81 


30 


5-67 


4-18 


3-59 


3-25 


303 


2-87 


2-75 


265 


2-57 


251 


241 


2-31 


2-20 


214 


207 


201 


1-94 


1-87 


1-79 


40 


6-42 


405 


346 


313 


290 


2-74 


262 


253 


2-45 


2-39 


2-29 


218 


207 


201 


1-94 


1-88 


1-80 


1-72 


1-64 


60 


6-29 


3-93 


3-34 


301 


2-79 


2 63 


251 


241 


233 


2-27 


217 


2-06 


1-94 


1-88 


1-82 


1-74 


1 67 


1-58 


1-48 


120 


6-15 


3-80 


3-23 


2-89 


2-67 


2-52 


2-39 


2-30 


2-22 


2 16 


205 


1-94 


1-82 


1 76 


1-69 


1 61 


1 53 


143 


1 31 


co 


602 


3-69 


312 


2-79 


2-57 


2-41 


2-29 


2-19 


211 


205 


194 


1-83 


1-71 


164 


1-67 


1-48 


1-39 


1-27 


100 



F=- 



-/— » where »i — S 1 lv 1 and »3 = S 1 /i>, are independent mean squares estimating a common variance <r* and based on v, and v t degrees of freedom, respectively. 



3/s, 

'v, 






4* 
N3 

O 



Percentage Points of the F-distribution (Variance Ratio) (continued) 

Upper 1 % pmrda 



^ 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


12 


15 


20 


24 


30 


40 


60 


120 


00 


1 


4052 


49996 


5403 


5625 


5764 


5859 


6928 


5981 


6022 


6056 


6106 


6167 


6209 


6235 


6261 


6287 


6313 


6339 


6366 


2 


0850 


99 00 


99 17 


99-25 


9930 


9933 


9936 


99-37 


99-39 


9940 


99-42 


9943 


9945 


9946 


9947 


99 47 


99 48 


99-49 


99 50 


3 


3412 


30 82 


29-46 


28-71 


2824 


27-91 


27-67 


27-49 


27-36 


27-23 


2705 


26-87 


26-69 


26-60 


2650 


2641 


2632 


26-22 


26-13 
1346 


4 


21-20 


1800 


16-69 


16-98 


16-62 


16-21 


14-98 


14-80 


14-66 


14-66 


14-37 


14-20 


1402 


13-93 


1384 


13-75 


1365 


13-56 


5 


1626 


1327 


1206 


11-39 


10-97 


1067 


10-46 


10-29 


1016 


1005 


9-89 


9-72 


955 


947 


938 


9-29 


9-20 


911 


902 


6 


13-75 


10-92 


9-78 


915 


8-75 


8-47 


8-26 


8-10 


7-98 


7-87 


7-72 


7-66 


7-40 


7-31 


7-23 


714 


706 


6-97 


6-88 


7 


12 25 


9 55 


8-45 


7-85 


7-46 


719 


6-99 


6-84 


6-72 


6-62 


6-47 


6-31 


6 16 


607 


6-99 


691 


6-82 


6-74 


6-65 


8 


11 26 


865 


7-69 


7-01 


6-63 


6-37 


618 


603 


6-91 


5-81 


5-67 


5-62 


636 


6-28 


5-20 


6-12 


603 


4-95 


4-86 


9 


10 56 


802 


699 


6-42 


606 


5-80 


5-61 


6-47 


6-35 


6-26 


6-11 


4-96 


4-81 


473 


4-65 


4-67 


4-48 


4-40 


4-31 


10 


10-04 


7-56 


665 


699 


5-64 


6-39 


5-20 


606 


4 94 


4-85 


4-71 


4-56 


4-41 


4 83 


4 25 


417 


408 


400 


3-91 


11 


965 


7-21 


622 


5-67 


632 


507 


4-89 


4-74 


4 63 


4 54 


440 


4-25 


4-10 


402 


394 


3 86 


3-78 


3-69 


3-60 


12 


633 


693 


6-95 


5-41 


606 


4-82 


4-64 


4-60 


439 


430 


416 


401 


3-86 


3-78 


3-70 


362 


3-54 


3-45 


3-36 


13 


907 


670 


6-74 


6-21 


4-86 


462 


444 


4-30 


4 19 


410 


3-96 


382 


3-66 


3-59 


351 


3-43 


3 34 


3-25 


3-17 


14 


886 


651 


6-56 


604 


4-69 


4 46 


4-28 


414 


403 


3-94 


3-80 


3-66 


3-61 


3-43 


3-35 


3-27 


3-18 


3-09 


3-00 


15 


8-68 


636 


542 


489 


4-56 


432 


414 


400 


3-89 


3-80 


367 


3-52 


3-37 


3-29 


321 


313 


305 


2 96 


2-87 


16 


853 


623 


5-29 


4-77 


4 44 


4-20 


403 


389 


3-78 


369 


3-55 


3-41 


3-26 


3-18 


310 


302 


293 


2-84 


2-76 


17 


8-40 


611 


5-18 


4-67 


434 


410 


393 


3-79 


3 68 


3-59 


346 


331 


316 


308 


300 


292 


2 83 


2-75 


2-65 


IS 


8-29 


601 


609 


4-58 


4-25 


401 


3-84 


3-71 


3-60 


351 


3-37 


323 


3-08 


3-00 


292 


2-84 


2-75 


2-66 


2-67 


19 


8-18 


593 


601 


4 50 


4-17 


3 94 


377 


363 


3-62 


3-43 


330 


315 


300 


2-92 


2 84 


2-76 


2-67 


2-68 


2-49 


20 


810 


5-85 


494 


4-43 


410 


387 


3-70 


3-66 


346 


337 


3-23 


309 


2-94 


2-86 


2-78 


269 


261 


252 


2-42 


21 


802 


5-78 


4-87 


4-37 


404 


381 


364 


351 


340 


3-31 


317 


303 


2-88 


2-80 


2-72 


2-64 


2-55 


2-46 


2-38 


22 


7-95 


6-72 


4 82 


4-31 


3-99 


3-76 


3-59 


3-45 


3 36 


3-26 


3 12 


298 


283 


2-75 


2 67 


2 58 


2-50 


2-40 


2-31 


23 


7-88 


5-66 


4-76 


426 


394 


3-71 


3-64 


341 


330 


321 


307 


293 


2-78 


2-70 


2-62 


2 54 


246 


235 


2-28 


24 


7-82 


6-61 


4-72 


4-22 


3-90 


3-67 


3 50 


336 


3-28 


317 


303 


2-89 


2-74 


266 


2-58 


2-49 


2-40 


2-31 


2-21 


25 


7-77 


6-67 


468 


4 18 


385 


363 


346 


332 


322 


313 


2 99 


2-85 


2-70 


2-62 


2 54 


245 


2 36 


2-27 


217 


26 


7-72 


653 


464 


414 


382 


3-59 


3-42 


3-29 


318 


309 


2 96 


2-81 


266 


2-58 


2 50 


242 


233 


2 23 


2-13 


27 


7-68 


6-49 


460 


411 


3-78 


3 66 


339 


3-26 


3-15 


306 


293 


2-78 


263 


2-55 


2-47 


238 


2-29 


2-20 


2-10 


28 


7-64 


6-45 


4-57 


407 


3-76 


353 


336 


3 23 


3 12 


3-03 


2-90 


2-75 


2-60 


252 


244 


235 


2 26 


217 


2-08 


29 


7-60 


642 


4-64 


404 


3-73 


3 60 


333 


3 20 


309 


300 


2-87 


273 


2'67 


2-49 


241 


2-33 


2 23 


2 14 


2-03 


30 


7-56 


6-39 


451 


4-02 


3-70 


3-47 


3 30 


317 


307 


208 


284 


2-70 


256 


2-47 


2 39 


2-30 


221 


211 


201 


40 


7-31 


6-18 


431 


383 


351 


3-29 


312 


2-99 


2-89 


2-80 


266 


2-52 


237 


2 26 


2-20 


2 11 


202 


1 92 


1-80 


60 


708 


4-98 


4-13 


365 


334 


312 


295 


282 


2-72 


2-63 


2-50 


235 


2-20 


2-12 


203 


1 94 


1-84 


1-73 


1-60 


120 


685 


4-79 


395 


348 


317 


2-96 


2-79 


2 66 


2 56 


2-47 


234 


219 


203 


1-95 


1 86 


1-76 


1-66 


1-53 


1-38 


00 


663 


4-61 


3-78 


332 


302 


2-80 


264 


261 


2-41 


232 


218 


204 


1-88 


1-79 


1-70 


1-59 


1-47 


1-32 


1-00 



'' = -J=-7*\ whore *i = s il"x aad A=SJv t are independent mean squares estimating a common variance <r« and based on •>, and v, degrees of freedom, respectively. 



Percentage Points of the F-distribution (Variance Ratio) (continued) 

Upper 0-5 % points 



V 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


12 


15 


20 


24 


30 


40 


60 


120 


so 


1 


16211 


20000 


21615 


22500 


23056 


23437 


23715 


23925 


24091 


24224 


24426 


24630 


24836 


24940 


25044 


25148 


25253 


25359 


25465 


2 


198 5 


1990 


1992 


199-2 


199-3 


199-3 


1994 


199-4 


1994 


1994 


1994 


199-4 


199-4 


199-5 


1995 


199-6 


199-5 


199-5 


1995 


3 


65-55 


4980 


47-47 


4619 


4539 


44 84 


44-43 


4413 


43-88 


43-69 


4339 


4308 


42-78 


42-62 


4247 


4231 


4215 


41-99 


41-83 


4 


31-33 


26-28 


24-26 


2315 


2246 


21-97 


21-62 


21-35 


21-14 


2097 


20-70 


20-44 


2017 


2003 


1989 


19-75 


1961 


19-47 


19-32 


5 


22-78 


1831 


16 53 


15 56 


14-94 


1451 


14-20 


1396 


13-77 


13-62 


13-38 


13-15 


12-90 


12-78 


1266 


12-53 


12-40 


12-27 


12-14 


6 


1863 


14-54 


1292 


1203 


11 46 


1107 


1079 


1057 


1039 


1025 


1003 


981 


9-59 


9-47 


936 


924 


912 


900 


8-88 


7 


1624 


1240 


10-88 


1005 


952 


9-16 


8-89 


868 


851 


8-38 


8-18 


7-97 


7-75 


7-65 


7-53 


7-42 


7-31 


719 


708 


8 


1469 


11-04 


960 


881 


8-30 


7-95 


769 


7-60 


7-34 


7-21 


701 


681 


661 


650 


6-40 


629 


6-18 


606 


6-95 


9 


1361 


1011 


8-72 


7-96 


7-47 


713 


6-88 


6 69 


664 


6-42 


623 


6-03 


5-83 


6-73 


6-62 


6-52 


641 


5 30 


6 19 


10 


12-83 


943 


808 


7-34 


687 


6-54 


630 


612 


6-97 


5-85 


5-66 


6-47 


6-27 


617 


507 


497 


4-86 


4-75 


4 64 


11 


12-23 


8-91 


7-60 


688 


6-42 


610 


6-86 


668 


5-54 


5-42 


6-24 


505 


4-86 


4-76 


4-65 


4-55 


444 


4-34 


4 23 


12 


11-75 


851 


7-23 


6-52 


607 


6-76 


6-52 


6 35 


6-20 


609 


4-91 


4-72 


4-53 


443 


433 


423 


412 


401 


390 


13 


11-37 


8 19 


693 


623 


5-79 


5-48 


5-25 


508 


494 


4-82 


4-64 


4-46 


4-27 


417 


407 


397 


387 


3 76 


365 


14 


11-06 


7-92 


6 68 


600 


656 


6-26 


503 


486 


4-72 


4-60 


4-43 


425 


406 


3-96 


386 


3-76 


366 


305 


344 


15 


1080 


7-70 


6-48 


5-80 


837 


607 


485 


4-67 


454 


4-42 


4-25 


407 


3-88 


3-79 


369 


3-58 


3-48 


337 


3-26 


16 


1058 


7-51 


630 


5-64 


621 


4-91 


469 


452 


4-38 


4-27 


4-10 


3-92 


3-73 


3 64 


3 54 


3-44 


333 


322 


3-11 


17 


1038 


7-35 


6 16 


5-50 


607 


4-78 


456 


4-39 


4-25 


414 


3-97 


379 


361 


351 


341 


331 


3-21 


3 10 


298 


18 


1022 


721 


603 


6-37 


4-96 


466 


4-44 


4-28 


414 


403 


3-86 


3-68 


3-50 


3-40 


3-30 


320 


3 10 


2 99 


2-87 


19 


1007 


709 


6-92 


6-27 


4-85 


4-56 


4-34 


418 


404 


393 


3-76 


359 


3-40 


331 


321 


311 


300 


289 


2-78 


20 


0-94 


6-99 


5-82 


617 


4-76 


4-47 


4-26 


409 


396 


385 


3-68 


3 50 


332 


322 


3 12 


302 


2-92 


281 


2-69 


21 


9-83 


6-89 


6-73 


609 


4-68 


439 


418 


401 


3-88 


3-77 


3-60 


343 


324 


315 


305 


2-95 


2-84 


273 


2-61 


22 


9-73 


681 


5 65 


502 


461 


432 


411 


3-94 


3-81 


3-70 


354 


336 


3-18 


308 


2-98 


2-88 


2-77 


266 


255 


23 


963 


6-73 


5-58 


4-95 


4-54 


4-26 


405 


388 


3-75 


364 


347 


3-30 


3-12 


302 


292 


2-82 


2-71 


2-60 


2-48 


24 


9 55 


6 66 


6-62 


4-89 


4-49 


4-20 


399 


3 83 


3-69 


3-59 


342 


3-25 


3-06 


297 


2-87 


2-77 


266 


2 55 


2-43 


25 


948 


660 


6-46 


4-84 


4-43 


415 


394 


3-78 


3-64 


3-54 


3-37 


3-20 


301 


292 


282 


2-72 


261 


2 50 


2-38 


26 


941 


6 54 


6-41 


4-79 


438 


4-10 


389 


3-73 


3-60 


349 


333 


315 


2-97 


287 


277 


2-67 


2-50 


245 


233 


27 


9 34 


649 


6-36 


4-74 


434 


406 


385 


3-69 


3-56 


345 


328 


311 


293 


283 


2 73 


263 


252 


241 


2-29 


28 


928 


6-44 


5-32 


4-70 


4-30 


402 


381 


365 


3-62 


3 41 


3-25 


307 


2-89 


2-79 


269 


2-59 


2-48 


237 


2 25 


29 


9 23 


640 


6-28 


4-66 


4-26 


3-98 


3-77 


3-61 


3-48 


3-38 


3-21 


304 


2-86 


2-76 


266 


2-56 


2-45 


233 


221 


30 


9-18 


635 


6 24 


462 


4-23 


395 


3-74 


3-58 


3-45 


334 


318 


301 


282 


273 


263 


2-52 


2-42 


230 


2-18 


40 


883 


607 


498 


4-37 


3-9!) 


371 


351 


335 


3-22 


3 12 


2-95 


2-78 


2 60 


2-50 


2-40 


230 


218 


2 06 


1 93 


60 


849 


5-79 


4-73 


414 


3-70 


349 


329 


3-13 


3-01 


2-90 


2-74 


2-57 


2-39 


2-29 


219 


208 


1-96 


1 83 


1-69 


120 


8- 18 


554 


4-50 


3-92 


3-55 


3 28 


309 


293 


281 


2-71 


254 


237 


2 19 


209 


1-98 


1-87 


1-75 


1 61 


1-43 


CO 


7-88 


6-30 


4-28 


372 


3-35 


309 


2-90 


2-74 


2 62 


2-52 


236 


219 


200 


1-90 


1-79 


167 


1 53 


1 36 


100 



t S IS 

•^'=-j = — V-*. where «i=5 J /i' l and t^—S % lv t ore independent mean squares estimating a common variance a* and baaed on v t and i>, degrees of freedom, respectively 



rc 



N) 



Percentage Points of the F- distribution (Variance Ratio) (continued) 
Upper 0- 1 % points 



>\ 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


12 


15 


20 


24 


30 


40 


60 


120 


00 


1 


4053* 


5000* 


5404* 


5626* 


5764* 


6859* 


5929* 


6981* 


6023* 


6056* 


6107* 


6158* 


6209* 


6235* 


6261* 


6287* 


6313* 


6340* 


6366* 


2 


998-5 


9990 


9992 


999-2 


9993 


999-3 


999-4 


999-4 


999-4 


999-4 


9994 


9994 


999-4 


9995 


999-5 


9995 


9996 


9995 


9995 


3 


1670 


1485 


141 1 


1371 


1346 


132-8 


1316 


1306 


129-9 


1292 


128-3 


127-4 


126-4 


1259 


1254 


1250 


1245 


124-0 


1235 


4 


7414 


61-26 


6618 


53-44 


61-71 


60-53 


49-66 


4900 


48-47 


4805 


47-41 


46-76 


4610 


46-77 


45-43 


4509 


44-76 


44-40 


4406 


5 


4718 


3712 


3320 


31-09 


2975 


28 84 


28 16 


27 64 


27-24 


2692 


26-42 


25-91 


2539 


25-14 


24-87 


24 60 


24-33 


2406 


23-79 


6 


3551 


2700 


2370 


21-92 


20-81 


2003 


1946 


19 03 


1869 


1841 


17-99 


17-66 


1712 


16-89 


16-67 


1644 


1621 


15 99 


16-75 


7 


2925 


21-69 


18-77 


1719 


1621 


15-52 


1502 


14-63 


1433 


1408 


13-71 


13-32 


1293 


12-73 


12-53 


1233 


1212 


11 91 


11-70 


8 


25-42 


18-49 


15-83 


1439 


1349 


1286 


1240 


1204 


11-77 


11-64 


11-19 


10-84 


1048 


10-30 


1011 


992 


9-73 


9-53 


9-33 


9 


22-86 


1639 


13-90 


12-56 


11-71 


1113 


1070 


10-37 


1011 


9 89 


9-67 


9-24 


8-90 


8-72 


8-65 


837 


819 


800 


7-81 


10 


21-04 


14-91 


12-65 


11-28 


1048 


992 


952 


920 


896 


8-75 


8-45 


813 


7-80 


7-64 


7-47 


7-30 


712 


694 


6-76 


11 


1969 


13-81 


11 56 


1035 


9-58 


905 


866 


835 


8- 12 


7-92 


7-63 


7-32 


701 


6-85 


668 


6-52 


6-35 


617 


600 


12 


18-64 


1297 


10-80 


963 


889 


8-38 


800 


7-71 


7-48 


7-29 


700 


6-71 


6-40 


6-25 


609 


5-93 


6-76 


6-59 


5-42 


13 


17-81 


1231 


1021 


907 


8-35 


7-86 


7-49 


7-21 


698 


6-80 


6-62 


623 


5-93 


6-78 


6 63 


6-47 


5-30 


614 


4-97 


14 


1714 


11-78 


973 


8-62 


7-92 


7-43 


708 


6-80 


6-68 


6-40 


613 


6-85 


6-66 


6-41 


6-25 


610 


4-94 


4-77 


4-60 


15 


1659 


11-34 


934 


8-25 


7-67 


709 


6-74 


647 


626 


608 


6-81 


6-54 


6-25 


610 


4-95 


480 


4-64 


4-47 


4 31 


16 


16 12 


1097 


900 


7-94 


7-27 


681 


6 46 


6 19 


5-98 


5-81 


5-55 


6-27 


4-99 


485 


4-70 


454 


4-39 


423 


406 


17 


1672 


10 66 


8-73 


7-68 


702 


656 


6-22 


5-96 


6-75 


6-58 


5-32 


605 


4-78 


463 


4-48 


433 


418 


402 


385 


18 


1538 


1039 


8-49 


7-46 


681 


635 


602 


676 


5-56 


5-39 


613 


4-87 


4-59 


4-45 


4-30 


4-16 


400 


3-84 


367 


19 


1608 


1016 


8-28 


7-26 


6 62 


6- 18 


6-85 


659 


6-39 


5-22 


4-97 


4-70 


443 


4-29 


414 


399 


3 84 


368 


3-51 


20 


1482 


995 


810 


710 


646 


602 


669 


644 


6-24 


508 


4-82 


4-56 


4 29 


4-15 


400 


386 


370 


354 


3-38 


21 


1459 


9-77 


7-94 


6-95 


632 


6-88 


6-56 


6-31 


611 


4-95 


4-70 


4-44 


417 


403 


3-88 


3-74 


358 


342 


3-26 


22 


1438 


9-61 


7-80 


681 


6 19 


676 


6-44 


6-19 


499 


483 


4-68 


433 


4 06 


3-92 


3-78 


363 


348 


332 


3-15 


23 


1419 


947 


7-67 


669 


608 


565 


633 


609 


489 


4-73 


4-48 


4-23 


3 96 


382 


3 68 


353 


338 


3 22 


305 


24 


1403 


934 


7-55 


659 


6-98 


6-65 


5-23 


4-99 


4-80 


4-64 


439 


4- 14 


3-87 


3-74 


3-59 


345 


329 


314 


2-97 


25 


1388 


922 


7-45 


649 


688 


5-46 


515 


491 


4-71 


4-56 


431 


406 


3-79 


366 


3-52 


337 


322 


306 


2-89 


26 


13-74 


9 12 


736 


6-41 


580 


6-38 


507 


4 83 


464 


4-48 


4-24 


3-99 


3-72 


3-59 


3-44 


3 30 


315 


299 


2-82 


27 


13-61 


902 


7-27 


633 


6-73 


5-31 


500 


4-76 


4-57 


4-41 


417 


3-92 


3-60 


3-52 


3-38 


3 23 


308 


292 


2-75 


28 


13-60 


8-93 


719 


6-25 


666 


524 


4-93 


4-69 


4-50 


435 


411 


3-86 


3-60 


3-46 


3-32 


3 18 


302 


286 


269 


29 


13-39 


885 


712 


6-19 


569 


6-18 


4-87 


4-64 


4-45 


4-29 


405 


380 


3 54 


341 


3-27 


3 12 


2-97 


281 


264 


30 


1329 


877 


705 


612 


553 


6- 12 


482 


4-58 


4-39 


4-24 


400 


3-75 


3-49 


3-36 


3-22 


307 


2-92 


276 


259 


40 


1261 


825 


660 


6-70 


5-13 


4-73 


444 


4 21 


402 


3-87 


3 64 


3-40 


316 


301 


2-87 


273 


2-57 


241 


223 


60 


11-97 


7 76 


6 17 


631 


4-76 


4-37 


409 


3-87 


369 


3-54 


331 


308 


2-83 


2-69 


2-55 


2-41 


2-25 


2 08 


1-89 


120 


11 38 


7-32 


6-79 


495 


4-42 


4-04 


3-77 


3-55 


3-38 


3-24 


302 


2-78 


2-63 


2-40 


2-26 


2-11 


1-95 


1-76 


1 54 


co 


10-83 


691 


542 


4-62 


410 


3-74 


347 


3-27 


310 


2 96 


2-74 


251 


2-27 


213 


1-99 


1 84 


1 66 


145 


100 



• Multiply these entries by 100. 

This 01 % table is based on the following sources: Colcord 4 Doming (1935); Fisher & Yatos (1953, Table V) used with the permission of tho authors and of Messrs Oliver and Boyd; 
Norton (1952). 



This table was reprinted from Biometrika Tables lor Statisticians , Vol. 1, 3rd Edition, Table 18, with the permission of the Biometrika Trustees. 



423 



Percentage Points of the t-distribution 





Q = 0-4 


0-25 


005 


0025 


0005 


0-0025 


0-0005 


V 


20 = 0-8 


0-5 


01 


005 


0-01 


0005 


0-001 


1 


0-325 


1000 


6314 


12-706 


63 657 


127-32 


63662 


2 


■289 


0816 


2-920 


4303 


9-925 


14089 


31-598 


3 


•277 


•765 


2-353 


3- 182 


6-841 


7-453 


12-924 


4 


•271 


•741 


2- 132 


2-776 


4-604 


5-598 


8-610 


5 


0-267 


0-727 


2-015 


2-671 


4032 


4-773 


6-809 


6 


•265 


•718 


1-943 


2-447 


3-707 


4-317 


5-959 


7 


■263 


•711 


1 895 


2 365 


3499 


4029 


6-408 


8 


-262 


•706 


1-860 


2-306 


3355 


3833 


5 041 


9 


•261 


•703 


1-833 


2262 


3-250 


3-690 


4-781 


10 


0-260 


0-700 


1-812 


2-228 


3- 169 


3-581 


4-587 


11 


-260 


•697 


1-796 


2-201 


3 108 


3-497 


4-437 


12 


•259 


■695 


1-782 


2179 


3055 


3428 


4-318 


13 


■259 


•694 


1-771 


2- 160 


3012 


3372 


4-221 


14 


•258 


•692 


1-761 


2 145 


2-977 


3-326 


4140 


15 


0-258 


0-691 


1-753 


2131 


2-947 


3-286 


4073 


16 


■258 


•690 


1-746 


2120 


2921 


3-252 


4015 


17 


•257 


•689 


1-740 


2-110 


2-898 


3-222 


3965 


18 


•257 


■688 


1-734 


2101 


2-878 


3197 


3-922 


19 


•257 


■688 


1-729 


2093 


2-861 


3- 174 


3 883 


20 


0-257 


687 


1-725 


2-086 


2-845 


3-153 


3 850 


21 


•257 


•686 


1-721 


2080 


2-831 


3-135 


3819 


22 


•256 


•686 


1-717 


2074 


2-819 


3119 


3-792 


23 


•256 


•686 


1-714 


2069 


2-807 


3-104 


3-767 


24 


•256 


•685 


1-711 


2064 


2-797 


3091 


3-745 


25 


0-256 


0-684 


1-708 


2060 


2-787 


3078 


3-725 


26 


•256 


•684 


1-706 


2056 


2-779 


3067 


3707 


27 


•256 


■684 


1-703 


2052 


2-771 


3057 


3-690 


28 


-256 


•683 


1-701 


2048 


2-763 


3047 


3674 


29 


•256 


•683 


1-699 


2045 


2-756 


3038 


3659 


30 


256 


0683 


1-697 


2042 


2-750 


3030 


3 646 


40 


•255 


•681 


1684 


2021 


2-704 


2-971 


3551 


60 


•254 


•679 


1-671 


2000 


2 660 


2-916 


3-460 


120 


•254 


•677 


1-658 


1-980 


2-617 


2-860 


3-373 


00 


•263 


•674 


1-645 


1-960 


2-576 


2-807 


3291 



Q- 1 -P{t\v) is the upper-tail area of the distribution for v degrees of freedom, appropriate for use in a single- 
toil test. For a two-tail test, 2Q must be used. 



This table was reprinted from Biometrika Tables (or Statisticians. Vol. 1, 3rd Edition, Table 12, with the permission of the Biometrika Trustees 



424 



Percentage Points of the X'-Distribution 



\ Q 

v \ 


0995 


0990 


0975 


0950 


0900 


750 


0500 




1 


392704. 10- 10 


157088.10-* 


982069.10-* 


393214.10-* 


00157908 


0- 1015308 


0-454936 




2 


00100251 


00201007 


00506356 


01 02587 


0-210721 


0-575364 


1-38629 




3 


00717218 


114832 


0-215795 


0-351846 


0-584374 


1-212534 


2-3C597 




4 


0-206989 


0-297109 


0-484419 


0-710723 


1063623 


1-92256 


3-35669 




5 


0-411742 


0-554298 


0-831212 


1145476 


1-61031 


2-67460 


4-35146 




6 


0-675727 


0-872090 


1-23734 


1-63538 


2-20413 


3-45460 


5-34812 




7 


0-989256 


1-239043 


1-68987 


216735 


2-83311 


4-25485 


6-34581 




8 


1-34441 


1-64650 


2- 17973 


2-73264 


3-48954 


507064 


7-34412 




9 


1-73493 


2-08790 


2-70039 


3-32511 


410816 


5-89883 


8-34283 




10 


2-15586 


2-55821 


3-24097 


3-94030 


4-86518 


6-73720 


9-34182 




11 


2-60322 


305348 


3-81075 


4-57481 


5-57778 


7-58414 


10-3410 




12 


3-07382 


3-57057 


4-40379 


5-22603 


6-30380 


8-43842 


11-3403 




13 


3-56503 


4- 10692 


500875 


5-89186 


704150 


9-29907 


12-3398 




14 


407467 


4-66043 


5-62873 


6-57063 


7-78953 


101653 


133393 




15 


4-60092 


5-22935 


6-26214 


7-26094 


8-54076 


11-0365 


14-3389 




16 


5-14221 


5-81221 


690766 


7-96165 


9-31224 


11-9122 


15-3385 




17 


5-69722 


640776 


7-56419 


8-67176 


100852 


12-7919 


16-3382 




18 


6-26480 


701491 


8-23075 


9-39046 


10-8649 


13-6753 


17-3379 




19 


6-84397 


7-63273 


8-90652 


101170 


11-6509 


14-5620 


18-3377 




20 


7-43384 


8-26040 


9-59078 


10-8508 


12-4426 


15-4518 


19-3374 




21 


803365 


8-89720 


10-28293 


11-5913 


13-2396 


16-3444 


20-3372 




22 


8-64272 


9-54249 


10-9823 


12-3380 


140415 


17-2396 


21-3370 




23 


9-26043 


1019567 


11-6886 


130905 


14-8480 


181373 


22-3369 




24 


9-88623 


10-8564 


12-4012 


13-8484 


15-6587 


190373 


23-3367 




25 


10-5197 


11-5240 


131197 


14-6114 


16-4734 


19-9393 


24-3366 




26 


111602 


121981 


13-8439 


15-3792 


17-2919 


208434 


25-3365 




27 


11-8076 


12-8785 


14-5734 


161514 


181139 


21-7494 


26-3363 




28 


12-4613 


13-5647 


15-3079 


16-9279 


18-9392 


22-6572 


27-3362 




29 


13 1211 


14-2565 


160471 


17-7084 


19-7677 


23-5666 


28-3361 




30 


13-7867 


14-9535 


16-7908 


18-4927 


20-5992 


24-4776 


29-3360 




40 


20-7065 


221643 


24-4330 


26-5093 


290505 


33-6603 


39-3353 




50 


27-9907 


29-7067 


32-3574 


34-7643 


37-6886 


42-9421 


49-3349 




60 


35-5345 


37-4849 


40-4817 


43-1880 


46-4589 


52-2938 


59 3347 




70 


43-2752 


45-4417 


48-7576 


51-7393 


55-3289 


61-6983 


69-3345 




80 


511719 


53-5401 


571532 


60-3915 


64-2778 


711445 


79-3343 




90 


59-1963 


61-7541 


65-6466 


69- 1260 


73-2911 


80-6247 


89-3342 




100 


67-3276 


700649 


74-2219 


77-9295 


82-3581 


901332 


99-3341 




X 


-2-5758 


-2-3263 


-1-9600 


- 1-6449 


-1-2816 


-0-6745 


00000 





Q = Q(X* I »') = 1 - P(X' I ») = 2-»' {r(Jy)}-' r e-»*z»"-' dx. 

J x* 



425 



Percentage Points of the X'-Distribution (continued) 



*=i i -l +x JU or x*=j^+v(2^-i)}». 



V \ 


0-250 


0100 


0050 


0025 


0010 


0005 


0001 


1 


1-32330 


2-70554 


3-84146 


502389 


6-63490 


7-87944 


10-828 


2 


2-77259 


4-60517 


5-99146 


7-37776 


9-21034 


10-5966 


13-816 


3 


4-10834 


6-25139 


7-81473 


9-34840 


11-3449 


12-8382 


16-266 


4 


5-38527 


7-77944 


9-48773 


11-1433 


13-2767 


14-8003 


18-467 


5 


6-62568 


9-23636 


110705 


12-8325 


15-0863 


10-7496 


20515 


6 


7-84080 


10-6446 


12-5916 


14-4494 


16-8119 


18- ".476 


22-458 


7 


903715 


120170 


140671 


16-0128 


18-4753 


20-2777 


24-322 


8 


10-2189 


13-3616 


15-5073 


17-5345 


200902 


21-9550 


26- 125 


9 


11-3888 


14-6837 


16-9190 


190228 


21-6660 


23-5894 


27-877 


10 


12-5489 


15-9872 


18-3070 


20-4832 


23-2093 


25-1882 


29-588 


11 


13-7007 


17-2750 


19-6751 


21-9200 


24-7250 


26-7568 


31-264 


12 


14-8454 


18-5493 


210261 


23-3367 


26-2170 


28-2995 


32-909 


13 


15-9839 


19-8119 


22-3620 


24-7356 


27-6882 


29-8195 


34-528 


14 


171169 


210641 


23-6848 


26- 1189 


291412 


31-3194 


36-123 


15 


18-2451 


223071 


24 9958 


27-4884 


30-5779 


32-8013 


37-697 


16 


19-3689 


23-5418 


20-2962 


28-8454 


31-9999 


34-2672 


39-252 


17 


20-4887 


24-7690 


27-5871 


30- 1910 


33-4087 


35-7185 


40-790 


18 


21-6049 


25-9894 


28-8693 


31-5264 


34-8053 


37- 1565 


42-312 


19 


22-7178 


27-2036 


301435 


32-8523 


361909 


38-5823 


43-820 


20 


23-8277 


28-4120 


31-4104 


341696 


37-5662 


39-9968 


45-315 


21 


24-9348 


29-6151 


320706 


35-4789 


38-9322 


41-4011 


46-797 


22 


26-0393 


30-8133 


33-9244 


36-7807 


40-2894 


42-7957 


48-268 


23 


27-1413 


320069 


35- 1725 


38-0756 


41-6384 


44-1813 


49-728 


24 


28-2412 


33- 1962 


36-4150 


39-3641 


42-9798 


45-5585 


51-179 


25 


29-3389 


34-3816 


37-6525 


40-6465 


44-3141 


46-9279 


52-618 


26 


30-4346 


35-5632 


38-8851 


41-9232 


45-6417 


48-2899 


54052 


27 


31-5284 


36-7412 


40-1133 


43-1945 


46-9629 


49-6449 


55-476 


28 


32-6205 


37-9159 


41-3371 


44-4608 


48-2782 


50-9934 


56-892 


29 


33-7109 


390875 


42-5570 


45-7223 


49-5879 


52-3356 


68-301 


30 


34-7997 


40-2560 


43-7730 


46-9792 


60-8922 


53-6720 


59-703 


40 


45-6160 


51-8051 


55-7585 


59-3417 


63-6907 


66-7660 


73-402 


50 


56-3336 


631671 


67-5048 


71-4202 


76-1539 


79-4900 


86-661 


60 


66-9815 


74-3970 


790819 


83-2977 


68-3794 


91-9517 


99-607 


70 


77-5767 


85-5270 


90-5312 


950232 


100-425 


104-215 


112317 


80 


88-1303 


96-5782 


101-879 


106-629 


112-329 


116-321 


124-839 


90 


98-6499 


107-565 


113-145 


118-136 


124116 


128-299 


137-208 


100 


109141 


118-498 


124-342 


129-561 


135-807 


140169 


149-449 


X 


+ 0-6745 


+ 1-2816 


+ 1-6449 


+ 1-9600 


+ 2-3263 


+ 2-5758 


+ 30902 


For v 


> 100 take 















according to the degree of accuracy required. X is the standardized normal deviate corresponding to 
P=Y — Q, and is shown in the bottom line of the table. 



This table was reprinted from Biometrika Tables for Statisticians , Vol. 1, 3rd Edition, Table 8, with the permission of the Biomerrika Trustees. 



426 



Notes 



m 



HEWLETT 
PACKARD 



Part No. 98820-13111 Printed in U.S.A. 

E0782 R«t Edition; July 1982