# Halcyon Calc - Working With Statistics

Halcyon Calc can do statistical calculations data. This document has the following sections to help you work with statistics on the calculator:

# Entering Statisical Data:

Statistics data is stored in a variable called ∑DAT stored in the current directory. The contents of this variable should be a real matrix. Every row in the matrix represents a different sample. Every column in the matrix represents a different value being tracked.

For example, imagine we were gathering statistical information about 5 people's weight over time. If we weighed everyone once a week for ten weeks, then we would end up with a real matrix with 5 columns (one for each person) and 10 rows (one for each week).

To analyze the data, you need to get it into a variable called ∑DAT. You can use the standard STO and RCL operations to do so. But, there are more convenient operations for entering statistical data.

If you have your real matrix of stats data on the stack, you can just execute STO∑. It will take the item from the top of the stack and store it into ∑DAT. Make sure that item on the top of the stack is a real matrix or you will get errors when you try to analyze the data in ∑DAT.

Alternatively, you can use the ∑+ operation to add a new sample to your statistics data. The top of the stack should be a real number if you only have a single value per sample. If you have multiple values per sample (like the 5 weights of the different people in the example above), then you can have a real vector on the top of the stack. If the ∑DAT variable does not exist, it will be created and the stat(s) from the top of the stack will be stored in it. If the ∑DAT variable exists, you must make sure that the number of columns in the existing data matches the number of columns in the data you are adding. If it matches, the new sample data will be added at the end of the real matrix in ∑DAT. You can add multiple samples in a single operation if the top of the stack is a real matrix. Again, the number of columns must match existing data but each row in the matrix from the top of the stack will be added at the end of ∑DAT.

# Modifying Statistical Data:

If you have stats data in a ∑DAT variable, you can modify that data. You can remove a the last sample from the ∑DAT variable by using the ∑- operation. The last sample in the ∑DAT variable will be pushed onto the stack as either a real number or a real vector and the sample will be removed from the real matrix found in ∑DAT. If no samples are left, the ∑DAT variable is removed entirely.

Another way to modify stats data is to use RCL∑ operation to put the real matrix containing all data on the stack. Then, you can use any means to modify that real matrix. Once you have the matrix changed, you can use STO∑ operation to put it back into ∑DAT.

Finally, you can quickly delete all of your statistical data by using the

# Summarizing Statistical Data:

Once you have your stats data stored in ∑DAT, you can use a series of operations to summarize that data. These operations perform operations on each column of the data in the real matrix in ∑DAT:

Operation Description
TOT Calculate the total of each column in the stats data.
MEAN Calculate the mean of each column in the stats data.
SDEV Calculate the standard deviation of each column in the stats data.
VAR Calculate the variance of each column in the stats data.
MAX∑ Calculate the maximum of each column in the stats data.
MIN∑ Calculate the minimum of each column in the stats data.

# Comparing Statistical Data:

If your data has two or more values per sample, you can do comparisons between these values. In the example above where we had 5 people's weight over time, we can look at the relationship, if any, between any pair of people's weight.

To do these comparisons, you must use the COL∑ operation to specify the two columns you wish to compare. Push two real numbers on the stack for the two column numbers containing the data you are comparing and execute this operation. This will store the column numbers into a new variable called ∑PAR (short for stats parameters) in the current directory. The following operations use this variable and the ∑DAT variable to calculate their results:

Operation Description
CORR Calculate the correlation between the values in the two columns.
COV Calculate the covariance between the values in the two columns.
LR Perform a linear regression of the data and describe a line with an intercept and slope which best approximates the relationship between the columns.

The LR operation is special because it also stores the intercept and slope it calculates into the ∑PAR variable. Once you have done a linear regression, you can then use the PREDV operation to make a prediction. Given an input value from one data set, it predicts what the value would be for the other. If you performed a linear regression of two people's weight from the previous example, you could then answer a question like "if person X's weight was Y, what would we predict person Z's weight to be". Of course this is only meaningful if the data is likely correlated in a way that a straight line makes sense (and people's weight is probably not correlated that way).

# Calculating Probabilities:

There are some operations which do not directly work with statistics data but can be helpful for calculating the probabilities of a random variable that you know or expect has a particular distribution. These operations do not operate on ∑DAT data but can be useful for analyzing the probability of observing the values that you have obtained:

Operation Description
UTPC Calculate the probability of a random variable exceeding X given a chi-square distribution.
UTPF Calculate the probability of a random variable exceeding X given a F distribution.
UTPN Calculate the probability of a random variable exceeding X given a normal distribution.
UTPT Calculate the probability of a random variable exceeding X given a t distribution.
COMB Calculate the number of combinations.
PERM Calculate the number of permutations.