Tag Archives: regression

Durbin-Watson statistic in TI-84

Unlike the more sophisticated TI-89 and Nspire, the Durbin-Watson statistic is not included in the TI-84. Yet, calculating it is fairly straight-forward using list functions.

This statistics of regression is given as

durbinwatson4

where e is the residual list of values. To obtain this list (using a previous multiple regression example), simply subtract the actual values from the regression formula (Y7 below):

durbinwatson1 durbinwatson2

Finally, run the formula below for answer.

durbinwatson3

Advertisements

Coefficient of determination for Multiple linear regression in TI-84 Plus

After determining the parameters of multiple linear regression in TI-84 (which do not have any direct built-in function support of this calculation), the coefficient of determination can also be easily calculated using the rich set of list functions supported by TI-84. Following the previous example, the dependent variable is in Sales list, the other two independent variables are Size and Dist lists.

The Yhat list is to be prepared first. This lists store the predicted values using the regression parameters determined in the previous installment.
mreg84rsq3amreg84rsq3

Next, the mean of Y and Yhat are calculated and stored to a handy list S.

mreg84rsq2

Furthermore, three lists SYY, SYhYh, SYYh are calculated respectively.

mreg84rsq4mreg84rsq3e

mreg84rsq6mreg84rsq5

mreg84rsq8mreg84rsq7

The result is obtained by the formula below.
mreg84rsq9

White test in TI Nspire and R

The White test is a statistical test to determine whether homoskedasticity exists in a data set. This test is based on the variance from the residual values. The TI Npsire is capable of computing this test even though it is not part of built-in functions, as the residual values can be recalled from regression tests. An example including multiple regression is shown below.
white0

A scatter plot for visual inspection of heteroskedasticity.
white3

In spreadsheet mode the calculation of the data set.
white2

And in R.

R-white1

R-white2

Quick residual plot in TI Nspire

When working with regression analysis, residual plot is a handy tool to gain insights by visualization. The TI Nspire provided easy and convenient access to these plots in just a few clicks.

Using a simple linear regression as an example below:

residualplot1

Access the menu 4:Analyze > 7:Residuals will show the two options for residual plots, including Show Residual Squares and Residual Plots. The nice plotting output are show below.

residualplot2residualplot3

Multiple linear regression in TI-84 Plus

Advanced feature like multiple linear regression is not included in the TI-84 Plus SE. However, obtaining the regression parameters need nothing more than some built-in matrix operations, and the steps are also very easy. For a simple example, consider two independent x variables x1 and x2 for a multiple regression analysis.

Firstly, the values are input into lists and later turned into matrices. L1 and L2 are x1 and x2, and L3 is the dependent variable.
ti84+multireg1

Convert the lists into matrices using the List>matr() function. L1 thru L3 are converted to Matrix C thru E.
ti84+multireg2

Create an matrix with all 1s with the dimension same as L1 / L2. And then use the augment() function to create a matrix such that the first row is L1 (Matrix C), second row is L2 (Matrix D), and the third row is the all 1s matrix. In this example we will store the result to matrix F. Notice that since augment() takes only two argument at one time, we have to chain the function.
ti84+multireg3

The result of F and its transform look like below.
ti84+multireg4ti84+multireg5ti84+multireg6

Finally, the following formula is used to obtain the parameters for the multiple regression

([F]t * [F])-1 * [F]t * [E]

ti84+multireg7

The parameters are expressed in the result matrix and therefore the multiple regression equation is

y = 41.51x1 - 0.34x2 + 65.32

See also this installment to determine the correlation of determination in a multiple linear regression settings also using the TI-84.

Comparing bug prediction methods by logistic growth and Gompertz curve in Nspire

Analysis can be performed on a sample set of data with cumulative bug counts collected over 12 days to obtain parameters to fit in models for future prediction. Column A and B are data, with the standard Nspire logistic regression function executed on column C and D to obtain the parameters a,b,c. Column E is the function value of the logistic function but not the one built-in with Nspire, instead the parameters are obtained separately using the Nelder-Mead program from the previous post.
growth2

There are other models besides logistic regression for prediction, one being an sigmoid function called Gompertz function and is applied to the same data set to obtain the parameters for comparison with the more common logistic function. Since the parameters are obtained in a similar fashion as the logistic function, i.e. by minimizing the sum of errors, the Nelder-Mead program can be reused. After obtaining the parameters, the function values on the data set are calculated and shown in Column F.

The application of the Nelder-Mead program to obtain the parameters of the logistic regression is shown below. Firstly the logi function is declared, and the sum of squared error is declared in the numfunc_logi function which in turn will be passed to the nm function in order to obtain the minimum by the Nelder-Mead algorithm. As shown below the results are exactly the same with the Nspire built-in logistic regression function (a=64.003, b=9.0317, c=0.33644, albeit the Nspire formula named a,b,c differently).
growth3

The application of the Nelder-Mead program to obtain the parameters of the Gompertz function is similar.
growth4

The number of bugs, data fit for both functions are plotted in the below graph alongside with the logistic regression curve. Hard to tell which of the two functions is better?
growth1

Turns out there is some guess better than others. As the calculation of Ru value below shown, the Gompertz function provided a little better fit in this bug prediction case. To calculate, obtain the one-var stats from the bugs data (only the sum of squares of deviation, stat.SSX is needed), and then plug in other values accordingly. Similar to the R coefficient in regression analysis, the larger value is, the better the prediction. And in this case, 0.9248 from Gompertz outperformed 0.9107 from logistic.
growth5Eduguesstimate is what I’d call this conclusion 😉