Stationarity/Nonstationarity Identification

**Stationarity/Nonstationarity Identification ***

Purpose *

Stationarity *

Testing for Stationarity *

*The Runs Test (nonparametric) **

*Stationarity of Tracking Data **

*Parametric Approaches **

Conclusions *

**References ***

The purpose of using formal time series analysis methods on sequential data is to learn "something" about the nature of the system generating the data. In our case, the system we are interested in is the human operator interacting with a multiple degree of freedom input device. The "something" we are trying to learn is the nature of human coordination. Our definition of *coordinated movement is the simultaneous motion along multiple degrees of freedom resulting in an efficient trajectory.*

This write-up is part of an exploration of time series analysis for the purpose of learning something about the nature of human coordination. The first step in all time analysis approaches is to check for __stationarity__, because if the data is stationary then many simplifying assumptions can be made.

The following definition of stationarity is taken from (Challis and Kitney November 1991).

*Stationarity*, is defined as a quality of a process in which the statistical parameters (mean and standard deviation) of the process do not change with time.

- The most important property of a stationary process is that the auto-correlation function (acf) depends on lag alone and does not change with the time at which the function was calculated.
- A
*weakly stationary*process has a constant mean and acf (and therefore variance) - A
*truly stationary (or strongly stationary)*process has all higher-order moments constant including the variance and mean.

The previous definition of stationarity is typical of what can be found in the literature. What is usually not explained in the literature is that strongly stationary processes are never seen in practice and are discussed only for their mathematical properties. Weakly stationary processes, are sometimes observed in the real world and are usually *assumed* to be "close enough" to stationarity in the strict sense (strong stationarity) to be treated as such. In addition, stationarity is really a relative term, rather than an absolute as the definition above may lead one to believe. Any process that "really" is stationary, can only be seen as stationary if the sampled data from the process is very long compared to the lowest frequency component in the data. In other words, if one collects data for only a short time, short compared to the length of wavelength of the data, then even a stationary process will appear to be nonstationary. Finally, no research exists which discusses what effect deviations, large or small, from stationarity may have on analysis techniques which require stationarity.

For the purpose of analysis, the stationarity property is a very good thing to have in one’s data, since it leads to many simplifying assumptions. Again, the first step in using any methodology for time series analysis is to check if one’s data is stationary.

There are two general approaches to testing for stationarity, parametric and nonparametric. Reviews of the literature seem to indicate that parametric approaches are those usually used by researchers working in the time domain, such as economists, who are making certain assumptions about the nature of their data. Nonparametric approaches are more commonly seen by researchers working in the frequency domain, such as electrical engineers, who often treat the system as a "black box" and can not make any basic assumptions about the nature of the system. Nonparametric tests are not based on the knowledge or assumption that the population is normally distributed (Bethea and Rhinehart 1991). By making no assumptions about the nature of the data, nonparametric tests are more widely applicable than parametric tests which often require normality in the data. While more widely applicable, the trade-off is that nonparametric tests are also less powerful than parametric tests. To arrive at the same statistical conclusion with the same confidence level, nonparametric tests require anywhere from 5% to 35% more data than parametric tests (Bethea and Rhinehart 1991).

A run is defined as "a succession of one or more identical symbols, which are followed and preceded by a different symbol or no symbol at all" (Gibbons 1985). So a series of identical flips of a coin is a run, where **H** represents heads and **T** for tails, such that

**…THHHHHHTTHTT…**

In the example above, the long succession of **H** is a counted as a run of heads. Too few or too many runs is evidence of dependency between the observations, and therefore, nonstationarity. A runs test is a counting of the number of runs in a series, and comparing the number found to what one would expect if the observations were independent of one another. The stationarity of data can be determined by using a runs test (Bendat and Piersol 1986) as follows:

- Divide the series into time intervals of equal lengths.
- Compute a mean value (or other, see below) for each interval.
- Count the number of runs of mean values above and below the
__median__value of the series. - Compare the number of counts found to known probabilities of runs for random data.

Note that the runs test works equally well on mean values, mean square values, variance, standard deviation, or any other parameter estimate (Bendat and Piersol 1986). Known probabilities of runs distributions can be found in (Bendat and Piersol 1986), (Bethea and Rhinehart 1991), and (Gibbons 1985).

However, the previous references on the runs test all deal with applying the runs test to a single observation series. This is fine if one only has one sequence of data. What about experiments where one can obtain multiple realizations of the same process? It does not seem appropriate to simply select one of the series and apply a runs test only to that series. The answer proposed here is to apply the runs test to all the data available and then to compare the distribution of runs found to the distribution of runs for a random series. For example, let’s say data from 100 realizations of the same process are collected, where nothing is known about the frequency components of the process generating the data. To test for stationarity, divide each data sample into 18 equal sized segments (the number 18 has been arbitrarily chosen, any even division should equally well). Then, count the number of runs above and below the median value (of the particular sample) in each data sample. In theory, the number of runs can range from 2 to 18 per sample. A truly random process will expect 90 of the 100 samples to have counts of at least 7 and not more than 14, where 7 and 14 are the 0.05% left and right tail cut-offs (from Table L: Number of Runs Distribution (Gibbons 1985)).

Note that using 0.05% left and right tail cut-off in the previous example is a more stringent requirement than using a 0.01% tail probability (which is in the opposite direction of the probability values from a standard ANOVA test). To illustrate this, imagine using a 0.0001% cut-off. A runs distribution table will give corresponding runs count between 2 and 18, which covers all the possibilities. A 0.0001% criteria is not stringent enough and will result in any data set passing a test for stationarity. For the purposes of the work here on human coordination, a 0.05% tail cut-off is considered to be a sufficient criteria.

Is the error data collected using the MITS software (Zhai 1995) stationary? To test for stationary, one subject was run through 38 trials of a 40 second six degree-of-freedom tracking task. The tracking error for each degree of freedom is computed as the difference between the user’s cursor position and the required target position. Data was sampled at 0.05 seconds during the task. Using the methodology from (Bendat and Piersol 1986) runs above and below the median were counted for 228 series (38 trials x 6 degrees of freedom), for different size segments.

The results of the runs count for the 40 second trials divided into 16, 14, 12, and 10 segments (corresponding to segments of 2.5, 2.9, 3.3, and 4.0 seconds in length) are presented in Figure 1. The 0.05% tail cut-offs are shown for the different size segments. In order for a process to be considered stationary, 90% of the number of runs distribution should be between the left and right tails. The results indicate that segments of 3.3 seconds and greater may be considered stationary, while segments shorter than 3.3 seconds are nonstationary.

Figure 1. (NOTE: For WEB viewing, it may be necessary to open this image separately in order to view it properly.) Runs distribution of tracking data from 228 series, divided into 2.5, 2.9, 3.3, and 4.0 second segments. The 0.05% tails are shown for the different tests. In order for a process to be considered stationary, 90% of the number of runs should be between the left and right tails. Results indicate that segments of 3.3 seconds and greater may be considered stationary, while segments shorter than 3.3 seconds are nonstationary.

According to (Bowerman and O'Connell 1979) and [Box, 1976 #24], that if a time series is *nonstationary*, then the sample auto-correlation function will neither cut off nor die down quickly, but rather *will die down extremely slowly*. The next question then becomes, exactly what is considered quick or slow? Unfortunately, there does not seem to be a quantifiable answer to this question in the literature. Clearly, the rate at which a function dies down depends upon the frequency of the signal compared to the sampling rate. Basically, parametric approaches assume a certain level of experience with the data, and with that experience one can then tell by looking whether data may be considered stationary or non-stationary.

Figure 2. The auto-correlation function of the Y translation error from a single 40 second tracking trial.

The auto-correlation function of the error from a tracking trial for one of the degrees of freedom is shown in Figure 2. Does this auto-correlation function agree with the results from the runs test? The runs test is essentially a test of independence, are different observations independent of one another or are they correlatated? Results from the runs test indicates that observations 3.3 seconds and farther apart from each other may be considered independent. The auto-correlation function shown in Figure 2 drops to near zero correlation at lags greater than about 3 to 5 seconds. The auto-correlation does seem to agree with results of the runs test.

A conclusion of stationarity is reasonable when one considers the nature of the tracking process that is occurring. Stationarity exists if the mean and variance of the data remains constant. In the MITS task, the target is restricted to a fixed volume area to move about in. If one assumes that the subject is at all times attempting to track the target with equal effort, then it reasonable to expect that the subject’s error (and the variance of that error) will remain relatively constant. The only question remains is what is the minimum time period which must be considered to achieve a "relatively constant" value. This question has been answered using the Runs Test. Based on the results from the Runs Test, segments of 3.3 seconds and larger of tracking data from the MITS tracking task may be considered independent and stationary. Segments smaller than around 3.3 seconds capture only the higher frequency components of the error process of a human tracking in six degrees of freedom.

Bendat, J. S., and Piersol, A. G. (1986). *Random Data: Analysis and Measurement Procedures*, John Wiley & Sons, Inc.

Bethea, R. M., and Rhinehart, R. R. (1991). *Applied Engineering Statistics*, Marcel Dekker, Inc., New York, NY.

Bowerman, B. L., and O'Connell, R. T. (1979). *Time Series and Forecasting*, Duxbury Press, North Scituate, Massachusetts.

Challis, R. E., and Kitney, R. I. (November 1991). "Biomedical signal processing (in four parts). Part 1 Time-domain methods." *Medical & Biological Engineering & Computing*, 28, 509-524.

Gibbons, J. D. (1985). *Nonparametric Methods for Quantitative Analysis*, American Sciences Press, Inc., Columbus, Ohio.

Zhai, S. (1995). "Human Performance in Six Degree of Freedom Input Control," Ph.D., University of Toronto, Toronto.