[PDF] 6 Spectral Analysis -- Smoothed Periodogram Method

Any obvious trend should also be removed prior to spectral estimation. Trend produces aspectral peak at zero frequency, and this peak can dominate the spectrum such that otherimportant features are obscured. After detrending, the next steps are computation of the Fouriertransform, computation of the raw periodogram, and smoothing of the periodogram.Discrete Fourier transform. Say x , x , , x0 1 n is an arbitrary time series of length n. The1time series can be expressed as the sum of sinusoids at the Fourier frequencies of the series:x A ( 0 ) 2 ( ) c o s 2 ( ) s in 2t A f f t B f f t j j j j 0 jn / 2 A ( f ) c o s 2 f t , t 0 ,1, , n 1n / 2 n / 2where the summation is over Fourier frequenciesjf , j 1, 2 , ( n 1) / 2 ,jnand the last term in braces is included only if n is even (Bloomfield 2000, p. 38) Note that thetotal number of coefficients is n whether n is even or odd. The coefficients in (1) are given byn 12A ( f ) x c o s 2fttnt 0n 12B ( f ) x s in 2 ft.tnt 0Equations (2) are sine and cosine transforms that transform the time series x into two series oftcoefficients of sinusoids. The relationships in (2) can be more succinctly expressed in complexnotation by making use of the Euler relationixe co s x i sin x(3)and its inverse1ix ix ix ixco s x e e , sin x e e (4)2In general, observed data are strictly real-valued, but they may be regarded as complex numberswith zero imaginary parts. Suppose x , x , , x0 1 n is such a real-valued time series expressed as1complex numbers. The discrete Fourier transform (DFT) of x is given in complex notation byt(1)(2)n 11 2 ift (5)td ( f ) x ent 0Periodogram. The relationships (2) transform the time series into a series of coefficients atits Fourier frequencies. The discrete Fourier transform is the complex expression of thesecoefficientsd ( f )A ( f ) B ( f )i2 2 (6)where A and B are identical to the quantities defined in (2).The original data can be recovered from the DFT using the inverse transform2 if j tx d ( f ) e tjwhich is the complex equivalent of equation (1). (7)jNotes_6, GEOS 585A, Spring 2013 4

Figure 6.4. Raw periodogram of Wolf sunspot number, 1700-2007.Periodogram ordinates give the relative variance contributed at differentfrequency ranges centered on fundamental frequencies (after padding) of theseries. The number of points in the plot is 256 because the series has beenpadded to length 512 before periodogram analysis (see Section 6.6).Smoothing the periodogram. The periodogram is a wildly fluctuating estimate of thespectrum with high variance. For a stable estimate, the periodogram must be smoothed.Bloomfield (2000, p. 157) recommends the Daniell window as a smoothing filter for generatingan estimated spectrum from the periodogram. The modified Daniell window of span, or length,m, is defined as1g , i 1 o r i mi2 ( m 1)1, i o th e rw is em 1where m is the number of weights, or span of the filter, andg is theithi(13)weight of the filter. TheDaniell filter differs from an evenly weighted moving average (rectangular filter) only in that thefirst and last weights are half as large as the other weights. A plot of the filter weights thereforehas the form of a trapezoid. For example, Figure 6.5 shows filter weights of 5-weight Daniell andrectangular filters. The advantage of the Daniell filter over the rectangular filter for smoothingthe periodogram is that the Daniell filter has less leakage, which refers to the influence ofvariance at non-Fourier frequencies on the spectrum at the Fourier frequencies. Theleakage is related to sidelobes in the frequency response of the filter. Successive smoothing byDaniell filters with different spans gives an increasingly smooth spectrum, and is equivalent tosingle application of a resultant filter produced by convoluting the individual spans of the Daniellfilters (Bloomfield 2000, p. 157).A smoothed periodogram of the Wolf sunspot number is plotted in Figure 6.6 Thesmoothing for this example was done with successive application of Daniell filters of length 7and 11. Broader (longer) filters would give a smoother spectrum. Narrower filters would give arougher spectrum. The proper amount of smoothing is somewhat subjective, and depends on thecharacteristics of the data. If the natural periodicity of a series is such that peaks in the spectrumare closely spaced in frequency, use of too broad a filter will merge the peaks. The tradeoffs inNotes_6, GEOS 585A, Spring 2013 6

variations and emphasize the slower variations. The effects are progressively stronger as theparameter gets closer to -1.The theoretical spectrum can be written in terms of the AR parameter, the variance of theresiduals and the sample size. For an AR(1) model,24NS ( f ) , 0 f 1 / 221 a 2 a c o s( 2 f )1 12where is the variance of the residuals, N is the number of observations, a is the1autoregressive parameter, f is frequency in cycles per year (or time unit), and S ( f ) is the2theoretical spectrum. The shape is entirely determined by the AR parameter; and N act toscale the spectrum higher or lower, but do not change the relative distribution of variance overfrequency.Equation (16) can be used to generate a theoretical spectrum for any observed time series.The series is first modeled as an AR(1) process. The autoregressive parameter and variance ofresiduals are estimated from the data. Plots of spectra for different values of a show how the1spectrum varies as a function of autoregressive parameter. The special case of a 0 corresponds1to a white noise process. By analogy with visible light, white noise contains an equal mixture ofvariance at all frequencies. The theoretical spectum of white noise is a horizontal line.For a 0 , the spectrum is enhanced at the low frequencies and depleted at the higher1frequencies. By analogy with the light, the spectrum is called “red noise.”For a 0 , the process tends to create erratic short-term variations, with positive1autocorrelation at even lags and negative autocorrelation at odd lags. The spectrum is enriched atthe high frequencies, and depleted at the low frequencies. Such series are sometimes called “bluenoise.”(16)Notes_6, GEOS 585A, Spring 2013 8

Figure 6.6. Smoothed periodogram estimate of spectrum of Wolf sunspotnumber. Raw periodogram (points) smoothed by Daniell filters of length 7 and11. Bandwidth gives resolution of the spectral estimate. Spectral peak at 10.6years.Figure 6.7. Sketch of general shapes of white noise, AR(1) and AR(2) null continua.Notes_6, GEOS 585A, Spring 2013 10

where sˆ( f ) is the spectral estimate at frequency f, s( f ) is the true, and unknown value of thespectrum, assumed to be approximately constant over the interval of averaging, and the2summation g is the sum of squared weights of the Daniell filter used to smooth theuuperiodogram. The sum of periodogram weights must equal 1 for the spectral estimate to be anunbiased estimate of the true spectrum (Bloomfield 2000, p. 178). The broader the Daniell filter,the lower the sum of squares of weights and the lower the variance of the spectral estimate. Forexample, for the 3-weight Daniell filter .2 5, .5 0 , .2 5the sum of squares of weights is 0.375,while for the 5-weight filter .1 2 5, .2 5, .2 5, .2 5, .1 2 5the sum of squares is 0.2188.An approximate confidence interval for the spectral estimate can be derived by consideringthat the periodogram estimates are independent and exponentially distributed. The spectralestimate, as a sum of independent exponentially distributed quantities, is approximately2 distributed. The distribution of ˆ 2S ( f ) can be shown to be approximately with degrees offreedomwhereg2 2 guuvg2 (21)2is the sum of squared Daniell weights. The relationship in (21) can be used toplace a confidence interval around the spectral estimates. For example, a 95% confidenceinterval for sˆ( f ) is given byv sˆ( f ) v sˆ( f )s( f )2 2 (0 .9 7 5 ) (0 .0 2 5 )22where (0 .0 2 5 ) and (0 .9 7 5 ) are the 2.5% and 97.5% points of thevvv degrees of freedom.v (22)v2distribution withResolution. Resolution is the ability of the spectrum to represent the fine structure of thefrequency properties of the series. The fine structure is the variation in the spectrum betweenclosely spaced frequencies. For example, narrow peaks are part of the fine structure of thespectrum. The raw periodogram measures the variance contributions at the Fourier frequencies,or the finest possible structure. Smoothing the periodogram, for example with a Daniell filter,averages over adjacent periodogram estimates, and consequently lowers the resolution of thespectrum. The wider the Daniell filter, the greater the smoothing and the greater the decrease inresolution.If two periodic components in the series are close to the same frequency, the smoothedspectrum might be incapable of identifying, or resolving, the individual peaks. The width of thefrequency interval applicable to a spectral estimate is called the bandwidth of the estimate. If ahypothetical periodogram were to have just a single peak at a particular Fourier frequency, thesmoothed spectrum is roughly the image of the Daniell filter used to smooth the periodogram, andthe peak in the spectrum is spread out over several Fourier frequencies. How many Fourierfrequencies the peak covers depends on the spans of the filter. A reasonable measure of thebandwidth of the spectral estimate is therefore the width of the resultant Daniell filter used tosmooth the periodogram. Depending on how the resultant Daniell filter has been constructed, theshape of the filter also varies. Thus one filter may have only a few weights appreciably differentfrom zero, while another filter of the same length may have fewer or more appreciably non-zeroweights. Rather than the width of the Daniell filter, therefore, a more effective measure ofbandwidth also takes into account the values of the Daniell filter weights. One such measure ofbandwidth is the width of the rectangular filter that has the same variance as the Daniell filter.Notes_6, GEOS 585A, Spring 2013 12

The variance of the estimator is proportional to the sum of squares of the filter weights. Thebandwidth for a given Daniell filter can therefore be computed as follows:1. Compute the sum of squares of the Daniell filter weights2. Compute the number of weights n of the evenly weighted moving average that haswthe same sum of squares as computed in (1)3. Compute the bandwidth as b w n f , where f is the spacing of the Fourierwfrequencies. (Note that if the series has been padded to length N ' , the spacing istaken as 1 / N ' )Differences in smoothness, stability and resolution are illustrated for spectra of the Wolfsunspot series in Figure 6.9. A lesser amount of smoothing of the raw periodogram yields thespectrum in Figure 6.9A. A greater amount of smoothing yields the spectrum in Figure 6.9B.The bandwidths indicate the differences in resolution of the two versions of the spectral. Bothversions clearly show the main spectral peak near 11 years, but the peak is narrower and muchhigher for the spectrum with less smoothing. On the other hand, the confidence interval aroundthe spectrum is much tighter for the spectrum with greater smoothing. Trial and errorcomputating and plotting of spectral with different degrees of smoothing for the smoothedperiodogram method is a analogous to the “window smoothing” approach described in lesson 4for the Blackman-Tukey method of spectral estimation.Note that the sunspot series also exhibits a spectral peak at near frequency 0.01 (wavelength100 years). This lower-frequency fluctuation is evident also in the time plot of the series (Figure6.2). Too much smoothing (e.g., Figure 6.9B) makes it impossible to resolve this peak from trend(peak at zero frequency).6.5 Testing for periodicityA peak in the estimated spectrum can be tested for significance by comparing the spectralestimate at a given frequency with the confidence interval for the estimate. Two considerationsfor the testing are:1. A significance test requires a null hypothesis. For the spectrum, the null hypothesis isthat the spectrum at the specified frequency is not different from some “null”spectrum, or null continuum. An earlier section described a white noise nullcontinuum, an autoregressive null continuum and a null continuum based on a greatlysmoothed raw periodogram. The null hypothesis is then that the estimated spectrum isno different than this underlying spectrum.2. The confidence bands developed above (equation 22) are not simultaneous. In otherwords, the bands should be used strictly to test for significance of a peak at a specifiedfrequency, and that frequency should be specified before running the spectral analysis.This approach can be contrasted with a “fishing expedition”, in which the spectrum isestimated and then browsed to identify “significant” peaks. Simultaneous confidencebands, which would be much wider than those given by equation 22, are needed if thespectrum is to be in such an exploratory mode to pick out significant peaks.To summarize, the test for periodicity begins with specification of a period or frequency ofinterest. Second, the spectrum and its confidence interval are estimated, possibly using awindow-closing procedure. Third, a null continuum is drawn so that the peaks in the spectrumcan be compared to a “null” spectrum without those peaks but with the same broad underlyingspectral shape. Finally, the peak is judged significant at 95% if the lower CI does not include thenull continuum.Notes_6, GEOS 585A, Spring 2013 13

Figure 6.9. Spectra of Wolf sunspot number using two levels ofsmoothing of raw periodogram. (A) Smoothing with Daniell filterspans [3 5 7]. (B) Smoothing with Daniell filter spans [11 15 23]. Notethe difference in range of y-axis.Notes_6, GEOS 585A, Spring 2013 14

6.6 Additional considerations: tapering, padding and leakageTapering and padding. Tapering and padding are mathematical manipulations sometimesperformed on the time series before periodogram analysis to improve the statistical properties ofthe spectral estimates or to speed up the computations. In spectral analysis, a time series isregarded as a finite sample of an infinitely long series, and the objective is to infer the propertiesof the infinitely long series. If the observed time series is viewed as repeating itself an infinitenumber of times, the sample can be considered as resulting from applying a data window to theinfinite series. The data window is a series of weights equal to 1 for the N observations of thetime series and zero elsewhere. This data window is rectangular in appearance. The effect of therectangular data window on spectral estimation is to distort the estimated spectrum of theunknown infinite-length series by introducing leakage. Leakage refers to the phenomenon bywhich variance at an important frequency (say a frequency of a strong periodicity) “leaks” intoother frequencies in the estimated spectrum. The net effect is to produce misleading peaks in theestimated spectrum.The objective of tapering is to reduce leakage. Tapering consists of altering the ends of themean-adjusted time series so that they taper gradually down to zero. Before tapering, the mean issubtracted so that the series has mean zero. A mathematical taper is then applied. A frequentlyused taper function is the split cosine bell, given by1 1 c o s 2 t / p , 0 t p / 2 ,2w ( t ) 1, / 2 1 / 2p p t p 11 c o s 2 (1 t ) / p , 1 p / 2 t 1 2where p is the proportion of data desired to be tapered, t is the time index, and w ( t)are the taperpweights. A suggested proportion is 10%, or p 0 .1 0 , which means that 5% is tapered on eachend (Bloomfield 2000, p. 69).Padding. The Fast Fourier Transform (FFT), introduced by Cooley and Tukey (1965), is acomputational algorithm that can greatly speed up computation of the Fourier transform andspectral analysis. The FFT is most effective if the length of time series, n, has small primenumbers. One way of achieving this is to pad the time series with zeros until the length of theseries is a power of 2 before computing the Fourier transform. The padded data are defined asxx t n 0 n t n ' 0' ttwhere x is the original time series, after subtracting the mean. It can be shown (Bloomfieldt2000, p. 61) that the discrete Fourier transform of the padded series differs trivially from that ofthe original seriesAs a side effect of padding, the grid of frequencies on which the transform is calculated ischanged to a finer spacing. This change suggests that padding with zeros can also be used to alterthe Fourier frequencies such that some period of a-priori interest falls near a Fourier frequency.This is an acceptable procedure (e.g., Mitchell et al. 1966). The finer spacing of Fourierfrequencies for a given span of Daniell filter gives a spectral estimate with a narrower bandwidth(see #3 under “Resolution” above), but the increase in resolution comes at the expense of adecrease in stability of the spectral estimate (see eqn (26) below).(23)(24)Notes_6, GEOS 585A, Spring 2013 15

Effect of padding and tapering on stability. Tapering and padding both have the effect ofincreasing the variance of the spectral estimate. If the time series is tapered by the split cosinebell taper and the total proportion of the series tapered is p, the variance of the spectral estimate(see eqn (20)) is increased by a factor ofcT1 2 8 9 3 p2 8 5 p 2If the time series is padded from an initial length of N to a padded length ofvariance is increased by a factor ofcPN 'NN ', the(25) (26)If a time series has been padded and tapered, an equation of form (22) can still be used for theconfidence interval for the spectrum, except with an effective degrees of freedom defined aswherevg2 (27)2 2g c c g* T P u2* (28)A simple example will serve to illustrate the computation of a confidence interval when theseries has been padded and tapered before computation of the spectrum. Say the original timeseries has a length 300 years, a total of 20% of the series has been tapered, and that the taperedseries has then been padded to length 512 by appending zeros. Equations (25) and (26) givevariance inflation factorsandcT1 2 8 9 3 p 1 2 8 9 3(.2 ) 2 22 8 5 2 8 5 (.2 ) p u1.1163 (29)512c 1 .7 0 6 7(30)P300If the periodogram is smoothed by a 5-weight Daniell filter, {.125 .25 .25 .25 .125}, the2quantity g is given by*2 2equivalent degrees of freedom areand the 95% confidence interval is (31)g c c g 1 .1 1 6 3(1 .7 0 6 7 )(.2 1 8 8) 0 .4 1 6 9,* T P uu2 2v 4 .8 0 5,(32)2g 0 .4 1 6 9*5 sˆ( f ) s( f ) 5 sˆ( f )o r1 2 .8 3 .8 3 1 20 .3 9 sˆ( f ) s ( f ) 6 .0 1 sˆ( f )(33)Notes_6, GEOS 585A, Spring 2013 16

6.7 ReferencesAnderson, O.D.. 1976. Time series analysis and forecasting: the Box-Jenkins approach.Butterworths, London, 182 p.Blackman, R.B., and Tukey, J.W., 1959, The measurement of power spectra, from the point ofview of communications engineering: New York, Dover.Bloomfield, P., 2000, Fourier analysis of time series: an introduction, second edition: New York,John Wiley & Sons, Inc., 261 p.Chatfield, C., 2004, The analysis of time series, an introduction, sixth edition: New York,Chapman & Hall/CRC.Cooley, J.W., and Tukey, J.W., 1965, An algorithm for the machine computation of complexFourier series: Math. Comput., v. 19, p. 297-301.Daniell, P.J., 1946, Discussion on the symposium on autocorrelation in time series: J. Roy.Statist. Soc. (Suppl.), v. 8, p. 88-90.Einstein, A., 1914, Arch. Sci. Phys. Natur., v. 4.37, p. 254-256.Hamming, R.W., and Tukey, J.W., 1949, Measuring noise color, Bell Telephone LaboratoriesMemorandum.Hayes, M.H., 1996, Statistical digital signal processing and modeling: New York, John Wiley &Sons, Inc.Kay, S.M., 1988, Modern spectral estimation: Engelwood Cliffs, NJ, Prentice Hall.Lagrange, 1873, Recherches sur la manière de former des tables des planètes d'apres les seulesobservations, in Oevres de Lagrange, v. VI, p. 507-627.Mitchell, J.M., Jr., Dzerdzeevskii, B., Flohn, H., Hofmeyr, W.L., Lamb, H.H., Rao, K.N., andWallén, C.C., 1966, Climatic change: Technicall Note No. 79, report of a working groupof the Commission for Climatology; WMO No. 195 TP 100: Geneva, Switzerland,World Meteorological Organizaton, 81 p.Percival, D.B., and Walden, A.T., 1993, Spectral analysis for physical applications: CambridgeUniversity Press.Schuster, A., 1897, On lunar and solar periodicities of earthquakes: Proc. Roy. Soc., p. 455-465.The MathWorks, I., 1996, Matlab signal processing toolbox: Natick, MA, The MathWorks, Inc.Thomson, W., 1876, On an instrument for calculating the integral of te product of two givenfunctions: Proc. Roy. Soc., v. 24, p. 266-68.Vautard, R., and Ghil, M., 1989, Singular spectrum analysis in nonlinear dynamics, withapplications to paleoclimatic time series, Physica D 35, 395-424.Welch, P.D., 1967, The use of fast Fourier transform for the estimation of power spectra: Amethod based on time averaging over short, modified periodograms: IEEE Trans. AudioElectroacoust., v. AU-15, p. 70-73.Wilks, Daniel S., 1995. Statistical methods in the atmospheric sciences. Academic Press, NewYork. 467p.Notes_6, GEOS 585A, Spring 2013 17

[PDF] 6 Spectral Analysis -- Smoothed Periodogram Method

Create successful ePaper yourself

Delete template?

Save as template?