<< Chapter < Page Chapter >> Page >

Plans for higher vector ranks

These plans extract a vector loop to reduce a DFT problem to a problem of lower vector rank, which is then solved recursively. Anyof the vector loops of V could be extracted in this way, leading to a number of possible plans corresponding to different looporderings.

Formally, to solve dft ( N , V , I , O ) , where V = ( n , ι , o ) V 1 , FFTW generates a loop that, for all k such that 0 k < n , invokes a plan for dft ( N , V 1 , I + k · ι , O + k · o ) .

Indirect plans

Indirect plans transform a DFT problem that requires some data shuffling (or discontiguous operation) into a problem that requires noshuffling plus a rank-0 problem that performs the shuffling.

Formally, to solve dft ( N , V , I , O ) where N > 0 , FFTW generates a plan that first solves dft ( , N V , I , O ) , and then solves dft ( copy-o ( N ) , copy-o ( V ) , O , O ) . Here we define copy-o ( t ) to be the I/O tensor ( n , o , o ) ( n , ι , o ) t : that is, it replaces the input strides with the output strides. Thus, an indirect plan firstrearranges/copies the data to the output, then solves the problem in place.

Plans for prime sizes

As discussed in "Goals and Background of the FFTW Project" , it turns out to be surprisingly useful to be able to handle large prime n (or large prime factors). Rader plans implement the algorithm from [link] to compute one-dimensional DFTs of prime size in Θ ( n log n ) time. Bluestein plans implement Bluestein's “chirp-z” algorithm, which can also handle prime n in Θ ( n log n ) time [link] , [link] , [link] . Generic plans implement a naive Θ ( n 2 ) algorithm (useful for n 100 ).

Discussion

Although it may not be immediately apparent, the combination of the recursive rules in "The space of plans in FFTW" can produce a number of useful algorithms. To illustrate these compositions, we discuss three particular issues: depth- vs. breadth-first, loop reordering,and in-place transforms.

Two descriptions of size-30 DFT, one depth-first, and one breadth-first.
Two possible decompositions for a size-30 DFT, both for the arbitrary choice of DIT radices 3 then 2 then 5, and prime-sizecodelets. Items grouped by a " { " result from the plan for a single sub-problem. In the depth-first case, the vector rank wasreduced to zero as per "Plans for higher vector ranks" before decomposing sub-problems, and vice-versa in the breadth-first case.

As discussed previously in sections "Review of the Cooley-Tukey FFT" and "Understanding FFTs with an ideal cache" , the same Cooley-Tukey decomposition can be executed in either traditionalbreadth-first order or in recursive depth-first order, where the latter has some theoretical cache advantages. FFTW is explicitlyrecursive, and thus it can naturally employ a depth-first order. Because its sub-problems contain a vector loop that can be executed ina variety of orders, however, FFTW can also employ breadth-first traversal. In particular, a 1d algorithm resembling thetraditional breadth-first Cooley-Tukey would result from applying "Cooley-Tukey plans" to completely factorize the problem size before applying the loop rule "Plans for higher vector ranks" to reduce the vector ranks, whereas depth-first traversal would result fromapplying the loop rule before factorizing each subtransform. These two possibilities are illustrated by an example in [link] .

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Fast fourier transforms. OpenStax CNX. Nov 18, 2012 Download for free at http://cnx.org/content/col10550/1.22
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Fast fourier transforms' conversation and receive update notifications?

Ask
Prateek Ashtikar
Start Quiz
Marion Cabalfin
Start Quiz
Royalle Moore
Start Quiz
Jemekia Weeden
Start Quiz
David Bourgeois
Start Quiz