Overview

Getting Started

Basic usage of NFFT.jl is shown in the following example for 1D:

using NFFT

J, N = 8, 16
k = range(-0.4, stop=0.4, length=J)  # nodes at which the NFFT is evaluated
f = randn(ComplexF64, J)             # data to be transformed
p = plan_nfft(k, N, reltol=1e-9)     # create plan
fHat = adjoint(p) * f                # calculate adjoint NFFT
y = p * fHat                         # calculate forward NFFT

In the same way the 2D NFFT can be applied:

J, N = 16, 32
k = rand(2, J) .- 0.5
f = randn(ComplexF64, J)
p = plan_nfft(k, (N,N), reltol=1e-9)
fHat = adjoint(p) * f
y = p * fHat

Currently, the eltype of the arguments f and fHat must be compatible with that of the sampling nodes k used in the plan_nfft call. For example, if one wants to use Float32 types to save memory, one can create the plan like this:

k = Float32.(LinRange(-0.5,0.5,64))
p = plan_nfft(k, N)

The plan will then internally use Float32 types. The signals f and fHat then need to have the eltype Complex{Float32} or equivalently ComplexF32. Otherwise there will be error messages.

In the previous example, the output vector was allocated within the * method. To avoid this allocation one can use the interface

mul!(fHat, p, f)
mul!(y, adjoint(p), fHat)

which allows to pass the output vector as the first argument.

One can also perform NFFT computations directly without first creating a plan:

fHat = nfft(k, f)
y = nfft_adjoint(k, N, fHat)

These forms are more forgiving about the types of the input arguments. The versions based on a plan are more optimized for repeated use with the same sampling nodes k. Note that nfft_adjoint requires the extra argument N since this cannot be derived from the input vector as can be done for nfft.

Note

The constructor plan_nfft is meant to be a generic factory function that can be implemented in different packages. If you want to use a concrete constructor of NFFT.jl use NFFTPlan instead.

Parameters

The NFFT has the several parameters that can be passed as a keyword argument to the constructor or the nfft and the nfft_adjoint function.

ParameterDescriptionExample Values
reltolRelative tolerance that the NFFT achieves.reltol $=10^{-9}$
mKernel size parameter. The convolution matrix has 2m+1 non-zero entries around each sampling node in each dimension.m $\in \{2,\dots,8\}$
σOversampling factor. The inner FFT is of size σNσ $\in [1.25, 2.0]$
windowConvolution window $\hat{\varphi}$:kaiser_bessel
precomputeFlag indicating the precomputation strategy for the convolution matrixTENSOR
blockingFlag block partitioning should be used to speed up computationtrue
sortNodesFlag if the nodes should be sorted in a lexicographic wayfalse
storeDeconvolutionIdxFlag if the deconvolve indices should be stored. Currently this option is necessary on the GPUtrue
fftflagsflags passed to the inner AbstractFFT as flags. This can for instance be FFTW.MEASURE in order to optimize the inner FFTFFTW.ESTIMATE

In practice the default values are properly chosen and there is in most cases no need to change them. The only parameter you sometimes need to care about are the accuracy parameters reltol, m,\sigma and the fftflags.

Accuracy

On a high-level it is possible to set the accuracy using the parameter reltol. It will automatically set the low-level parameters m and \sigma. You only need to change the later if you run into memory issues. It is important that you change only reltol or the pair m,\sigma.

The relation between reltol, m, and σ depends on the window function and the NFFT implementation. We use the formula

\[w = 2m + 1 = \left\lceil \text{log}_{10} \frac{1}{\text{reltol}} \right\rceil + 1\]

which was verified for σ and the default window function. If you change the window function, you should use the parameter m, and σ instead of reltol.

Window Functions

It is possible to change the internal window function $\hat{\varphi}$. Available are

  • :gauss
  • :spline
  • :kaiser_bessel_rev
  • :kaiser_bessel
  • :cosh_type

and one can easily add more by extending the windowFunctions.jl file in Github.

However, the possibility of changing the window function is only important for NFFT researcher and not for NFFT users. Right now :kaiser_bessel provides the best accuracy and thus there is no reason to change the parameter window to something different.

Precomputation

There are different precomputation strategies available:

ValueDescription
NFFT.POLYNOMIALThis option approximates the window function by a polynomial with high degree and evaluates the polynomial during the actual convolution.
NFFT.LINEARThis option uses a look-up table to first sample the window function and later use linear interpolation during the actual convolution.
NFFT.FULLThis option precomputes the entire convolution matrix and stores it as a SparseMatrixCSC. This option requires more memory and the longest precomputation time. This allows simple GPU implementations like realized in CuNFFT.
NFFT.TENSORThis option calculates the window on demand but exploits the tensor product structure for multi-dimensional plans.

Again you don't need to change this parameter since the default NFFT.POLYNOMIAL is a good choice in most situations. You may want to use NFFT.TENSOR if you are applying the same transform multiple times since it is a little bit faster than NFFT.POLYNOMIAL but has a higher pre-computation time.

Block Partitioning

Internally NFFT can use block partitioning to speedup computation. It helps in two ways

  • It helps improving the memory efficiency by grouping sampling points together which allows for better use of CPU caches.
  • Block partitioning is a mandatory to enable multi-threading in the adjoint NFFT, which would otherwise not be possible because of a data dependency.

We enable block partitioning by default since it helps also in the single-threaded case and thus, there usually is no reason to switch it off.

Multi-Threading

Most parts of NFFT are multi-threaded when running on the CPU. To this end, start Julia with the option

julia -t T

where T it the number of desired threads. NFFT.jl will use all threads that are specified.

Directional

There are special methods for computing 1D NFFT's for each 1D slice along a particular dimension of a higher dimensional array.

J = 11

y = rand(J) .- 0.5
N = (16,20)
p1 = plan_nfft(y, N, dims=1)
f = randn(ComplexF64, N)
fHat = p1 * f

Here size(f) = (16,20) and size(fHat) = (11,20) since we compute an NFFT along the first dimension. To compute the NFFT along the second dimension

p2 = plan_nfft(y, N, dims=2)
fHat = p2 * f

Now size(fHat) = (16,11).