Overview
Getting Started
Basic usage of NFFT.jl is shown in the following example for 1D:
using NFFT
J, N = 8, 16
k = range(-0.4, stop=0.4, length=J) # nodes at which the NFFT is evaluated
f = randn(ComplexF64, J) # data to be transformed
p = plan_nfft(k, N, reltol=1e-9) # create plan
fHat = adjoint(p) * f # calculate adjoint NFFT
y = p * fHat # calculate forward NFFT
In the same way the 2D NFFT can be applied:
J, N = 16, 32
k = rand(2, J) .- 0.5
f = randn(ComplexF64, J)
p = plan_nfft(k, (N,N), reltol=1e-9)
fHat = adjoint(p) * f
y = p * fHat
Currently, the eltype of the arguments f
and fHat
must be compatible with that of the sampling nodes k
used in the plan_nfft
call. For example, if one wants to use Float32
types to save memory, one can create the plan like this:
k = Float32.(LinRange(-0.5,0.5,64))
p = plan_nfft(k, N)
The plan will then internally use Float32
types. The signals f
and fHat
then need to have the eltype Complex{Float32}
or equivalently ComplexF32
. Otherwise there will be error messages.
In the previous example, the output vector was allocated within the *
method. To avoid this allocation one can use the interface
mul!(fHat, p, f)
mul!(y, adjoint(p), fHat)
which allows to pass the output vector as the first argument.
One can also perform NFFT computations directly without first creating a plan:
fHat = nfft(k, f)
y = nfft_adjoint(k, N, fHat)
These forms are more forgiving about the types of the input arguments. The versions based on a plan are more optimized for repeated use with the same sampling nodes k
. Note that nfft_adjoint
requires the extra argument N
since this cannot be derived from the input vector as can be done for nfft
.
The constructor plan_nfft
is meant to be a generic factory function that can be implemented in different packages. If you want to use a concrete constructor of NFFT.jl
use NFFTPlan
instead.
Parameters
The NFFT has the several parameters that can be passed as a keyword argument to the constructor or the nfft
and the nfft_adjoint
function.
Parameter | Description | Example Values |
---|---|---|
reltol | Relative tolerance that the NFFT achieves. | reltol $=10^{-9}$ |
m | Kernel size parameter. The convolution matrix has 2m+1 non-zero entries around each sampling node in each dimension. | m $\in \{2,\dots,8\}$ |
σ | Oversampling factor. The inner FFT is of size σN | σ $\in [1.25, 2.0]$ |
window | Convolution window $\hat{\varphi}$ | :kaiser_bessel |
precompute | Flag indicating the precomputation strategy for the convolution matrix | TENSOR |
blocking | Flag block partitioning should be used to speed up computation | true |
sortNodes | Flag if the nodes should be sorted in a lexicographic way | false |
storeDeconvolutionIdx | Flag if the deconvolve indices should be stored. Currently this option is necessary on the GPU | true |
fftflags | flags passed to the inner AbstractFFT as flags . This can for instance be FFTW.MEASURE in order to optimize the inner FFT | FFTW.ESTIMATE |
In practice the default values are properly chosen and there is in most cases no need to change them. The only parameter you sometimes need to care about are the accuracy parameters reltol
, m
,\sigma
and the fftflags
.
Accuracy
On a high-level it is possible to set the accuracy using the parameter reltol
. It will automatically set the low-level parameters m
and \sigma
. You only need to change the later if you run into memory issues. It is important that you change only reltol
or the pair m
,\sigma
.
The relation between reltol
, m
, and σ
depends on the window function and the NFFT implementation. We use the formula
\[w = 2m + 1 = \left\lceil \text{log}_{10} \frac{1}{\text{reltol}} \right\rceil + 1\]
which was verified for σ
and the default window function. If you change the window function, you should use the parameter m
, and σ
instead of reltol
.
Window Functions
It is possible to change the internal window function $\hat{\varphi}$. Available are
:gauss
:spline
:kaiser_bessel_rev
:kaiser_bessel
:cosh_type
and one can easily add more by extending the windowFunctions.jl file in Github.
However, the possibility of changing the window function is only important for NFFT researcher and not for NFFT users. Right now :kaiser_bessel
provides the best accuracy and thus there is no reason to change the parameter window
to something different.
Precomputation
There are different precomputation strategies available:
Value | Description |
---|---|
NFFT.POLYNOMIAL | This option approximates the window function by a polynomial with high degree and evaluates the polynomial during the actual convolution. |
NFFT.LINEAR | This option uses a look-up table to first sample the window function and later use linear interpolation during the actual convolution. |
NFFT.FULL | This option precomputes the entire convolution matrix and stores it as a SparseMatrixCSC . This option requires more memory and the longest precomputation time. This allows simple GPU implementations like realized in CuNFFT. |
NFFT.TENSOR | This option calculates the window on demand but exploits the tensor product structure for multi-dimensional plans. |
Again you don't need to change this parameter since the default NFFT.POLYNOMIAL
is a good choice in most situations. You may want to use NFFT.TENSOR
if you are applying the same transform multiple times since it is a little bit faster than NFFT.POLYNOMIAL
but has a higher pre-computation time.
Block Partitioning
Internally NFFT can use block partitioning to speedup computation. It helps in two ways
- It helps improving the memory efficiency by grouping sampling points together which allows for better use of CPU caches.
- Block partitioning is a mandatory to enable multi-threading in the adjoint NFFT, which would otherwise not be possible because of a data dependency.
We enable block partitioning by default since it helps also in the single-threaded case and thus, there usually is no reason to switch it off.
Multi-Threading
Most parts of NFFT are multi-threaded when running on the CPU. To this end, start Julia with the option
julia -t T
where T
it the number of desired threads. NFFT.jl will use all threads that are specified.
Directional
There are special methods for computing 1D NFFT's for each 1D slice along a particular dimension of a higher dimensional array.
J = 11
y = rand(J) .- 0.5
N = (16,20)
p1 = plan_nfft(y, N, dims=1)
f = randn(ComplexF64, N)
fHat = p1 * f
Here size(f) = (16,20)
and size(fHat) = (11,20)
since we compute an NFFT along the first dimension. To compute the NFFT along the second dimension
p2 = plan_nfft(y, N, dims=2)
fHat = p2 * f
Now size(fHat) = (16,11)
.