Benchmark, accuracy and test API

Here are the object and functions used to test pyvkfft:

accuracy testing vs scipy
benchmark transforms for opencl, cuda pyvkfft interfaces, also comparing with cufft (scikit-cuda) and clfft (gpyfft)
test module

Accuracy module

pyvkfft.accuracy.exhaustive_test(backend, vn, ndim, dtype, inplace, norm, use_lut, r2c=False, dct=False, dst=False, nproc=None, verbose=True, return_res=False)

Run tests on a large range of sizes using multiprocessing. Manual function.

Parameters:

backend -- either 'pyopencl', 'pycuda' or 'cupy'
vn -- the list/iterable of sizes n.
ndim -- the number of dimensions. The array shape will be [n]*ndim
dtype -- either np.complex64 or np.complex128, or np.float32/np.float64 for r2c/dct/dst
inplace -- True or False
norm -- either 0, 1 or "ortho"
use_lut -- if True,1, False or 0, will trigger useLUT=1 or 0 for VkFFT. If None, the default VkFFT behaviour is used. Always True by default for double precision, so no need to force it.
r2c -- if True, test an r2c transform. If inplace, the last dimension (x, fastest axis) must be even
dct -- either 1, 2, 3 or 4 to test different dct. Only norm=1 is can be tested (native scipy normalisation).
dst -- either 1, 2, 3 or 4 to test different dst. Only norm=1 is can be tested (native scipy normalisation).
nproc -- the maximum number of parallel process to use. If None, the number of detected cores will be used (this may use too much memory !)
verbose -- if True, prints 1 line per test
return_res -- if True, return the list of result dictionaries.

Returns:

True if all tests passed, False otherwise. If return_res is True, return the list of result dictionaries instead.

pyvkfft.accuracy.l2(a, b): L2 norm

pyvkfft.accuracy.li(a, b): Linf norm

pyvkfft.accuracy.test_accuracy(backend, shape, ndim, axes, dtype, inplace, norm, use_lut, r2c=False, dct=False, dst=False, gpu_name=None, opencl_platform=None, stream=None, queue=None, return_array=False, init_array=None, verbose=False, colour_output=False, ref_long_double=True, order='C')

Measure the FT accuracy by comparing to the result from scipy (if available), or numpy.

Parameters:

backend -- either 'pyopencl', 'pycuda' or 'cupy'
shape -- the shape of the array to test. If this is an inplace r2c, the fast-axis length must be even, and two extra values will be appended along x, so the actual transform shape is the one supplied
ndim -- the number of FFT dimensions. Can be None if axes is given
axes -- the transform axes. Supersedes ndim
dtype -- either np.complex64 or np.complex128, or np.float32/np.float64 for r2c & dct
inplace -- if True, make an inplace transform. Note that for inplace r2c transforms, the size for the last (x, fastest) axis must be even.
norm -- either 0, 1 or "ortho"
use_lut -- if True,1, False or 0, will trigger useLUT=1 or 0 for VkFFT. If None, the default VkFFT behaviour is used.
r2c -- if True, test an r2c transform. If inplace, the last dimension (x, fastest axis) must be even
dct -- either 1, 2, 3 or 4 to test different dct. Only norm=1 is can be tested (native scipy normalisation).
dst -- either 1, 2, 3 or 4 to test different dst. Only norm=1 is can be tested (native scipy normalisation).
gpu_name -- the name of the gpu to use. If None, the first available for the backend will be used.
opencl_platform -- the name of the OpenCL platform to use. If None, the first available will be used.
stream -- the cuda stream to use, or None
queue -- the opencl queue to use (mandatory for the 'pyopencl' backend)
return_array -- if True, will return the generated random array so it can be re-used for different parameters
init_array -- the initial (numpy) random array to use (should be filled with uniform random numbers between +/-0.5 for both real and imaginary fields), to save time. The correct type will be applied. If None, a random array is generated.
verbose -- if True, print a 1-line info for both fft and ifft results
colour_output -- if True, use some colour to tag the quality of the accuracy
ref_long_double -- if True and scipy is available, long double precision will be used for the reference transform. Otherwise, this is ignored.
order -- either 'C' (default C-contiguous) or 'F' to test a different stride. Note that for the latter, a 3D transform on a 4D array will not be supported as the last transform axis would be on the 4th dimension (once ordered by stride).

Returns:

a dictionary with (l2_fft, li_fft, l2_ifft, li_ifft, tol, dt_array, dt_app, dt_fft, dt_ifft, src_unchanged_fft, src_unchanged_ifft, tol_test, str), with the L2 and Linf normalised norms comparing pyvkfft's result with either numpy, scipy, the reference tolerance, and the times spent in preparing the initial random array, creating the VkFFT app, and performing the forward and backward transforms (including the GPU and reference transforms, plus the L2 and Linf computations - don't use this for benchmarking), 'src_fft_unchanged' and 'srf_ifft_unchanged' are True if for an out-of-place transform, the source array is actually unmodified (which is not true for r2c ifft with ndim>=2). The last fields are 'tol_test' which is True if both li_fft and li_ifft are smaller than tol, and str the string summarising the results (printed if verbose is True). If return_array is True, the initial random array used is returned as 'd0'. All input parameters are also returned as key/values, except stream, queue, return_array, ini_array and verbose.

Benchmark, accuracy and test API

Accuracy module

Benchmark module

test module