Post by Dima KoganPost by Chris MarshallAfter working through a couple of iterations on this code,
I can see why you weren't a fan of handling the alignment
stuff directly. With this fix, we should be able to support
existing PDL versions until we eventually get to the PDL3
work. I'm still trying to wrap up the final fixes for the
64bit index support for PDL-2.008. We definitely need to add
better support for alignment issues in PDL3.
Hi.
OK. Wow, that was a pain. Thank you very much for the patch,
Chris. It was exactly what I was planning on doing, once I'd
get around to it. It turns out that more was needed, however.
Without more stuff, lots of tests still fail on x86. I resolved
most of the issues, and things look much better now, although
still not perfect. Some questionable "features" of the current
- I have a mostly-finished-but-not-yet-complete set of tests to
exercise various alignments. These aren't done, and appear
in a disabled state at the end of the .t file. They are done
enough to discover that for particularly poorly-aligned data
the FFTW3 planner crashes. On amd64, with double-precisiou
data this happens with alignment 2 or worse. Because of
this I'm wary of turning on these tests, since I don't
want the test suite to crash on some other platform.
I'm relatively comfortable that this isn't something you'll
ever see when actually using the module, since you get memory
from malloc() ultimately, and it's never going to give you
anything that badly aligned. So I'm tempted to just leave the
tests disabled; don't know.
Leave them disabled. Since most malloc operations will
align to the basic type (at least), the extra testing
is effort that would be best spent on better type and
alignment in the PDL3 work. If testing shows a problem,
it should be possible to implement a re-alignment warning
or error to cover those cases.
Post by Dima Kogan- The test suite was checking for plan creations, since only
the input size/type was dictating when new plans would be
made. With variable alignment, there are other factors, so
those checks are now wrong much of the time. I disabled them.
Concur.
Post by Dima Kogan- Current architecture supports running multiple FFTs in
parallel with PDL threading. I however create a single plan
for ALL the threads. It thus should be possible to have a
situation where some part of the thread is aligned and some
other part is not, thus making the computer plan incorrect
for some of the threads. I constructed a test case to tickle
this issue, but for some reason I do not see a failure. So
the current code tests for a known problem, but the known
problem does not appear to be a problem. This really needs
more investigation before a "release" can happen, whatever
that means.
If the entire data block of a pdl is aligned properly
(according to what the plan expects), then as long as
the size of the FFT dimension is a multiple of 2 for
complex float data, then all loop computations should
be aligned. For the case of a real float or complex
FFT the requirement would be a multiple of 4 or a
multiple of 2, respectively.
Documenting this would be sufficient and we could check
alignment for each FFT if needed but I think it would
be simpler to enforce the dimension size constraints
and say, for fully general operation, use double complex
FFTs.
Otherwise, I think we should push a CPAN developers
release of PDL::FFTW3 with Alien::FFTW3 listed as
an optional requirement for testing purposes. My
guess is that we may be good for a general release
assuming nothing weird pops up from the testing.
The Alien::FFTW3 stuff needs to be addressed but I
would like to get the 64bit index support fix in
first and I've spend more time than I had planned
on FFTW3. :-)
Cheers,
Chris
Post by Dima KoganI'm done looking at this for today, but the code is pushed up
and feel free to have a look.
dima