[Pdl-porters] 64bit indices on 32bit arch?

Discussion:

Henning Glawe

2014-09-24 06:59:31 UTC

Moin,
while debugging slatec failures on the debian powerpc port, I discovered
that 64bit indexing support is enabled on that architecture, although it
is a 32bit arch (ppc64 is handled separately).

The reason is that perl is configured explicitely with 64bit (long long int)
ivtype on all of debian's architectures.
While the latter is not bad per se, as ivtype is, as far as I understand it,
responsible for perl scalars, it leads to some problems with the interface
to fortran77 code:
- slatec declares matrix size arguments to be 'integer'
- gfortran uses, at least on ppc, 32bit as default 'integer' size
- the PDL slatec interface hands pointers to 64bit ints (PDL_Indx) to the
f77
on big-endian architectures, the fortran code will see 0, when a (small)
64bit number is read as a 32bit int.
given that the matrix size arguments are used as loop boundaries in slatec,
and fortran uses 1-based array indices by default, this results in
out-of-bounds memory access.

Somehow I think, as PDL_Indx is used for addressing memory, it should _not_
be derived from ivtype/ivsize, but from the architectures native pointer
size.

Any opinions?

--
c u
henning

Chris Marshall

2014-09-24 15:26:44 UTC

Permalink

Henning Glawe

2014-09-24 21:02:08 UTC

Permalink

Perl uses a special typedef IV which is a simple signed integer type
that is guaranteed to be large enough to hold a pointer (as well as an
integer). Additionally, there is the UV, which is simply an unsigned
IV.
This seemed to be the ideal way to portably determine the platform
capabilities regarding 64bit index support. It sounds like some fixes
are still needed in the PDL slatec bindings (and presumably for other
fortran-based interfaces).

I have a hack (involving gfortran option '-fdefault-integer-8' and some type
changes in Lib/Slatec/slatec.pd) running/successfully-tested both on powerpc
(32bit) and amd64 (64bit).
Unfortunately, '-fdefault-integer-8' broke minuit bindings on both
architectures, will continue tomorrow.

Agreed, having working support for big piddles is of higher priority; but,
given that 32bit architectures are defined by 32bit pointer sizes/memory
addressing ranges (which also applies to mmaped files), I think that on 32bit
architectures, a 64bit PDL_Indx is (1) pointless and (2) implies significant
computational cost, due to the complexity of 64bit arithmetics on 32bit
architectures.
The latter may be neglegible when perl deals with perl scalars, but less so
in the context of PDL...

--
c u
henning

Chris Marshall

2014-09-24 21:30:02 UTC

Permalink

Post by Henning Glawe

I have a hack (involving gfortran option '-fdefault-integer-8' and some type
changes in Lib/Slatec/slatec.pd) running/successfully-tested both on powerpc
(32bit) and amd64 (64bit).
Unfortunately, '-fdefault-integer-8' broke minuit bindings on both
architectures, will continue tomorrow.
Agreed, having working support for big piddles is of higher priority; but,
given that 32bit architectures are defined by 32bit pointer sizes/memory
addressing ranges (which also applies to mmaped files), I think that on 32bit
architectures, a 64bit PDL_Indx is (1) pointless and (2) implies significant
computational cost, due to the complexity of 64bit arithmetics on 32bit
architectures.

It should be possible to optimize the selection of the appropriate
type for PDL_Indx but initially I was trying for simple and correct.
Once the support is working fully, I do plan to have a config time
option and improvement to the type selection code will be welcome.
(Maybe adding a check against $Config{ptrsize}?)

Post by Henning Glawe
The latter may be neglegible when perl deals with perl scalars, but less so
in the context of PDL...

True. After the 64bit index support works, we definitely should
check for performance impact. For PDL3 work post the PDL-2.008
release I would like to revisit performance from scratch in the
refactoring of the core computation engine.

--Chris

Henning Glawe

2014-09-24 22:15:02 UTC

Permalink

Post by Chris Marshall
It should be possible to optimize the selection of the appropriate
type for PDL_Indx but initially I was trying for simple and correct.
Once the support is working fully, I do plan to have a config time
option and improvement to the type selection code will be welcome.
(Maybe adding a check against $Config{ptrsize}?)

Seems like the most reasonable choice; and then, the C 'int' type could be
selected by $Config{i${bits}type}, derived from $Config{ptrsize}.

Anyhow, when considering Fortran bindings, maybe we should also provide a
fortran(90) module, providing the same information as the C header 'pdl.h',
such as typedefs (in fortran context: 'kind' parameters usable in variable,
especially subroutine argument, declarations).

Post by Chris Marshall

Post by Henning Glawe
The latter may be neglegible when perl deals with perl scalars, but less so
in the context of PDL...

Great, not only performance-wize: we also should think about how we should
proceed with the 'ranking' of datatypes, e.g. the implicite data conversions
to double-precision floats that are happening now.

--
c u
henning

Chris Marshall

2014-09-25 13:50:28 UTC

Permalink

Post by Henning Glawe

Seems like the most reasonable choice; and then, the C 'int' type could be
selected by $Config{i${bits}type}, derived from $Config{ptrsize}.
Anyhow, when considering Fortran bindings, maybe we should also provide a
fortran(90) module, providing the same information as the C header 'pdl.h',
such as typedefs (in fortran context: 'kind' parameters usable in variable,
especially subroutine argument, declarations).

Sounds like a good idea. I would like to see PDL with more
uniform, consistent support for fortran.

Post by Henning Glawe

Post by Chris Marshall

Post by Henning Glawe
The latter may be neglegible when perl deals with perl scalars, but less so
in the context of PDL...

Great, not only performance-wize: we also should think about how we should
proceed with the 'ranking' of datatypes, e.g. the implicit data conversions
to double-precision floats that are happening now.

This is the debugging I was referring to (in the longlong-double-fix
branch of the sf.net git repo). I've been stalled for a while due to
a lack of time and no convenient 64bit system for debugging.
I'm hoping to set up a 64bit linux VM to work with but now we're
back to the problem of not enough time... :-)

N.B. The approach I've taken is pretty much a hack to enable
64bit index support in PDL-2.x by replacing the double type
by a union type of double and 64bit int. The changes were
made by inspection and now most of the test pass but the
failures are now of the pointer/memory/datatype type which
are much more difficult to track down absent a good debug
environment (gdb, valgrind,...)

Cheers,
Chris