David is quite right that roles are problematic in OO C.
I think Glib / Vala is a working system that is quite good. I also believe that if you can accept âinterfacesâ rather than full, formal ârolesâ, that Glib can do that. How important is it to have âproperâ roles?
From: David Mertens
Sent: Wednesday, June 11, 2014 12:54 AM
To: Chris Marshall
Cc: Dmitry Karasik ; pdl-porters
Subject: Re: [Pdl-porters] PDL OO thoughts and Prima OO
Hey everyone,
If we just wanted a C object system that supported inheritance, I would say we should take Prima's object system as a starting point and build new classes based on Prima's Object class. I spent a weekend on this a while ago, but it'll take quite a bit more work: Prima's purpose as a GUI toolkit is pretty deeply embedded.
However, if you want roles as well, that's a different beast, and the bulk of my thoughts lie there. I think we can do it, but it'll be tricky. My concerns and ideas, in brief:
In C, object method invocation is efficient because the compiler knows
exactly where to find your function pointers. As a result, in C an object can
be cast into another class if it has the same binary layout. The problem with
roles in C is that there is no promise of binary layout. If two different
classes implement the same role, you don't know where in the vtable to find
the same method. This can be solved in a handful of ways. The most efficient
way I can think of is to make role methods a different kind of method than a
class method.
Thoughts?
David
---
In C, object method invocation is efficient because the compiler knows
exactly where to find your function pointers. Typically, the first element
of a struct definin a class is a vtable pointer, and the rest of the items
are the data elements.
struct my_string_vtable;
struct my_string {
struct my_string_vtable * methods;
char * string;
STRLEN allocated_length;
}
struct my_string_vtable {
struct my_string * copy(struct my_string *);
struct my_string * delete(sruct my_string *);
STRLEN get_length (struct my_string *);
char * get_string(struct my_string*);
}
You then invoke the methods by calling the function pointers directly from
the vtable. To get this string's length, I would say:
STRLEN the_len = the_string->methods->get_length(the_string);
(Presumably it might iterate through the characters byte-by-byte until it
found the null byte. We might consider storing this offset in the object
itself, but I use it this way for illustrative purposes.) The C compiler
knows that the first element of the_string is a pointer to the vtable. The C
compiler also knows where the get_length method resides in this vtable, so it
can locate the function to call with two pointer dereferences.
As a result, in C an object can be cast into another class if it has the same
binary layout. Suppose, for example, that the length method is supposed to
return the number of printable characters. If I had a different (i.e. Unicode)
implementation of a string, then it would need its own length method. We
might want to add a few additional methods, but as long as the first four
elements in our my_unicode_string_vtable are a copy, delete, get_length, and
get_string function, we can cast our unicode object pointer to be a string
object pointer. Any functions that operate on the string through provided
methods will Just Work. This is how C inheritance works. (Introspection
requires an additonal wrinkle, but it's not important here.)
The problem with roles in C is that there is no promise of binary layout. If
two different classes implement the same role, you don't know where in the
vtable to find the same method. Suppose I have a role, does_logger. Obviously,
if I compose the role into the class, then the class knows where those
methods live and can call them out of the vtable without trouble. But what
if I write a function that can take *anything* that does_logger? (So long as
it sticks just to the logger role's functions, it should just work.) If I
have a widget class and I create a derived widget class that does_logger,
where in the vtable are the logger methods? The derived widget must be binary
compatible with the base widget, so the logger methods must go at the end of
the vtable. The binary location of these methods will certainly be different
from the binary location of the logger methods composed into a derivative of
my_string because that class only has four methods. We are left with a
conundrum: the function cannot invoke logger commands as if they were normal
object methods because the role does_logger makes no promises about binary
layout. The compiler does not know where to look for them.
This can be solved in a handful of ways. One way is to fetch role methods via
a string hash or other index lookup. This method lookup could return a void
pointer that is the function pointer, and it would have to be cast to the
appropriate function type. Another mechanism which I think would be more
efficient at the cost of an extra argument, would involve looking up a role
vtable (once) and including that as a separate argument to role methods.
For comparison, here is a normal object method invocation:
object->vtable->method(object, arg1, arg2, ...)
Here is an index lookup, which relies upon a previously defined typedef and
global index "does_logger_log_something_idx":
typedef does_logger_log_something(void * objec);
does_logger_log_something log_func
= object->vtable->lookup_role_method(does_logger_log_something_idx);
log_func(object, arg1, arg2, ...);
And here is the role vtable lookup, which relies upon a global index for the
role, not the methods:
struct * does_logger_vtable logger_vtable;
logger_vtable = object->vtable->get_role_vtable(object, does_logger_role_idx)
logger_vtable->log_something(logger_vtable, object, arg1, arg2, ...);
The advantage of this third approach is that you perform the vtable lookup
once. Any role-based function calls within log_something have the vtable on
hand and can perform role method lookups with a simple struct member lookup.
One more thing about role vtables: they don't need to be separate vtables.
They can simply be offsets into the class's vtable. You could implement
roles so that if the passed role vtable is null, the first thing the role
method does is lookup the role vtable. In this way, if you know the class of
your object, you could simply invoke the logger method as
object->vtable->log_something(0, object, arg1, arg2, ...)
On Tue, Jun 10, 2014 at 2:52 PM, David Mertens <***@gmail.com> wrote:
Hey everyone,
I know I originally started rumblings along these lines (at least on the PDL end) a couple of years back, so obviously I've thought about it. In an effort to not spam the list with a lengthy, disorganized response, I'm going to try to collect my thoughts and be a bit more systematic about it.
Didn't want my silence to indicate indifference. :-)
David
On Wed, Jun 4, 2014 at 10:59 AM, Chris Marshall <***@gmail.com> wrote:
Hi Dmitry-
I wanted to bring your attention to a number of items tied
into PDL3 development/plans (at least mine) that could be
of interest to you and might be an opportunity for some
synergy with Prima.
* David has been working on TinyCC based JIT compiling
with his C::Blocks development. This would allow for
runtime compilation of PDL+C code.
* My current thoughts for the PDL3 kernel is that it would
be lightweight, support modern perl OO similar to Moose
and perl5-mop ideas, *and* be symmetrically usable from
C and from Perl (this is where the JIT really helps!).
* I think it is important that the PDL3 kernel be C-based
for portability and compatibility across platforms, libraries,
compilers... With C++ you lose that.
* Going with C I've come across a number of techniques
for implementation:
http://www.cs.rit.edu/~ats/books/ooc.pdf (uses a custom
code generator from a specification language)
https://wiki.gnome.org/Projects/Vala (the Vala OOC-to-C
compiler based on the Gnome Object system)
Prima (I finally was able to work through the internals
API and saw that it is very similar to the above and it
support symmetrically access to methods from either
C or perl layers)