Thursday, May 29, 2008

What is really a POD class?

For a project at hand, I'm using the offsetof() macro with a C++ class, and in an attempt to avoid a warning, I stumbled over the following, to my eyes, strange behavior.

Let's start with the following code:

class MyClass {
public:
  MyClass() { ... }

  size_t get_offset() { return offsetof(MyClass, y); }  // Gives warning
private:
  int x,y,z;
};
When compiling, I get the warning
offsetof.cc: In member function ‘int MyClass::get_offset() const’:
offsetof.cc:14: warning: invalid access to non-static data member ‘MyClass::y’ of NULL object
offsetof.cc:14: warning: (perhaps the ‘offsetof’ macro was used incorrectly)
The source of the problem is that offsetof() macro can only be used with POD structures, i.e., from Chapter 9 paragraph 4 in ISO/IEC 14882: 2003:
A POD-struct is an aggregate class that has no non-static data members of type non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator and no user-defined destructor.
Cool, so let's move all the members to a base class and inherit from it instead. That will make the base class a POD, so we can get the offset of the member from there and just add the other stuff in the subclass.
struct Base {
  int x,y,z;
};

class MyClass : public MyBase {
public:
  MyClass() { ... }

  size_t get_offset() { return offsetof(MyBase, y); }
};
... and as a result, I got no warnings! Very nice.

Aw, shoot, I cannot have x, y, and z public, I'd better make them protected so that MyClass can work with them, but they are not available to anybody else (encapsuling the state like a good programmer):

struct Base {
protected:
  int x,y,z;
};

class MyClass : public MyBase {
public:
  MyClass() { ... }

  size_t get_offset() { return offsetof(MyBase, y); }
};
... and then let's just compile it and off we go:
$ g++ -ggdb -Wall -ansi -pedantic    offsetof.cc   -o offsetof
offsetof.cc: In member function ‘int MyClass::get_offset() const’:
offsetof.cc:8: error: ‘int MyBase::y’ is protected
offsetof.cc:16: error: within this context
offsetof.cc:16: warning: invalid access to non-static data member ‘MyBase::y’ of NULL object
offsetof.cc:16: warning: (perhaps the ‘offsetof’ macro was used incorrectly)
Hey! What is going on now! Just making the member variables protected does not make the struct non-POD... or? Well, it turns out that it actually does, let us read that paragraph again (with the boldface emphasis added by me):
A POD-struct is an aggregate class that has no non-static data members of type non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator and no user-defined destructor.
So, what is an aggregate class then? Moving on to 8.5.1/1, we have:
An aggregate is an array or a class with no use-declared constructors, no private or protected non-static data members, no base classes, and no virtual functions.
So, making the members protected actually made the base class non-POD. Shoot... now what? Well, that restriction only applies to the MyBase class, not to the way we inherit from that class, so by just using protected inheritance, I will make the member variables protected within MyClass while MyBase is a POD, like this:
struct MyBase {
  int x,y,z;
};

class MyClass : protected MyBase {
public:
  MyClass() { x = 1; y = 2; z = 3; }

  int get_offset() const {
    return offsetof(MyBase, y);
  }

};
This also means that I finally found a good reason for protected inheritance, which is one of the language gadgets I really never have not seen any use for until now.

2 comments:

Unknown said...

This may be stretching the standard's wording a bit much? While the compiler is silenced, I don't think get_offset() really does what's intended anymore. Consider:

[1]
class Another {
public:
int a,b,c;
};

class MyClass : protected Another, protected MyBase {
public:
MyClass() { x = 1; y = 2; z = 3; }

int get_offset() const {
return offsetof(MyBase, y);
}
};

[2] If MyClass had a vtpr.

In either case get_offset() would not point to the intended MyClass::y member.

Mats Kindahl said...

You are entirely correct that is does not do the same thing for the cases you mention. Also, the intention of a class is to encapsule data, but by casting the subclass into the base class, the encapsuling can be circumvented, which is not really a good thing.

The main problem, however, is that we get a warning for a case that ought to be safe: that of using offsetof() for getting the offset of a member variable in a context where the member variable is accessible. It would be better to not produce a warning for the first case, and hence not force programmers to work around the warning to a case that is not really safe. Note that the compiler should still produce a warning (or even an error) for the cases you give.

Also, the main points of the post is to clarify what an aggregate class is and the restrictions on the offsetof() macro.