Sunday 14 February 2016

Explicit template instantiation as an organization tool

When you create a class template, the easiest way to organize your source code is to splat everything in a header file and be done with it. Users of your class just #include your header, and Bob's an annoying (in an endearing way) relative you get to see once a year.

Of course, the easiest way has a couple of drawbacks.
  1. Your header's code will get compiled for each translation unit (TU) that #includes it, even for similar types, i.e., if it has already been compiled before. During linking, all those equal copies (instantiations for the same types) will be discarded.
  2. Since your header contains the implementation, it's chock-full of... implementation details. Among other things, this means it probably #includes other headers. See #1.
This means your compilation time for class templates will put on some weight. Wouldn't it be nice if we could put it on a diet?

I've been taking a shot a this issue for the last few days, with the following goals:
  • The header should contain only declarations.
  • The implementation should go on a separate file.
  • The class template user should have as little work, and as many options available, as possible.
So, we'll begin with the header file. This file contains the class template declaration, and this is the file that will be #included by all the code that uses our class template.

Note: I'm leaving out the include guards, but they're necessary, same as in any other header.

listener.h
template <typename T>
class Listener
{
public:
    Listener();
    ~Listener();
private:
    T i;
};


We place the implementation in another file.

listener.ipp
#include "listener.h"
#include <iostream>

#include <string>

void print(std::string s)
{
    std::cout << s << '\n';
}

template <typename T>
Listener<T>::Listener()
{
    print("Listener:ctor");
}

template <typename T>
Listener<T>::~Listener()
{
    print("Listener:dtor");
}


Note we don't have to #include <iostream> or <string> in the header.

However, if we tried to build our program like this, it would fail, because the TU containing the header would not see the implementation. Which means we'll get missing symbols at link-time.

So, here we get to the first task for the class template user - creating an artificial TU to generate the code he needs.

listener_session.cpp
#include "listener.ipp"
#include "session.h"

template class Listener<Session>;


This explicit instantiation of the class template will generate all the template's members, which will allow the linker to find them and resolve the undefined references to those symbols.

Simple, heh?

Well, keeping the template declaration and definition in separate files is all fine and dandy, but if it were that simple, everyone would be doing it. There is a trade-off, and, as is usually the case, 80% of the time it’s probably not necessary; in fact, I suspect the old 80/20 rule may even be more skewed in this case.

Which begs the question – why am I spending time with this? Because that’s how I learn.

So, what’s the trade-off here? In order for this to work you must create explicit instantiations for all the types you will be using. So, instead of just #include-ing the header and defining your Listener<Whatever> variables, your project must have one or more TUs that #include both the header and the implementation and then contain explicit instantiations for the types you’ll be using, just as shown with listener_session.cpp, above.

And this means the projects that stand to benefit more from this practice are the ones where it requires the most work, namely, large projects.

The class template author can’t predict all the types that will parametrize the template, so he can’t supply the explicit instantiations (although sometimes he does, more about that in a minute). So, this means it’s up to the class template user to create the TUs with the required explicit instantiations. I happen to think the trade-off is worth it, but I’ve never been involved in a huge C++ project, so I could be wrong.

Has it been a minute, already? OK, let’s go to the case where the class template author organizes his code like this and supplies the explicit instantiations – when he wants to limit the types available to instantiate the template. By doing this, and withholding the source code, the class author guarantees that users of his template can’t instantiate it with any type other than those supplied.

So, we get the first two goals, but so far the third goal doesn't look good. There's not much we can do, within the language, to reduce the work required from our user. But we can give him more options.

Suppose our user doesn't actually care about all this. Maybe he's just developing a hello_template_world, or he loves coffee, and doesn't mind a 10 min. coffee-break every 30 mins.

In that case, we have several options.

Just include the .ipp file.
That's it, nothing more needs to be done.

Create a header the includes the .ipp file.
This allows the author some liberty with the naming of the .ipp file, if necessary. Other than that, it's just the same as including the .ipp file.

Quite simple, heh?

Yes, indeed. And quite wrong, too. We won't have much problem with the class template members, but that void print(std::string s) will send us straight into duplicate-symbol-land (BTW, this is actually the only reason why it exists).

So, how can we go about it? There's a simple solution (really), and it doesn't require that much extra work.

listener.h
template <typename T>
class Listener
{
public:
    Listener();
    ~Listener();
private:
    T i;
};

#if defined(LISTENER_FULL_HEADER) || defined(ALL_FULL_HEADER)
#define LISTENER_INLINE inline
#include "listener.ipp"
#else
#define LISTENER_INLINE

#endif

We give our users two options, they can either #define LISTENER_FULL_HEADER, which will only apply full inclusion to this class template; or they can #define ALL_FULL_HEADER, which will apply full inclusion to every class template.

And our implementation becomes this:

listener.ipp
#include "listener.h"
#include <iostream>

#include <string>

LISTENER_INLINE
void print(std::string s)
{
    std::cout << s << '\n';
}

template <typename T>
LISTENER_INLINE
Listener<T>::Listener()
{
    print("Listener:ctor");
}

template <typename T>

LISTENER_INLINE
Listener<T>::~Listener()
{
    print("Listener:dtor");
}


#undef LISTENER_INLINE

The #undef at the end is just for preprocessor hygiene.

How do we know this actually works as advertised? Here’s what we get when we build the program with each option. The code was built with VS 2015, and we used dumpbin /symbols on the resulting .obj files.

The results are edited, to fit on a single line.

Full inclusion gives us this:

ear.obj
1B0 SECT65 notype External Listener?3?5dtor?6?$AA@
1B4 SECT66 notype External Listener?3?5ctor?6?$AA@


main.obj
1C9 SECT6B notype External Listener?3?5dtor?6?$AA@
1CD SECT6C notype External Listener?3?5ctor?6?$AA@


The symbols are defined on the two object files, one of these will be dropped when linking.

Separate implementation gives us this:

ear.obj
020 UNDEF notype () External Listener<Session>::Listener<Session>(void))
021 UNDEF notype () External Listener<Session>::~Listener<Session>(void))


main.obj
075 UNDEF notype () External Listener<Session>::Listener<Session>(void))
076 UNDEF notype () External Listener<Session>::~Listener<Session>(void))


listener_instant.obj
199 SECT60 notype External Listener?3?5dtor?6?$AA@
19D SECT61 notype External Listener?3?5ctor?6?$AA@


We get the undefined symbols on ear.obj and main.obj, which will be resolved during linking.

So far, so good. Now, time to move this out of Proof-of-Concept-Land.

No comments:

Post a Comment