Thursday, 18 February 2016

Explicit template instantiation - Exhibit 1

As I promised last time, let's put our design to work on some code.

Back in 2013, I wrote a couple of template functions as a quick solution to a very specific problem - outputting non-ASCII characters on a Windows console in a program compiled with Mingw. Here's the header, and here's the source.

I've now applied the idea from the last post, and changed the code from full inclusion to explicit instantiation. There are other changes to be made, but those will wait.

If you go back in the header's history, you'll see that the previous version used full inclusion. As such, this was included in every translation unit (TU) that used this code:

#include "boost/locale.hpp"
#include "windows.h"
#include <cassert>
#include <locale>
#include <sstream>
#include <string>

Not necessarily outrageous. However, all of this was included because of a couple of functions. Not only that, but because the implementation was in the header, some helper functions had to be declared in the header, too, thus leaking implementation details.

Now, applying our new-fangled design, here's what we get.

Header (win_console_out.h)

Contains just the declaration for the functions.

template <typename CharT>
std::string ConvertOutput(CharT const* s);

template <typename CharT>
std::string ConvertOutput(std::basic_string<CharT> const& s);

template <typename CharT = DefaultCharT, typename T = void>
std::string ConvertOutput(T const& t);

Because the only #include we need is <string>, everything else has been removed from the header.

Source (win_console_out.cpp)

Just as before, contains the non-template helper functions. These are not part of the interface, and were removed from the header.

Implementation (win_console_out.ipp)

Most of the previous content of the header ended up here - all the #includes, the declarations for the helper functions, and the template functions definitions.

Explicit instantiation

Finally, each user of this code will supply its own explicit instantiation. In this case, you can see here what we're defining:

template std::string ConvertOutput(char const* s);
template std::string ConvertOutput(

    std::basic_string<char> const& s);
template std::string ConvertOutput(wchar_t const* s);
template std::string ConvertOutput(

    std::basic_string<wchar_t> const& s);

And that's it. I'll use my extensive (hah!) body of published code as a test for this design. If it works, I plan to keep using it, in order to reveal its ugly warts.

I've also created the mechanism for reverting to full inclusion via #defined variables, either globally or on a header-by-header basis. As I usually say, I believe users should have the right to choose, and I strive to keep to that principle.

Sunday, 14 February 2016

Explicit template instantiation as an organization tool

When you create a class template, the easiest way to organize your source code is to splat everything in a header file and be done with it. Users of your class just #include your header, and Bob's an annoying (in an endearing way) relative you get to see once a year.

Of course, the easiest way has a couple of drawbacks.
  1. Your header's code will get compiled for each translation unit (TU) that #includes it, even for similar types, i.e., if it has already been compiled before. During linking, all those equal copies (instantiations for the same types) will be discarded.
  2. Since your header contains the implementation, it's chock-full of... implementation details. Among other things, this means it probably #includes other headers. See #1.
This means your compilation time for class templates will put on some weight. Wouldn't it be nice if we could put it on a diet?

I've been taking a shot a this issue for the last few days, with the following goals:
  • The header should contain only declarations.
  • The implementation should go on a separate file.
  • The class template user should have as little work, and as many options available, as possible.
So, we'll begin with the header file. This file contains the class template declaration, and this is the file that will be #included by all the code that uses our class template.

Note: I'm leaving out the include guards, but they're necessary, same as in any other header.

listener.h
template <typename T>
class Listener
{
public:
    Listener();
    ~Listener();
private:
    T i;
};


We place the implementation in another file.

listener.ipp
#include "listener.h"
#include <iostream>

#include <string>

void print(std::string s)
{
    std::cout << s << '\n';
}

template <typename T>
Listener<T>::Listener()
{
    print("Listener:ctor");
}

template <typename T>
Listener<T>::~Listener()
{
    print("Listener:dtor");
}


Note we don't have to #include <iostream> or <string> in the header.

However, if we tried to build our program like this, it would fail, because the TU containing the header would not see the implementation. Which means we'll get missing symbols at link-time.

So, here we get to the first task for the class template user - creating an artificial TU to generate the code he needs.

listener_session.cpp
#include "listener.ipp"
#include "session.h"

template class Listener<Session>;


This explicit instantiation of the class template will generate all the template's members, which will allow the linker to find them and resolve the undefined references to those symbols.

Simple, heh?

Well, keeping the template declaration and definition in separate files is all fine and dandy, but if it were that simple, everyone would be doing it. There is a trade-off, and, as is usually the case, 80% of the time it’s probably not necessary; in fact, I suspect the old 80/20 rule may even be more skewed in this case.

Which begs the question – why am I spending time with this? Because that’s how I learn.

So, what’s the trade-off here? In order for this to work you must create explicit instantiations for all the types you will be using. So, instead of just #include-ing the header and defining your Listener<Whatever> variables, your project must have one or more TUs that #include both the header and the implementation and then contain explicit instantiations for the types you’ll be using, just as shown with listener_session.cpp, above.

And this means the projects that stand to benefit more from this practice are the ones where it requires the most work, namely, large projects.

The class template author can’t predict all the types that will parametrize the template, so he can’t supply the explicit instantiations (although sometimes he does, more about that in a minute). So, this means it’s up to the class template user to create the TUs with the required explicit instantiations. I happen to think the trade-off is worth it, but I’ve never been involved in a huge C++ project, so I could be wrong.

Has it been a minute, already? OK, let’s go to the case where the class template author organizes his code like this and supplies the explicit instantiations – when he wants to limit the types available to instantiate the template. By doing this, and withholding the source code, the class author guarantees that users of his template can’t instantiate it with any type other than those supplied.

So, we get the first two goals, but so far the third goal doesn't look good. There's not much we can do, within the language, to reduce the work required from our user. But we can give him more options.

Suppose our user doesn't actually care about all this. Maybe he's just developing a hello_template_world, or he loves coffee, and doesn't mind a 10 min. coffee-break every 30 mins.

In that case, we have several options.

Just include the .ipp file.
That's it, nothing more needs to be done.

Create a header the includes the .ipp file.
This allows the author some liberty with the naming of the .ipp file, if necessary. Other than that, it's just the same as including the .ipp file.

Quite simple, heh?

Yes, indeed. And quite wrong, too. We won't have much problem with the class template members, but that void print(std::string s) will send us straight into duplicate-symbol-land (BTW, this is actually the only reason why it exists).

So, how can we go about it? There's a simple solution (really), and it doesn't require that much extra work.

listener.h
template <typename T>
class Listener
{
public:
    Listener();
    ~Listener();
private:
    T i;
};

#if defined(LISTENER_FULL_HEADER) || defined(ALL_FULL_HEADER)
#define LISTENER_INLINE inline
#include "listener.ipp"
#else
#define LISTENER_INLINE

#endif

We give our users two options, they can either #define LISTENER_FULL_HEADER, which will only apply full inclusion to this class template; or they can #define ALL_FULL_HEADER, which will apply full inclusion to every class template.

And our implementation becomes this:

listener.ipp
#include "listener.h"
#include <iostream>

#include <string>

LISTENER_INLINE
void print(std::string s)
{
    std::cout << s << '\n';
}

template <typename T>
LISTENER_INLINE
Listener<T>::Listener()
{
    print("Listener:ctor");
}

template <typename T>

LISTENER_INLINE
Listener<T>::~Listener()
{
    print("Listener:dtor");
}


#undef LISTENER_INLINE

The #undef at the end is just for preprocessor hygiene.

How do we know this actually works as advertised? Here’s what we get when we build the program with each option. The code was built with VS 2015, and we used dumpbin /symbols on the resulting .obj files.

The results are edited, to fit on a single line.

Full inclusion gives us this:

ear.obj
1B0 SECT65 notype External Listener?3?5dtor?6?$AA@
1B4 SECT66 notype External Listener?3?5ctor?6?$AA@


main.obj
1C9 SECT6B notype External Listener?3?5dtor?6?$AA@
1CD SECT6C notype External Listener?3?5ctor?6?$AA@


The symbols are defined on the two object files, one of these will be dropped when linking.

Separate implementation gives us this:

ear.obj
020 UNDEF notype () External Listener<Session>::Listener<Session>(void))
021 UNDEF notype () External Listener<Session>::~Listener<Session>(void))


main.obj
075 UNDEF notype () External Listener<Session>::Listener<Session>(void))
076 UNDEF notype () External Listener<Session>::~Listener<Session>(void))


listener_instant.obj
199 SECT60 notype External Listener?3?5dtor?6?$AA@
19D SECT61 notype External Listener?3?5ctor?6?$AA@


We get the undefined symbols on ear.obj and main.obj, which will be resolved during linking.

So far, so good. Now, time to move this out of Proof-of-Concept-Land.

Monday, 8 February 2016

This behavior is by design

Visual customization (or personalization, or whatever else you want to call the ability to configure software to suit your specific needs) is a big thing for me. Especially for software that I use for long periods of time. And, for this kind of software, there's nothing more important than changing the color scheme. Sure, changing the font can be nice, changing the font size is definitely important, but changing the color scheme is, as far as I'm concerned, essential.

For me, there's nothing worse than staring at a glaring white background for hours. I've got nothing against all you 0xFFFFFF lovers out there, it's just not my thing.

These last few days, I've been comparing several visual customization alternatives, from the user's point of view. Why from the user's point of view? Because I consider that to be the most important point of view, we create software for users.

Also, when selecting the criteria for classification, there was one aspect that out-weighted the others by several orders of magnitude: Time. It's not just about changing the look of software, but doing it quickly.

As a final note, when I talk about visual customization, I mean the ability to change just about every visual aspect of the interface. There is a name for software that has a "customization" option that only allows you to change the color of the title bar or the toolbar, but I won't say it here, there may be children reading this (I hear it's great for insomnia).

So, here's the final result, from best to worst:

1. The software provides official alternatives to the default look

"Official" here means "created or curated by the software developer", and easily accessible/installable via some sort of manager that takes care of download/installation. To qualify, these alternatives must be diverse (not just variations on glaring white) and must actually work (e.g., no blue text on a black background - yes, Linux shell, I'm looking at you).

Oh, and "curated" means someone actually looked at the whole thing and confirmed that we're not getting the blue-on-black stroke of genius. Being free of bugs/exploits/nasty surprises is a must, but it's not enough.

For software that makes this grade, I don't really care that much about how difficult it may be to change individual aspects of the provided looks/themes/whatever, because we have a coherent whole that works.

Out of the software I use, the winners are Visual Studio and Qt Creator. Congratulations to these teams, top quality work. Android Studio follows right behind, and the only reason it's not up there with those two is because does have some hard-to-read combinations, where the text and background colors are very similar.

2. The software is easy to configure

Since we're not guaranteed to have a coherent alternative here, it must be easy to either change the whole look (e.g., via theming) or individual aspects of it.

So, here we may have software that has a "wealth of community-provided looks/themes/whatever", where trying out a theme is trivial, but changing each individual aspect is not - e.g., Chrome/Firefox extensions.

And we also have software that may or may not have all that wealth, but has a trivial way of either changing the whole look or individual aspects of it - e.g., Notepad++ or gedit.

Changing individual aspects may not give the user a complete coherent workspace, but it will provide a good starting point and, since it's easy to configure, it will allow the user to quickly solve problems as they appear. Again, a very time-efficient solution. Not as good as #1, but a positive experience, all things considered.

3. The software is not easy to configure

Here, it's irrelevant if we're talking about the whole look or individual aspects of it.

This is the software that expects the user to manually download some sort of file (usually, an archive file), copy it to some directory and expand it; or to copy some existing files from a system-protected directory to a user directory and then edit those, looking for some more-or-less cryptic configuration key and changing some usually-even-more-or-less cryptic value; and then, maybe, having to restart the software. Bonus (negative) points if it's a system-wide configuration, where "restart" actually means "reboot".

Most Linux desktops I've tried fall in this category. In order to change a simple visual aspect, if you can't find a recipe for changing the exact item you want, you're in for a good reading of Gtk, or Qt, or some-other-widget-toolkit docs (assuming what you want is properly documented), followed by some file copying, key-searching, and value tweaking. And, since what you'll get will usually be a good starting point, you'll have the enormous pleasure of repeating the process as further problems appear.

Oh, and if you do find the recipe you're looking for, check its date, and make sure the toolkit/software version match yours.

Here, I usually do just enough work to splash a dark background on the software I use the most and ignore everything else. I definitely don't want to waste more time than absolutely necessary.

4. The abomination

This is a special category. You may have already got that impression from the heading, but it's also special in that it has only one entry: Windows Aero.

Let's make this clear - I often look at design options and think "This is incredibly stupid, but maybe there is some logic that I'm missing here". As the man sang, "I'm not the sharpest tool in the shed", so there's definitely room for error on my part. However, when we get to Windows Aero, I can't get past the comma in that sentence. I've considered it several times, and I can see no logic here.

Let's look at the symptoms:
  • Countless posts asking what should be a very simple question, "How do I change the background color for Windows Explorer?", and getting one of two answers: Either "Switch from Aero to Basic/Classic/Anything-Else-That's-Not-Aero" or "Why would you want to do that, there's plenty of people that use white and love it". Both these answers actually mean "You can't".

  • Of course, that's not entirely true. You can. You just have to reverse-engineer the binary format that stores the color configuration (basically it's a DLL containing resources). Or pay for a utility created by someone who went through that effort; then, you'll still have to create your "visual style", and you'll still have to patch Windows to allow you to use your non-signed "visual style".

  • Yes, you read that correctly. Binary format? Check! DLL? Check! Patch Windows? Check! Options such as the background color for applications are stored (although "buried", "concealed", or even "encrypted" would be more suited here) in an undocumented binary format, in a DLL, stored in a protected system folder.

This is the only example I've found where your best option is to actually pay for a third-party application to do something as simple as changing a background color. BTW, this is also why you find successful commercial alternatives to Windows Explorer. It's not just the extra functionality these alternatives have (and they do add value to their offerings), it's also this brain-dead design by the Aero team, that means that something as simple as "the ability to change the background color" is a value point.

Here, I don't even waste my time. I'm just grateful that whoever designed Windows Aero didn't get to unleash its remarkable genius anywhere near Visual Studio.