Sunday 1 September 2013

Boost.Locale, ICU and Mingw

I've been doing a few small command-line programs, and needed to output text like this on a Windows terminal: "Opção inválida".

That led me to the Interesting World of Internationalization. Which is a lot more Interesting on Windows than on Linux, I must say. This should work, right?

UINT oldcodepage = GetConsoleOutputCP();
SetConsoleOutputCP(CP_UTF8);
cout << "Olá\n" << endl;
SetConsoleOutputCP(oldcodepage);


In fact, not only did it not work reliably on all the Win7 machines I tried, but the endl wasn't outputted (hence, the "\n"). So, I've decided I needed something more powerful, like, say, ICU. And, fortunately, the size of the ICU DLLs would not be a problem.

Building ICU was easy. I just fired this on an msys terminal:

runConfigureICU MinGW --prefix=<DBG_PATH> --enable-debug --disable-release
make
make install

make clean

runConfigureICU MinGW --prefix=<REL_PATH> --disable-debug --enable-release
make
make install


No, I didn't bother with the recommended C++ changes, I'll look at it next time.

Next, building Boost. First thing was finding out what did I need to do to let Boost.Locale (and, as I found out later, also Boost.Regex) know that I had ICU available.

When invoking b2, I had to pass -sICU_PATH=<PATH_TO_ICU_HOME>. However, since I wanted to use different ICU versions for debug and release, that meant I couldn't just build Boost as I usually do, with --build-type=complete. Instead, I went for something like this:

b2 -sICU_PATH=<ICU_DBG_PATH> --prefix=<BOOST_INSTALL_DIR> 
    --build-dir=<BOOST_BUILD_DIR> toolset=gcc link=static,shared 
    runtime-link=shared threading=single,multi variant=debug 
    install > boost_install.log 2>&1

And I took a look at boost_install.log, as Boost started running its tests, including checking for ICU. And that was a good thing, because I spotted this early on:

- has_icu builds           : no

Boost was complaining about icui18n.dll and icudata.dll. And, sure enough, looking at my ICU lib folder, neither was anywhere to be found.

So, I went to Boost.Locale's (and Boost.Regex's) Jamfile.v2, looking for these dependencies. The first thing I noticed was that these dependencies didn't apply to MSVC. Actually, MSVC's dependencies included DLLs that I had on my system, so I renamed icui18n to icuin, and removed icudata. I also had to remove a couple of this_is_an_invalid_library_names, and finally I got this:

- has_icu builds           : yes

which made me quite happy. So, after Boost debug got built, I ran this little thingie to check that everything was OK:

#include "boost/locale.hpp"
using namespace boost::locale;

#include <algorithm>
using std::for_each;
#include <iostream>
using std::cout;
using std::endl;
#include <locale>
#include <string>
using std::string;

   
int main(int /*argc*/, char */*argv*/[])
{
    localization_backend_manager lbm = 
        localization_backend_manager::global();
    auto s = lbm.get_all_backends();
    for_each(s.begin(), s.end(), [](string& x){ cout << x << endl; });

    generator gen;
    std::locale loc = gen("");
    cout << boost::locale::conv::between("Olá", "cp850", 
        std::use_facet<boost::locale::info>(loc).encoding()) << endl;
} 


This outputted

icu
winapi
std
Olá


This not only shows that Boost.Locale was built with ICU support (line 1), but also that the conversions are working (line 4).

Next: Testing this on a few machines.