Sunday, 30 June 2013

A pointer is an iterator is... not exactly a pointer

I'll start by stating the obvious - while the last change I made in the previous post built, it would blow up in your face if you executed it, because this


is a not-so-good idea. Since end() gives you one-past-the-end, dereferencing it will crash immediately (if you're lucky), or will crash during the presentation your soon-to-be-ex-CEO is making to the press on the launch of your brand new service (if you're not). I should've mentioned it, but I forgot to because my goal was not to execute the code, just to get it to build.

Now then, the next step was making sure the types involved were what I was expecting. To that end, I used this:

auto a = &*buffer.begin();
auto b = buffer.begin();
std::cout << typeid(a).name() << std::endl;
std::cout << typeid(b).name() << std::endl;

VC gave me this:

unsigned __int64 *
class std::_Vector_iterator<class std::_Vector_val<
    struct std::_Simple_types<unsigned __int64> > >

So, uint64* on one end, and vector<uint64>::iterator on the other.

GCC gave me this:


Yep, mangled. No biggie, after going through c++filt -t, I got this:

unsigned long long*
__gnu_cxx::__normal_iterator<unsigned long long*, 
    std::vector<unsigned long long, std::allocator<unsigned long long> > >

And there we have it, ulong long* on one end, and vector<ulong long>::iterator on the other.

So, the types were as I was expecting, i.e., there was nothing fancy going on with the types themselves. Which means that my understanding of the dereferencing of iterators was not entirely correct; i.e., iterators may share some traits with pointers (namely, dereferecing), but they're not interchangeable, as far as argument deduction is concerned. Let's see.

#include <vector>
#include <iostream>
void print(int i)
    std::cout << "int: " << i << std::endl;

void print(int* i)
    std::cout << "int*: " << *i << std::endl;

void print(std::vector<int>::iterator i)
    std::cout << "vector<int>::iterator: " << *i << std::endl;
int main() {
    std::vector<int> v;

    // CALLS void print(int i)

    auto i = v.begin();

    // CALLS void print(std::vector<int>::iterator i).

    // CALLS void print(int i)
    // IF WE HAVE ONLY void print(int* i), THE ERROR IS
    // INVALID CONVERSION FROM 'int' TO 'int*'

    // THIS CALLS void print(int* i)
    print(&*i + 0);

In order to get a call to void print(int* i), we must first dereference the iterator (thus, getting an int), and then take its address (the + 0 is optional).

OK, so the solution for the original code is simple. We must change this

template <typename I>
std::pair<double, double> minmax_times(I first, I last, size_t iterations)
  typedef typename std::iterator_traits<I>::value_type T;

  std::pair<I, I> m0, m1;

to this

template <typename I>
std::pair<double, double> minmax_times(I first, I last, size_t iterations)
  typedef typename std::iterator_traits<I>::value_type T;

  std::pair<T*, T*> m0, m1;

This time, I've built and executed it. It works.

Friday, 28 June 2013

The joy of type deduction

I've been following Alexander Stepanov's A9 lectures. It has been a good way to work on some fundamental algorithms (and it's also a great reminder that fundamental != basic).

I'm on lecture 5, and I had just designed my own algorithm for minmax_element(...), and wanted to measure it against the lecture's algorithm. So, I fired Qt Creator, created a new project, added the lecture files, built and... Boom!

c:\dev\qt\qt5.0.1\tools\mingw\bin\../lib/gcc/i686-w64-mingw32/4.7.2/include/c++/bits/stl_pair.h:180:4: error: no match for 'operator=' in '((std::pair<__gnu_cxx::__normal_iterator<long long unsigned int*, std::vector<long long unsigned int, std::allocator<long long unsigned int> > >, __gnu_cxx::__normal_iterator<long long unsigned int*, std::vector<long long unsigned int, std::allocator<long long unsigned int> > > >*)this)->std::pair<__gnu_cxx::__normal_iterator<long long unsigned int*, std::vector<long long unsigned int, std::allocator<long long unsigned int> > >, __gnu_cxx::__normal_iterator<long long unsigned int*, std::vector<long long unsigned int, std::allocator<long long unsigned int> > > >::first = std::forward<long long unsigned int*>((* & __p.std::pair<long long unsigned int*, long long unsigned int*>::first))'

As is often the case with C++, depending on what compiler/compiler version you're using, attempting to compile the same code may lead to strange and wondrous results. In order to get another point of view, I attempted to build the same code on VC11 (VS2012); and, lo and behold... I got the same result. Although, the error message was, to put it mildly, ever so slightly saner:

1>c:\program files (x86)\microsoft visual studio 11.0\vc\include\utility(190): error C2679: binary '=' : no operator found which takes a right-hand operand of type 'unsigned __int64 *' (or there is no acceptable conversion)

Ah, here we have two C++ compilers that are complaining about the exact same thing (albeit with differing levels of whinin... er, I mean, verbosity). So much for my "The world's an oyster filled with mismatched pearls" chaos theory.

Anyway, let's follow the invocations, shall we? It all starts here:

std::pair<double, double> result = 
    minmax_times(buffer.begin(), buffer.end(), n / i);

Which calls a function with this definition:

template <typename I> 
std::pair<double, double> minmax_times(I first, I last, size_t iterations)

So, I is a vector iterator. Then, we get this:

std::pair<I, I> m0, m1;
m0 = minmax_element_simple((&*seq.begin()) + i * n, 
(&*seq.begin()) + (i + 1) * n, std::less<T>());
m1 = course::minmax_element((&*seq.begin()) + i * n,  
    (&*seq.begin()) + (i + 1) * n, std::less<T>());

BTW, the course:: qualifier on minmax_element(...) was something I had to add, because both compilers (again, conspiring to destroy my chaotic view of the world) complained about ambiguity between course::minmax_element(...) and std::minmax_element(...).

Here's the declaration of minmax_element_simple(...):

template <typename I, typename Compare>
std::pair<I, I> minmax_element_simple(I first, I last, Compare comp)

And here's the declaration of course::minmax_element(...):

template <typename I, typename Compare>
std::pair<I, I> minmax_element(I first, I last, Compare cmp)

This means, on the one hand, we have a simple iterator expression; on the other hand we have a more complex expression, where we're taking the address of a dereferenced iterator. I then changed it to this:

//std::pair<I, I> m0, m1;
auto m0 = minmax_element_simple((&*seq.begin()) + i * n, 
(&*seq.begin()) + (i + 1) * n, std::less<T>());
auto m1 = course::minmax_element((&*seq.begin()) + i * n,  
    (&*seq.begin()) + (i + 1) * n, std::less<T>());
//if (m0 != m1) std::cout << "Failed: different mins or maxs\n";

And it built.

Next, I did what I often do in cases like this, where I'm not completely grasping what's going on - I followed a hunch. The reasoning behind it was as follows:
  • The template functions weren't being called with explicit template arguments, so the compiler was doing type deduction.
  • This deduction was leading to different types on each side of the assignment, and that was causing the error. This was confirmed when I changed the definition from std::pair<I,I> to auto, thus allowing the compiler to apply the same type deduction to both sides of the assignment.
  • I couldn't easily change the expression in the invocation to minmax_element(...) and minmax_element_simple(...), but I could change the expression in the invocation to minmax_times(...).
So, I removed the autos, put everything back the way it was, and changed this, instead:

std::pair<double, double> result = 
minmax_times((&*buffer.begin()) + 0, (&*buffer.end()) + 0, n / i);

And, again, it built.

Next step: Figuring out what are the types of these different expressions.

I wish there was some tool to output what the compiler's doing when it's "hiding" all the complexity from the developer. It could even be a different - and slower - version of the compiler, with code included via optional compilation (thus keeping the normal compiler as fast as possible, which is what we all want, naturally). As someone whose large part of his day-to-day work is acting when things go wrong, I know the value of having information that allows us to quickly review the succession of events that led to a failure, rather than having to recreate those events based on incomplete information.

Of course, I can also accept the fact that I have to know these rules, in order to work with C++. That's my next step, as I want to know what's going here, and that's what I've been constantly doing - learning.

But these are not exclusive options. Actually, a tool like the one I mentioned above would improve such knowledge, by making it more easily available.

Thursday, 27 June 2013

Those pesky evil macros

I've always been suspicious of Absolutes. Well, not always; after all, that's what our formative years are for - to be filled with certainties that we'll eventually replace with doubt throughout the rest of our lives. Anyway...

When I started developing, one of the first Absolutes I included in the "suspicious" category was the "X is evil" statement. And these days, with all the good sources of information available (stackoverflow,, or reddit/cpp, to name a few), I believe there's little reason for an Absolute like this. With a few exceptions, most "X is evil" statements need some qualifying, such as "X is evil, except in these cases, where it's perfectly acceptable", and it's this qualified (and quality) information that needs to be spread around, and not some utter hogwash Absolute.

So, let's tackle some evil, shall we?

I've been doing some research on C++ macros. Well, C macros, actually, but let's not get worked up over it. Namely, something like this, which would allow me to write this:

#define FE_1(WHAT, X) WHAT (X)
#define FE_2(WHAT, X, ...) WHAT (X) FE_1(WHAT, __VA_ARGS__)
#define FE_3(WHAT, X, ...) WHAT (X) FE_2(WHAT, __VA_ARGS__)
#define FE_4(WHAT, X, ...) WHAT (X) FE_3(WHAT, __VA_ARGS__)
#define FE_5(WHAT, X, ...) WHAT (X) FE_4(WHAT, __VA_ARGS__)

#define GET_MACRO(_1,_2,_3,_4,_5,NAME,...) NAME
#define FOR_EACH(action,...) \

#define MYPC_LOG(level, ...) \
    BOOST_LOG_TRIVIAL(level) \
    FOR_EACH(<<, __VA_ARGS__)
MYPC_LOG(info, "info message: ", GetCode(), "some more info");
MYPC_LOG(info, "another info message: ", GetCode(), "some more info: ", 42);
MYPC_LOG(error, "error: ", GetCode(), "error info: ", 42, 12);

Why this solution? Because it's a simple way (once you understand it) to translate function call syntax (comma-separated list of values) into stream syntax. That means I can further isolate my code from the knowledge of how the logger works, and that's a Good Thing(TM).

Ah, but I can almost hear you say: "You don't need macros for that". And you're absolutely right, I don't. So, macros being - generally speaking and, you know, in a qualified way - evil, and there being better alternatives, why macros?

For two reasons, actually.
  1. As I've written in previous posts, I'm using __FILE__ and __LINE__, and the preprocessor relies on their position in the code to generate their values. So, I need these to be at the actual call site, and not on some function (even inline).
  2. One fundamental property of logging code is to generate as little overhead as possible. Look at this: MYPC_LOG(info, "info message: ", GetCode(), "some more info"). See the GetCode() in there? Ideally, we want to evaluate this only if it will be logged, i.e., if our logging level is INFO or lower (I'm ignoring other possible filters, here). I couldn't do this if MYPC_LOG() were a function.
Well, regarding point #2, yes, I could, using lambdas or functors. However, that would lead to a more complex solution and I'd still have point #1 to deal with.

Finally, with regards to spreading qualified information, here it goes. Note that logging/debugging with __FILE__ and __LINE__ shows up a few times as a perfectly good reason for using macros.

So, whenever someone berates you for using "evil constructs", do some research and see what your peers are saying. If it turns out there's a better alternative, learn and adapt; otherwise, sometimes you're stuck with some degree of evil. Just don't get used to it, or you could end up like this fellow, who only turned out all right in the end by sheer luck; or, as someone put it, because of someone else's pity.

Friday, 21 June 2013

Adapting to Boost Log

I've decided to see how Boost Log implements its non-evaluation magic. And, as far as I can see, it turns out to be clever, but it's not magic.

I've built Boost 1.54.0 beta1, and changed the trivial logging example to this:

int GetCode()
    static int val = 0;

    return ++val;

int main(int argc, char *argv[])
        boost::log::trivial::severity >= boost::log::trivial::info);

    BOOST_LOG_TRIVIAL(trace) << "trace message: " << GetCode();
    BOOST_LOG_TRIVIAL(debug) << "debug message: " << GetCode();
    BOOST_LOG_TRIVIAL(info) << "info message: " << GetCode();
    BOOST_LOG_TRIVIAL(warning) << "warning message: " << GetCode();
    BOOST_LOG_TRIVIAL(error) << "error message:" << GetCode();
    BOOST_LOG_TRIVIAL(fatal) << "fatal message: " << GetCode();

    return 0;

And this was the output:

[2013-06-15 20:26:00.401384] [0x000010ec] [info]    info message: 1
[2013-06-15 20:26:00.411384] [0x000010ec] [warning] warning message: 2
[2013-06-15 20:26:00.411384] [0x000010ec] [error]   error message:3
[2013-06-15 20:26:00.421384] [0x000010ec] [fatal]   fatal message: 4

So, this means that the calls to GetCode()in these lines didn't run:

BOOST_LOG_TRIVIAL(trace) << "trace message: " << GetCode();
BOOST_LOG_TRIVIAL(debug) << "debug message: " << GetCode();

Looking at the preprocessor output, this is what we have (edited for readability):

    BLR _boost_log_record_23 =  
        (BLT::logger::get()).open_record((BLK::severity = BLT::trace)); 
    BLA::make_record_pump((BLT::logger::get()), _boost_log_record_23).stream()
        << "trace message: " << GetCode();

BLR = ::boost::log::record
BLT = ::boost::log::trivial
BLK = ::boost::log::keywords
BLA = ::boost::log::aux

It uses a for so it can initialize the record and test the initialization in the same expression. That's where the filter is applied (I didn't know the !!, but from what I've read, it's a safe way of converting to bool), and if the record fails the filter, the body of the for is not executed.

Comparing Boost and Poco loggers, I like Boost's stream syntax better. I find it more convenient, since it frees the client app from having to worry about string formatting/concatenation. OTOH, I look at Poco's implementation, and not only can I understand most of it, but I can also reason about the bits I find more challenging. Not so much with Boost, so far.

I also prefer Poco's organization - e.g., all the convenience macros are in Poco/Logger.h. The idea I got from Boost (and I'm still checking this) is that there is no simplified way of getting a list of what's available (and judging from Qt Creator's auto-complete list, there is a lot available).

I won't give up my goal of being able to change logger implementations with little effort, but I may have to give up the goal of not burdening the client code with my code's requirements, if using Boost Log. I'm still trying to figure out if this will pose a problem, and, if so, how to solve it.

One thing is certain - I'll keep on using macros. I won't give up __FILE__ and __LINE__, and I certainly don't want to have to write both all over the place, so I need actual textual replacement, and that's something only the preprocessor can give me.

Next stop: Getting a better understanding of Boost Log.

Tuesday, 18 June 2013

Speechless... In the best possible sense

Sometimes, I find something that is so utterly brilliant, I can do nothing more than just link to it.

Yes, it's a short post. No, the link won't open in a new window. After all, it's not like this post has much to read, right?

Oh, OK. What am I doing? Getting my head around Boost Log (or, rather, watch it dutifully going over said head), and trying to figure out a way to create a wrapper around it that a) won't evaluate logging expressions unnecessarily (without having to implement that same behaviour in my wrapper); and b) will translate "function call" syntax to "stream" (operator<<) syntax.

Saturday, 15 June 2013

On user choice...

While considering how to format my log lines (mainly, whether to use JSON for the entire log line or only for the context), and how to implement a request/context ID (main problem: How to propagate the ID through the call stack while keeping interface pollution to a minimum for classes/functions that don't need to know what is a request/context ID; so far, I've got no solution), I noticed Boost 1.54 beta is out, and one of its new libraries is a logging library. I'm currently taking a look at it, to check how wrapper-friendly it is and whether using a wrapper will negate any of its advantages. E.g., if it uses any sort of lazy-evaluation on the logged expressions, wrapping it may not be such a good idea. I'm still looking at it.

In the meantime...

As I've said before, I believe in user choice, i.e., that software should be customizable. Most software is, these days; Games? Not so much. And when I read stuff like this, I fear it may not get any better, anytime soon.

Maybe I'm suffering from a huge amount of bias, but the main point I got out of this was: The game designer knows best. The [gamer/player/user/(dare I say it?) paying customer]'s view on the subject is actually quite irrelevant, and apparently the less he can tinker with the final product, the better, lest he disturbs the delicate and near-perfect balance that was created for him; for all of us, actually; for all the hundreds, thousands, tens of thousands, hundreds of thousands... well, you get the picture.

I'm at a loss on how to describe this belief that any one team (or person, for that matter) can routinely design an experience that is going to be completely satisfactory for this sizable a universe. Arrogant comes to mind, but it seems weak; I find it beyond arrogance to believe that what I have produced is so close to perfection that I should then attempt to intentionally limit the user's ability to rebalance it to its taste.

There's a difference between games and other forms of entertainment: Interaction. The player is not a passive receptor, as in literature, music, or cinema, which are one-way activities.

No so, with a game. While I'm experiencing whatever the game designer created, my input is part of what drives the experience forward. In Witcher 2, I can spend all my time in the forest around Flotsam killing monsters, and not advance the main quest. Which means I'll remain there indefinitely. That's not what the game designer envisioned, but it's what I'm considering fun at the moment. The game exists to satisfy me, the player. That's why I buy it.

It's a matter of control. I accept I can't control a song, a book, or a movie; but, with a game, I want to have as much control in tailoring my game experience as possible. Because I'm supposed to have an active role in that experience (i.e., playing the game), and the more I can customize it, the better I can match it to my idea of fun (which may be vastly different from the game designer's).

But there's another factor in this equation; let me illustrate with an example - I'm replaying the Witcher 2 on Hard. I expect there will be parts of the game where I can't progress on Hard, and I'll just change the difficulty to Normal (or even Easy) when I reach the limit of time I'm willing to spend on that part of the game. Note what I wrote in bold. I'm spending my time to play the game you created. Time is a finite resource, and if I don't consider my time as well spent, I'll stop playing. And I'll probably think twice before buying anything from you again.

Because Witcher 2 gives me a level of control I'm comfortable with, when Witcher 3 comes out, I'll buy it at full price on day 1, instead of waiting until it goes on sale (or ignoring it entirely). Allowing me to tailor my experience as I see fit means a higher chance of you getting my money.

We can take this one step further. Still in the gaming realm, I'll give my money to way more easily than to any other company (yes, I'll choose over Steam, it's a no-brainer for me). Because they're the ones that have given me greater freedom. I'm not going to buy a console, but if I were, I'd definitely skip the XB1, based on MS's announcements; if you want me to "license" (basically, rent) software, you need to significantly lower the its price. If I'm paying full price, I'm buying it, and I expect to be able to treat it like everything else I buy - it's mine.

Moving away from the gaming world, let's consider smartphones and tablets. I can't see myself buying an iAnything. For that exact same reason I've been talking about - I'm not supporting a business model where I'm expected to cede control over something I bought.

You make the rules, so obviously you decide how much control I'll have. Me, I get to vote with my wallet.

A final note on the comparison with books - in high school, we had to read several mandatory books, classic Portuguese literature (similar to how I had to read The Great Gatsby in 12th grade English). An "industry" of sorts quickly developed for - much smaller - books with a "compact analysis" of the book we should be reading. I remember some of us didn't have the actual book, but we all had the analysis book. Just because the media doesn't support customization, it doesn't mean your customers won't seek to customize the experience; sometimes, at your expense.

Monday, 3 June 2013

How to use logging levels

As the title says - what to log, and at which level? 

Note: I'll be giving examples far more complex than my personal project. Part of my day-to-day activity involves working with logs, so my knowledge and experience on this subject goes beyond what I'm coding. I've rarely taken the time to organize my thoughts in writing, so this is my first attempt at correcting that omission.


I'm using TRACE for two situations:
  • Log records that I won't miss in production - e.g., a ctor that has an initializer list and an empty body.
  • Operations that could cause "log spamming" (no relation) - e.g., when I set up the asio io_service loop to call libsh2 functions, these are repeatedly called, many times per second, until their task is over. The last thing I want in my logs is this kind of spam, unless I really need it, while investigating an issue.
TRACE statements can be #defined out of the code, so the first point is important.


Here, I decided to apply some caution; I created two DEBUG levels: DEBUG and DEBUGPRD.

Again, this level applies to stuff I won't miss on a production log. A very obvious example: move and copy ctors.

At this level is stuff that I won't regularly want on my production log, but that I may find occasionally useful, while investigating an issue I can't replicate in QA. E.g., the point at which a RAII class releases its resource.

And here ends the selective code exclusion of my logging setup. From here on, as far as exclusion goes, it's all or nothing.


Business as usual.
  • We invoked some other platform and got back a success.
  • We recorded a purchase on our DB, with no error.
  • Someone invoked us to validate a token and it was valid.
This is the level where we start logging mainly functionality, rather than technical details - e.g., we log that we're running an SSH command, and we're not interested in the details of the process - what classes are being created, moved, etc.

Some may ask "Why is this in a production log?". Well, suppose you get a ticket about a customer transaction on your platform, that was erroneously approved; the customer's balance was inferior to the transaction amount and, as a result, the customer went in the red. Now, you need to find out what went wrong and how to make sure it doesn't happen again. You try to replicate it in QA, with no success. You set up a test in production (the horror!) and, lo and behold, you can't replicate the issue.

Well, no biggie. Since you're logging at INFORMATION level, before you do all this, you check the log for the request/response to the system that validated the customer's balance, to see where was the point of failure - if you got an - incorrect - OK on the customer balance validation, the analysis must be performed on the validation platform; if you got a NO-GO and your application went ahead anyway, then the investigation is on your app's end.

Without this knowledge, you don't know on which system to focus the effort, and this is where logging at the INFORMATION level can be invaluable.


As a general rule of thumb, I'd say: "Just like INFORMATION, but for business issues". Taking the example above, we validated the customer's balance, and it was negative; or the customer tried to login using the wrong password. These are not hard errors, but rather perfectly acceptable situations. Still, we take notice.

Why? Because, by doing so, we can then mine the logs for this information.

Let's assume, as is often the case, our system was built with "aggressive" goals; meaning, neither enough budget, nor time. Under this scenario, anything that's not "core functionality" will fall prey to the usual corner-cutting; and monitoring isn't usually considered "core functionality".

Which is a shame, because monitoring, when properly implemented (i.e., when the application/system is designed to either implement monitoring, or to facilitate its ad-hoc implementation), is the most valuable prevention tool I know.

And this is what a well-implemented NOTICE level gives us: a way to easily mine a log file for things that may require our attention (provided we have the resources to spare said attention, obviously) - we have a customer that has had 10 failed balance validations in the last hour; or we have a login with 12 failed attempts in the last hour (and, since we block the login for 15 minutes every 3 failures, we know someone has been "constantly" trying to login). We can fire an automated contact to our customer, saying "We have just detected such and such, and we'd like to know if you're having any difficulty and if we can be of assistance".

All this is easier to detect because we're logging events at their optimal level.


Here, the balance begins shifting back a bit to the technical side of the pendulum. Adding to the business issues (e.g., a client's balance decreased over 20% in a single transaction), we can also log things like resource usage (are we getting uncomfortably close to any limit) or performance data (we have invoked platform X; the expected response time is 1 sec and it took 15 secs).

Some of this can be complemented with external monitoring processes - e.g., keeping watch of CPU/memory/disk usage. Meaning, the application itself doesn't have to include everything and the kitchen sink - after all, in a high throughput app, we may not want to burden the component responsible for running the transactions with the task of verifying its own performance. So, if this task was not budgeted for the app, it can be built later on by sweeping the logs.


I'll start by saying something that should be obvious, but often isn't: Log only errors at this level. Namely:

  • Business errors. As stated above, a failed balance validation because of a negative balance is not an error; a failed balance validation because the validation platform says the account doesn't exist, is. Thus, in this context, "business errors" means "conditions which are not covered by any business rule, be it success or failure"; a negative balance is a failure, but it's covered by a business rule, so it's not an error.

  • Technical errors. The balance validation platform is unavailable. The DB update failed because it couldn't acquire a lock. The DB insert failed because of a failed check constraint. This last one may not be a technical error - if the UI doesn't validate the constraint and passes any input to the DB, I'd call this a business error, because the DB's check constraint is actually validating the input (actually, I'd call it a design error, unless someone could justify the enormous burden of maintaining the check both in the DB and the UI).


From here on, the pendulum has swung all the way back to technical issues.

That said, there are business issues that can indicate a critical situation - if every customer balance validation for the last 5 minutes is always returning negative balance, I'd like to have an alarm ring, thank you very much; similarly, if every login in the past 5 minutes has failed, let's have some of that alarm ring, and be quick about it.

However, I think this analysis should be performed by a process created specifically for this, and not by the application itself. The application has a "login" operation, and this operation shouldn't have to know that its failed login is one among dozens that are failing at that same moment.

As such, I can't think of many single business issues that could be classified as critical. An example could be a special high-visibility promotion we're having on our online store, and we want anything at all related to this promotion to bubble to the surface immediately. So, we could, via some configuration, say that any issue related to this promotion ID (or product ID, or whatever) should be considered critical. Still, while I find this idea valid, I don't think it belongs in the logging framework.

On the technical issue side, what is critical? The balance validation platform has been unavailable for 5 minutes. The DB has been unavailable for 5 minutes. Our user repository has been unavailable for 5 minutes. Or they're available, but we're getting times way above the expected performance for 10 minutes. Or our resource usage has gone way beyond "uncomfortably close to the limit".

Again, this requires either monitoring processes, or that the application be developed to "look at the big picture". Here, I also tend towards removing these checks from the application to monitoring processes; but, and this may be just bias, implementing these checks in the application itself doesn't strike me as odd.


Ah... what do we put here?

Application startup gives me a few ideas - can't create the DB connection pool; or can create the pool but the DB is unavailable. During startup, anything that's fundamental for the task(s) the application performs should be available, otherwise, we might as well shut down.

When the application is already in flight, I can think of a few issues, but mainly concerned with running out of resources - memory, file system, sockets, threads; and even these may nor merit a shut down - e.g., sockets may be freed, in which case the application may resume its normal work and the incident could be just a momentary blip on the radar.

tl;dr - The importance of choosing the correct logging level comes from how it can simplify (and, in come cases, actually make possible) the task of mining the logs for useful data; "useful" in the sense that it gives you an image of the application's status and allows you to take action based on said status.