Quality, Microsoft, Quality

My day job involves writing software.  Today, actually, it involves porting software from Windows to Linux.  Until now, our software has been built with Microsoft Visual C++ but, strangely enough, VC++ doesn’t support targeting Linux, so we’re having to build it with GCC.

The application is command-line-only.  It involves no networking, only basic file IO, no console input and minimal console output.  For the most part, it is a number-cruncher.  How hard can this be?

Very tricky, it turns out.  Mostly because the Microsoft C++ compiler, even after all these years, is a buggy heap of shite.

To be fair, some would say that VC++ has always been a buggy heap of shite.  The impression that’s put around the industry is that this used to be the case, but now it’s pretty much on a par with other compilers.

Well, I’m here to tell you that it just ain’t so.  VC++ is still full of bugs.  Most of the ones I’ve come across so far have been bugs of the worst sort for maintainability and portability – the compiler allows things that the language specification says are illegal.  So your developers can sit and happily spew out non-conforming code, compile it and run it, never even realising they are writing rubbish that won’t compile anywhere else.

Here’s my top 5 I love to hate:

1.  Preprocessor phases are done in the wrong order.

Seriously.  The standard specifies exactly what order things happen in.  How hard is it to just implement that order?  Specifically, the preprocessor is supposed to remove comments before it expands macros.  It doesn’t.  It means that an enterprising developer can write this piece of genius:

#ifdef INCLUDE_SOME_CODE
#define SOME_CODE
#else
#define SOME_CODE /##/
#endif

That ## in the second definition of the SOME_CODE macro means, ‘Paste these two things together and put the result in the output.’  Of course it works out to a one-line comment.  Someone has found a way to conditionally comment out code – so long as your compiler (specifically the preprocessor) is completely broken.

2.  Templates are parsed at the wrong time.

Seriously.  The compiler is supposed to parse template definitions as it finds them, then instantiate them when they are used.  Microsoft decided that was too easy.  When the VC++ compiler finds a template definition, it sticks it’s thumb in the page there to remember where that template is defined.  Then, when it finds an instantiation of that template, it goes back and parses the template.  This causes grief, mainly because it means that the compiler already knows what the template arguments are when it parses the template.  This lets it try to be clever.  In the process of being clever, it deviates from the standard in all sorts of ways.  For instance:

template<typename T>
class A {
    My compiler is a smelly, buggy heap of shite.
};

template<>
A<int>
{
    // Actually do something useful in here.
};

int main() {
    A<int> a;
    return 0;
}

This compiles just fine.  When the compiler is parsing this file, it doesn’t bother to parse the templated class definitions until it gets to the instantiation in the main function.  When it sees

A<int>

it spots that this is an instantiation of the template A.  Of course, it now knows what the template parameter is, so it knows that it only needs to parse the specialisation of the template for T=int, so that’s exactly what it does.  It never bothers to parse the general template class.  Hey, what’s the point, anyway?  It never gets used.

If you can’t see the problem here, your brain needs looking at (or you don’t write software for a living).  Suppose for a minute that the template definition is in a library header file somewhere.  Suppose it’s at the bottom of a long chain of template definitions.  Suppose that chain of template definitions is rarely used; in fact, it’s not used at all in your library code.  Some people who use your library might want to use it in some obscure corner cases, though.  Your library compiles fine, but when they use this obscure feature of your library, their code won’t compile any more, and they get several pages of incomprehensible template error messages.  Whose code is to blame?  It can’t be yours; your library compiles fine!  Your poor user will probably spend several days poring over his code before he even realises it could be your fault.

3.  Dependent and non-dependent namespaces are combined.

I guess this is a consequence of number 2, but I think it deserves a separate item.

When the compiler is parsing a template, it can come across two different types of name, dependent and non-dependent.  Dependent names are ones that depend on the template parameters; non-dependent names don’t depend on the template names.  When the compiler is looking for a non-dependent name, it isn’t supposed to search dependent names.  This is because, until you know what the template parameters are, you don’t know what is a dependent name and what isn’t.

To demonstrate this, consider this line of code:

A<B> C;

What is this?  Is it a declaration of a variable called C with type A<B>?  Or is it two comparison operations, mean to be equivalent to

( A<B ) > C

?  Either would be legal, and without knowing what A, B and C are, you can’t tell.  This is why writing a parser for C++ is so hard and why people designing new languages bang on so long and hard about the importance of context-free grammars.

Now, suppose the compiler finds the above line in a template definition and A, B and C are template parameters.  If template parsing is done right (according to the standard) then the compiler doesn’t know what A, B and C are and so it can’t tell what the statement is meant to mean.  The standard provides a way out, though; if you’re in this situation and you’re trying to declare a variable C of type A<B>, you’re supposed to write:

typename A<B> C;

This tells the compiler that you’re naming a type, not constructing an expression.  It’s rather an ugly kludge, but it’s what we have in the standard.

But if you do template parsing the Microsoft way, you already know what the template parameters are when you parse the template.  You know what A, B and C are.  You can tell what is meant by that line of code.  What’s the point of issuing an error message, just because the standard says you should?

To drive this home, here’s an example of code with two errors according to the standard but which the Microsoft compiler has no problem with:

#include <vector>
template<typename T>
class Base {
protected:
    std::vector<T> data;
};

template<typename T>
class Deriv : public Base<T>{
public:
    Deriv() {
        data.push_back(T());
    }
};

Spotted the problems?  No?  Who would, without a compiler to tell you they are there?  The two errors are both name lookups that should fail.  Firstly, std::vector<T> depends on a template parameter.  We don’t know what T is.  This could be two comparisons rather than a variable declaration.  Secondly, and perhaps a bit more subtly, data should not be directly available in class Deriv.  The reason is that it is actually called Base<T>::Deriv- a dependent name!  data` looks like a non-dependent name, so the compiler isn’t supposed to check the list of dependent names to find it.  To write this code correctly, you’re supposed to say:

#include<vector>
template<typename T>
class Base {
protected:
    typename std::vector<T> data;
};

template<typename T>
class Deriv: public Base<T> {
protected:
    using Base<T>::data;
public:
    Deriv() {
        data.push_back(T());
    }
};

And, while you’re thinking about how hard it was to spot those bugs in the original version, remember that the compiler might not even be parsing the content of those templates if they’re not instantiated.

4.  What is a static enum, anyway?

I have no idea who thought this was necessary, or a good idea, but someone had written code like this:

class A {
    static enum B {
        V1
    };
};

It turns out that static isn’t the only possibility; you can declare your enums to be volatile, register or (if your version of VC++ is pre-C++11) auto.  And various combinations of the above.

5.  const_iterator is for what again?

Don’t try this at home:

template<typename T>
void Erase(std::vector<T>& t, std::vector<T>::const_iterator& ci) {
    t.erase(ci);
}

And you thought const_iterators were harmless…