• Node.js packages in Mountain Lion

    tl;dr: make sure you add /usr/local/share/npm/bin to your PATH when installing node.js to be able to access the package binaries.

    Developing in Ruby on Rails on a Mountain Lion environment can be a pain. Although it’s a UNIX-like environment, most of the tools created for web development have been made with Linux in mind, and making the switch from a Linux box to Mac OS X is far from harmless.

    Anyway, the other day I needed to tweak Bootstrap to make the default grid wider, and instead of using the Bootstrap web site customiser, I decided to download the source code from GitHub and build it myself.

    In order to do this, you need node.js and some of the packages that come with it. I’ve never developed or even played with node.js before, so I needed to install it on the computer. And that was fairly easy thanks to homebrew by simply issuing the command brew install node.

    After node has been installed you have access to npm, the node package manager. Following the Bootstrap instructions, I installed the necessary packages:

    npm install recess connect uglify-js jshint -g

    After that I thought I was ready to build Bootstrap, but the make command complained about not being able to find some of the node.js binaries I’ve just installed a minute ago.

    The solution to the problem, though, was rather simple. It turns out the default formula for nodejs on homebrew doesn’t tell you the folder in which the node.js binaries will be installed in. Without adding this folder to the path, obviously the system can’t find the files it’s supposed to execute.

    Simply add the folder /usr/local/share/npm/bin to your PATH environment variable and you’ll be good to go.

  • Mac OS X, iTerm and the meta key

    If you use your Mac OS X as a development machine and are a regular user of the shell, chances are you are going to be using the movement commands a lot. Chances are, too, that you are using iTerm instead of the system provided Terminal app.

    Using the arrow keys is usually enough, but more often than not you need to move between words. This movements, unless you redefine it in your global or local bashrc profile (or any similar shell you maybe using), are done with the keys b and f. Pressing C-b or C-f moves the cursor one character back or forward. Doing if with M-b or M-f does the same but with a word (if you are an Emacs user you will be familiar with those key shortcuts).

    The C stands for control key, while the M stands for meta key. In most keyboards (or keymaps to be precise), the control key is mapped to the ctrl key and the meta key is mapped to the alt key. In Mac OS X, the meta key is mapped to the alt key, but as you may very well know, this alt key is known as the option key, and has its peculiarities.

    Now, if you open a shell in iTerm and press C-b or C-f, the output will be as expected, but not if you press M-b or M-f. Instead of moving forward or backward a word, you will see that some weird character is written on the command line.

    Fortunately this is really easy to fix in iTerm. You just need to go to the Profiles menu, edit your profile (which is most likely to be the default one), and then go to the keys tab. Now, on the bottom of the keymap lists, you will see that you can configure the behaviour of the option key. Set it up to the last option (+Esc) as shown in the screenshot, and then the alt key in iTerm will be sending the shell the adequate escape sequence so all meta mappings work as expected.

    iTerm profile editor
    iTerm profile editor

    EDIT (30/11/2012): looks like this breaks some of the characters that are used by typing the meta key, i.e. the # character (meta + 3). Another way to achieve what we want is to manually map all the meta key shortcuts. This can be done in the same window as before. Select Normal instead of +Esc and, for each key shortcut you want to map, click on the + button. On the opening dialog, type the combination you want to map, for example alt + d, and select Send escape sequence from the drop down  Then on the last textbox insert the escape sequence character you want to send (typically the same pressed along the meta key).

    Select Send Escape Sequence
    Type the character to send
  • Why I will never buy an Apple product again

    Well, here it is. This is not a tech post. Not a programming post either. This is just a rant I really needed to put online for some people to know. Also, I know this will never appear on Hacker News but I always wanted to write one of those “why I <type here a randomly shit nobody really cares about>” posts :)

    tl;dr: Apple are a bunch of jokers.

    Here’s the story. Last week my MacBook decided to refuse booting. It’s a late 2008 model (the first Unibody), and I was hoping it wasn’t because of a hardware issue. Actually, the machine booted, but it would refuse to show the login screen after the initial load and the second appearance of the Apple logo. For the record, I had Mountain Lion installed on it and have had no problems so far. After trying some things like repairing disk permissions, clearing the NVRAM, do a safe boot and some other black magic suggested in the Apple support pages and a good friend who happens to know a lot about the Mac world, I came to the conclusion that the problem was really not repairable and decided to go for a clean Mountain Lion install.

    After booting the box into a Ubuntu Live CD and backing up some non essential files that I’d rather not lose either, I reinstalled Mountain Lion. Everything was fine. I had now a clean Mountain Lion installation on a laptop without any noticeable hardware issues. I had just only lost some hours of my time. No big deal.

    But then I went to the Mac App Store to redownload and reinstall iPhoto. To my surprise, the App did not appear as purchased. The system was asking me to pay the £13 or so it cost. The thing is, I had already purchased iPhoto 3 months ago. So I decided to email Apple support and ask for help.

    This was the answer: “You purchased iPhoto when your Apple ID country was Spain, and then you changed your Apple ID country to United Kingdom, so you lost  all purchases made while your ID was linked to Spain”. And this is in fact true: I moved from Barcelona to London 3 months ago and decided to change my Apple ID country to the UK. What I did certainly not know is that stupid policy of you losing all purchases when changing countries.

    So I replied Customer Service to actually get a clarification on that, and the answer was crystal clear: “yes, all purchases in the App Store are linked to the country of your Apple ID, so if you change it, you lose the purchases”.

    It seems this is no recent news, as a search on the internet showed different people having to deal with the same issue. But this did not make it less stupid. What kind on nonsense and stupid policy is that?

    I could understand a similar policy with movies or music, as all those monster major distribution companies issue rights to watch or listen to certain material on a country basis. This is obviously a matter for another post and another site. But for software? And even worse, software being developed and sold by Apple themselves?

    When did we all go so fucking crazy about everything?

    So I asked Apple again: “are you telling me that I bought a software from you 3 months ago to run on this machine, and now, 3 months later, after having to do a system clean installation, on the same fucking machine, I have to pay AGAIN for the same fucking software?” The answer was clear again: “yes, I’m aware this is not the answer you were expecting but it’s how it works”. And then this hilarious predefined quote at the bottom of the email telling me “how happy we are to have you as a member of the Apple family”. Ha!

    Searching on google again, I found out some people managed to get a refund because of that, so I thought I had nothing to lose to try. I emailed again (and did the same through the feedback links on Apple’s web site asking for a refund and telling them as nicely as I was able to do given the fucking circumstances that I felt like I was treated like garbage.

    Let’s be honest. I am not one of those really old Apple customers. This MacBook was my second one and besides that I’ve only owned an iPod mini, two iPhones and an iPad (and I have to say the overall experience with both those products and the company was clearly positive). So no, I am not one of those poor sad bastards that go queue during the night to get a fucking gadget the day it gets released (although I have to admit I’ve done that with the World of Warcraft: The Burning Crusade release). But this really has nothing to do with it. They just crossed the line. Again. And yes, I know there is some shitty legal things involved in all this regarding VAT and some other things, but this is NOTHING Apple can not get over to make an App purchase valid if you fucking move countries.

    In the end Apple resolved my issue by giving me some redeem codes, not only for iPhoto but also for iMovie and GarageBand (apps that I do not use and I’m very unlikely to do so in the future), but not because of what happened, according to the Customer Service email, but because “we have checked that the MacBook you bought came with iLife, so we are generous enough to let you fucking download again a software you already paid for 3 months ago”.

    Well, you know what? You’ve lost a customer for a fucking £13 App.

    Good job, Apple.

  • Common linking issues in C++

    Introduction

    C++ is a language derived from C, so in essence all problems at link time boil down at declaring stuff but not defining it.

    Declaring something in C++ means bringing the entity into existence in the program, so it can be used after the declaration point. Defining something means giving a complete description of the entity itself. You can declare a class or a function, and it means this class and this function do exist. But to completely describe a class and a function you have to define them. A class definition provides a list of base classes of that class, a list of members (data members and member functions) of that class, etc. A function definition provides the executable code of that function. All definitions are declarations but not all declarations are definitions.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    
    // Defines variable 'x'
    int x;
    // Declares variable 'y'
    extern int y;
    // Declares class 'A'
    struct A;
    // Declares function 'f(int)'
    void f(int);
    
    // Defines class 'A'
    struct A
    {
        // Declares member function 'A::g(float)'
        void g(float);
    
        // Defines member function 'A::h(char)'
        void h(char) 
        { 
          // Code
        }
    
        // Defines data member 'A::x'
        int x;
    
        // Declares static data member 'A::y'
        static int y;
    };
    
    // Defines  function'f(int)'
    void f(int)
    {
     // Code
    }
    
    // Defines member function 'A::g(float)'
    void A::g(float)
    {
     // Code
    }
    
    // Defines static data member 'A::y'
    int A::y;
    

    C++, in contrast to C, strongly sticks to the One Definition Rule which states that entities can be defined at most once in an entire program. Of course this may not be completely true depending your own the definition of "entity": template functions when instantiated by the compiler can be defined more than once in the program, and some magic happens so this does not become a problem.

    Anyway, C++ brings its own set of linking issues which may fool even the most experienced C++ developer.

    Static data members are only declared inside the class specifier

    Some might argue that this is one of the most common source of linking issues when using C++. Truth be told, static data members are just global variables in disguise so most people will avoid them. However, there are cases where a static data member may come in handy, for instance when implementing the singleton pattern.

    The problem lies that, although usual (nonstatic) data members are defined when they are declared inside a class (like in line 23 of the example above), static data members are only declared. Thus in line 26 of the example above A::y is only being declared. Its actual definition is given in line 42. The actual definition of a static data member will go in the implementation file (usually a .cpp or .cc file).

    So the usual case goes like this: you realize you need a static data member. You add it to the class. Your code compiles fine but does not link. In fact 'A::y', the static data member you just added is undefined? How can this be?

    Now you know the reason.

    What is the reason this issue is hit so many times? Well, there are three reasons. A historical one, where early versions of C++ compilers allowed this. A quirk in the C++ language itself where const integral and enumerator static data members can be declared and initialized in the class itself (thus defining them as well). And finally, a linguistic issue, since in Java and C# static fields are declared like any other fields plus a static specifier.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    
    // -- Header file
    class MySingleton
    {
    public:
        static MySingleton& getInstance()
        {
            if (singleton_ == 0)
                singleton_ = new MySingleton;
            return *singleton_;
        }
    private:
        // Usual private constructor
        MySingleton() { }
        // Declaration
        static MySingleton *singleton_;
    };
    
    // -- Implementation file
    // Definition
    MySingleton* MySingleton::singleton_ = 0;
    

    Not all headers are created equal

    The usual myth is that C++ is a superset of C. Well, it looks like as a superset of C but they are actually two different languages. That said, they share so many thinks that interfacing C++ and C is pretty straightforward, in particular when the former must call the latter (the opposite may be a bit more challenging).

    Thus, it is not unsual to see that a C++ program #includes C header files. Chances are that the headers of your operating system will be in C. Being able to #include a C header and using the entities declared in it is one of the strengths of C++. And this is the source of our second problem.

    Remember that in C++ functions may be overloaded. This means that we can use the same name when declaring two functions in the same scope as long as they have different enough parameter types.

    1
    2
    3
    4
    5
    6
    7
    
    // Declaration of 'f(int)'
    void f(int);
    // Declaration of 'f(float)'
    void f(float);
    // Redeclaration of 'f(int)' since, in a parameter, 'const int' cannot
    // be distinguished from 'int'
    void f(const int);
    

    It may be non obvious, but we cannot give these two functions declared above the same f name. So the compiler crafts an artificial name for f(int) and f(float) (this is called a decorated name or a mangled name). For instance they could be f_1_int and f_1_float (here 1 would mean the number of parameters). The C++ compiler will internally use these names when generating code and the lower levels will just see two diferent names.

    But overloading cannot be applied to C. Thus we run into a problem here. If we #include C headers, the names of these functions cannot be overloaded thus a C compiler will generated code using the (undecorated) name of the function. If our C++ compiler always uses a decorated name, there will be an unresolved symbol. The C++ compiler cannot tell if this is C or C++. Can it?

    Good news, it can. You can define the linkage of declarations in the code. By default linkage is C++ so overload works as described above. When you want to #include a C header, you will have to tell the C++ compiler that the linkage of the declarations is C, not C++. Most of the time you will find these lines in the beginning of a C header intended to be used from C++.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
    // Remember this is a C header so protect ourselves when this is compiled using C
    #ifdef __cplusplus 
    // This 'extern "C"' syntax is only valid in C++, not in C.
    extern "C" { 
    // From now everything has C linkage. 
    #endif
    
    /* Library declarations in C */
    
    #ifdef __cplusplus 
    // Close the brace opened above
    }
    #endif
    

    Virtual member functions and virtual tables

    Finally one of the, in my opinion, most confusing link errors when using a C++ compilers: virtual table unresolved references.

    Virtual member functions are, in C++ parlance, polymorphic methods of other programming languages (like Java). Virtual member functions can be overridden by derived classes (descendant classes) thus when called, they must be dispatched dinamically.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    
    struct A
    {
        virtual void vmf(float*);
        virtual void vmf2(float*);
    };
    struct B : A
    {
        virtual void vmf(float*);
        virtual void vmf3(float*);
    };
    
    virtual B::vmf(float*)
    {
        // Code
    }
    
    void g(A* pa, float *pfl)
    {
      // Dynamic dispatch 
      // we don't really know if A::vmf or B::vmf will be called
      pa->vmf(pfl);
    
      // Static call to A::vmf since we qualified the function being called
      pa->A::vmf(pfl);
    
      B b;
      // Static call to B::vmf, no doubts here since the dynamic type (in memory)
      // of 'b' and its declared type must be the same
      b.vmf(pfl);
    
      A& ra(*pa);
      // Dynamic dispatch again
      ra.vmf(pfl);
    }
    

    Dynamic dispatch is implemented using a virtual method table (or vtable). Every class with virtual methods (called a dynamic class) has a vtable. This vtable is a sequence of addresses to member functions. Every virtual member function is assigned an index in this table and the addresses points to the function implementing the virtual member function for that class. For instance class A above has two member functions vmf and vmf2. The vtable of A, then will have two entries, 0 and 1, and will point to the functions A::vmf and A::vmf2 respectively. The vtable of B will have three entries, 0, 1, 2, that will point to functions B::vmf, A::vmf2 and B::vmf3 respectively.

    Every object of a dynamic class has a hidden data member (called the virtual pointer) that points to the vtable of its class. When C++ specifies that a call goes through dynamic dispatch (in C++ parlance, a call to the ultimate overrider), we do not call directly any function but instead, through this hidden data member, we reach the vtable and using the index of the virtual member function being called, we retrieve the entry containing the addresses to the real function. Then this addresses is used in an indirect call.

    Since both the virtual table and the virtual pointer are hidden from the eyes of the developer, sometimes errors in our code may cause link errors.

    The compiler does not emit a vtable

    This may not apply to all C++ compilers, but usually a C++ compiler only emits a vtable when it finds a definition of a virtual member function. Note that virtual member function definitions for a given class may be scattered in several files. Magic happens again so more than one definition of the vtable of a given class in several files does not become a problem at link time.

    But, what if you forget to define all virtual functions? This may look contrived but in my experience this may happen by accident. The problem lies on the error at link time, which is really confusing.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    
    struct A
    {
        int x_;
        A(int x) : x_(x) { }
    
        // We forget to define A::foo
        virtual void foo();
    };
    
    void quux(A* a)
    {
        // Dynamic dispatch
        a->foo();
    }
    
    int main(int argc, char * argv[])
    {
        A a(3);
        quux(&a);
    }
    

    If you compile and link this with g++ (I use -g since it improves link error messages by using the debugging information).

    $ g++ -o prova test.cc -g
    /tmp/ccl71r2A.o: In function `A':
    test.cc:4: undefined reference to `vtable for A'
    collect2: ld returned 1 exit status

    But the line 4 is the constructor. You see now how confusing this message is, don't you? What is going on?

    Well, everything makes sense if we remember that hidden data member I mentioned above, the virtual pointer. As a data member of a class it must be initialized in the constructor. It is initialized with the address of the virtual table of A. But the virtual table of A was not emitted since we forgot to define all virtual member functions. Thus, unresolved reference for the virtual table.

    Missing virtual member functions in base classes

    Remember that the vtable contains entries for all the virtual member functions of the base tables. The vtable is statically initialized (this is, the compiler "hardcodes" in the generated code, in the data section) the addresses of each entry. What if we forget to define a virtual member function of a base class?

    Consider this example

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    
    struct A
    {
        int x_;
        A(int x) : x_(x) { }
    
        virtual void foo();
        // We forget to define A::foo2
        virtual void foo2();
    };
    
    void A::foo() 
    {
        // Definition of A::foo
    }
    
    struct B : A 
    {
        B(int x) : A(x) { }
    
        virtual void foo() 
        { 
            // Definition of B::foo
        }
    };
    
    void quux(A* a)
    {
        a->foo();
    }
    
    int main(int argc, char * argv[])
    {
        B b(3);
        quux(&b);
    }
    

    If we compile and link with g++ we get

    /tmp/cc4t9NG3.o:(.rodata._ZTV1B[vtable for B]+0xc): undefined reference to `A::foo2()'
    /tmp/cc4t9NG3.o:(.rodata._ZTV1A[vtable for A]+0xc): undefined reference to `A::foo2()'
    collect2: ld returned 1 exit status

    This happens because vtables of A and B refer to A::foo2, but we forgot to define it. Fortunately, now the error message is easier to grasp: some function is missing.

    Obviously, many more link errors caused by C++ exist, but I think the ones shown here are quite common and the error messages related to them are quite confusing.

  • Crazy stuff in C++ (1)

    Introduction

    C++ is a controversial language: you love it or you hate it. As always, knowing better about something allows one to make better arguments for or against that thing. This is what this series is about. Here I’ll explain some of the less known (except for C++ expert programmers, of course) features of C++.

    Let’s start with templates, which account for such a huge part of the language.

    Templates

    Everyone knows that C++ has templates. Templates is an implementation of the «I have an algorithm that can be used with many different types» idea. This idea is called generic programming and is a pretty powerful one. This is why it is present in almost all modern languages.

    Back to C++. C++ defines two kinds of templates: classes templates and function templates. Class templates define an infinite set of classes while function templates define an infinite set of functions. The elements of these sets of classes or functions are called specializations. Every template has a template-name which will be used to name a specific specialization.

    Template declarations

    Consider these two declarations

    1
    2
    
    template <typename T>
    struct my_list { ... }
    
    1
    2
    
    template <typename T>
    void max(T a, T b) { return a > b ? a : b; }
    

    These are template declarations. The first one declares a class template and its template-name is my_list, the second one defines a function template and its template-name is max. A template declaration is just a declaration preceded with something without an official name that starts with template <…>, I will call it the template header (but note that this name is not blessed by the C++ standard at all, it just makes sense to me call it like this).

    The template header defines what are the parameters of the template class. These are called the template parameters. A type-template parameter, like that T shown above, is a “type variable”. This is the most usual template parameter as it allows to parameterize the declaration over one or more type variables. C++, though, has two more kinds of template parameters: nontype-template parameters and (the funny named) template-template parameter. A nontype-template parameter allows us to parameterize the declaration over a (compile-time) constant integer value. Here “integer value” is a very broad term: of course it includes all integers, but also enum values (enumerators) and addresses of (statically allocated) variables and functions. A template-template parameter allows us to parameterize a declaration over another class template with appropiate template parameters.

    1
    2
    
    template <typename T, int N> // N is a nontype-template parameter
    struct my_fixed_array { };
    
    1
    2
    
    template <template <typename T> MyContainer> // MyContainer is a template-template parameter
    struct adaptor { };
    

    Specializations

    I said above that a class template or function template defines an infinite set of classes or function and that each element of that set was called a specialization. There is a specialization for every possible value that a template parameter can have. Such values are not bounded thus there is an infinite number of specializations (well, we could argue that constant integer values are finite in the language, but types are clearly not finite).

    We give value to template parameters of a template by means of template arguments. These template arguments always appear in what is called a template-id. A template-id is just the template-name followed by a list of template-arguments enclosed in < and >.

    1
    2
    
    my_list<int> l;// Here T has value int, we will write it as T ← int
    max<float>(3.4f, 5.6f); // T ← float
    

    Primary template and partial specializations

    When we first declare a class template or a function template, such declaration defines the primary template.

    1
    2
    
    template <typename T>
    struct my_list { ... };
    
    1
    2
    
    template <typename T>
    void swap(T& a, T& b);
    

    Class templates (but not function templates!) can have an extra set of template declarations called partial specializations. A partial specialization looks like a normal class template declaration but the template-name is now a template-id where the template-arguments partially specialize the given template parameters.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    
    // 1) Partial specialization for "pointer to (any) P" type
    template <typename P>
    struct my_list<P*> { }; 
    
    // 2) Partial specialization for "array of (any) Size of (any) Element"
    template <typename Element, int Size>
    struct my_list<Element[Size]> { };
    
    // 3) Partial specialization for "pointer to function with two parameters 
    // Arg1 and Arg2 returning Ret"
    template <typename Ret, typename Arg1, typename Arg2>
    struct my_list<Ret (*)(Arg1, Arg2)>;
    

    A C++ compiler will always pick the partial specialization (if any) that is “closer” to the one requested in the template arguments. If no partial specialization matches, the primary template is chosen instead. The exact algorithm is not important here.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    
    my_list<int> l0; // will pick the primary template T ← int
    
    my_list<int*> l1; // will pick partial specialization 1) 
    // where P ← int (note that respect to the primary template this is T ← int*)
    
    my_list<int[10]> l2; // will pick partial specialization 2) 
    // where Element ← int and Size ← 10
    
    my_list<int (*)(float, double)> l3; // will pick partial specialization 3) 
    // where Ret ← int, Arg1 ← float and Arg2 ← double
    

    I think this is enough for today regarding C++ templates. More craziness to come. Stay tuned.