Howard E. Hinnant
2006-05-06

Some Design Rationale for the Rvalue References

A common request or question about the rvalue reference work is:

Why can't move constructors and move assignment operators be automatically generated like copy constructors and copy assignment operators?

Indeed this would be very convenient, and most of the time it would generate correct code. The problem is that it won't always generate correct code. And in this case, simply recompiling working C++03 code in a C++0X compiler could produce code with run time errors in it (silent breakage, the worst kind).

A simple example of a class which can not have compiler generated move members is one which is self referencing.

Aren't self referencing classes pretty rare? Isn't such code so poorly designed it's broken anyway?

Self referencing classes aren't common, but they are not rare either. And no, there are some very good designs which exhibit self referencing behavior. For example two commercial implementations of the following classes from the std::lib are self referencing for very good reasons:

The implementations in question embed the "end node" for these types into the class itself, instead of putting it on the heap. This makes default construction both wicked fast and a nothrow operation (both extremely valuable properties of a type).

Can move members be automatically (and implicitly) generated in special circumtances? Perhaps if a class consists only of other classes which already have move members, then it would always be safe to implicitly and automatically generate the move members. For example:

class A
{
private:
    std::vector<B> some_data;
public:
    // ...
};

A's move constructor is always going to look like:

A(A&& a) : some_data(std::move(a.some_data)) {}

So why can't we just have the compiler generate that automatically?

Up until recently I had only generated a fairly contrived example that would break under these rules. Below I present very reasonable example code which shows the potential for working correctly in C++03, but breaking (if move members are automatically generated) when recompiled under C++0X.

The code below is based on the Observer Pattern described in Design Patterns by Gamma, Helm, Johnson and Vlissides. There are two base class types: Listener and Broadcaster. Clients derive from these base classes for the purpose of having the Broadcaster types transmit a message upon some event to all of the Listener types that have registered with it. The Broadcaster stores a std::vector<Listener*> as a data member so that it knows who to broadcast the message to. In the code shown below the Listener also keeps a std::vector<Broadcaster*> so that it can subscribe to multiple Broadcaster's and so that it can tell a Broadcaster when it wants to unsubscribe (such as upon destruction of the Listener).

The example is coded with four variants below:

  1. CPP03: As it would be for C++03. No move semantics.
  2. CPP0X_NOM: As it would be for C++0X, but completely ignorant of move semantics.
  3. CPP0X_CGM: As it would be for C++0X, completely ignorant of move semantics, but with simulated compiler-generated move constructors and move assignment operators.
  4. CPP0X_UGM: As it would be for C++0X, and with correct user written move constructors and move assignment operators.
#include <vector>
#include <iostream>

// C++03
#define CPP03      1
// C++0X with rvalue-reference in languge and std::lib, but not user code
#define CPP0X_NOM  2
// C++0X with rvalue-reference in languge and std::lib, user code has compiler-generated move
#define CPP0X_CGM  3
// C++0X with rvalue-reference in languge and std::lib, user code has user-written move
#define CPP0X_UGM  4

#define MODE CPP0X_CGM

class Listener;
class Broadcaster;

class BroadcasterBase
{
    friend class Broadcaster;
private:
    BroadcasterBase(const BroadcasterBase&);
    BroadcasterBase& operator=(const BroadcasterBase&);
protected:
    std::vector<Listener*> listeners_;

    BroadcasterBase() {}
    ~BroadcasterBase();
#if MODE == CPP0X_CGM || MODE == CPP0X_UGM
    BroadcasterBase(BroadcasterBase&& b)
        : listeners_(std::move(b.listeners_))
        {}
#endif
#if MODE == CPP0X_CGM
    BroadcasterBase& operator=(BroadcasterBase&& b)
        {listeners_ = std::move(b.listeners_); return *this;}
#endif
};

class Broadcaster
    : private BroadcasterBase
{
    friend class BroadcasterBase;
private:

    void removeAllListeners();
public:
    Broadcaster() {}
    virtual ~Broadcaster() {}

    Broadcaster(const Broadcaster&);
    Broadcaster& operator=(const Broadcaster&);

#if MODE == CPP0X_CGM
    Broadcaster(Broadcaster&& b) : BroadcasterBase(std::move(b)) {}
    Broadcaster& operator=(Broadcaster&& b) {BroadcasterBase::operator=(std::move(b)); return *this;}
#endif

#if MODE == CPP0X_UGM
    Broadcaster(Broadcaster&& b);
    Broadcaster& operator=(Broadcaster&& b);
#endif

    void broadcast();
    void removeListener(Listener& x);
    void addListener(Listener&);
};

class Listener
{
    friend class BroadcasterBase;
    friend class Broadcaster;
private:
    Listener(const Listener&);             // undefined
    Listener& operator=(const Listener&);  // undefined
    void addBroadcaster(Broadcaster&);
    void removeBroadcaster(Broadcaster&);
    virtual void receive_message(Broadcaster*) = 0;
protected:
    std::vector<Broadcaster*> broadcasters_;

public:
    Listener() {}
    virtual ~Listener();

};

BroadcasterBase::~BroadcasterBase()
{
    for (std::vector<Listener*>::iterator i = listeners_.begin(), e = listeners_.end();
            i != e; ++i)
        (*i)->removeBroadcaster(static_cast<Broadcaster&>(*this));
}

void
Broadcaster::removeAllListeners()
{
    for (std::vector<Listener*>::iterator i = listeners_.begin(), e = listeners_.end();
            i != e; ++i)
        (*i)->removeBroadcaster(*this);
}

void
Broadcaster::broadcast()
{
    for (std::vector<Listener*>::iterator i = listeners_.begin(), e = listeners_.end();
            i != e; ++i)
        (*i)->receive_message(this);
}

Broadcaster::Broadcaster(const Broadcaster& b)
{
    listeners_.reserve(b.listeners_.size());
    for (std::vector<Listener*>::const_iterator i = b.listeners_.begin(), e = b.listeners_.end();
            i != e; ++i)
    {
        (*i)->addBroadcaster(*this);
        listeners_.push_back(*i);
    }
}

Broadcaster&
Broadcaster::operator=(const Broadcaster& b)
{
    if (this != &b)
    {
        removeAllListeners();
        listeners_.reserve(b.listeners_.size());
        for (std::vector<Listener*>::const_iterator i = b.listeners_.begin(), e = b.listeners_.end();
                i != e; ++i)
        {
            (*i)->addBroadcaster(*this);
            listeners_.push_back(*i);
        }
    }
    return *this;
}

#if MODE == CPP0X_UGM
Broadcaster::Broadcaster(Broadcaster&& b)
    : BroadcasterBase(std::move(b))
{
    for (std::vector<Listener*>::iterator i = listeners_.begin(), e = listeners_.end();
            i != e; ++i)
    {
        (*i)->removeBroadcaster(b);
        (*i)->addBroadcaster(*this);
    }
}

Broadcaster&
Broadcaster::operator=(Broadcaster&& b)
{
    removeAllListeners();
    listeners_ = std::move(b.listeners_);
    for (std::vector<Listener*>::iterator i = listeners_.begin(), e = listeners_.end();
            i != e; ++i)
    {
        (*i)->removeBroadcaster(b);
        (*i)->addBroadcaster(*this);
    }
    return *this;
}
#endif

void
Broadcaster::removeListener(Listener& x)
{
    std::vector<Listener*>::iterator i = std::find(listeners_.begin(), listeners_.end(), &x);
    if (i != listeners_.end())
        listeners_.erase(i);
}

void
Broadcaster::addListener(Listener& x)
{
    if (listeners_.size() == listeners_.capacity())
        listeners_.reserve(2*listeners_.size());
    x.addBroadcaster(*this);
    listeners_.push_back(&x);
}

Listener::~Listener()
{
    for (std::vector<Broadcaster*>::iterator i = broadcasters_.begin(), e = broadcasters_.end();
            i != e; ++i)
        (*i)->removeListener(*this);
}

void
Listener::removeBroadcaster(Broadcaster& x)
{
    std::vector<Broadcaster*>::iterator i = std::find(broadcasters_.begin(), broadcasters_.end(), &x);
    if (i != broadcasters_.end())
        broadcasters_.erase(i);
}

void
Listener::addBroadcaster(Broadcaster& x)
{
    broadcasters_.push_back(&x);
}

class ConcreteListener
    : public Listener
{
private:
    int id_;
public:
    explicit ConcreteListener(int id)
        : id_(id) {}

    virtual void receive_message(Broadcaster* b)
    {
        if (std::find(broadcasters_.begin(), broadcasters_.end(), b) != broadcasters_.end())
            std::cout << "Listener " << id_ << " received message\n";
        else
            std::cout << "Listener " << id_ << " received stray message\n";
    }
};

class ConcreteBroadcaster
    : public Broadcaster
{
private:
public:
    ConcreteBroadcaster() {}
#if MODE == CPP0X_CGM || MODE == CPP0X_UGM
    ConcreteBroadcaster(ConcreteBroadcaster&& x) : Broadcaster(std::move(x)) {}
#endif
};

extern bool x;

ConcreteBroadcaster ConcreteBroadcaster_factory(Listener& l1, Listener& l2)
{
    ConcreteBroadcaster broadcaster;
    broadcaster.addListener(l1);
    broadcaster.addListener(l2);
    if (x)  // fake out RVO
         return broadcaster;
    return ConcreteBroadcaster();
}

bool x = true;

int main()
{
    ConcreteListener listener1(1);
    ConcreteListener listener2(2);
    ConcreteBroadcaster broadcaster = ConcreteBroadcaster_factory(listener1, listener2);
    broadcaster.broadcast();
}

With MODE set to CPP03, there are no rvalue references, and the std::vector doesn't perform any move semantics. The code example should compile on your compiler today and print out:

Listener 1 received message
Listener 2 received message

With MODE set to CPP0X_NOM, there are actually no source code changes in the example above. But in this mode I turn on rvalue reference support as currently proposed to the C++ committee, and implemented in CodeWarrior. Although it does not directly change code in the example, the underlying std::vector which is used in the example now understands move semantics. This mode models move semantics without automatically generated move constructors and move assignment operators. The result is the same:

Listener 1 received message
Listener 2 received message

With MODE set to CPP0X_CGM, I have given Broadcaster and ConcreteBroadcaster move constructors that are identical as to what the compiler would automatically generate if we specified that in C++0X. The output is:

Listener 1 received stray message

... followed by an application crash. Two things have gone wrong.

  1. A Listener has received a message from a Broadcaster which it does not recognize.
  2. When the Listener destructs, it tries to remove itself from a Broadcaster that itself has already destructed. This results in a bad memory access on my system.

Both of these problems are due to an incorreclty written move constructor for Broadcaster which failed to notify Listeners that the address of the Broadcaster they were subscribed to was changing.

With MODE set to CPP0X_UGM, the example once again works and has the correct output. In this mode I've coded the move members as they should be (which is obviously different from the compiler-generated variants).

Despite the dangers of having the compiler implicitly generate move members, the fact remains that most of the time, the compiler would generate correct code. N1717 proposes an explicit syntax for auto-generating some special members which I do support. The difference here is that the class author must explicitly request the automatic code generation.