Modern Maintainable Code

Summary:

2) It's easy to accidentally eliminate overloads from overload resolution (the set of overloads the compiler considers when calling a function), especially when doing the above. We'll discuss a simple technique to ensure that you don't leave your users in an unfortunate position by accidentally being too specific.

We'll also discuss a few other things along the way, like one reason why non-member functions are more flexible than member functions. We'll highlight why the modern C++11 guideline: 'prefer using the non-member std::begin, std::end, and std::swap functions' exists and why things like non-member std::size are in our near future [1].

One thing people often forget is that overloading lets you extend your use of other people's code. For example: Suppose I'd like to pull in some random library from the internet, and because the designer of that library is "particular" (read: silly), they wrote a container Foo that correctly implemented the forward iterator interface, but provided only a begin method on the Foo class, without an accompanying end method.

I'd love to be able to write templated functions that take containers (including both Foo and STL containers like vector) and run STL algorithms on them, but because this class is weird, it's not apparent how to make that happen.

Let me back up a minute, show what I mean, and explain why someone might do this. This is what's going on:

The question, is how do we know when it's no longer safe to dereference the iterator? How do we know that we've exhausted all of the elements inside the Foo container? The standard C++ way would be to call myContainer.end() and compare the iterator to that; if the iterator equals myContainer.end(), we've exhausted the range.

The fundamental issue: This developer chose to get what would be the equivalent of the "end" iterator by having it be the result of a default-constructed FooIterator type.

This design decision isn't actually so far-fetched, as it's exactly what is done with std::istream_iterator (though stream classes also don't have a begin member).

The original author got away with this decision because all they ever did was call functions directly, like so: std::find(f.begin(), FooIterator{}, 5);, where they specified the "end" iterator directly.

Unfortunately, although their choice works out ok at that level, it doesn't compose well:

We could write a function bar, that takes in a templated Container type and runs find over that container. Again, not so farfetched, why not reuse code between several containers? The problem is that I can either write an implementation that works with STL-like containers (things that implement container.end()) or an implementation that works with Foo and knows how to construct its equivalent to the "end" iterator.

I could write an overload of bar to solve this, one for each style of container, but what if I've also got another function baz that has the same fundamental problem? I'd have to duplicate most of baz too in another overload.

I could also write a couple overloads of find that take the containers directly (rather than iterators), one version for the STL containers, one for Foo, and that's a bit better than our first idea, but if I wanted to also call std::transform on my container, I again need to do more work.

No. Boil the problem down to its smallest component: There is no "end" method on Foo. Solve that problem.

Now get frustrated because you "can't". The code is a library from the internet. Either you don't have access to the source code, and therefore cannot add an end member function, or you do have access, but know that you're going to have to remember to add an end member function every time they release a new version.

There's a better way. C++11 introduced a non-member version of the begin and end functions for this exact reason. I cannot ever add a member function to a primitive type like a fixed-size array (like int x[20];), but it would be meaningful to have a begin and end function on them for use with standard algorithms. Similarly, I cannot add an end member function to Foo, but it still makes sense to talk about that concept. The way we do that is via a normal (non-member) function.

The implementation of the non-member function, std::end, is simple. There are 2 overloads in the STL, one which is templated over any type T and returns the result of calling instanceOfT.end() and another overload for fixed-size arrays that returns the arrayName + sizeOfArray [2]. Why not add one of our own that takes a Foo?

This looks like:

We can use the existing non-member std::begin() function for free, because our container implements a begin method, so we're golden there, the STL's non-member begin function will just call that.

The one thing we still need to fix is bar: bar calls the member version of begin and end. We should change that so that it uses the more general/extensible non-member form:

In general, you should always prefer using the non-member forms of functions like begin and end. They will work in more cases.

How to avoid accidentally removing overloads from scope:

There's an easy way to accidentally nullify our solution to the last problem:

Yep. Just add the std:: prefix to begin and end and this won't compile when instantiated with the Foo type. Why? Because we wrote our overload of end in a different namespace (possibly no namespace at all). By saying "look for end in the std namespace (which is exactly what we do with the code std::end), we don't look at any overloads in other namespaces. We effectively restrict overload resolution (the process of looking at all overloads in scope and selecting the best match) to the std namespace.

The solution here is NOT to write our overload into the std namespace (never do that with anything!), but rather, to ensure that all appropriate namespaces are considered for overload resolution.

How do we do that?

Well, step 1 is to avoid fully-qualifying names (in other words, don't prefix with someNamespace::).

Step 2 is to bring all overloads you want/need into the current scope. Any overloads declared in your current namespace (or no namespace at all) will already be in scope. You'll need to somehow bring the STL's ones into scope though (it'd be just as bad to be unable to compile bar with vector because we didn't have the std::begin overloads in scope). We do this with using statements. Add the lines:

to the top of your cpp file if you're calling them in a cpp file. Using statements bring symbols into the current scope. In this case, they bring the STL's overloads of begin and end into scope so that they will participate in overload resolution (along with the one you wrote for Foo).

If you're working in a header file, where the guidance is to never use using statements to avoid both pollution and unexpected behavior changes, fear not: there is an answer. The guidance I spoke of is somewhat incomplete: You should never use using statements in a header file at file or namespace scope. It's perfectly fine to use them inside a class or inside a function, even when inside a header, like so:

We only pull in std::begin and std::end for overload resolution in the scope of bar this way. The using statements will not impact anything else.

So the general rules are:
1) Avoid fully qualifying function names (specifying the namespace) when calling a function. Doing so potentially removes candidates from overload resolution.
3) Never use a using statement in a header file unless it is done in the scope of a class or function.

Make these general habits, as the same guidance applies to functions like swap too! You never know who's going to end up calling your code; by following these rules you can make your generic code/algorithms work with other people's user-defined types.

Conclusions:

Overloading is [perhaps surprisingly] a great way to facilitate the generalization of your code when used with templates. It lets us adapt foreign interfaces to work in familiar ways so that we can reuse more code (like the STL's algorithms).

In order to "play nice" with overloads that may not all be defined in the same namespace, it's important to avoid fully-qualifying functions when calling them, and to ensure that all overloads you need have been pulled into scope with a using statement.

Footnotes:

[1] You can actually implement the non-member size function today, among others.

[2] There are actually a few other overloads of std::end for const-correctness. You'd probably also want to implement the const overloads specific to Foo in real code. See cppreference for details.