Monday, March 31, 2008

The Superiority of Ranges

When it comes to describing an interval (whether it is a range in memory, a range on the screen, a range of times) if you work in C/C++, your life will become easier once you recognize the truth: the language wants you to work using "begin/end+1" for ranges.

To clarify the problem, when we have to describe a range of values, there are typically three ways that programmers use a pair of integrals to describe the range:
  1. First item, number of items ("start/length")
  2. First item, last included item ("start/end")
  3. Fisrt item, one past last item ("start/end+1")
This third way is the best way...I don't say that lightly...but after years and years of programming, I find that code using start/end+1 comes out cleaner and simpler every time.

First, why not start/end? Well, the nice thing about start/end+1 is that the length of the range is just end-start. When you use start/end, you have to do the awkward (end-start+1).

Range checks also become inconsistent...basically you want to have your regions defined inclusive on one side and exclusive on the other. In other words, being in the range should be true if x >= x1 && x < x2. What's good about this is that if the end value of one range is equal to the start value of another, the two perfectly align, and the two cover an interval completely. With start/end notation, you get tons of off-by-one situations.

Similarly, start/length simply requires you to write more code. Consider start/end+1...a number of useful operations become trivial:
  • The union of [x1,x2) and [y1,y2) is just [min(x1,y1),max(x2,y2))
  • The intersection of the two is [max(x1,y1),min(x2,y2))
  • A range is empty if x2 <= x1
  • Containment and asymmetric set differences are all equally clean.
Try rewriting the above logic with either start/end or start/length and I think you'll that start/end+1 gives you the simplest code.

No comments:

Post a Comment