Monday, November 17, 2014

Not Getting Software Wrong


The majority of time creating software is typically spent making sure that you got the software right (or if it's not, it should be!).  But, sometimes, this focus on making software right gets in the way of actually ensuring the software works.  Instead, in many cases it's more important to make sure that software is not wrong.  This may seem like just a bit of word play, but I've come to believe that it reflects a fundamental difference in how one views software quality.

Consider the following code concept for stopping four wheel motors on a robot:

// Turn off all wheel motors
  for (uint_t motor = 1; motor < maxWheels; motor++)  {
    SetSpeed(motor, OFF);
  }

Now consider this code:
// Turn off all wheel motors
  SetSpeed(leftFront,  OFF);
  SetSpeed(rightFront, OFF);
  SetSpeed(leftRear,   OFF);
  SetSpeed(rightRear,  OFF);

(Feel free to make obvious assumptions, such as leftFront being an const uint_t, and modify to your favorite naming conventions and style guidelines. None of that is the point here.)

Which code is better?   Well, from our intro to programming course probably we want to write the first type of code. It is more flexible, "elegant," and uses the oh-so-cool concept of iteration.  On the other hand, the second type of code is likely to get our fingers smacked with a ruler in a freshman programming class.  Where's the iteration?  Where is the flexibility? What if there are more than four items that need to be turned off? Where is the dazzling display of (not-so-advanced) computer science concepts?

But hold on a moment. It's a four wheeled robot we're building.  Are you really going to change to a six wheeled robot?  (Well, maybe, and I've even worked a bit with such robots.  But sticking on two extra wheels isn't all that common after the frame has been welded!)  So what are you really buying by optimizing for a change that is unlikely to happen?

Let's look at it a different way. Many times I'm not interested in elegance, clever use of computer science concepts, or even number of lines of code. What I am interested in is that the code is actually correct. Have you ever written the loop type of code and gotten it wrong?  Be honest!  There is a reason that "off by one error" has its own wikipedia entry. How long did it take you to look at the loop and make sure it was right? You had to assume or go look up the value of maxWheels, right?  If you are writing unit tests, you need to somehow figure out a way to test that all the motors got turned off -- and account for the fact that maxWheels might not even be set to the correct value.

But, with the second set of code, it's pretty easy to see that all four wheels are getting turned off.  There's no loop to get wrong.  (You do have to make sure that the four wheel consts are set properly.)  There is no loop to test.  Cyclomatic complexity is lower because there is no looping branch.  You don't have to mentally execute the loop to make sure there is no off-by-one error. You don't have to ask whether the loop can (or should) execute zero times. Instead, with the second set of codes you can just look at it and count up that the wheels are being turned off.

Now of course this is a pretty trivial example, and no doubt one that at least someone will take exception to.  But the point I'm making is the following.  The first code segment is what we were all taught to do, and arguably easier to write. But the second code segment is arguably easier to check for correctness.  In peer reviews someone might miss a bug in the first code. I'd say that missing a bug is less likely in the second code segment.

Am I saying that you should avoid loops if you have 10,000 things to initialize?  No, of course not. Am I saying you should cut and paste huge chunks of code to avoid a loop?  Nope. And maybe you actually do want to use the loop approach in your situation even with 3 or 4 things to initialize for a good reason. And if you get lots of lines of code that is likely to be more error-prone than a well-considered loop. All that I'm saying is that when you write code, consider carefully the tradeoff between writing concise, but clever, code and writing code that is hard to get wrong. Importantly, this includes the risk of a varying style causing an increased risk of getting things wrong. How that comes out will depend on your situation.  What I am saying is think about what is likely to change, what is unlikely to change, and what the path is to least risk of having bugs.

By the way, did you see the bug?  If you did, good for you.  If not, well, then I've already proved my point, haven't I?  The wheel number should start at 0 or the limit should be maxWheels+1.  Or the "<" should be "<=" -- depending on whether wheel numbers are base 0 or base 1.  As it is, assuming maxWheels is 4 as you'd expect, only 3 out of 4 wheels actually get turned off.  By the way, this wasn't an intentional trick on the reader. As I was writing this article, after a pretty long day, that's how I wrote the code. (It doesn't help that I've spent a lot of time writing code in languages with base-1 arrays instead of base-0 arrays.) I didn't even realize I'd made that mistake until I went back to check it when writing this paragraph.  Yes, really.  And don't tell me you've never made that mistake!  We all have. Or you've never tried to write code at the end of a long day.  And thus are bugs born.  Most are caught, as this one would have been, but inevitably some aren't.  Isn't it better to write code without bugs to begin with instead of find (most) bugs after you write the code?

Instead of trying to write code and spending all our effort to prove that whatever code we have written is correct, I would argue we should spend some of our effort on writing code that can't be wrong.  (More precisely, code that is difficult to get wrong in the first place, and easy to detect is wrong.)  There are many tools at our disposal to do this beyond the simple example shown.  Writing code in a style that makes very rigorous static analysis tools happy is one way. Many other style practices are designed to help with this as well, chief among them being various type checking strategies, whether automated or implemented via naming conventions. Avoiding high cyclomatic complexity is another way that comes to mind. So is creating a statechart before actually writing the code, then using a switch statement that traces directly to the statechart. Avoiding premature optimization is also important in avoiding bugs.  The list goes on.

So the next time you are looking at a piece of code, don't ask yourself "do I think this is right?"  Instead, ask yourself "how easy is it to be sure that it is not wrong?"  If you have to think hard about it -- then maybe it really is incorrect code even if you can't see a bug.  Ask whether it could be rewritten in a way that is more obviously not wrong.  Then do it.

Certainly I'm not the first to have this idea. But I see small examples of this kind of thing so often that it's worth telling this story once in a while to make sure others have paused to consider it.  Which brings us to a quote that I've come to appreciate more and more over time.  Print it out and stick it on the water cooler at work:

"There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies."

        — C.A.R. Hoare, The 1980 ACM Turing Award Lecture  (CACM 24(2), Feb 1981, p. 81)

(I will admit that this quote is a bit clever and therefore not a sterling example of making a statement easy to check for correctness.  But then again he is the one who got the Turing Award, so we'll allow some slack for clever wording in his acceptance essay.)

Job and Career Advice

I sometimes get requests from LinkedIn contacts about help deciding between job offers. I can't provide personalize advice, but here are...