Upstream Decisions, Downstream Costs

Windows Tech Journal

Issue: November 1997

“How long do you think version 3.0 will take?” Kim asked the assembled project team members. She’d been assigned to lead a major upgrade to Square-Calc, Square-Tech’s signature spreadsheet program. Square-Calc 3.0 would use Square-Calc 2.5’s code base as the jumping-off point for a series of major enhancements.

“I’ve reviewed each of the features carefully,” José said. “It looks to me like it would take about 6 months for our 4-person development team to implement this functionality from scratch. I wasn’t involved in version 2.5, and I haven’t had a chance to look over the code base, but the from a user point of view, the product looks pretty good, so I think 6 months is a safe estimate.”

“I wouldn’t be too sure about that,” George said. George was a senior developer who had worked on several of Square-Tech’s projects. “I didn’t work on Square-Calc 2.5, but I know that that team had problems. The contract programmer who was in charge of the graphics updates delivered his code 2 months late. Then the team had problems integrating his code with the rest of the code, and it took another month to stabilize the software. They released a good product, but I don’t know what will happen once we start changing the code. If it would take 6 months to implement the version 3.0 features from scratch, I’d allow at least 8 months to add the same features to the existing code base—especially since none of us worked on version 2.5.”

“I’ll tell my boss that we need at least 6 months, but that he should be prepared to accept a 2-month overrun,” Kim said. “How do you want to proceed with requirements specification and design?”

“The company has been over this ground before,” Bob, the third developer, said. “I think we all have a pretty good idea about what the software should look like. I don’t see the point of wasting our time on a huge, bureaucratic requirements document.”

“That’s OK with me, José said. “I used to work on military software projects, and those government spec’s are a total waste of time. Let’s just create a checklist of features and implement the software based on the list.”

“I think formal design is about as useless as formal requirements,” Bob said. “We all know what we’re doing. We can just noodle some ideas around on the whiteboard. I think that will be good enough.”

“Sounds good to me,” José said.

“OK,” Kim said. “Let’s get to work.”

The team members had decided earlier that they needed to make quick progress to gain upper management’s confidence, so they coded a handful of the easiest features first. Some features took a little longer than expected, but they attributed that to learning to work with an unfamiliar code base. They made their first internal release 2 months into their 6 month schedule, and then began work on the meatier features.

José turned first to adding two new graphs. After working for a few weeks, the team held a status meeting. José reported encountering a few problems. “The graphics code is really unstable,” he said. “I’m doing the best I can with it, but it’s slow going. I can’t seem to add any new graphs without breaking all the existing graphs. That contractor’s code from version 2.5 is terrible. It doesn’t follow the project’s coding conventions, and I don’t see how they ever got most of it to work at all.” Bob and George reported that the code they were working on was less problematic, and they were making good progress.

“Let’s just see how your work goes a little longer,” Kim said to José. “If you keep running into problems, Bob or George can give you a hand.”

A few weeks later, the team met again. “I’m running into even more problems than before,” José reported. “It seems like every time I fix one bug in that old graphics code, I uncover two or three more. I think that whole graphics engine needs to be rewritten.”

“Bob, can you spare some time to help José?” Kim asked.

“I’m running into problems myself,” Bob said. “I’ve been working on speeding up the spreadsheet recalculation engine. I thought the approach I had chosen would work, but I’ve run into a flaw in my design, and I have to rewrite a big chunk of what I’ve already coded.”

“How about you George?”

“I’ve been doing my own design work as I go along,” George said, “so I haven’t run into any major problems. But working with that old code is tough. The code is really brittle, so I have to make my changes carefully. There are some squirrelly workarounds that the version 2.5 team used to hack the integration with that contractor’s graphics code. When José changes that code, I keep needing to change my code. I don’t think I’m making progress fast enough to be able to help José.”

“OK, we’ll keep the assignments as they are,” Kim decided.

As the project wore on, the development team found that the pattern of slow progress didn’t change much. José eventually did rewrite the graphics subsystem, and the redesign of that code caused George to rewrite many of the interfaces to the code he had been working on. Bob’s work on the recalculation engine kept pace roughly with José’s graphics work, and the team reached its feature complete milestone after 9 months. At that point, the marketing department reviewed the software and found several “essential” features that had not been included. The development team spent 2 more months adding those features. The developers needed 2 months after that to stabilize the software, and they finally released Square-Tech 3.0 after 13 months.

How Fast Can You Swim Upstream?

Although the Square-Calc case study is fictional, the general pattern is one that has been repeated many times. Development teams take shortcuts early in the project to save time, and in the end those shortcuts come back to haunt both the current project team and the project teams on future projects that try to extend those designs and code.

You’ll sometimes hear experienced software developers talk about the “upstream” and “downstream” parts of a software project. The word “upstream” refers to the early parts of a project, and “downstream” refers to the later parts. I have found this distinction between “upstream” and “downstream” to be a fundamentally useful way to think about software projects. The work developers perform early in the project flows into the later part of the project. Good early work sets up the project for a gentle float through placid waters downstream. Poor quality work sets up the project for a rough and tumble ride through rocky whitewater downstream.

Barry Boehm and Philip Papaccio found that an error created early in the project, for example during requirements specification or architecture, costs 50 to 200 times as much to correct late in the project as it does to correct close to the point where it was originally created. Figure 1 illustrates this effect.

GIF

Figure 1. Increase in defect cost as time between defect creation and defect correction increases. Effective projects practice “phase containment”—detecting and correcting defects in the same phase they’re created.

Why are errors so much more costly to correct downstream? One sentence in a requirements specification can easily turn into several design diagrams. Later in the project, those diagrams can turn into hundreds of lines of source code, dozens of test cases, many pages of end-user documentation, help screens, instructions for technical support personnel, and so on.

If the project team has an opportunity to correct a mistake at requirements time when the only work that has been done is the creation of a one-sentence requirements statement, it will be wise to correct that single sentence rather than waiting until it also has to correct all the various manifestations of the requirements problem downstream. This idea is sometimes called “phase containment,” which refers to detecting and correcting defects in the same phase in which they’re created.

Swimming Lessons

The Square-Calc 3.0 team fell victim to a major upstream problem: It didn’t perform the upstream activities of requirements development and design, which meant that it didn’t give itself an opportunity to detect and correct upstream problems cheaply. It could correct problems only downstream, during coding, and only at great cost.

Well run software projects root out problems early. They create opportunities to correct upstream problems. They develop and review their requirements carefully to correct as many requirements problems as they can “in phase.” They develop and review their designs with just as much care. In the case study, Bob shouldn’t have spent several weeks writing new code for the recalc engine before discovering a design flaw. That flaw should have been corrected cheaply during an upstream design review.

Because no code is being generated while these upstream activities are being conducted, these activities might seem as though they are delaying “the real work” of the project. In the case study, Bob was eager to begin coding. The temptation to start coding before nailing down the requirements or thinking through high level designs is a strong temptation to developers who enjoy coding (including me). But the upstream activities of requirements development and design don’t delay the project; they lay the groundwork for the project’s success, and they shouldn’t be abbreviated.

You might worry about erring on the side of spending too much time on upstream work and increasing the project’s overhead. That is a slight risk, but very few projects err on that side. The more severe risk is erring on the side of too little upstream work, which allows defects to slip through that must be corrected at 50 to 200 times the cost of correcting them earlier. The smart money errs on the side of too much upstream work rather than too little.

Life Preservers for Maintenance Projects

Project teams create software under intense schedule pressure, which leads them to take shortcuts and produce low quality design and code bases. When the project team finally stabilizes the software enough to release it, the team members heave huge sighs of relief and cross their fingers, hoping that the next project will work out better.

The next project hardly ever works out better. Even though a project team knows that the old designs and code are shaky, because of pressure to release timely upgrades it will continue to build the next version of its product on that shaky foundation. The project pays a steep price for using that previous project’s low quality code. Newly added code will expose latent defects in the old code, which must be corrected. Debugging is time consuming because new bugs can arise from either new code or low quality old code. As problems mount, the project team realizes that the notion of using a poor quality code base to support quick development is a cruel joke—in reality, doing that severely hobbles the project team’s ability to extend its software and ultimately delays the software’s delivery. José might have been right that implementing Square-Calc 3.0’s functionality from scratch would take only 6 months, but implementing it within the context of Square-Calc 2.5’s existing low-quality code base took much longer.

Developers maintaining low quality code they wrote themselves is bad enough, but Richard Thomas found that ten generations of maintenance programmers work on an average program before it’s rewritten. If the code isn’t designed and written well in the first place, those 10 generations of programmers will have a devil of a time modifying it.

Because so many generations of programmers work on any particular program, it isn’t surprising that each of them have to invest a substantial amount of time understanding the code they’re revising. Garish Parikh and Nicholas Zvegintzov report that maintenance programmers spend 50 to 60 percent of their time simply trying to understand the code they’re working on. Create high quality designs and code in the first place, and 10 generations of maintenance programmers will thank you for it!

Watch for the Undertow

The desire to make the maintenance programmers’ jobs easier is noble, but a more immediate reason to develop high quality designs and code is that high quality designs and code make initial development easier. Software changes substantially during its initial development. When requirements change in the middle of a release, developers find themselves revising their own designs and code to accommodate the changes. They are downstream from their own work. In the case study, marketing had the developers make two months worth of changes after they were already 9 months into their project. Even on well-managed projects, Capers Jones reports that requirements change between 1 to 4 percent per month. If you’re looking for a reason to do high quality work upstream, realize that “downstream” might be closer than it appears.