Keep It Simple
If Alexander the Great could conquer the known world by the time he was 18 years old, you would think that grown adults could conquer the little bits of complexity contained in the taupe-colored boxes by their desks.
Unfortunately, this “little bit of complexity” isn’t as little as some people assume. Computing is the only profession in which a single mind is obliged to span the intellectual distance from a bit to a few hundred megabytes, a ratio of 109, or nine orders of magnitude. The immensity of this ratio is staggering. As Edsgar Dijstra says, “Compared to that number of semantic levels, the average mathematical theory is almost flat. By evoking the need for deep conceptual hierarchies, the automatic computer confronts us with a radically new intellectual challenge that has no precedent in our history” (“On the Cruelty of Really Teaching Computer Science,” Communications of the ACM, December 1989).
At the 1972 Turing Award Lecture, Dijkstra argued that most of programming is an attempt to compensate for the strictly limited size of our skulls—to manage the enormous complexity associated with modern software systems (“The Humble Programmer,” Communications of the ACM, October 1972). A certain amount of software complexity is inherent in the problems we try to solve, but a large part depends as much on the nature of the solution as the problem. The best solutions are those created by people who realize just how small their skulls are and tailor their solutions accordingly.
Hierarchies and Abstraction
Two of the most effective general means of managing complexity are the use of hierarchies and abstractions.
A hierarchy is a tiered, structured organization in which a problem space is divided into levels that are ordered and ranked. In a hierarchy, you handle different details at different levels. The details don’t go away completely; you simply push them to a different level so that you can think about them when you want to rather than all at the same time. Hierarchies come into play most obviously in the module hierarchy of a functional design, but they also come into play in inheritance hierarchies in object-oriented design, nested data-structures, and in many other cases.
Using hierarchies comes naturally to most people. When people draw a complex object, such as a house, for example, they draw it as a hierarchy. First they draw the outline of the house, then the windows and doors, then additional details, as desired (Herbert Simon, The Sciences of the Artificial, MIT Press, 1969). They don’t draw the house brick-by-brick, shingle-by-shingle, or nail-by-nail.
Abstraction is also a mean of reducing complexity by handling different details at different levels. Any time you work with an aggregate entity, you’re working with an abstraction. If you refer to an object as a “house” rather than a combination of glass, wood, and nails, you’re making an abstraction. If you refer to a collection of houses as a “town,” you’re making another abstraction. Abstraction is a more general concept than hierarchy. It can reduce complexity by spreading details across a loose network of components, for example, rather than among a hierarchy’s strictly tiered levels.
Programming productivity has advanced largely through increasing the abstractness of program components. Fred Brooks argues that the biggest single productivity gain ever made in software development arose from the move from machine language to higher-level languages. That move freed programmers from worrying about the detailed quirks of individual pieces of hardware and allowed them to focus on programming (“No Silver Bullets—Essence and Accidents of Software Engineering,” Computer, April 1987).
More recently, the advent of visual programming environments has greatly reduced the complexity associated with creating GUI applications. Visual programming environments allow programmers to work at a level of abstraction at which they can forget about many GUI-related housekeeping details and focus on the particulars of the application itself.
Neither hierarchies nor abstractions reduce the total number of details in a program—they might actually increase the total number. Their benefit arises from organizing details in such a way that fewer details have to be considered at any particular time.
Focusing on the goal of minimizing complexity yields significant design guidance.
Subsystem design. At the software-architecture level, the complexity of the problem can be reduced by dividing a system into subsystems. The more independent you make the subsystems–the more strictly you separate their concerns—the more you reduce complexity, and the more you enable programmers to focus on one thing at once.
Classes and modules. Without classes or modules, the traditional advice to keep individual routines short becomes a double-edged sword. Keeping routines short helps a reader to understand each individual routine, but it tends to multiply the number of routines system-wide, which makes the system harder to understand as a whole.
Classes and modules, and for that matter subsystems, are helpful complexity-reduction tools because they provide an intermediate level of aggregation between individual routines and entire systems. With classes and modules, you can keep routines short, but combine them into meaningful groups to keep complexity from exploding at the whole-system level.
Cohesion and coupling. The structured design guideline to build programs with strong cohesion and loose coupling arises from the need to manage complexity. The more loosely coupled two routines or classes are, the fewer interactions are possible and the less complex their relationship is. The stronger the cohesion of a routine, the neater a mental package it fits into, and the less your brain has to remember and account for in the operation of the code inside that routine.
Fan-out. The classic advice to limit “fan-out” (the number of routines a routine calls) to 5 to 9 might seem arbitrary until you realize that the underlying motivation for the advice is to limit the complexity that a programmer has to contend with at any one time. The computer is capable of handling virtually any degree of fan-out; it’s the human software developer with the small skull who needs a limit on the number of possibilities that have to be considered simultaneously.
Information hiding. Information hiding is the practice of hiding design and implementation details behind abstract routine, module, and class interfaces. From a complexity viewpoint, information hiding is perhaps the most powerful design heuristic because it explicitly focuses on hiding details, which ipso facto reduces a program’s complexity when viewed from any particular point of view.
Focusing on complexity reduction also helps to cut through many historically nettlesome coding issues.
Global data. The existence of global data introduces the possibility that virtually any part of a program can interact with any other part of a program through their operations on the same data. The use of even a few global variables dramatically increases the complexity that a human reader has to deal with when trying to understand a program, and for that reason use of global data compromises the programmer’s primary objective of keeping complexity to a minimum.
Gotos. What guidance does complexity reduction provide for the historically controversial goto debate? Because gotos don’t necessarily follow any specific pattern, your brain can’t simplify their operation in any standard way. Gotos introduce a degree of flexibility that dramatically increases a program’s complexity and therefore should be avoided.
By the same reasoning, if you need to use gotos with discipline in a systematic way to compensate for weaknesses in a programming language, you should—if such use serves to reduce a program’s complexity from both the local and global viewpoints.
Coding standards. The complexity lens also brings the purpose of coding standards into focus. From a complexity-reduction point of view, the details of your coding standard almost don’t matter. The primary benefit of a coding standard is that it reduces the complexity burden associated with revisiting formatting, documentation, and naming decisions with every line of code you write. When you standardize such decisions, you free up mental resources that can be focused on more challenging aspects of the programming problem.
One of the reasons that coding standards are often controversial is that the choice among many candidate standards is essentially arbitrary. Standards are most useful when they spare you the trouble of making and defending arbitrary decisions. They’re less valuable when they impose restrictions in more meaningful areas.
When programming is seen predominately as an attempt to manage complexity, the litmus test for any design or implementation approach becomes clear: Does the approach increase or decrease overall system complexity? If a design seems simple and yet accounts for all the possible cases, it is a good design. If an implementation results in code that is easy to read and is more simple than clever, it is a good implementation.
Our brains might not be capable of fully encompassing the world of mind-numbing details associated with creating a modern software system. But, paradoxically, if we approach software problems with a keen awareness that our human skulls are smaller than we would like and tailor our approaches accordingly, we just might be able to conquer that whole world of details afterall.