Software Project Failure: The Reasons, The Costs

Software project failure is often devastating to an organization. Schedule
slips, buggy releases and missing features can mean the end of the project
or even financial ruin for a company. Oddly, there is disagreement over what
it means for a project to fail.

This article uses economic criteria to define what it means for a project
to fail. It then categorizes how projects fail and finally, it examines
common traps that contribute or accelerate project failure.

The cost, feature, product spiral

Economics determines the success of any software project and its value to a
company. The amount of money spent on development determines the cost of
the asset. The return generated by the product is its value. The difference
between the return and the cost is the return on investment (ROI).

Economics of Adding Features

Figure 1. The ROI of a product is the difference between its cost of production and its return. If the return is greater than the cost of production then it is said to possess a positive ROI.

Organizations must consider the cost of adding features to a product. Figure 1 shows a software project whose returns outpace the cost of production, thus producing a positive ROI.

Figure 2 depicts a product that initially has a positive ROI, but whose
added features cost (marginal cost) more than the amount of return generated
by the features. This initially profitable product becomes a drag on the
company.

Figure 2. ROI is said to be negative if it costs more to product a product than it generates.

Figures 1 and 2 are deceptive because under most software processes, the cost of changing software is not linear, but exponential. Brooks (1) attributes the exponential rise in costs to the cost of communication. Changes to software include new features, bug fixes and scaling.

The effects of exponential cost of production can be characterized by three properties. First, new projects are successful because the cost curve is flat. Second, once the costs start increasing, they quickly overcome any additional value added from the new features. Finally, if changes are made after the costs become exponential, the additional costs will quickly overwhelm all returns garnered from the product to date. Figure 3 details the effects of an exponential cost of change.

Figure 3. Exponential costs of change can quickly subsume the worth of any product.

Software processes are designed to manage the cost of change. An examination of cost management and processes is beyond the scope this article but will be the topic of a future article. Briefly, processes that follow waterfall and iterative models control costs by reducing need for change as costs increase. In contrast, processes based on the spiral model ensure that the cost of change is fixed. This article assumes an exponential cost of change as most projects are based on waterfall or iterative models.

Changes are often unavoidable because there are no successful medium-sized software projects. Successful projects require a significant amount of development and become a company asset.

Maximizing ROI means expanding the market and the addition of features which, in turn, increase the investment in the product. If the next version is successful, this increased investment leads to an even greater desire to maximize returns. If the cost of change becomes exponential, high cost makes adding features impractical and development must stop. Unfortunately, most companies do not realize this point exists and spend huge sums on dead products.

Software Failure Modes

Exponential costs of change belie a stark reality: Unless the product is shipped before the cost of change becomes exponential, it will very likely fail. Many projects become races to see if enough features can be created to make a viable product before adding the additional required features becomes too expensive.

There are four failure modes that prevent product completion:

Hitting the wall before release: A small team of programmers is making good progress adding features to a product. Before the needed features can be delivered, some event makes the cost of change exponential and all progress stops. These events may include losing a key team member, adding team members to accelerate production, unforeseen difficulties with technology choices, unforeseen requirements, and major changes in target audience/market. Figure 4 shows how the minimum number of features will never be reached.

Figure 4. Teams hit a wall in production when some event causes the cost of change to be exponential so that all progress seems to stop.

90% done: A team of programmers is making steady progress but never finishes the required features because of a gradual rise in the cost of change.

This failure mode is often unavoidable because the riskiest features are often put off until last. These features often require so much complexity that their solutions overwhelm the development process. Proper risk mitigation is essential to avoiding this failure mode.

Figure 5. 90% done is difficult to detect because the cost of change increases slower with no propagating event. The higher the value on delivery and the more required features, the more likely this failure mode.

Endless QA: Endless QA occurs when a product ships with all features completed, but still has too many bugs to make it into production. If the cost curve has become exponential, these bugs will take longer and longer to fix. As the cost of change increases, any given change will likely cause more bugs.

Figure 6 demonstrates how the fixing of bugs once the product is released to QA can ruin ROI. The higher the cost of change before delivery to QA, the larger the number of bugs. Indeed, the number of bugs at QA is a good indirect metric of the cost of change.

Figure 6. A product shipped to QA with a fair number of bugs when the cost of change is exponential is in danger of never being shipped as each attempt to fix bugs will probably produce more, each one costing more to fix.

Version 2.0: Most failures of version 2.0 of any product can be traced to exponential cost of change. During version 1.x, the cost of change has become exponential. The new features will never generate high-enough returns to make up for the costs of producing the version. Figure 7 diagrams this effect. What is most frustrating for many teams is that after a successful first version, the costs of change may have become so high, that it is unlikely the second version will ever ship.

Figure 7. Second releases often fail because the cost of change has become so great it easily overwhelms the value of the second release.

Failure traps

If costs do increase exponentially, development teams must ensure cost is managed until delivery of the product. If they don’t, failure is all but guaranteed.

Unfortunately, there are several traps for developers that accelerate the onset of exponential costs of change. Interestingly, all of these techniques are designed to accelerate development at the beginning of the project, but the costs of using may overwhelms any savings. Here are four of the most common traps:

Prototype trap. Product prototypes are great ways to prove technologies, techniques and reduce risk. However, unless the economics of development are understood, they become liabilities. The problem is how much money is spent on the prototype. If enough resources are spent on any given prototype it becomes too valuable to throw away.

Most developers intend to throw away a prototype once it is completed and the resulting code quickly becomes expensive to change.

The prototype trap can be avoided by ensuring that no significant investment is spent on any given prototype. There are many situations where prototypes are necessary, but they must never endanger a project by reducing the amount of resources available to finish.

4GL trap. 4GLs such as Visual Basic (VB), Forte, 4GL, and Magic allow developers to rapidly develop applications by making assumptions about how data will be accessed and displayed.

The problem with 4GLs is that the code is very hard to modify after it has been created. This accelerates the cost of change. In addition, a language that makes some applications easy to create becomes a hindrance when the problem domain exceeds the design of that language.

Often, the only way around these limitations is to use some other language such as Java or C++ to solve the unsupported problem. The interfaces between multiple languages are notoriously expensive to maintain and extend. Anyone who has tried to make a VB application perform and look like a professional, highly polished standalone application will immediately realize these limitations.

The 4GL trap is easily avoided by understanding the limitations of each language and only using it if all of the features required by the product fit within the assumed model of the language. This is the most insidious part of this trap. Most 4GLs are marketed as being designed for novice programmers with little training. Microsoft has been particularly aggressive in marketing VB to companies as the way to hire ‘cheap’ programmers. Unfortunately, these are precisely the people who should not be making the decision about when a particular language is adequate for solving a given problem. Choosing the wrong language will ensure that the product will never ship.

Scripting trap. Scripting languages allow the easy creation of sophisticated software by sewing together existing applications. Advanced scripting languages such as Perl are very powerful and can be used for a variety of purposes. Operating systems such as Unix are designed to be easily integrated through scripting languages and have far lower cost of ownership than those whose management tools are grafted on with pretty user interfaces.

The trap lies in the sophistication of these languages and the mechanisms that make it easy to write programs. Most scripts are not maintainable or even readable by those people who created them. This does not mean that scripts are bad things. They are the perfect solution for integrating existing tools and making small programs. However, since they are always expensive to maintain, the amount of effort put into any single script should be below the threshold of throwaway code: essentially, it is usually cheaper to rewrite the script than to try to modify it.

A stark example of the scripting trap comes from Excite. Excite built its original search and Web serving infrastructure in Perl on Unix machines. Perl allowed Excite to quickly create products that competed with more mature companies such as Yahoo and Web Crawler. However, by 1998, maintenance expenses made it impossible to add new features. Excite had to stop all production and rewrite its infrastructure in Java. This transition took many months and hindered Excite’s competion in the other markets such as online shopping and video streaming.

Avoiding the trap is relatively easy. There are many applications that are small and will remain small forever. These are perfect for scripting languages. If new features are required, this small size makes it easy to rewrite in an OO language to control cost of change.

Integrated Development Environment (IDE) trap. Many companies product IDEs that allow developers to quickly deploy code that they write. Examples include Microsoft’s Visual Interdev Studio and .NET framework, IBM’s Visual Age and Oracle’s 8i. The problem with these environments is that they make assumptions about the target deployment environment and workgroup configuration.

The problem is that companies do not design these tools to help developers, but lock developers who use their IDE’s into their platforms.

In the real world of changing requirements, platform restrictions are often deadly. These restrictions include limited OS support, limited APIs that may make certain features impossible, or platform bugs. Often, the only way around these restrictions is to rewrite major amounts of code.

The IDE trap is easily avoided by choosing tools that do not lock you into a vendor’s technology. In addition, development teams must deploy to production style systems early in the development process. This allows adequate time to develop the necessary scripts and procedures to ensure proper delivery.

Reengineering trap. Reengineering projects is designed to address exponential cost of change of an existing system. Lessons learned in previous versions can be applied to control the cost of change.

Reengineering almost always fails because the existing code cannot be easily changed because the cost of change is exponential. If the cost of change was not exponential, there would be no reason to reengineer. This makes it extremely expensive to work with the existing code. As a result, reengineering usually takes as long or longer to complete than the original product while producing the same set of features.

If it took 10 man-years to complete the first product, it will probably take 10 man-years to complete the reengineered version with exactly the same features. Ten man-years for a zero-sum gain. This is why reengineering projects are rarely completed.

The reengineering trap is avoided by developing a migration strategy. All new features must be made separate from the original code base to avoid the exponential cost of change and the original code base is mined for completed features.

Whenever a bug is encountered in the original code base or an existing feature needs to be extended, the existing code is removed and refactored into the new code base. These migrations are expensive, but there is no way to avoid them. In this way, an organized reengineering of only those sections that are not currently adequate will be performed. The cost of changing these sections will be exponential, but will hopefully be limited.

Conclusion

In a capitalist economic system, software must possess a positive ROI in order to make sense to an organization.

Many software products fail not because there is no market, but because the cost of creating the software far outstrips any profit. Exponential costs of change exacerbate this problem. Software processes are designed to manage these costs; however, it is crucial that an organization understand how and when the costs of creating software will outstrip the worth of a product.

Fortunately, software products tend to fail in one of four modes. By understanding how these modes organizations can choose the appropriate software process to avoid these failures. Each software process model (waterfall, iterative, spiral) has a different approach of managing costs. How each process attempts to manage costs is beyond the scope of this article. However, understanding how costs contribute to failures is crucial to picking a model and process appropriate for your organization.

Finally, regardless of the chosen software process, there are several traps that can accelerate the exponential cost of software production and must be avoided at all costs. The tools that cause these traps are essential to the existence of any software organization, but inappropriate selection will invariably lead to failure. Fortunately, it is usually possible to avoid these traps.

Carmine Mangione has been teaching Agile Methodologies and Extreme Programming (XP) to Fortune 500 companies for the past two years. He has developed materials to show teams how to move from standard methodologies and non-object oriented programming to Extreme Programming and Object Oriented Analysis and Programming. He is currently CTO of X-Spaces, Inc. where he has created an XP team and delivered a peer-to-peer based communications infrastructure. Mangione is also a professor at Seattle University, where he teaches graduate-level courses in Relational Databases, Object Oriented Design, UI Design, Parallel and Distributed Computing, and Advanced Java Programming. He holds a B.S. in Aerospace Engineering from Cal Poly Institute and earned his M.S. in Computer Science from UC Irvine.

Bibliography
The Mythical Man Month, Brooks, F.P., Addison-Wesley, 1995.