Wednesday, October 5, 2011

Keep it Simple, Keep it Reliable

Hello dear readers,

In this post I want to continue the discussion on software complexity and software reliability. The main idea has been already defined by the Einstein:

"Everything should be as simple as it is, but not simpler."

Now, let me talk a bit about existing SW reliability models (SRM). The earliest concepts of SW reliability engineering were adapted from the older techniques of HW reliability. However, the application of hardware methods to software has to be done with care, since there are fundamental differences in the nature of hardware and software faults. Since the last 20 years software reliability engineering is a separate domain. H. Pham gave a classification of actual software reliability models in his book "System software reliability". According to it, there are the following groups of SRM: error seeding models, failure rate models, curve fitting models, reliability growth models, time-series models, and non-homogeneous Poisson process models. These models are based on software metrics like lines of codes, number of operators and operands, cyclomatic complexity, object oriented metrics and many others. An overview of SW complexity metrics you can find in: "Object-oriented metrics - a survey" and "A survey of software metrics".

All of the defined SRM are Black-box Models that consider the software as an indivisible entity. A separate domain, which is more interesting for me, contains so-called Architecture-based SRM, like the ones described in: "Architecture-based approach to reliability assessment of software systems" and "An analytical approach to architecture-based software performance and reliability prediction". These type of the models consider software as a system of components with given failure rates or fault activation probabilities (those can be evaluated using the black box models). Reliability of the entire systems can be evaluated by processing information of system architecture, failure behavior, and single component properties. Most of these models are based on probabilistic mathematical frameworks like various Markov chains, stochastic Petri nets, stochastic process algebra, and probabilistic queuing networks. The architecture-based models help not only to evaluate reliability but also to detect unreliable parts of the system.

Returning to the topic, I want to refer to a very trivial principle: The simpler is SW, the more reliable is the SW. This idea is very transparent. The majority of SW faults are actually bugs that have been introduced during SW design or implementation. Complex SW contains more bugs.  Hence, the probability that one of these bugs will be activated is higher. This fact can be proven by any SRM. To make a reliable SW system you have to define a function of this system very strictly and clear and develop the SW just for this function. This principle is too straightforward, but it will help you to obtain a system "as simple as it is, but not simpler".