CIS 422/522

Fundamentals of Software Testing

Program testing can be used to show the presence of bugs,
but never their absence! -- Dijkstra

... The type of usage environment influences the type and level of software testing...

What is software testing?

Process of executing a computer program and comparing the actual behavior with the expected behavior...

The comparison is intended to detect deviations (if any) between the actual behavior and the expected behavior.

Testing serves as a barrier to allowing low quality products from reaching the customer...

What is not software testing

Software testing is not debugging...

Software testing discovers unexpected behavior.

Defect classification identifies the situation as a software error, a design error, a specification error, a testing error, etc. - if available, a "work-around" is identified.

If a software error, then debugging is performed which tracks down the cause of the error, and attempts to fix it.

What is the goal of software testing?

To find errors
To gain confidence in the software
To ensure that all the functionality is implemented
To ensure the customer will be able to get his work done

Depending on your goal, you'll approach testing in different ways ... what does it mean to completely test a product?

The fundamental questions of software testing

What test cases should I use?
How can I tell if the product behaves correctly?
When am I done?
How well did I do?

Principles of testing

Complete Testing is NOT Possible
Testing Work is Creative and Difficult
An Important Reason for Testing is to Prevent Deficiencies from Occurring
Testing is Risk-Based
Testing Must be Planned
Testing Requires Independence

Types of errors

If one of the primary reasons for testing is to find errors, it's important to understand errors.

Some types:

user interface errors	error handling errors	boundary related errors
calculation errors	initial and later state errors	control flow errors
errors handling or interpreting data	race condition errors	load condition errors
hardware interfacing errors	source and version control errors	documentation errors
requirements/specifications errors	feature errors	structural errors
data errors	coding errors	interface, integration and system errors
test/test design errors

Some testing terminology

Faults - a mistake in the code that causes the software to not behaveas expected (causes)
Failures - the act of a product not behaving as expected - the manifestation of a fault (symptoms)
Validation - establishing the fitness of a software product for its use - "are we building the right product?"
Verification - establishing the correspondence between the software and its specification - "are we building the product right?"
Test case - the collection of inputs, predicted results and execution conditions for a single test
Ad-lib/ad-hoc test case - a test executed without prior planning - especially if the expected behavior is not known prior to running the test
Pass/fail criteria - decision rules used to determine whether a product passes or fails a given test
Coincidental correctness - when behavior appears to be what is expected, but it is just a coincidence
Test suite - a collection of test cases necessary to "adequately" test a product
Test plan - a document describing the scope, approach, resources and schedule of intended testing activity - identifies features to be tested, the testing tasks, who will do each task, and any risks requiring contingency planning
Oracle - a procedure, process or magical phenomenon that can determine if the actual behavior matches the expected behavior
Incident - when a test produces an unexpected outcome - further effort is necessary to classify the incident as a software error, a design error, a specification error, a testing error, etc.
Bug report - a method of transmitting the occurrence of a discrepancy between actual and expected output to someone who cares for "follow-up" - also known as discrepancy report, defect report, problem report, etc.
Work-around - a procedure by which an error in the product can be "by-passed" and the desired function achieved.

Efficiency in testing

Q: How many possible test cases can you have?
A: Many!

Testing involves inferring the behavior of all inputs from a relatively small number of inputs.
Pick your inputs to ensure the "Biggest Bang for the Buck".
The measure of "bang" is dependent on the goal you've selected for testing...

Types of software testing

Defect vs. Statistical testing.

DEFECT Testing is intended to exercise a system so that defects are exposed before the system is delivered. STATISTICAL Testing is a software testing process in which the objective is to measure the actual reliability (or other quality attribute) of the system (and compare it with the intended quality attribute goal), rather than to discover faults.

Software testing is the process of executing a computer program and comparing the actual behavior with the expected behavior ... this is intended to detect deviations (if any) between the actual behavior and the expected behavior.

Applications consist of components such as:

subroutines/functions/libraries
modules
programs
systems

Testing can either occur once all the pieces are assembled (BIG BANG TESTING), or the pieces can be tested as they become ready (INCREMENTAL TESTING)

Incremental testing is the most common approach to testing - it can begin as soon as compilable functions are available - this is known as UNIT TESTING

Unit testing involves executing a function, or collection of functions and comparing the observed behavior with the behavior expected from that unit of code.

System test categories

(Basically, testing for the different Quality Attributes...)

Load/Stress (how big a load)
Volume (continuous heavy load)
Configuration
Compatibility
Security
Performance
Installability
Reliability/Availability
Recovery
Serviceability (Maintainability)
Usability/Fitness for Use

In order to test a function...

Units like functions cannot be run by themselves

Functions need a MAIN (in most languages) to call them
Functions may need additional code to expose them to test input
Functions may need additional code to make their behavior visible

The code used to do all this is called a test driver

For a test case to detect a fault...

There has to be a fault in the code
The fault has to be executed
A state error has to be created
The state error has to propagate (persist)
The state error must be recognized

What are the implications of these requirements?

First and foremost: the statement with the fault must be executed

In Order to Ensure that each statement with a Fault is Executed...

Ensure each statement is executed ... statement coverage

We could just throw a large number of tests at the function, with the expectation that in the process of running them all, we'll get 100% statement coverage

or ...

We can do "data engineering" to ensure that each statement gets executed at least once...

What does it mean for a "fault to be executed"?

Will simply executing a statement cause a fault to be executed?

Many times a statement is correct except under certain circumstances ... in order to execute a fault, you must execute the statement under the appropriate circumstances...

Statement coverage by itself is not adequate...

What does the "appropriate circumstance" usually involve itself with?

Control flow coverage

The way you get to a statement is just as important as the statement itself when "executing a fault".

Branch Coverage: Ensure that every conditional is evaluated as both true and false during testing.

Multi-Condition Coverage: Ensure that every conditional predicate is evaluated as both true and false during testing plus Loop Coverage every loop must be executed 0 times and more than one time.

Path Coverage: Ensure all permutations of paths through the program are taken.

This approach to testing is known as Glass Box testing because you can see the code and select test cases based on specific details of the implementation. Glass Box testing is effective because it deals with the way the software is written rather than the way you think the software is written...

As size of the units increase, or modules consisting of several smaller units are created, the complexity of engineering test cases based on the code becomes more and more difficult...

In order to deal with the increased complexity, the emphasis switches from code based testing to specification based testing as the size of the components being tested increases.

Specification based testing is known as Black Box testing because the software is viewed as a black box which transforms input to output based on the specifications of what the software is supposed to do.

A common approach is to divide the input domain into categories such that the program can reasonably be expected to behave the same for any points within the category - that is, the behavior of every point in the category is equivalent - these are known as Equivalence Classes.

"Equivalence" means that the input will cause the same operations to take place, or there is some other similarity between the points.

There are no "correct" equivalence classes for a program. Which points are equivalent depends on your view of the problem and implementation!

An equivalence partitioning is perfect if when one point in a partition uncovers a bug, every other point in that partition will also uncover the bug, and if one point in the partition does not uncover a bug, no other point in the partition will either.

Usually identifying a set of equivalence classes is an iterative process - continually refine your equivalence classes

Transaction-based testing

A transaction is a unit of work as seen from the user's point of view - it consists of a series of tasks which may or may not be visible to the user.

Tasks are activities that the user or system does -

Sometimes in the process of a transaction, you may have a choice of several tasks, and depending upon what choice you make, the tasks to be performed may change. This is especially true of modern, windows-based applications.

Modeling the task transitions within a transaction

The task to be performed is usually preceded by one task, and followed by another. In some cases there may be a collection of tasks that may precede/follow a task

Identify the tasks that can occur within the transaction - this would be things the user does and things the system does within the context of that transaction

Then identify what tasks may precede or succeed each task

Task transition coverage

Task Coverage: Make sure every task is carried out at least once by your test suite.

Predecessor/Successor Coverage: Make sure that every task is carried out, and that in the process, each task is preceded/succeeded by each of its predecessors/successors.

Transition Path Coverage: Make sure that every transition path is carried out at least once by your test suite.

TESTING TOOLS

Anything (esp. software) a tester uses to facilitate testing a program can be considered a testing tool.

Frequently testing tools are used to automate manual procedures.

Automation tools minimize effort to carry out tests - esp. subsequent applications of the test and make it easier to reproduce behavior.

Comparators
- compares the contents of two files - one is the "actual output file" and the other is the "expected output file" - requires that you can capture your "actual output"
Product-level harnesses
- usually batch, command-line oriented
- automatic execution of the product
- automated input
- automated output capture
- automated output analysis
- Capture/replay tools
- extend the harness concept to interactive products that "converse" with the user

Capture/replay tools

build around the concept of a test script
collection of commands to the "harness" that cause it to represent the behavior of the "user"
commands translate into keyboard activity, mouse events, etc. many commercial capture/replay tools:
Vermont High Test
Visual Test for Windows

The test plan

Document describing the scope, approach, resources, and schedule of testing activities. Defines test items, features to be tested, testing tasks, who will do each task, and any risks requiring contingency planning.

Goals of a Test Plan:

Raises testing issues
Defines testing work
Coordinates testing effort
Assigns and obtains resources

The test plan as a product

In some situations, the test plan is as much a deliverable - often, the format is prescribed by the customer, but it typically will contain sections on:

Strategy and approach to testing
Specifications of test cases
Responsibilities
Testing, etc. procedures
Procedures for control of the process

The test plan as a tool

The test plan is intended to help organize and manage the testing effort

a test plan is valuable as a tool if it does this

beyond this point, it is a diversion of resources

However, don't forget:

Communication
Accountability
Frail human memories

A standard for test plans and documentation (IEEE 829)

Introduction
Test Items
Features to be Tested
Features Not to be Tested
Approach
Item Pass/Fail Criteria
Suspension Criteria and Resumption Requirements
Test Deliverables
Testing Tasks
Environmental Needs
Responsibilities
Staffing and Training Needs
Schedule
Risks and Contingencies

Additional axioms

A good test case is one that has a high probability of detecting a previously undiscovered defect, not one that shows that the program works correctly.
One of the most difficult problems in testing is knowing when to stop.
A necessary part of every test case is a description of the expected output/behavior.
Avoid nonreproducible or on-the-fly testing.
Write test cases for invalid as well as valid input conditions.
Thoroughly inspect the results of each test.
As the number of detected defects in a piece of software increases, the probability of the existence of more undetected defects also increases.
Ensure that TESTABILITY is a key objective in your software design.
Testing, like almost every other activity, must start with objectives.
Bugs tend to be clustered; programs that have the most bugs in test are the most likely source of future bugs. So, should be search for symptoms, rather than bugs. While the bugs found must be fixed, the real value is the data obtained. A little analysis reveals which program sections are gardens and which are jungles.

lloyd.madden@dynamix.com

glenw@uswest.net

johnfl@cs.uoregon.edu