Estimating Test Effort – From Billiard Balls to Electron Clouds

Each of us has very likely had to do an estimate in the past, whether it was for a set of assigned tasks, for a project, or for the budget of an entire organization. As a tester, the question is commonly presented as, “How long will it take to test this product – and what resources will you need?” and then the person asking stands there, likely somewhat impatiently, waiting for your answer.

Warning – What you say will be used against you at a later time! Estimation is a black box (magic?) process for many organizations and all the more so for their testing activities. The inputs used, the process and end results are therefore open to debate or worse – to un-discussed misinterpretation.

In Steve McConnell’s “Software Development’s Classic Mistakes 2008” software estimation is mentioned more than once as an area of significant impact. Of the “Top Ten Almost Always”, confusing estimates with targets was ranked #5 and shortchanging quality assurance was ranked #3. Overly optimistic schedules was ranked #1 of that list.

“…some organizations actually refer to the target as the ‘estimate,’ which lends it an unwarranted and misleading authenticity as a foundation for creating plans, schedules, and commitments.” – Steve McConnell, “Software Development’s Classic Mistakes 2008”

Capers Jones noted in Assessment and Control of Software Risks that most projects overshoot their estimated schedules anywhere from 25% to 100%, but some few organizations have achieved consistent schedule-prediction accuracies within 10% and 5%.

What are these organizations doing different? Are they the quantum physicists of software? Perhaps, but maybe they have simply taken the time to put in place a formal and repeatable estimation process tailored to the types of projects they undertake – something we all can do. Of course, we will not blindly copy these organizations’ approach. We simply need to employ similar practices in a framework of continuous improvement while recognizing that a single approach will not fit all of our needs all of the time.

Read the full article…

Posted in  All, Business of Testing, Estimation for Testing, Planning for Quality, Test Planning & Strategy | Tagged , , , , | Comments Off on Estimating Test Effort – From Billiard Balls to Electron Clouds

Increasing Test Effort Estimation Effectiveness

On May 29, 2008, I presented “Increasing Test Effort Estimation Effectiveness” to VANQ.org, the Vancouver Software Quality Assurance User Group, and I wanted to share that material with you.


Estimates Provide Visibility to Stakeholders

  • To know how much we will do, we need to know how big it is (Vision) relative to our constraints (Capacity)
  • For each scenario (Choices) of effort, resources and constraints we can determine impact (Consequences) on Scope and level of Risk
  • Scope and Risks and Mitigation Approaches (Strategy) are inputs to the test strategy/plan
  • All of the above and the task level effort (work) estimates are inputs to the project plan

Estimates Provide a Foundation for Planning

  • A good estimate makes the project achievable.
  • Identifies the winning combination of features and options.
  • Establishes the initial risks for risk management.

View the full presentation…

As this was a presentation where participation was expected, you can read the article Estimating Test Effort – From Billiard Balls to Electron Clouds for additional details.

Posted in  All, Estimation for Testing, Planning for Quality, Test Planning & Strategy | Tagged , , | Comments Off on Increasing Test Effort Estimation Effectiveness

Why Pay High Prices for Defect Management

This article was published in Volume 2 Issue 10 of Software Test and Performance, www.stpmag.com


If you want quality costs to stop guzzling money, the most effective thing you can do is optimize your project’s defect life cycle. Here’s how.

Testing is much more than just finding bugs to squash. It’s not an event, but a set of diverse activities playing a critical role in identifying problems of varied types throughout the project life cycle, far in advance of public access to the software.

Tracking the real costs of software failure, such as patches, support and rework, can be difficult, but it’s clear that effective testing can help optimize the cost of quality. Within the activities of testing, defect management needs thoughtful consideration to ensure that communication is as efficient and collaborative as possible and turnaround is prompt.

Read the full article…

Posted in  All, Automation & Tools, Planning for Quality | Tagged , , , | Comments Off on Why Pay High Prices for Defect Management

Optimizing the Defect Lifecycle – with Resolution

I presented “Optimizing the Defect Lifecycle – with Resolution” to VANQ.org, the Vancouver Software Quality Assurance User Group. Subsequently I presented it to the Southern Idaho Society for Software Quality Assurance and UBC Continuing Studies (Tech.UBC.ca), and I wanted to share that material with you.

Just as defect reports provide valuable insight into the types of common classes of defects that have occurred in the past, analysis of other attributes can provide useful information for improvement. One of the classifications that can be applied to a defect would be its resolution.

Not all defects are simply fixed by development. Without specific tracking of defect resolutions, the true defect find rate and defect clustering in the code is obscured by duplicates, not repros, by designs, and enhancement requests.

In this presentation, we will look at the implications of how including a resolution attribute as part of your defect reports can help optimize the one of the core processes related to developing software.  We will explore how we can use this field to facilitate process automation and data collection to help make team interactions more efficient and that the right people are making the decisions about product functionality.

You can download the slides here: Optimizing the Defect Lifecycle – with Resolution

Posted in  All, Automation & Tools, Planning for Quality | Tagged , , , , , , , , , | Comments Off on Optimizing the Defect Lifecycle – with Resolution

Philosophy of Defect Resolutions

One of the foundation processes in any company that produces software is the defect lifecycle. It is primarily this process that describes how Development and Testing interact around an issue or defect report.

There is typically an emphasis on how setting and tracking of the severity and/or priority of a defect is done. That rating ties to the likelihood of the defect being encountered in the field and the impact or cost of that encounter – Clearly describing a focus on reducing External Failures as they relate to the Total Cost of Quality within the organization. This is certainly an important area to look for and implement improvements given the potential for high return on investment.


However, the vision of a test manager must look for potential improvements in all areas included in the formula of Total Cost of Quality (Continuous Quality Improvement and Outsourcing) and the definition of interfaces between the different organizational groups and the supporting processes. The test manager must also look to driving the capability of the test team and working with the individuals within that team to be their most effective.

Definition of a Defect

“A software error is present when the program does not do what its end user reasonably expects it to do.” (Myers, 1976).

“The extent to which a program has bugs is measured by the extent to which it fails to be useful. This is a fundamentally human measure.” (Beizer, 1984.).

Attributes of a Defect

There are many attributes that can be ascribed to a defect that are usable in classifying, organizing, and analyzing the associated issue. Aside from a unique Identifier (DefectID), a Description of the issue with Reproduction Steps, and Expected and Actual Results, a defect report might have some of the following:

  • Status
  • Assigned To
  • Priority
  • Severity
  • Functional Area
  • Feature
  • How Found
  • Type
  • Environment
  • Resolution
  • Opened Version
  • Opened By
  • Opened Date
  • Related Test case(s) or Requirement(s)
  • History or Audit Trail

When including these attributes in the recording of defects, and as part of your defect lifecycle, you can leverage the information to make observations and draw conclusions, typically via metrics (Establishing Effective Metrics).

Using Defect Resolution

One of the classifications that can be applied to a defect would be its resolution.

Not all defects are simply fixed by development. Developers may resolve a defect as: not a bug, by design, not reproducible, duplicate bug, etc. as the reason for moving the defect out of their queues.

Cem Kaner suggests in “How To Win Friends, And Stomp Bugs” the following list of choices for the resolution of a defect in your defect database:

  • Fixed: the programmer says it’s fixed. Now you should check it.
  • Cannot reproduce: The programmer can’t make the failure happen. Add details, and notify the programmer.
  • Deferred: It’s a bug, but we’ll fix it later.
  • As Designed: The program works as it’s supposed to.
  • Need Info: The programmer needs more info from you.
  • Duplicate: This is just a repeat of another bug report (cross reference it on this report.).

The resolution field can contain other options in addition to the above, such as Enhancement, Spec Issue, Not Implemented, Deferred, and Third Party.

In this field, you can capture much of the philosophy or underlying meaning of the defect resolution process. The core of this philosophy should center on getting a more detailed picture of the defect counts and on allowing an analysis of that picture for more accurate and useful metrics (without providing too many choices).

For example if a defect is given the resolution:

  • “Fixed” implies that there really was a problem in the code and it has been addressed now.
  • “As Designed” implies that the tester may not have the latest information about the functionality OR may not have the necessary understanding of the product.
  • “Enhancement” implies that the tester has not found a defect per se, but that the issue is a new feature or feature modification request. In other words this is not a defect but it has been implemented in the current release (as opposed to those that have been Deferred). This information is valuable for the future, as these records can then be distinguished from the others for easy collection and inclusion in the requirements document and help files.
  • “Cannot Reproduce” implies that there is not enough information in the report for the developer to be able to reproduce the defect and that the tester needs to clarify/add information or Withdraw the defect. There may be a hardware setup or set-up preconditions required for seeing this defect that needs to be added to the report, or pointed out to the developer if already in the report (eg: the build number or environment information). This resolution should appear only as a transitory value before the true resolution is set.

In each of the examples above, you could come to a similar view if you were looking at just the surface of the defects. However, your experience no doubt also contains many instances (such as deadline pressure) where some resolutions are sometimes inappropriately used as a way for developers to clear out their queues. In these cases it often becomes the job of the tester, through the mechanism of the defect lifecycle, to make sure the defects get to the right audience and aren’t simply let go.

“Need More Info” and “Cannot Reproduce” are examples of resolutions that can create a lot of churn between the developer and the tester. Examination of how many defects get these resolution and the reasons why can provide good insight into training opportunities within the team to reduce this rework. Improvements to the application or tools to help with diagnosis of problems may also suggest themselves.

“As Designed” may be a defect that should not have been logged as implied above. But what if the design is flawed? Referring back to the definition of a defect can be helpful in deciding the next step.

A defined and visible defect lifecycle process provides supports for improved defect resolution. For example, if a defect is resolved “As Designed” or “Deferred” perhaps that defect is then assigned to the business analyst responsible for the functional area to confirm before it goes back to the tester for review, or it might even be escalated to the product manager.

“Duplicate” defects could indicate a higher likelihood of the defect to be encountered, potentially poor organization in terms of resource effort overlap, maybe poor processes in terms of making sure the majority of duplicates are caught before going to development, or even poor training in terms of the resources doing the work.

This is another example of where the defect lifecycle can help in making sure there is stronger chance to reduce the effort spent on duplicate defects. Before assigning a defect to a developer, perhaps the tester reviews all currently open defects (or searches for similar defects via keywords). If it is determined that the defect is a duplicate, the tester then may add any additional information to the existing defect or if the differences are of a greater degree, log a new defect record and relate the two together. Following a similar process to this may mean that duplicate defects are still getting logged, however there is now a valid process behind how they get logged and the testers avoid creating extra work for development and possibly looking like they just aren’t taking enough care in their work.

Summary

Just as defect reports provide valuable insight into the types of common classes or types of defects that have occurred in the past, analysis of other attributes can provide useful information for improvement. Without specific tracking of defect resolutions, the true defect find rate and defect clustering in the code is obscured by duplicates, not repros, by designs, and enhancement requests.

Recording and analyzing this information helps ensure you are able to investigate and address the root causes of these Quality Costs. An adaptive approach to testing processes, communication and training goes a long way to show that you have a strong and capable test team. Including a resolution attribute as part of your defect database, and more specifically within the definition of your defect lifecycle, can help achieve this.

For another view on this, check out my presentation on SlideShare: “Optimizing the Defect Lifecycle – with Resolution

Posted in  All, Automation & Tools, Planning for Quality | Tagged , , , | Comments Off on Philosophy of Defect Resolutions

Defect Tracking – Selecting the Right Tool

Effective defect tracking is a critical activity in any software development project, and having the right tool to do the job is just as critical.

A defect tracking system is a tool solution intended for: the tracking of project issues, defects, or bugs. The tool provides the efficient mechanisms for:

  • Recording and communicating such issues to the various team members,
  • Tracking the state of each issue as it moves through the defect lifecycle, and
  • For generating various reports on the data associated with the set of collected issues for further analysis and interpretation.

In “New Project? Where are the Templates?“, it was mentioned that a guiding principle in the software industry, considering the wide range of project scope and constraints, is to “use the processes and tools appropriate to the size, complexity, and impact of your project”. This is certainly true when selecting the right defect tracking solution for your team.

There are literally dozens of publicly available defect tracking tools to choose from. A total of eighty-eight of them are listed on Danny Faught’s www.testingfaqs.org web site.

While it is still possible to track defects by way of email or even on paper, off-the-shelf solutions have the common intent of trying to: accelerate defect resolution; generally improve project organization and resource planning; promote communication within the project team; and increase transparency of and track quality levels for every functional area throughout the project. All reasons to invest some time in looking at the available options for capable tool-based solutions.

Steps to Making the Right Choice

A decision has been made to assess potential solutions, either because you don’t have one or the current solution is not meeting your present or anticipated needs. The following steps outline a process that you can use as a starting place to undertaking an objective tool selection.

Getting Approval for the Assessment

Selecting a defect tracking tool is not necessarily a simple matter, and it will require time and potentially other resources from the organization to complete. A well described ROI proposal to management will go a long way to getting their buy-in to this project.

In a document of only a few pages describe; what is the anticipated ROI of getting a new tool, the major steps in the selection process, and why it is important to undertake this process as part of choosing the defect tracking tool. You will also want to consider these questions: what are the quantifiable goals that the team hopes to achieve with the tool; and are these benefits something that can be measured?

Quantifying these benefits (and the inability of any current solution to provide these benefits) is important to getting through the first gate of determining how much a Defect Tracking solution is worth to the team, and getting a budget defined for the new tool, before you start looking.

Cost will be a big factor and how much you can invest will have a significant impact on which tools you will be able to consider as your solution. Note that the budget must include the actual licenses as well as the assessment itself, and costs for any future training and implementation for the selected tool.

As noted in “Does ROI Matter To You?” by Wolfgang Strigel, ROI is a widely used approach for measuring the value of a new and improved process or product technology. For further information on the ROI calculations that you can apply, refer to “Practical Metrics and Models for Return on Investment” by David F. Rico.

Document the Process of Assessment

Make it clear and visible in a document what steps will be undertaken as part of the selection process. This will avoid confusion or misunderstandings as to any decision gates, short-list criteria, and possible reasons for choices made or delays in the process. You can use this article as a starting place for the outline of this document.

Determine the Method for Evaluation

Define how you are going to measure each tool against the needs of the team and against each other. The solution being assessed needs to be describable in terms of how it is better or worse than an alternative solution and the reasons why need to be recorded in an objective manner.

A sample method of evaluation may be to simply give a rating from 0-10 for each functional requirement where: 0-2 = non-existent or unusable; 3-5 = present but not useful in current form; 6-8 = present but requires configuration or changes; 9-10 = present with little to no configuration or changes needed.

Requirements could be enumerated and grouped into the following:

  1. Critical Requirements: List and describe the critical requirements and why they are critical to your company.
  2. Functional Requirements: List and describe what functionality or abilities the tool must have.
  3. Non-Functional Requirements: List and describe what constraints (cost, environment, quality, etc) the tool must meet or comply with.

The following is a sample of what this section of the assessment document might look like:

ID # Requ. Type Test for Requirement Evaluated Comments
CR01 Critical [the specific measurable requirement to be met] [1-10] [any additional comments, questions, or contextual information]
CR02 Critical
FR01 Functional
FR02 Functional
NR01 NonFunctional

At a minimum, you will need to prioritize the requirements so that you can measure each tool against what is most important to your team. In “Evaluating Tools”, Elisabeth Hendrickson recommends a face-to-face meeting as a good forum for prioritizing:

  • Invite everyone with a say in the tool selection decision.
  • Post a list of the requirements, printed large enough to be read from a distance, on the wall or whiteboard.
  • Everyone at the meeting gets three votes. (If you have a very large number of requirements, you may want to give everyone five votes instead of three.)
  • Each person may cast his or her votes in any combination: one at a time, or multiple votes for a particularly important requirement.
  • At the end of the meeting, tally the votes and the requirements are now prioritized.

Determine the Needs from Stakeholders

One of the most important steps in the tool selection process is to involve representatives of the various groups of stakeholders in enumerating the needs for the solution. It is easy to determine that testers and developers need to have their feature and workflow requirements met. But team leads, project management, technical support and others may all have inputs or need outputs from the Defect Tracking system.

Of course not all functionality is critical – there are “nice to have’s” and “maybe in the future” types of features from each group of stakeholders. Remember to look at both your process and technical needs for requirements. Also, don’t be afraid to challenge or rework the existing processes at this time – it is a great opportunity to improve and strengthen your processes (note: better doesn’t necessarily mean more). In the end the tool that you choose should support (and perhaps guide) your process, but not impose one of its own.

Re-define Needs in a Form for Evaluation

Although a need may be stated easily enough by a stakeholder, it may not be expressed in a measurable manner by the person doing the evaluation of the candidate tools. Similar to business requirements in development of a new software product, these statements express the needs for which the application must provide a solution. But, in order to implement the business requirement accurately, the actual functional requirements need to be enumerated, scoped, understood, reviewed, and signed-off. It is the same case when assessing Off-The-Shelf solutions.

An example of such an expressed need might be that “the tool must be compatible with our development environment”. If this need was expanded to describe exactly how the tool is expected to be compatible by listing specific operating systems, development tools, or other third-party software with which the tool must integrate, then the evaluation of this need can be performed in a much more systematic and objective manner.

Other examples of requirements that are difficult to objectively evaluate and use to compare two solutions might be:

  • “Is completely customizable”
  • “Has an intuitive interface”
  • “Is reliable”

Select Tools for Detailed In-House Evaluation

In the case of defect tracking systems, your preliminary research will uncover a large listing of possible solutions. The critical requirements should be used to limit those that make it past the point where they will be even included in the first cut. This research is typically “hands-off” and will be used to finalize the first cut against initial criteria. Remember, any tool you can eliminate at this point will allow you to spend more time on potentially better matches in the next steps of the assessment process.

In determining what your choices are, you will want to search the Web for vendor websites and relevant lists of tools (such as www.testingfaqs.org), read and participate in forums or newsgroups, possibly attend tradeshows and conferences, and ask co-workers and colleagues in other companies about tools they have experienced. If you work for a large company, ask other divisions what they use (maybe they even did a similar assessment already).

Note: During the assessment process, you may find you need to add or modify your requirements or constraints (perhaps even for budget or time for the assessment itself) in order to arrive at the best choice. Remember to keep such changes and the reasons why, visible to the stakeholder representatives and to those that approve the budget.

Obtain Demos of Top (3) Tools

Prior to deciding on the tools that you want to have demos for, talk with the vendor’s sales people.

Elisabeth Hendrickson warns in her paper that when you ask the sales representatives up front if their tool meets the requirements you have, it’s very likely that they will respond positively. The important part of this conversation is not when the sales person says, “We can do that.” It is when the sales person says, “.and this is how.”

It is also reasonable to ask the sales person how their tool compares to their competition:

  • How can I compare your product with other products on the market?
  • Under what situations is your tool the best choice?
  • Under what situations is your tool probably not the best choice?
  • What features differentiate your tool from the competition?
  • What don’t you like about your tool?

Again, you will want to reduce the number of tools you proceed with to the actual demo step. It is more than likely that you do not have all the time you would need to look at every tool in sufficient detail, so it is better to examine a much smaller group than to sacrifice the depth of the evaluation by looking at a larger group of candidate tools.

During the demos, the objective is to get the selected tool vendors to perform a demonstration of their tool to stakeholder representatives, either in person or on-line, where they prove their earlier claims on how their tool matches all your criteria.

Perform Detailed In-House Evaluation in a Practical Environment

After the demo, obtain evaluation versions of the successful tool(s) for further evaluation on your own to identify potentially hidden problems or irritating behaviours not uncovered in the demo, or to investigate the concerns of the stakeholder representatives expressed after the demo.

Try to arrange to use the tool(s) on an actual project if possible so that the users of the system have an opportunity to experience the solution in a real-world situation and offer feedback. If this is not possible, assign resources to use the evaluation copies of the tool(s) in a simulated project environment.

During this stage of the evaluation, make sure to perform the various activities of the entire process (not just the day-to-day) that your team intends to follow.

Final Selection is Made – Roll-out the New Tool

When rolling out the new tool to your team, remember to make time for training the various stakeholders in how to apply the tool and how it addresses their needs. Also, start collecting data immediately that you can use to measure how well the tool is working in your project environments. This will allow you to track metrics over time and be able to adjust course sooner than later if anything unforeseen begins to develop in implementation and on-going usage.

For more information on selecting tools for your organization and defect tracking tools in particular refer to “Tracking Down a Defect Management Tool” by Hung Quoc Nguyen and “Evaluating Tools” by Elisabeth Hendrickson. The above selection process has been adapted from the tool assessment process described in “Introduction to Practical Test Automation”, a public course available through UBC Continuing Studies as part of the Software Engineering Certificate – Quality Assurance and Testing Track (http://www.tech.ubc.ca/softeng/).

Posted in  All, Automation & Tools, Planning for Quality | Tagged , , | Comments Off on Defect Tracking – Selecting the Right Tool

Pictures for Test Planning

In the fast-paced changing world of software product development there is a continuous challenge to document the expectations for the system and its internally and externally facing behaviours. Requirements often suffer because of the challenges of keeping up with an iterative project life-cycle, evolving product scope, and uncertain or changing GUI/Screens.

However, the need remains for all stakeholders to optimize agreement, minimize risk, and minimize rework costs. From a tester’s point of view this translates in part into how test coverage of the system can be assured and made visible to all the stakeholders?

In “Testing Without Requirements“, we suggested using checklists, matrices, and user scenarios as ways to approach testing when requirements are non-existent, not complete, or not ready at the time testing needs to start.

Even when you have minimal or out-of-date requirements, you can use different methods to help you rapidly define the application, describe its functions and derive an efficient plan to drive your testing effort.

A first step in developing these tools is to think in pictures.

“Imagery is the most fundamental language we have. Everything you do the mind processes through images,” says Dennis Gersten, M.D., a San Diego psychiatrist and publisher of Atlantis, a bi-monthly imagery newsletter.

Benefits of creating User Scenarios / Use Cases:

  • Easy for the owner of the functionality to tell/draw the story about how it is supposed to work.
  • System entities and user types are identified.
  • Allows for easy review and ability to fill in the gaps or update as things change.
  • Provides early ‘testing’ or validation of architecture, design, and working demos.
  • Provides systematic step-by-step description of the systems’ services.
  • Easy to expand the steps into individual test cases as time permits.

User scenarios quickly provide a clearer picture of what the customer is expecting the product to accomplish. Employing these user scenarios can reduce ambiguity and vagueness in the development process and can, in turn, be used to create very specific test cases to validate the functionality, boundaries, and error handling of a program.

And, every picture tells a story and stories or scenarios form a basis for testing. Using diagrams can be very effective to visualize the software, not only for the tester but for the whole project team.

Creating user scenarios/use cases can be kick-started by simply drawing a flowchart of the basic and alternate flows through the system. This exercise rapidly identifies the areas for testing including outstanding questions or design issues before you start.

In her article “A Picture’s Worth a Thousand Words”, Elizabeth Hendrickson notes, “Pictures can pack a great deal of information into a small space. They help us to see connections that mere words cannot.”

The Unified Modeling Language (UML), which is a standard language for specifying, visualizing, constructing, and documenting the artifacts of software systems can be employed to help provide these pictures. However, there are many less formal types of notations you can use to put together different simple diagrams, such as activity flow charts, data flow diagrams, state diagrams, and sequence diagrams that can be just as useful for meeting your project needs.

As long as you can achieve the goal to obtain enough information to help you with the task of generating comprehensive test cases ideas, it doesn’t matter what notation you use. Start with a basic diagram, depicting the main modules of the system and when, why, and how they interact; from there, you can create more detailed diagrams for each module.

What should be in the initial picture? The very basic information you have. Is it a client-server application? Is it web-based? Is there a database? What are the major tasks the system is supposed to perform?

You have to focus on how the system behaves. End users can help define user scenarios (or use cases) in a diagram format, providing details of the system that will help you not only understand what the client is expecting but also will allow you to validate the diagrams previously drawn.

Describing the tasks and subtasks in detail will provide test scenarios and analyzing the relationships among the modules will help determine the important inputs for the overall testing strategy.

Flow Charts

A flow chart is commonly seen as a pictorial representation describing a process, defining the logical steps including decision points and activities. Flow charts are useful for defining the paths you want to verify or to force into an error condition.

Flow charts can take different forms, such as top-down flow chart, detailed flow chart, workflow diagrams, and deployment diagrams. Each of the different types of flow charts provides a different view or aspect to a process or task. Flow charts provide an excellent form of documentation for a process, and quite often are useful when examining how various steps in a process work together.

State Diagrams

Another option to capture the software behaviour is the use of state diagrams. State diagrams are used to describe the behaviour of a system. State diagrams describe all of the possible states of an object as events occur and the conditions for transition between those states. The basic elements of a state diagram are rounded boxes representing the state of the object and arrows indicating the transition to the next state. The activity section of the state symbol depicts what activities the object will be doing while it is in that state.

All state diagrams begin with an initial state of the object. This is the state of the object when it is created. After the initial state the object begins changing states. Transition conditions based on the activities determine the next state of the object.

Summary

Flowcharts and state diagrams provide similar and at times complementary methods for visualizing, or picturing, the core information to be captured in a user scenario or use case. Throughout the process of creating these diagrams, test case ideas will come to the fore to be rapidly captured for later detailing.

Time well-spent to better understand the software to be implemented and tested not only improves your actual testing activities, but also helps improve the organization’s understanding of the product, and thereby can significantly improve the product as a whole.

Posted in  All, Agile Testing, Automation & Tools, Requirements & Testing, Test Planning & Strategy | Tagged , , , , , , | Comments Off on Pictures for Test Planning

Establishing Effective Metrics

The biggest challenge in establishing an effective metrics programme is not the formulas, statistics, and complex analysis that are often associated with metrics. Rather, the difficulty lies in determining which metrics provide valuable information to the project and/or organization, and which procedures are most efficient for collecting and applying these metrics.

IEEE Standard for a Software Quality Metrics Methodology IEEE Std 1061-1998 (Revision of IEEE Std 1061-1992) relates software quality to metrics in the following: “Software quality is the degree to which software possesses a desired combination of attributes. This desired combination of attributes shall be clearly defined; otherwise, assessment of quality is left to intuition. For the purpose of this standard, defining software quality for a system is equivalent to defining a list of software quality attributes required for that system. In order to measure the software quality attributes, an appropriate set of software metrics shall be identified.”

It is important to be aware that a common misapplication of software metrics is to use them to measure team members productivity against an industry standard. Such comparisons do not earn support for the metrics programme and in fact are likely to cause resentment among the project staff. A metrics programme must focus on much more than productivity measures.

Software metrics are used to quantify software, software development resources, and/or the software development process. Consider these areas of software development that can benefit from inclusion in a planned metrics programme:

  • Product quality
  • Product performance
  • Schedule and progress
  • Resources and cost
  • Development process

Goals of a Metrics Programme

A metrics methodology for measuring quality allows an organization to:

  • Identify and increase awareness of quality requirements and goals.
  • Provide a quantitative basis for evaluating and making decisions about software quality in a timely manner.
  • Increase customer satisfaction by predicting and then quantifying the quality of the software before it is delivered.
  • Reduce software life cycle costs by improving process effectiveness and customer satisfaction.
  • Provide feedback on the metrics programme itself and validate the set of metrics being tracked.

Defining the Metrics Programme Framework

The key to the effective software metrics within an organization is to prepare a plan describing how metrics will be used to meet strategic management goals.

The first component of a metrics programme is a framework that describes the metrics to be collected, how to collect the data, and how to apply the results of analysis.

  • A software quality metrics framework hierarchy begins with the establishment of quality requirements and quality goals.
  • Then, by the assignment of various quality factors, the project team outlines the definitions for each of the quality requirements.
  • Next, direct metrics for each quality requirements are obtained by decomposing each quality factor into measurable attributes. The direct metrics are concrete attributes that provide more useful definitions than quality factors to analysts, designers, programmers, testers, and maintainers.

The decomposition of quality factors into direct metrics facilitates objective communication between management and technical personnel regarding the quality objectives.

Keep the following questions in mind when considering the direct metric for each quality factor and its quality requirement or goal:

  • What is this metric supposed to tell us?
  • What is the theoretical relationship between the characteristic or attribute to be measured and the measurements being taken?
  • Are you taking these particular measurements because they’re the right ones or because they’re convenient?

Beware: Often there is a lack of relationship between the metrics and what we want to measure. This makes the metric gathering process difficult and drawing valid conclusions improbable.

Example Metric and Interpretation

A sample metric that should be easy to gather from the Defect Database would be the number of existing Defects versus their Status over a series of Builds.

The corresponding abbreviated information table for this metric would be as follows:

Item Description Example
Name Name to be given to this metric. Defects Vs Status per Build (Internal Release)
Quality Factors Quality Factors that relate to this metric. Stability, Correctness, Completeness
Target Value Numerical value of the metric that is to be achieved in order to meet quality requirements.  Include the critical value and range of the metric. Zero known defects un-addressed in the Defect database system – Ideal target value for this metric would be to see the trend towards zero defects for status “New” and “ReOpened” and to a lesser extent “Resolved”.
Application / Impact Description of how the metric is used and what its area of application is.  Indication of whether this metric can be used to alter or halt the project (as “Can the metric be used to indicate deficient software quality?”). This metric is used to keep track of the number of defects in each of the available states in the Defect database.  This can be used as one reflection of the level of quality/stability of the current application.  In the future these metric values can be used to calculate the defect open/close rates.
Data Items Input values that are necessary for computing the metric values. Values used to calculate the metrics:
– Number of defects with a status “New”
– Number of defects with a status “Closed”
– Number of defects with a status “Postponed”
– Number of defects with a status “Rejected”
– Number of defects with a status “ReOpened”
– Number of defects with a status “Resolved”
Computation Explanation of the steps involved in the metric’s computation. Collect the Data Items for the range of Builds to be considered.  Plot each Data Item as a series with Builds along the x-axis and number of defects along the y-axis.
Interpretation Interpretation of the results of the metric’s computation. The numbers of defects un-addressed (New, ReOpened, Resolved) will give an idea of the current state of the application, and of the amount of effort that will be required to meet the Target Value(s).
Considerations Considerations of the appropriateness of the metric (eg: can data be collected for this metric?  Is the metric appropriate for this application?). If a defect has been addressed it does not necessarily need to have been fixed (eg: Postponed, Rejected).Equivalent Minimum Time to perform similar test coverage for testers and defect fixes for programmers on each Build must be available or the data collected will be skewed and interpretations flawed.
Tools Software or hardware tools that are used to gather and store data, compute the metric and analyze the results. Tools Necessary:
– Export of the Defect database to an Excel spreadsheet
– The Defect database
Example An example of applying the metric. A sample graph of this metric is shown below.

[You would insert a graph that displays the number of defects that are in each status used in the Defect Database across a range of Builds.]

From this graph, observations of the trends of each status can allow conclusions to be drawn about the readiness of the software for external release, and how many more Builds are required and how much development and test effort is required to reach that goal.

-adapted from IEEE Standard for a Software Quality Metrics Methodology

Summary

With a number of well-defined metrics measured and recorded over time, the subjectivity of future estimates and software evaluations is greatly reduced. The metrics provide a firm quantitative basis for decision-making.

As just one example, if you knew that in past projects of certain size and duration that the testing effort consisted of X hours with Y test cases, this information would provide a starting point for estimates in future projects. Of course metrics do not eliminate the need for human judgment in software evaluations; they only provide a starting point for such an estimate.

Posted in  All, Planning for Quality, Test Planning & Strategy | Tagged , , | Comments Off on Establishing Effective Metrics

Rapid Test Case Prioritization

It is a common theme in software projects and testing in particular that there is never enough time to do all that you need to do. Given the limited time that you have available, how can you know that you did the best job testing? You know there are always defects left unfound when the application is released. For Testing, the objective is to minimize risks by improving product quality, and this is done in part by constructing a specific set of test cases to put the application through its paces and more.


IEEE Standard 610 (1990) defines a test case as:

  1. A set of test inputs, execution conditions, and expected results developed for a particular objective, such as to exercise a particular program path or to verify compliance with a specific requirement.
  2. (IEEE Std 829-1983) Documentation specifying inputs, predicted results, and a set of execution conditions for a test item.

Of course you will find it difficult to execute all your test cases on each build of the application during the project lifecycle. But how will you know which test cases must be executed for each build, what should be executed and what could be executed if you have time?

Prioritize Your Test Cases

Your application doesn’t have to be perfect; but it does need to meet your intended customer’s requirements and expectations. To understand the expectations for your project you need to determine what is important about your application, what are the goals and what are the risks.

Sue Bartlett discusses this exercise in detail in “How to Find the Level of Quality Your Sponsor Wants”. She comments in that article that: “When we do communicate quality goals ahead of the detailed planning, design or coding, we have a better chance to avoid a quality mismatch at the end. That means meeting schedules, covering costs, and making a profit will have a better chance of success.”

For the purposes of test planning, the organization and scheduling of your test cases for test execution in the context of your project’s build schedule will help achieve these goals. As part of this organization, we are concerned with the prioritization of individual test cases. Grouping your test cases by priority will help you to decide what is to be tested for each type of build and therefore how much time is needed. If you have a limited amount of time, you can see what will fit.

Ross Collard in “Use Case Testing” states that: “the top 10% to 15% of the test cases uncover 75% to 90% of the significant defects.”

Test case prioritization will help make sure that these top 10% to 15% of test cases are identified.

How To Prioritize Test Cases

How many times have you looked at your test cases and were able to easily pick out a small subset that are the most important? That answer is probably not often. It is really difficult to stop thinking that “all of these are equally important”.

When it comes to test cases, assigning a priority is not easy and is not necessarily static for the duration of the project. However, we can get started by constructing an example prioritization process to address the first-cut of prioritizing the test cases.

Let us assume that you have just finished creating your test cases from your functional specifications, use cases, and other sources of information on the intended behaviours and capabilities of your application. Now it is time to assign each test case a priority.

Test Case Priorities

First, you must decide what your types of priorities are and what they imply. For our purposes we will begin with an assumption that there is a parallel between the severity of a defect that we might find and the priority of the corresponding test case.

1 – Build Verification Tests (BVTs): Also known as “smoke tests” are the set of test cases you want to run first to determine if a given build is even testable. If you cannot access each functional area or perform essential actions that large groups of other test cases depend on, then there is no point in attempting any of those other tests before performing this first test case, as they would most certainly fail.

2 – Highs: These are the set of test cases that are executed the most often to ensure functionality is stable, intended behaviours and capabilities are working, and important error and boundaries are tested.

3 – Mediums: This is where the testing of a given functional area or feature is to get more detailed, and the majority of aspects for the function are examined including boundary, error and configuration tests.

4 – Lows: This is where the least frequently executed tests are grouped. This doesn’t mean that the tests are not important but just that they are not run often in the life of the project – such as GUI, Error Messages, Usability, Stress, and Performance tests.

We have chosen to group test cases into one of four categories: BVTs, Highs, Mediums, and Lows. The trick now is to figure out which test cases belong to which priority. After all, the priority will indicate which test cases are expected to be executed more often and which are not.

How To Go About Prioritizing

1) Arbitrary Assignment: These first three steps will leave you with an arbitrary grouping of the test cases, based on the idea that if you don’t have enough time to test at least make sure all the product requirements have been confirmed to do what they are supposed to under assumed good conditions. If you stop to think about what each test case is testing they all become important too, so just:

  1. Label all your Functional Verification (or Happy Path) tests as High Priority.
  2. Label all your Error and Boundary or Validation tests as Medium Priority.
  3. Label all your Non-Functional Verification tests such as Performance and Usability as Low Priority.

2) Promotion and Demotion: Not all the functional tests are as important as each other and the same is true for Boundary and Non-Functional Tests. Think about the importance of the test and how often you would want to check this functionality relative to others of the same priority – consider the quality goals and requirements of your project.

  1. Divide the Functional Verification tests into two groups of Important and Not Quite As Important.
  2. Demote the “Not Quite As Important” Functional Verification tests to Medium Priority.
  3. Divide the Error and Boundary tests into two groups of Important and Not Quite As Important.
  4. Promote the “Important” Error and Boundary tests to High Priority.
  5. Divide the Non-Functional tests into two groups of Important and Not Quite As Important.
  6. Promote the “Important” Non-Functional tests to Medium Priority.
  7. Repeat the divide and promote/demote process for each set of High, Medium, and Low Priority test cases until you reach a point where the number of test cases being moved between priorities has become minimal.

3) Identify Build Verification Tests: Now, which tests must be checked with every build to ensure that the build is testable and ready for the rest of the team to start testing?

  1. Divide the High Priority tests into two groups of Critical and Important.
  2. Promote the “Critical” High Priority tests to BVT Priority.

Note: Do not identify BVTs first! BVTs are a selection of High priority test cases that are determined to be critical to the system and testing

At the end of this process, a rule of thumb is to check that the percent distribution of the priorities is along the lines of BVTs 10-15%, Highs 20-30%, Mediums 40-60%, and Lows 10-15%

When promoting and demoting test cases, aspects to consider are how frequently the user will require this feature or functionality. Likewise, how critical is this behaviour to the users day-to-day or month-end activities. Robyn Brilliant provides a list in Test Progress Reporting using Functional Readiness that you could apply when considering the test cases for promotion or demotion:

Using a scale from one to five, with one being the most severe and five the least severe, quantify the Reliability Risk as follows:

  1. Failure of this function would impact customers.
  2. Failure of this function would have a significant impact to the company.
  3. Failure of this function would cause a potential delay to customers.
  4. Failure of this function would have a minor impact to the company.
  5. Failure of this function would have no impact.

This and similar scales can aid you in arriving at your final first cut of test case priorities.

Summary

This is a simplified example of a test case prioritization process. However, it can serve you well as a basis for rapid organizing of your test cases and getting your test schedule, efforts, and which test cases are done when mapped into the project plan.

Remember, how you prioritize your testing tasks and the test cases to be executed will depend on where you are in your project cycle. It is likely that you will re-prioritize your test cases as you move towards release and as you determine by investigation and observation where the risks and defects are manifesting. Establishing your testing objectives up front for each phase and making sure they are reflected in the individual priorities of your test cases will make your life a lot easier when it comes time to explain and execute your plan.

Finally, having prioritized test cases also gives you a good starting place for your potentially pending automation project. Ie: Automate the BVT Priority test cases, measure the benefits, improve the automation, automate the High Priority test cases, etc.

For related reading, check out these articles:

Posted in  All, Planning for Quality, Test Planning & Strategy | Tagged , | Comments Off on Rapid Test Case Prioritization

Evaluating UI Without Users

User involvement is one of the major factors in designing UI for software. However, as much as we would like the users to be involved, user devotion to a project is never a free or unlimited resource. To maximize the benefit of their involvement, the design should be as free as possible of trivial bugs so that the users do not have to waste time encountering and overcoming these issues during the evaluation. [Task-Centered User Interface Design – A Practical Introduction, C. Lewis, J. Rieman, 1994]

Also, performing an evaluation with just user participants will not reveal all types of issues. For example, an interface used by thousands of individuals and tested with only a few users will not uncover problems that the evaluating users and the tests they perform don’t happen to encounter. It also won’t uncover problems that users might have after they get more experience with the system.

Human-Computer Interaction (HCI)

Software is created for users. Human-Computer Interaction (HCI) is a multidisciplinary study of people, computer technology, and how they influence each other. Not only does it touch on the technology and design, it also involves the studies of cognitive psychology and sociology. Understanding HCI principles and applying them to the UI design will make the software more usable for people.

Understanding Human Factors

Human factors – human abilities, human limitations, and other human characteristics from a physical and psychological perspective that are relevant to the design, operations, and maintenance of complex systems. [Northrop Grumman]

It’s a common mistake for software projects to focus more on the technology aspects and neglect the relevant human factors and cognitive impact of the UI design.

“Humans are limited in their capacity to process information.” [Human-Computer Interaction, Dix, Finlay, Abowd and Beale, 1998] These limitations refer to the capabilities of the mental processes used when gathering knowledge. UI designers must make sure to take these limitations into consideration when performing their work.

Memory

  • Short-term memory is limited in capacity (~7 units) and it can be maintained for about 20 to 30 seconds.
  • Long-term memory is unlimited in capacity and it is permanent in duration.
  • It is faster to retrieve frequently accessed long term memory than less frequently accessed information from long term memory.

Perception

  • Human perception is selective. When a lot of information is presented, our brain will filter the information to intake. It is easier to recognize and interpret objects that are familiar based on memory.

Learning

  • Familiar, structured and organized information is easier to process.
  • Smaller units of information are easier to learn.
  • Because short term memory is limited, we select what we want to learn or store in our long term memory.

Know Your Users

Although we share the same model of information processing system (the human brain), different users have different ways of learning and form different conceptual and mental models about what they see. Therefore, it is important to know:

  • Potential users’ job and tasks.
  • Potential users’ knowledge and experience.
  • Potential users’ physical and psychological characteristics.
  • Potential users’ physical environment.

Human Factors Evaluation

The human factors in HCI can be used as the driving force behind a component-based review in producing a more ‘user friendly’ UI.

Grouping

  • Group information – A group is treated as a single unit in short-term memory.
  • Place commonly used buttons closer and make them larger in order to minimize hand and eye movement.

E.g. Font style, font type, font size, alignment controls are grouped in the Formatting toolbar.

Relationship

  • Provide visual structure and organization of objects on screen.
  • Use pattern recognition.
  • Use relationship that is familiar to the user.
  • Use metaphor and knowledge that can be transferred from the real world but, do not duplicate the limitation in the real world to software.

E.g. The graphics used for Play, Stop, Pause, etc. buttons in a CD Player software are similar to the ones used on an actual CD Player or tape deck.

Clarity

  • Avoid using terms that are hard to distinguish in sound.
  • Map only a one control to one piece of functionality.
  • Avoid making the user learn more than what is necessary to perform certain functions.
  • Avoid testing user’s intelligence.
  • Minimize the need of for the user to memorize information by displaying information on screen for as long as the user needs it.

E.g. On a multi screen application, buttons with the same name should perform the same functionality.

Cues

  • Provide visual cues.
  • Use pictures or icons – A picture is worth a thousand words.
  • Use sounds to get the user’s attention.
  • Make information and controls visible.

E.g. Buttons are disabled instead of invisible when they are not available for use.

Cognitive Walkthrough

In addition to the more general design evaluation using human factors, another way to evaluate the UI is by performing a cognitive walkthrough. The cognitive walkthrough is a technique for evaluating the design of a user interface, with special attention to how well the interface supports “exploratory learning,” i.e., first-time use without formal training. [Usability Evaluation with the Cognitive Walkthrough, John Rieman, Marita Franzke, and David Redmiles]. Walkthroughs should be done when the interface begins to grow and when components begin to interact with each other.

Lewis and Rieman suggested the following information is needed for a walkthrough:

  • A description or a prototype of the interface.
  • A task description.
  • A complete list of actions needed to complete the task.
  • An idea of who the users will be.

During a walkthrough, the evaluator will perform the task using the prototype given. This is a good way of imagining the user’s thoughts and actions when using the interface.

Heuristics Evaluation

In addition to the cognitive approach to evaluate UI, another approach is by heuristic evaluation. Heuristic evaluation involves having a small set of evaluators (not users) examine the interface and judge its compliance with recognized usability principles. [Heuristic Evaluation, Jakob Nielsen]

Here are some recommended heuristics:

  • Be consistent.
  • Provide clear exit.
  • Prevent error if possible.
  • Provide clear error message in easy to understand language.
  • Use familiar language and logic that it is easier to learn and understand. Avoid using technical jargons.
  • Display information on screen until it is not needed by the user.
  • Provide feedback to the user within a reasonable timeframe.
  • Account for both experienced and inexperienced user.
  • Irrelevant or rarely used information should not be displayed unless the user asks for it.
  • Provide documentation.

Summary

A good non-user evaluation, or expert review, using established usability assessment principles can catch problems that an evaluation with only a few users may not reveal. If some key evaluation and design guidelines are followed the critical problems can be detected and resolved.

Of course, performing just an evaluation without users won’t uncover all the problems either. Once the evaluation without users is complete, and appropriate design changes are made, the next step will be to get the users to participate.

Posted in  All, Test Planning & Strategy | Tagged , | Comments Off on Evaluating UI Without Users