Data-Driven Testing

One of the risks of any application is being able to do enough testing in the time available. Another common risk is getting at those critical components behind the GUI to ensure that they are as bug free as possible before and after integration with the rest of the system.

Black box testing can tell you whether the GUI layer is functioning as it should, by taking in user input and responding with a result. But what if there is an error, how do we know what layer (GUI, Application, Database, etc) is creating the error? What if our black box tests don’t result in any database or application layer bugs? Does that mean there aren’t any? What if the GUI layer is masking the errors?

The tester needs to avoid the risk of automating a fluctuating GUI and at the same time provide the volume of tests needed to ensure that the application and database layers are stable and functioning.

Test Data Creation

In order to begin performing data driven testing you first need to create good test data. Sometimes sources for this data are available from previous testing efforts, from users of the previous versions of the application, or perhaps obtainable by purchasing test suites from a vendor. More often than not, this test data needs to be created from scratch in the specific context of the application to be tested and the test plans to be executed against it.

Sending the Message

Once the database is populated with your initial data, you will want to take that data and construct a message, command, or request to send to the application that you are testing.

Constructing the message can be as simple as extracting the information from the database and putting it together in the order you need. Or there could be additional, more complex, logic required to construct the message that the application needs to receive. Working with scripts and databases together can give you a very versatile and useful solution.

Getting Your Results

To get the results of sending your message to the application, you need to be able to collect the result codes or expect your application to send back a response. Assuming the latter is the case, your test tool would wait for that response, process it as necessary and then compare aspects of the response to the value in an Expected Results column in your tool’s database. Depending on the result, a Pass or Fail can be determined and a log entry created with as much detail as you care to capture.

Example Testing Problem

To pull all of what we have talked about so far into a tangible example, let us imagine a certain application that we want to test.

The application is a multi-tier web application that passes information from its User Interface layer to its Application Layer, via XML messaging. These XML messages are made up of a number of fields of all different types. The application takes these messages, processes them, and decides what the appropriate response is – return the information requested, perform the required action, etc.

If we assume that a typical message can contain 20 data elements or fields, and each data element needs to be tested with four different values: valid, error, upper field length boundary, and lower field length boundary: we need to perform over ONE TRILLION tests.

4 to the 20th power = 1,099,511,627,776

This number of fields is not unreasonable, and the different values we mentioned above do not include every case that may need to be considered depending on the constraints for the message’s data elements. Are all the fields required? Can they be blank? What about surrounding whitespace characters, are they stripped away for each field? What about fields that take codes like ON/OFF or RED/GREEN/BLUE/BLACK/WHITE? Each field will need a certain number of tests based on its constraints.

This is where you need to make use of Equivalence Classes and other test planning techniques to determine the minimum number of tests for each field. For example, if we can assume that we only need to vary the input of 15 fields out of the 20, and that each field contains only either valid data or invalid data, the number of combinations of test data we now need is much less.

2 to the 15th power = 32,768 <– still a lot of tests

This is obviously a coarse example, but it makes the point. What if you had 30 potential data elements in your message? Generating this amount of data, executing the tests, and reviewing the results takes a huge amount of time.

Always, make sure that you have well designed test cases before undertaking your automated testing. A little upfront analysis and planning can save you a lot of work.

Defining Your Test Tool

Before creating or purchasing any software application or test tool, the first thing one must do is to collect and enumerate the requirements.

In general the tool needs to:

  • Create and/or populate a database based on a predefined schema including an “Expected Results” column that adequately populates the database based on the elements created
  • Create a ‘message’ or formatted string, delimited by tabs or in XML, for sending to a network location, where the application under testing is waiting for connections and incoming requests or data to be delivered
  • Send the message to the network location monitored by the application being tested
  • Listen for a response from the application
  • Compare the response to the value in the appropriate row in the data table
  • Report on the success or failure of the test based on the result of the comparison of the actual response received and the corresponding value in the expected result column

Before stating that the basic requirements are complete, it would be much more useful if the tool could execute thousands or millions of tests. The time for these tests to complete could be extensive. It also may be that the tests should really be run at times of the day when there is low usage of the application or low network traffic. Both of these needs require that the tool is able to run unattended and perhaps is able to start up at a scheduled time.

But remember, if you are building your own tool or even buying one, keep it simple. After all this isn’t the product you are trying to ship. The first priority is to make the test process work better than it is working right now. You can add in other bells and whistles later.


The framework required for any basic data driven testing tool is made up of three main components: File Input/Output for reading from configuration files and writing reports, a database for storing the test data, and an engine with which to extract the data from the database and make meaningful directives and requests of the system under test.

And that’s that. With a tool comprised of these core components, you can send potentially trillions of messages to an application for testing purposes – far more than you could, or would ever want to do manually.

About Trevor Atkins

Trevor Atkins (@thinktesting) has been involved in 100’s of software projects over the last 20+ years and has a demonstrated track record of achieving rapid ROI for his customers and their business. Experienced in all project roles, Trevor’s primary focus has been on planning and execution of projects and improvement of the same, so as to optimize quality versus constraints for the business. LinkedIn Profile
This entry was posted in  All, Automation & Tools, Test Planning & Strategy and tagged , , . Bookmark the permalink.