Test data provisioning

for software development

Test data engineers have the honorable task to give teams access to the right test data at the right time. But this task isn’t easy to execute, since the provisioning of test data is often a time consuming process.

The three main frustrations in the field of data provisioning are 1) getting test data manually, 2) wasting time on searching for test cases and 3) the fact that you have too much data. Luckily, these challenges can easily be solved with the right tools.

What is your main challenge?

I have to create my test data manually

Most test data requests take too much time because refreshes are processed manually. There are too many people involved in the test data provisioning process. Long waiting times have a negative impact on your time to market.

Using a self-service portal and/or an automation tool you’re able to refresh on demand. It can even be integrated directly into your CI/CD pipeline. Automating your test data processes helps you save a lot of time.

I am wasting time searching for test data

Searching for test data can take up to 50% of your testing time. You need specific test data to cover your system under test properly, but you’re wasting valuable time searching for it.

Finding test data quickly that is aligned with your test cases will have a massive impact on the quality of your software release and the time to deliver it.

I have too much data

Production databases are growing more than ever before. When using full size copies of production your test and development environments increase as well. Therefore it may take days or even weeks of extra time to restore these environments.

Using reduced sets (subsets) of test data reduces the time to create, refresh or restore these environments so you test and develop more in the same amount of time.

What is test data provisioning?

Let’s start by answering the question “What is test data provisioning?” Based on our experience, we have formulated the following answer:

“Test data provisioning is the process of making test data accessible and available to users in an orderly, secure – and preferably automated – way.”

Test data provisioning is an important part of the software development process because it allows developers to simulate real-world scenarios and identify any issues with the software before it is released to the public.


How to: test data provisioning

There are a few different approaches to test data provisioning. One approach is to use synthetic data, which is data that is generated by a computer program. This data is typically designed to mimic real-world data, but it may not be a perfect representation of it. Another approach is to use real-world data, which is data that has been collected from actual users or sources. This data is often more accurate and realistic than synthetic data, but it may be more difficult to obtain and manage.

Test data provisioning can be a complex and time-consuming process, so many organizations use specialized tools and services to help manage it. These tools can help automate the process of creating and managing test data, and they can also provide features such as data masking, which allows developers to protect sensitive data while still being able to test the software.

In summary, test data provisioning is the process of creating and managing data sets that are used for testing software during the development process. It is an important part of the software development process because it allows developers to identify any issues with the software before it is released. There are different approaches to test data provisioning, and specialized tools and services can help automate and manage the process.


Test data deliverability

A lot is being said and written about how to create proper test data sets, but the deliverability is often underexposed. Many organizations face challenges in the area of test data delivery. Most teams don’t have the ability to self-refresh their (test) data sets. They have to request a database refresh at the DBA, which often takes a lot of time and causes the necessary frustrations.

Test data distribution is an important part of Test Data Management. In our view, Test Data Management consist of various parts / steps (whose order may vary):


Easy test data

Software delivery has to be done faster and more efficiently. Waiting for the DBA to deliver a database refresh is precious time lost since it can take several days to weeks or even months for a database to get refreshed. Waiting for a test data refresh also causes annoyed Software (Quality) Engineers. Everyone’s waiting for the test results but the team doesn’t have the proper data to execute these tests and the pressure continues to increase. A search is then made for quick (but not necessarily qualitative) solutions such as manually generate test data or test cases themselves. Just to be able to test the software and give the business some (whether or not reliable) results about the delivered software product.

Test data availability

For software development teams to be able to self-refresh and have test data easily available ensures that they can do their job properly. One of the biggest bottlenecks here is the fact that DBA’s aren’t that eager to hand over database control to Software (Quality) Engineers, afraid they will break something or worse, destroy it. But since it’s not the main task of a DBA, he or she doesn’t want to spend too much time on database refreshes, resulting in long waiting times for software development teams.

Pros of easy test data availability:

  • Manually generation of test data is past tense
  • Don’t waste any more time searching for test data
  • Save on waiting time and storage costs

Test data portal

In summary, we find that both software development teams and DBAs would benefit from easy test data distribution. But how do we get this done? Simple: with the help of a test data portal. A platform which can be used and accessed by both DBAs and Software (Quality) Engineers. Both are able to login to this portal and via the platform they are able to self-refresh their own test data sets. Extremely easy at the click of a button. The only waiting time is the actual (technical) data processing time. DBAs are not constantly disturbed and Software (Quality) Engineers don’t have to wait. They can control the processes themselves.

On-demand distribution

At the backend of this test data platform, DATPROF Runtime, certain projects (also called templates) are installed. These projects can be for example data masking (DATPROF Privacy) or data subsetting (DATPROF Subset) templates. In most cases the templates are developed by a Software Quality team member with decent database knowledge, working closely together with the DBA. After the installation (mostly done by the DBA) the projects can be used by the selected teams.

The DBA can grant access and monitor the processes that are being executed. In addition, they are released from these non-primary tasks they’ve been given every once in a while. So it is a win-win situation. Depending on your Test Data Architecture teams are able to access and refresh their own database (subset) at the click of a button and the DBA is sure that databases won’t get corrupted.

Automated distribution

To make test data delivery even more interesting, this entire self-refresh process can be automated, too. With the help of an API the test data platform can be integrated with nearly every orchestration tool or other tools you might use for software testing. Our test data platform DATPROF Runtime is used by customers to refresh databases several times a week or even multiple times a day – instead of manually generating test data because of the pressure placed on the test team. With the help of such a test data portal you’re able to easily obtain proper, production-like test data. This way test data is as an accelerator instead of a bottleneck in your entire CI/CD pipeline.

Test data provisioning tools

Discover & Learn

Discover your data and gain analytics of the data quality by profiling and analyzing your application databases.


Subset & Reduce

Subset the right amount of test data and reduce the storage costs and wait times for new test environments.


Provision & Automate

Provide each team with the right test data using the self service portal or automate test data with the built-in API.



Virtualize & Control

Quickly and easily clone your databases without claiming extra storage. Roll back to a previous state if needed.


Book a meeting

Schedule a product demonstration with one of our TDM experts

Trial DATPROF Privacy

Full Platform Demo

45-minute session to discover the entire TDM platform with the help of a technical pre sales consultant.


TDM Platform

The right test data in the right place at the right time. Masked, generated, subsetted, virtualized and automated at the push of a button.