Test data provisioning
for software development
Test data engineers have the honorable task to give teams access to the right test data at the right time. But this task isn’t easy to execute, since the provisioning of test data is often a time consuming process.
The three main frustrations in the field of data provisioning are 1) getting test data manually, 2) wasting time on searching for test cases and 3) the fact that you have too much data. Luckily, these challenges can easily be solved with the right tools.
What is your main challenge?
I have to create my test data manually
Most test data requests take too much time because refreshes are processed manually. There are too many people involved in the test data provisioning process. Long waiting times have a negative impact on your time to market.
Using a self-service portal and/or an automation tool you’re able to refresh on demand. It can even be integrated directly into your CI/CD pipeline. Automating your test data processes helps you save a lot of time.
I am wasting time on searching for test data
Searching for test data can take up to 50% of your testing time. You need specific test data to cover your system under test properly, but you’re wasting valuable time searching for it.
Finding test data quickly that is aligned with your test cases will have a massive impact on the quality of your software release and the time to deliver it.
I have too much data
Production databases are growing more than ever before. When using full size copies of production your test and development environments increase as well. Therefore it may take days or even weeks of extra time to restore these environments.
Using reduced sets (subsets) of test data reduces the time to create, refresh or restore these environments so you test and develop more in the same amount of time.
What is test data provisioning?
Let’s start by answering the question “What is test data provisioning?” Based on our experience, we have formulated the following answer:
“Test data provisioning is the process of making test data accessible and available to users in an orderly, secure – and preferably automated – way.”
A lot is being said and written about how to create proper test data sets, but the deliverability is often underexposed. Many organizations face challenges in the area of test data delivery. Most teams don’t have the ability to self-refresh their (test) data sets. They have to request a database refresh at the DBA, which often takes a lot of time and causes the necessary frustrations.
Easy test data
Software delivery has to be done faster and more efficient. Waiting for the DBA to deliver a database refresh is precious time lost since it can take up to several days to weeks or even months for a database to get refreshed. Waiting for a test data refresh also causes annoyed Software (Quality) Engineers. Everyone’s waiting for the test results but the team doesn’t have the proper data to execute these tests and the pressure continues to increase. A search is then made for quick (but not necessarily qualitative) solutions such as manually generate test data or test cases themselves. Just to be able to test the software and give the business some (whether or not reliable) results about the delivered software product.
Test data availability
For software development teams to be able to self-refresh and have test data easily available ensures that they can do their job properly. One of the biggest bottlenecks here is the fact that DBA’s aren’t that eager to hand over database control to Software (Quality) Engineers, afraid they will break something or worse, destroy it. But since it’s not the main task of a DBA, he or she doesn’t want to spend too much time on database refreshes, resulting in long waiting times for software development teams.
Pros of easy test data availability:
- Manually generation of test data is past tense
- Don’t waste any more time on searching test data
- Save on waiting time and storage costs
Test data portal
In summary, we find that both software development teams and DBAs would benefit from easy test data distribution. But how do we get this done? Simple: with the help of a test data portal. A platform which can be used and accessed by both DBAs and Software (Quality) Engineers. Both are able to login to this portal and via the platform they are able to self-refresh their own test data sets. Extremely easy at the click of a button. The only waiting time is the actual (technical) data processing time. DBAs are not constantly disturbed and Software (Quality) Engineers don’t have to wait. They can control the processes themselves.
At the backend of this test data platform, DATPROF Runtime, certain projects (also called templates) are installed. These projects can be for example data masking (DATPROF Privacy) or data subsetting (DATPROF Subset) templates. In most cases the templates are developed by a Software Quality team member with decent database knowledge, working closely together with the DBA. After the installation (mostly done by the DBA) the projects can be used by the selected teams.
The DBA can grant access and monitor the processes that are being executed. In addition, they are released from these non-primary tasks they’ve been given every once in a while. So it is a win-win situation. Depending on your Test Data Architecture teams are able to access and refresh their own database (subset) at the click of a button and the DBA is sure that databases won’t get corrupted.
To make test data delivery even more interesting, this entire self-refresh process can be automated, too. With the help of an API the test data platform can be integrated with nearly every orchestration tool or other tools you might use for software testing. Our test data platform DATPROF Runtime is used by customers to refresh databases several times a week or even multiple times a day – instead of manually generating test data because of the pressure placed on the test team. With the help of such a test data portal you’re able to easily obtain proper, production-like test data. This way test data is as an accelerator instead of a bottleneck in your entire CI/CD pipeline.
Data Provisioning Resources
The concept of data subsetting is surprisingly simple: take a consistent part of a database and transfer it to another database. That’s all. Of course, the actual data subsetting isn’t that simple.
Subsetting is copying a part of the data from one database (source) in to another (target). Therefore, you need a source to provide the data (typically a production database). The target is typically a dev or test environment.
DevOps Software (Quality) Engineers are in need of high quality, up-to-date test data or certain specific test cases to test a newly developed application. The hunt for this test data often goes hand in hand with long waiting times.