X
Blog | Data Analytics | Data Engineering

Data Engineering: How to Create a Test Plan

In this article, we explain how to create a minimal test plan in data engineering. We discuss the importance of ensuring process quality with a detailed and documented test plan: what, how, and why to test.

Is It Possible to Code Without Errors? 

We know it’s impossible to code without errors. To address this, software developers must perform all necessary tests before releasing the program to users to ensure it works properly.

Testing software to confirm it functions correctly is essential, yet this doesn’t always happen. In the best cases, the tests conducted are sufficient. However, it’s also common for developers to either not test enough or to do it poorly.

Data engineering adds another layer of complexity: beyond testing the software, the data must also be tested. The responsibility for proper testing lies with data engineering teams. However, it’s not just about testing but also providing evidence that the tests were performed.

The Importance of Documenting Evidence in Data Engineering Tests: Quality Assurance for Teams and Clients 

Documenting the evidence of tests conducted in data engineering is crucial for two main reasons:

  • For ourselves: To demonstrate that the tests were actually performed. If we don’t leave this documentation, the test effectively doesn’t exist.
  • For others: If a leader, client, or colleague has evidence of the tests and an issue arises later, they can understand which tests were conducted and which weren’t. This helps narrow down the root cause of the problem, enabling quicker and simpler resolution.

What Is the Difference Between a “Test Case,” “Testing,” and “Proving We Tested”?

These three concepts may seem similar, but they are not. Below, we explain the difference between each one:

  1. Test Case: A set of conditions or variables under which it will be determined whether an application, process, feature, or behavior is acceptable or not. In other words, a test case defines what needs to be tested.
  2. Testing: The act of verifying the proper functioning of an application or process. This may include one, none, or several predefined test cases.
  3. Proving We Tested: Providing concrete evidence of the tests performed.

The difference between these three concepts may seem subtle, but it is significant. The first tells us what to test, the second involves executing the tests, and the third ensures we can demonstrate what tests were performed and their results. Defining test cases and conducting tests is pointless if we don’t document evidence of doing so.

The phrase, “It worked when I tested it,” is very common in data engineering. This may well be true, but without evidence of the tests, it becomes one person’s word against the facts. Once again, it is the data engineer’s responsibility to ensure proper documentation of the tests performed.

How to Create a Test Plan in Data Engineering

A good test plan requires careful planning. Documenting evidence forces us to think about what we are going to test before performing the test. Below are some examples of the minimum components a test plan should include:

  1. Every dashboard (visualization) and data engineering development must have a minimal test plan executed and documented.
  2. Each test must be properly documented by taking screenshots, capturing query results, or using another method to provide verifiable evidence that the test was executed and successfully passed.
  3. If a test fails, the issue must be resolved, and all tests must be re-executed (ensuring integral consistency).
  4. The evidence document must be attached to the corresponding task (DevOps/Jira/Trello/etc.) to document the execution.
  5. Executing the “Minimal Test Plan” does not exempt teams from conducting additional tests to ensure the quality of the final product.
  6. The data engineer is responsible for the quality of the product delivered, including performing any additional tests they deem necessary to ensure quality.
  7. If an error is detected after delivery or changes are made to reports or processes, all tests must be executed again.

Key Elements of a Test Plan in Data Engineering

A test plan can be created using a simple tool like Excel, Google Sheets, or any similar platform accessible to all team members. Of course, using dedicated tools for this purpose is even better. The document should include the following:

  1. Pipeline to be validated: Specify which pipeline is being tested, who developed it, and the date of the tests.
  2. Description of the controls performed: For example, verifying that business definitions were reviewed/understood, checking whether metric values match the data source, identifying any out-of-range (outlier) values, etc.
  3. Test results: Indicate whether the result was “OK,” “Failed,” or “N/A.”
  4. Documentation location: Specify where the evidence of the test is stored.
  5. Executor: Identify who performed the test.
  6. Observations: Include a column for any additional remarks.

Here is an example of how to document a data engineering process:

Example of a Minimal Test Plan

Conclusion

There’s an old saying: “It’s not enough to be honest; you must also appear honest.” Over time, this was simplified to: It’s not enough to be; you must also appear to be.
Proving we tested is the resource we have in data engineering to both be and appear. We are not only responsible for ensuring that processes work, but we must also guarantee the accuracy of the data.

Documenting evidence of the tests we performed is essential. If there are predefined test cases, we use them; if not, we define a minimal set of tests that we deem appropriate and leave evidence that we performed them.
This set of tests can be agreed upon with the team, the leader, the data user, or even with oneself. What must not happen is failing to test or being unable to prove that we tested.

This is one of the best practices we can apply in data engineering. A solid test plan helps us avoid failures that could lead to a loss of trust in our work. Consider the central role that data plays in organizations: if the information we provide to the business is incorrect, it could result in poor decision-making.

For this reason, regardless of the client or project, whether it was requested or not, we must test and provide clear evidence that the tests were performed and passed. Always.

This article was originally written in Spanish and translated into English using ChatGPT.


* This content was originally published on  Datalytics.com. Datalytics and Lovelytics merged in January 2025.

Author

Related Posts

A featured image for the blog that has the title with a background featuring retail shelves.
Apr 13 2026

Same Challenges, New Opportunities: Why AI is Finally Closing the Retail Execution Gap

Retail’s age-old problems remain, but the solutions are evolving. Discover how AI is finally solving CPG’s core issues.

Apr 09 2026

Why AI Transformation in Retail & CPG Requires Domain Experts, Not Just Technology

Discover why domain knowledge is the missing ingredient in Retail and CPG AI transformation strategies in this blog.

Mar 26 2026

Building a Workforce, Not a Chatbot, with Databricks Agent Bricks

Over the last couple years, we’ve seen a lot of enterprises focus their AI implementations solely on "generative" tasks: summarizing long documents, drafting emails, or...
Mar 13 2026

Beyond Reactive Analytics: Transforming Warranty Risk Management with Compound LLM and Databricks

Executive Overview   Traditional warranty analytics systems share a fatal flaw- they tell you what broke yesterday, not what will break tomorrow. By the time a warranty...
Robert Herjavec headshot on stylized teal background with Lovelytics colors
Feb 26 2026

Shark Tank’s Robert Herjavec Makes Strategic Investment in Lovelytics, Joins Board of Directors

AI-focused Databricks consulting firm secures investment from renowned technology entrepreneur to accelerate growth in enterprise AI[Arlington, VA] — Lovelytics, a...
Feb 17 2026

Alex Wiss Is Our New CTO and We’re Changing How We Work

We have some big news to share. Alex Wiss is stepping into the role of Chief Technology Officer at Lovelytics. Most of you already know Alex. He has spent his whole...
Feb 06 2026

State of AI Agents 2026: Lessons on Governance, Evaluation, and Scale

Introduction Databricks has released its State of AI Agents 2026 report, a data-driven snapshot of how enterprises are shifting from chatbots and pilots toward agentic...
A conversation with Lovelytics' new databricks MVPs
Jan 22 2026

The New Era of AI: A Conversation with Lovelytics’ New Databricks MVPs

As AI reshapes the enterprise landscape, Databricks has launched a new AI MVP designation to recognize the practitioners leading the charge. We are thrilled to...
Jan 20 2026

Lovelytics at DTECH 2026: Navigating the AI-Driven Grid

The power and utilities industry is at a critical inflection point. As we prepare for DTECH 2026 in San Diego from February 2–5, the conversation has shifted from "why"...
Dec 24 2025

Tackling the Telco Reliability Crisis: From Reactive Chaos to AI-Driven Resilience

In the telecommunications industry, the pressure has never been higher. As demand for seamless connectivity skyrockets, providers are grappling with aging...