X
Data Analytics | Data Visualization | Insights

How To Remove Duplicate Values in Tableau Prep

One feature of tableau prep is the ability to help with data cleansing. Data cleansing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a dataset, table, or database and refers to identifying incomplete, incorrect, inaccurate, or irrelevant parts of the data and then replacing, modifying, or deleting the dirty or coarse data. There are times when you want your dataset to only have unique values. In this example, we are going to use Tableau Prep to create a dataset that only has one record per customer. We would typically use this as a dimension/lookup table in our data model.

Option 1: Aggregate 

We can use the built-in aggregate functionality to remove duplicates. By default, Tableau prep will remove duplicate values when you use group by

We have connected to the superstore dataset and removed the unnecessary columns. We now have a dataset that contains Customer ID and Customer Name. In this example, you can see that there are several customers with multiple rows in the dataset.

Next we will add an Aggregate step to the workflow and add Customer ID and Customer Name to the Grouped Field section. We will now see that we have one record per customer.

To test this out I’ve filtered Claire Gute and we can see only one record for this customer.

Option 2: Create a unique rank and filter out results


In this example, we will walk through removing records based on the latest order date. In our dataset, we have the order date, customer id, and customer name.

Next, we are going to create a calculated field and create a ID using the partition, order by, and Row Number functionality. We partitioned by Customer ID because we want our counts to reset after each new customer id. We ordered by order date DESC because want the id to be based on the latest date (if we wanted this to be based on the earliest date then we would use ASC)

We now have a unique ID for each record in sequential order for each customer.

Next we will filter our calculated field to only keep 1.

Now we will remove Rank and Order from our dataset and we will have a finished dataset with only unique values.

Tableau Prep can be a powerful tool that can save you a lot of time in preparing your data to visualize. I love helping clients understand their data at a new level through the art and science of data visualization. To learn more about how I and Lovelytics help clients do more with their data, please visit us at www.lovelytics.com or connect with us by email at [email protected].

Author

Related Posts

Ago 04 2025

How Lovelytics and Databricks Partnered to Migrate and Automate Databricks’ Internal Reporting to AI/BI

Introduction: What is AI/BI and Why It’s a Game-Changer For years, BI tools have helped organizations analyze and visualize data, but the landscape has shifted....
Jul 31 2025

Announcing the Geospatial AI Accelerator, Our Latest Brickbuilder 

Built on Databricks to unlock AI-driven insights from geospatial data We’re excited to announce the launch of the Geospatial AI Accelerator by Lovelytics, our latest...
Jul 31 2025

Agentic AI: Building Secure, Ethical, and Governed AI Agents 

A practical guide for business and technology leaders Introduction: When AI Acts Autonomously, Can You Trust It? AI agents capable of independent decision-making...
Jul 23 2025

Why Data Literacy Is Critical to Enable a Data-Driven Culture

In the age of digital transformation, nearly every organization I have encountered in practice has expressed a desire to be “data-driven”. But there's a critical...
Jul 21 2025

Why Integrating Data Observability is No Longer Optional

In the modern data-driven enterprise, data is no longer just a byproduct of operations, it’s a key strategic asset.  Unfortunately, as data pipelines grow in...
Jul 09 2025

Why are Data Catalog and Data Management Companies the New Acquisition Target? 

At the end of May, Salesforce announced that they were acquiring Informatica for about $8 billion. The acquisition demonstrated Salesforce's intent to enhance its data...
Jul 01 2025

Agentic AI: The Future of Intelligent Business Automation

Artificial intelligence (AI) is no longer just a tool for augmenting human decision-making—it is rapidly evolving into an autonomous, self-learning force that is...
Jun 30 2025

Three Emerging GenAI Patterns Reshaping the Enterprise: Insights from DAIS 2025

The 2025 Databricks Data + AI Summit showcased the rapid evolution of Generative AI (GenAI) in the enterprise. One of the most anticipated moments was the chat between...
Jun 24 2025

The Invisible Handbrake: How Poor Governance and Misaligned Processes Undermine Business Enablement

The Business Enablement Mirage Organizations today are racing to enable their businesses through digital transformation, AI-powered insights, and connected workflows....
Jun 16 2025

Garbage in, SQL out

Introduction Enterprises are rapidly exploring how to integrate Generative AI (GenAI) into core operations, with large language models (LLMs) at the center of this...