Microsoft Excel: how to delete duplicates?

Microsoft Excel: how to delete duplicates?

Excel is commonly chosen by companies which are just starting to manage their contacts. The accessibility of the tool makes it a good solution when starting your business. But storing hundreds or thousands of contacts in Excel can quickly become difficult, especially when it comes to duplicates. That's why companies end up choosing CRMs, which are more suitable and more versatile for any growing business.

Reminder : what's a contact duplicate?

A duplicate emerges when the information of a contact is filed twice (or more!) in your database. Duplicates can have disastrous consequences on your prospecting, that's why you must delete them or, even better, automate their detection and their merging.

We'll tell you how to do solve this problem right below 😉

Mains causes for duplicates

Manual input of duplicates

Typing errors are very common when adding new contacts (spelling mistakes, inverting first and last names, encoded characters, etc.). These can be coupled with the addition of duplicates when not paying attention or when several team members are collaborating on the same database.

Automatically imported duplicates

You can also add contacts and enrich them automatically in an Excel file thanks to workflows set up on tools like Zapier, Make or n8n. For instance, Dropcontact can enrich your Excel or Google Sheets file online thanks to a workflow created with Zapier. Check it out 👉 here

Duplicates can also be created automatically in your file because of lead generation forms integrated in your website. This case is less common but can occur when:

  • A lead fills in the form with a typing error
  • A lead fills in the form… several times
  • A lead gives their professional email AND their personal one
  • A lead inverts their first and last names

What happens when there are duplicates in your Excel file

Having duplicates in your file can have dire consequences on your prospecting and on your credibility when dealing with prospects.

On the one hand, a team member runs the risk of contacting the same persons several times if the file is shared. This will quickly make your credibility with these contacts plumet.

On the other hand, it will have a considerable impact on the way you analyse your metrics and your KPIs, especially when compared to the number of contacts in your file. Imagine your file has 10 000 contacts, 3 000 of which are duplicates: this will thwart your perspective on the number of contacts you actually have. Same goes for KPIs, like your email opening rate. In the best case scenario, contact duplicate, or even a triplet, will only open your email once. Your opening, click and conversion rates can be highly distorted.

What's Excel ?

Do we really need to introduce Microsoft Excel ? 😉

Small reminder for industry beginners: Excel is a versatile spreadsheet software included in the Microsoft Office 365 Suite. It can create tables, lists, charts, and dynamic formulas which vary depending on the selected cell.

Excel has several features and allows you to multitask, from your accounting, your finance management, your logistics, to your contact database management.

Excel is easy to use and works for every type of user, from beginners to experts, and has a high-storage capacity for your data (more than 1 million of lines can be created).

The tool is available online or locally, thanks to its Desktop and Mobile app.

How to create your database in Excel ?

Creating a database in Excel can be done in several (very) easy steps!

To start off, you'll need to open a new spreadsheet from the Desktop app or in Excel online (if you choose the Desktop app, you'll always be able to put your file online afterwards).

Then, you'll have to define labels for each row and column which will allow you to sort out your data.

To convert your file into a table, you'll juste have to click on Insert, then on "Table". You'll need to define the columns and rows you want to add to your table (those with your labels). Don't forget to check the "My table has headers" box, so that the fields are enhanced with a conditional formatting.

A filter is then automatically added to each of your recognized field so that you can easily sort out and filter each set of data in one click.

A conditional formatting is also applied to highlight the fields of your database.

Once your table is created, you'll need to add your data. There are 2 ways to do so:

  • Entering data manually by filling in each cell. Beware of spelling and typing errors, for if this process is rather simple, it can also generate a lot of mistakes.
  • Importing data from external sources like text or .csv files. The import is seamless: click on the "Data" tab —> "from a Text or CSV file." However, make sure your data are properly formatted beforehand and are adequately imported under each of your labels.

There you go 🎬 you're all set to create several databases on Excel.

How can I delete duplicates in Excel ?

Once your database is created, you'll need to keep it updated and to make sure it doesn't hold any incorrect data or, most of all, duplicates. How can you do that?

There are two methods to delete duplicates in Excel.

Manually deleting duplicates in Excel

First, you can delete them manually, thanks to conditional formatting. In the "Home" tab, choose "Conditional formatting", then "Highlight cells". Lastly, define the following rule: if the cell contains a duplicate: apply [cell color].

This formatting will let you visualize the duplicates in your document so that you can decide whether you want to delete them or not.

Secondly, you'll need to delete the duplicates in your document. There's a simple way to delete all the duplicates in one or several columns. You'll also be able to select or unselect some columns so that you only delete the duplicates you want. Go to Data —> Delete duplicates —> then select the columns in which you want to delete duplicates.

Say goodbye to contact duplicates thanks to this easy operation.

However, keep in mind that this must be done manually and after you verified the duplicates in your document. This process can also be biased, for if contacts have the same first and last names, Excel's automatic function will delete these namesakes.

Deleting duplicates in Excel by importing your document to a third-party tool

Several tools on the market can delete duplicates from an Excel file in .csv or .xls. With a simple file import, they can detect and merge duplicates and then export the clean file. Dedupely and Datablist are among the solutions able to do so.

Dedupely

Dedupely is a duplicate management tool that can identify potential duplicates from the information available in the cells of your Excel file.

Deduply works with a credit system: from 25$ per month to verify 30 000 lines in your file. To detect and merge duplicate, you'll need to import your .csv or .xls file directly to Dedupely.

Make sure you make a copy of your contact file before importing it to Dedupely, because once the duplicates are merged, you won't be able to restore your previous data.

Datablist

Datablist is a multi-feature tool which detects and automatically merges contact duplicates, lets you define how it should process with duplicates (whether you want them merged or definitively deleted). It can also analyze the values it has defined as duplicate and choose whether to keep them or delete them. This gives you a precise overview of what Datablist is doing in your file.

Alternative to Excel : merging duplicates directly in your CRM

Merging duplicates automatically in your CRM thanks to Dropcontact's integration

Dropcontact's native integration in Pipedrive, HubSpot and Salesforce

Dropcontact is directly and natively integrated in your Pipedrive, HubSpot and (soon) Salesforce CRM.

Once connected to your CRM, Dropcontact manages and merges your duplicates, in real-time, with full autonomy and without interruptions.

Automatically deleting and merging duplicates

Dropcontact works with proprietary algorithms which instantly identify the duplicates in your CRM and automatically merge them, without your lifting a finger.

Dropcontact verifies the pieces of information in your CRM and always keeps the most up-to-date ones.

Real-time synchronization

The data in your CRM are constantly updated in real-time thanks to the Dropcontact integration. Dropcontact will even prevent you from creating new duplicates and instantly merges the two contact cards involved.

Now you can truly say farewell to duplicates thanks to this life-changing integration 💚

Share this article

Most frequently asked questions

How do duplicates usually appear in Excel?

Duplicates usually emerge after typing in contacts manually or after automatically importing contacts through contact forms, workflows or other tools.

What's a contact duplicate?

A duplicate emerges when the information of a contact is filed twice (or more!) in your database. Duplicates can have disastrous consequences on your prospecting, that's why you must delete them or, even better, automate their detection and their merging.

What are the consequences of having duplicates in my Excel file?

On the one hand, a team member runs the risk of contacting the same persons several times if the file is shared. This will quickly make your credibility with these contacts plumet.On the other hand, it will have a considerable impact on the way you analyse your metrics and your KPIs, especially when compared to the number of contacts in your file.