Pandas – Removing Duplicates

Pandas – Removing Duplicates

Discovering Duplicates

Duplicate rows are rows that have been registered more than one time.

By taking a look at our test data set, we can assume that row 11 and 12 are duplicates.

To discover duplicates, we can use the duplicated() method.

The duplicated() method returns a Boolean values for each row:

Example

Returns True for every row that is a duplicate, othwerwise False:

Try it Yourself »


w3schoolsCERTIFIED.2022

Get Certified!

Complete the Pandas modules, do the exercises, take the exam, and you will become w3schools certified!

$10 ENROLL


Removing Duplicates

To remove duplicates, use the drop_duplicates() method.

Example

Remove all duplicates:

Try it Yourself »

Remember: The (inplace = True) will make sure that the method does NOT return a new DataFrame, but it will remove all duplicates from the original DataFrame.


Test Yourself With Exercises

Exercise:

Insert the correct syntax for removing rows with empty cells.

Start the Exercise

ArmenianEnglish