Java 8 – Remove Duplicates from a Stream

Introduction

Java 8 introduced the Stream API, offering a powerful and efficient way to process collections of data in a functional and declarative style. One common task when working with collections is removing duplicate elements. Whether you’re dealing with a list of integers, strings, or custom objects, the Stream API provides a straightforward way to remove duplicates using the distinct() method.

In this guide, we’ll explore how to remove duplicates from a stream in Java 8, with examples demonstrating how to apply this to different types of data, including lists of integers, strings, and custom objects.

Table of Contents

  • Problem Statement
  • Solution Steps
  • Java Program
    • Removing Duplicates from a List of Integers
    • Removing Duplicates from a List of Strings
    • Removing Duplicates from a Stream of Custom Objects
  • Advanced Considerations
  • Conclusion

Problem Statement

The task is to create a Java program that:

  • Demonstrates how to use the distinct() method to remove duplicates from a stream.
  • Applies distinct() to different types of data, including lists of integers, strings, and custom objects.
  • Outputs the results with duplicates removed.

Example 1:

  • Input: List of integers [1, 2, 2, 3, 4, 4, 5]
  • Output: [1, 2, 3, 4, 5]

Example 2:

  • Input: List of strings ["apple", "banana", "apple", "cherry"]
  • Output: ["apple", "banana", "cherry"]

Solution Steps

  1. Create a Stream: Start with a stream of elements that may contain duplicates.
  2. Apply the distinct() Method: Use the distinct() method to filter out duplicate elements.
  3. Display the Result: Collect and print the elements with duplicates removed.

Java Program

Removing Duplicates from a List of Integers

The distinct() method can be used to remove duplicates from a list of integers.

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

/**
 * Java 8 - Removing Duplicates from a List of Integers Using Stream.distinct()
 * Author: https://www.rameshfadatare.com/
 */
public class RemoveDuplicatesFromIntegerList {

    public static void main(String[] args) {
        // Step 1: Create a list of integers with duplicates
        List<Integer> numbers = Arrays.asList(1, 2, 2, 3, 4, 4, 5);

        // Step 2: Remove duplicates using distinct()
        List<Integer> uniqueNumbers = numbers.stream()
            .distinct()
            .collect(Collectors.toList());

        // Step 3: Display the result
        System.out.println("Unique Numbers: " + uniqueNumbers);
    }
}

Output

Unique Numbers: [1, 2, 3, 4, 5]

Explanation

  • The numbers.stream() method creates a stream from the list of integers.
  • The distinct() method filters out duplicate elements from the stream.
  • The collect(Collectors.toList()) method collects the unique elements into a list.

Removing Duplicates from a List of Strings

You can also use the distinct() method to remove duplicates from a list of strings.

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

/**
 * Java 8 - Removing Duplicates from a List of Strings Using Stream.distinct()
 * Author: https://www.rameshfadatare.com/
 */
public class RemoveDuplicatesFromStringList {

    public static void main(String[] args) {
        // Step 1: Create a list of strings with duplicates
        List<String> fruits = Arrays.asList("apple", "banana", "apple", "cherry", "banana");

        // Step 2: Remove duplicates using distinct()
        List<String> uniqueFruits = fruits.stream()
            .distinct()
            .collect(Collectors.toList());

        // Step 3: Display the result
        System.out.println("Unique Fruits: " + uniqueFruits);
    }
}

Output

Unique Fruits: [apple, banana, cherry]

Explanation

  • The fruits.stream() method creates a stream from the list of strings.
  • The distinct() method removes duplicate strings from the stream.
  • The collect(Collectors.toList()) method collects the unique strings into a list.

Removing Duplicates from a Stream of Custom Objects

The distinct() method can also be applied to streams of custom objects. To ensure that duplicates are correctly identified, the custom objects should correctly override the equals() and hashCode() methods.

import java.util.Arrays;
import java.util.List;
import java.util.Objects;
import java.util.stream.Collectors;

/**
 * Java 8 - Removing Duplicates from a Stream of Custom Objects Using Stream.distinct()
 * Author: https://www.rameshfadatare.com/
 */
public class RemoveDuplicatesFromCustomObjects {

    public static void main(String[] args) {
        // Step 1: Create a list of products with duplicates
        List<Product> products = Arrays.asList(
            new Product("Laptop", 1500),
            new Product("Phone", 800),
            new Product("Laptop", 1500),
            new Product("Tablet", 600)
        );

        // Step 2: Remove duplicates using distinct()
        List<Product> uniqueProducts = products.stream()
            .distinct()
            .collect(Collectors.toList());

        // Step 3: Display the result
        uniqueProducts.forEach(product -> 
            System.out.println("Product: " + product.getName() + ", Price: " + product.getPrice()));
    }
}

// Custom class Product
class Product {
    private String name;
    private double price;

    public Product(String name, double price) {
        this.name = name;
        this.price = price;
    }

    public String getName() {
        return name;
    }

    public double getPrice() {
        return price;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Product product = (Product) o;
        return Double.compare(product.price, price) == 0 &&
                Objects.equals(name, product.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name, price);
    }
}

Output

Product: Laptop, Price: 1500.0
Product: Phone, Price: 800.0
Product: Tablet, Price: 600.0

Explanation

  • The products.stream() method creates a stream from the list of Product objects.
  • The distinct() method removes duplicate products from the stream, relying on the correct implementation of equals() and hashCode().
  • The unique products are collected into a list and displayed.

Advanced Considerations

  • Performance Considerations: While distinct() is useful for removing duplicates, it can be computationally expensive for large datasets. Consider the performance impact, especially if the stream needs to be processed in parallel.

  • Parallel Streams: When working with parallel streams, distinct() ensures that duplicates are removed across all threads. However, the order of elements is not guaranteed, so use forEachOrdered() if order matters.

  • Custom Equality: For custom objects, ensure that equals() and hashCode() are correctly implemented to allow distinct() to work as expected.

Conclusion

This guide provides methods for removing duplicates from a stream in Java 8, covering scenarios with lists of integers, strings, and custom objects. The distinct() method is a powerful feature of the Stream API that simplifies the process of filtering out duplicate elements, making your code more concise and readable. By understanding how to use distinct() effectively, you can efficiently manage data in your Java applications.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top