Union Find DFS BFS HashMap Sorting

Problem Link on Leetcode: Accounts Merge

Problem Description:

Given a list of accounts where each element accounts[i] is a list of strings, where the first element accounts[i][0] is a name, and the rest of the elements are emails representing emails of the account.

Now, we would like to merge these accounts. Two accounts definitely belong to the same person if there is some common email to both accounts. Note that even if two accounts have the same name, they may belong to different people as people could have the same name. A person can have any number of accounts initially, but all of their accounts definitely have the same name.

After merging the accounts, return the accounts in the following format: the first element of each account is the name, and the rest of the elements are emails in sorted order. The accounts themselves can be returned in any order.

Example 1:

Input:

accounts = 

[
    ["John","johnsmith@mail.com","john_newyork@mail.com"],
    ["John","johnsmith@mail.com","john00@mail.com"],
    ["Mary","mary@mail.com"],
    ["John","johnnybravo@mail.com"]
]

Output:

[
    ["John","john00@mail.com","john_newyork@mail.com","johnsmith@mail.com"],
    ["Mary","mary@mail.com"],
    ["John","johnnybravo@mail.com"]
]

Explanation:

The first and second John’s are the same person as they have the common email “johnsmith@mail.com”. The third John and Mary are different people as none of their email addresses are used by other accounts. We could return these lists in any order, for example the answer

[
    ['Mary', 'mary@mail.com'],
    ['John', 'johnnybravo@mail.com'], 
    ['John', 'john00@mail.com', 'john_newyork@mail.com', 'johnsmith@mail.com']
] 

would still be accepted.

Example 2:

Input:

accounts = 
    [
        ["Gabe","Gabe0@m.co","Gabe3@m.co","Gabe1@m.co"],
        ["Kevin","Kevin3@m.co","Kevin5@m.co","Kevin0@m.co"],
        ["Ethan","Ethan5@m.co","Ethan4@m.co","Ethan0@m.co"],
        ["Hanzo","Hanzo3@m.co","Hanzo1@m.co","Hanzo0@m.co"],
        ["Fern","Fern5@m.co","Fern1@m.co","Fern0@m.co"]
    ]

Output:

[
    ["Ethan","Ethan0@m.co","Ethan4@m.co","Ethan5@m.co"],
    ["Gabe","Gabe0@m.co","Gabe1@m.co","Gabe3@m.co"],
    ["Hanzo","Hanzo0@m.co","Hanzo1@m.co","Hanzo3@m.co"],
    ["Kevin","Kevin0@m.co","Kevin3@m.co","Kevin5@m.co"],
    ["Fern","Fern0@m.co","Fern1@m.co","Fern5@m.co"]
]

Constraints:

1 <= accounts.length <= 1000
2 <= accounts[i].length <= 10
1 <= accounts[i][j].length <= 30
accounts[i][0] consists of English letters.
accounts[i][j] (for j > 0) is a valid email.

Implementation:

This problem is a great candidate for Union Find, as part of the solution involves identifying accounts that belong to the same person. You can check out my detailed Union Find solution here: Union Find Approach

While Union Find is generally the most efficient approach for solving this problem, Depth First Search (DFS) can sometimes perform better for smaller input sizes. In this article, we’ll use DFS, which is simpler to implement than Union Find.

Depth First Search Approach:

Example 3:

Input: List of accounts with names and emails.

accounts = 
    [
        ["JOHN", "J1", "J2"],
        ["JOHN", "J1", "J3"],
        ["JOHN", "J3", "J4"],
        ["JOHN", "J5"],
        ["MARRY", "M1"]
    ]

Let’s break down the solution into 4 steps.

Steps:

  • Step 1 - Emails to Account Map:
    
    Map<String, String> emailToNameMap = new HashMap<>();

    // Map emails to account names
    for (List<String> account : accounts) {
            String accountName = account.get(0);
            for (int i = 1; i < account.size(); i++) {
                String email = account.get(i);
                emailToNameMap.putIfAbsent(email, accountName);
            }
    }

Output:


{
    M1=MARRY, 
    J1=JOHN, 
    J2=JOHN, 
    J3=JOHN, 
    J4=JOHN, 
    J5=JOHN
}

  • Step 2 - Initialization: Initialize graph with empty adjacency lists

emailToNameMap.keySet().forEach(email -> emailGraph.put(email, new ArrayList<>()));

  • Step 3 - Build bi-directional email graph

for (List<String> account : accounts) {
    String firstEmail = account.get(1);
    for (int i = 2; i < account.size(); i++) {
        String email = account.get(i);
        emailGraph.get(firstEmail).add(email);
        emailGraph.get(email).add(firstEmail);
    }
}

  • Step 4 - Traverse Graph: Traverse graph using DFS to find connected components

for (String email : emailToNameMap.keySet()) {
    if (!visitedEmails.contains(email)) {
        List<String> connectedEmails = new ArrayList<>();

        dfs(email, connectedEmails);

        Collections.sort(connectedEmails);

        List<String> accountGroup = new ArrayList<>();
        accountGroup.add(emailToNameMap.get(email));
        accountGroup.addAll(connectedEmails);
        mergedAccounts.add(accountGroup);
    }
}

  • Depth First Search

public void dfs(String email, List<String> connectedEmails) {
    connectedEmails.add(email);
    visitedEmails.add(email);
    for (String neighbor : emailGraph.get(email)) {
        if (!visitedEmails.contains(neighbor)) {
                dfs(neighbor, connectedEmails);
        }
    }
}

Time Complexity:

O(nk log nk)

where n is number of accounts and k = maximum number of emails in accounts[i]

In the worst case, all the emails will end up belonging to a single person. The total

number of emails will be n * k, and we need to sort these emails which takes nk log(nk).

DFS traversal will take n * k operations as no email will be traversed more than once

Space Complexity:

O(n + nk) = O(nk)

O(nk) Building the adjacency list will take O(nk) space. In the end, visited hashset will contain all of the emails, so it will use O(nk) space. Also, the recursive call stack for DFS will use O(nk) space in the worst case.

Full Code:


public class L721AccountsMerge_DFS {
    
    Map<String, String> emailToNameMap = new HashMap<>();
    Map<String, List<String>> emailGraph = new HashMap<>();
    Set<String> visitedEmails = new HashSet<>();

    public List<List<String>> accountsMerge(List<List<String>> accounts) {
        List<List<String>> mergedAccounts = new ArrayList<>();

        // Map emails to account names
        for (List<String> account : accounts) {
            String accountName = account.get(0);
            for (int i = 1; i < account.size(); i++) {
                String email = account.get(i);
                emailToNameMap.putIfAbsent(email, accountName);
            }
        }

        // Initialize graph with empty adjacency lists
        emailToNameMap.keySet().forEach(email -> emailGraph.put(email, new ArrayList<>()));

        // Build bi-directional email graph
        for (List<String> account : accounts) {
            String firstEmail = account.get(1);
            for (int i = 2; i < account.size(); i++) {
                String email = account.get(i);
                emailGraph.get(firstEmail).add(email);
                emailGraph.get(email).add(firstEmail);
            }
        }

        // Traverse graph using DFS to find connected components
        for (String email : emailToNameMap.keySet()) {
            if (!visitedEmails.contains(email)) {
                List<String> connectedEmails = new ArrayList<>();

                dfs(email, connectedEmails);

                Collections.sort(connectedEmails);

                List<String> accountGroup = new ArrayList<>();
                accountGroup.add(emailToNameMap.get(email));
                accountGroup.addAll(connectedEmails);
                mergedAccounts.add(accountGroup);
            }
        }

        return mergedAccounts;
    }

    public void dfs(String email, List<String> connectedEmails) {
        connectedEmails.add(email);
        visitedEmails.add(email);
        for (String neighbor : emailGraph.get(email)) {
            if (!visitedEmails.contains(neighbor)) {
                dfs(neighbor, connectedEmails);
            }
        }
    }

}

Author: Mohammad J Iqbal

Mohammad J Iqbal

Follow Mohammad J Iqbal on LinkedIn