Repeated DNA Sequences LeetCode Solution

Last updated on January 13th, 2025 at 09:36 pm

Here, we see a Repeated DNA Sequences LeetCode Solution. This Leetcode problem is solved using different approaches in many programming languages, such as C++, Java, JavaScript, Python, etc.

List of all LeetCode Solution

Topics

Depth-First Search, Tree

Companies

Microsoft

Level of Question

Medium

Repeated DNA Sequences LeetCode Solution

Repeated DNA Sequences LeetCode Solution

1. Problem Statement

The DNA sequence is composed of a series of nucleotides abbreviated as ‘A’, ‘C’, ‘G’, and ‘T’.

  • For example, “ACGAATTCCG” is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

Example 1:
Input: s = “AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT”
Output: [“AAAAACCCCC”,”CCCCCAAAAA”]

Example 2:
Input: s = “AAAAAAAAAAAAA”
Output: [“AAAAAAAAAA”]

2. Coding Pattern Used in Solution

The coding pattern used in all the provided implementations is Sliding Window. The sliding window technique is applied here to process substrings of length 10 (a fixed window size) as the code iterates through the input string. The window slides one character at a time, and the substrings are checked for repetitions using hash-based data structures (e.g., unordered_mapSet, or defaultdict).

3. Code Implementation in Different Languages

3.1 Repeated DNA Sequences C++

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        unordered_map<size_t,int> MP;
        hash<string> hash_fn;
        vector<string> ret;
        
        for(int i = 0; i < int(s.size()) - 9; ++i)
            if(MP[hash_fn(s.substr(i,10))]++ == 1 )
               ret.push_back(s.substr(i,10));
               
      return ret;        
    }
};

3.2 Repeated DNA Sequences Java

class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        Set seen = new HashSet(), repeated = new HashSet();
        for (int i = 0; i + 9 < s.length(); i++) {
            String ten = s.substring(i, i + 10);
            if (!seen.add(ten))
                repeated.add(ten);
        }
        return new ArrayList(repeated);
    }
}

3.3 Repeated DNA Sequences JavaScript

var findRepeatedDnaSequences = function(s) {
    let curr = s.slice(0, 10);
    const seen = new Set([curr]);
    const res = new Set();
    for(let i = 10; i < s.length; i++) {
        curr = curr.slice(1) + s[i];
        if(seen.has(curr)) res.add(curr);
        seen.add(curr);
    }
    return [...res];
};

3.4 Repeated DNA Sequences Python

class Solution(object):
    def findRepeatedDnaSequences(self, s):
        sequences = collections.defaultdict(int)
        for i in range(len(s)):
            sequences[s[i:i+10]] += 1
        return [key for key, value in sequences.iteritems() if value > 1]

4. Time and Space Complexity

Time ComplexitySpace Complexity
C++O(n)O(n)
JavaO(n)O(n)
JavaScriptO(n)O(n)
PythonO(n)O(n)
  • Sliding Window: The fixed-length substrings of size 10 are processed using a sliding window approach.
  • Hashing: Hash-based data structures (e.g., unordered_mapSet, or defaultdict) are used to efficiently track and detect repeated substrings.
  • Efficiency: The time complexity is O(n) because each substring is processed in constant time, and the space complexity is O(n) due to the storage of substrings in hash-based data structures.
Scroll to Top