Question
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG"
. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that > occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
,
Return: ["AAAAACCCCC", "CCCCCAAAAA"]
.
Solution
Result: Accepted Time: 136 ms
Here should be some explanations.
class Solution {
public:
int chrs[256]={0};
int get_value(const string s)
{
int ret = 0;
for(const auto x:s)
ret = (ret<<2)|chrs[x];
return ret;
}
vector<string> findRepeatedDnaSequences(string s) {
chrs['A'] = 0;
chrs['C'] = 1;
chrs['G'] = 2;
chrs['T'] = 3;
unordered_map<int,int> mp;
vector<string> ret;
const int limit = int(s.length()) - 9;
for(int i = 0; i < limit; i++)
{
string tmp = s.substr(i,10);
int value = get_value(tmp);
mp[value]++;
if(mp[value] == 2)
ret.push_back(tmp);
}
return ret;
}
};
Complexity Analytics
- Time Complexity:
- Space Complexity: