string - What is the best algorithm to find longest substring with constraints? -


unfortunately don't know name of following problem sure known problem. want find effective algorithm solve problem.

let s - input string , k - number (1 <= k <= 26).

problem find longest substring of s, has k different characters. best algorithm solve problem?

some examples:

1) s = aaaaabcdef, k = 3, answer = aaaaabc

2) s = acaaba, k = 2, answer = acaa or aaba

3) s = abcde, k = 5, answer = abcde

i have sketch of solution of problem. seems difficult me, has quadratic complexity. so, in single linear pass can compute sequent of same characters 1 , appropriated count. next step use set contain k characters. usage similar:

std::string max_string; (int = 0; < s.size(); ++i) {    std::set<int> my_set;    std::string possible_solution;    (int j = i; j < s.size(); ++j)    {        // filling set , possible_solution    }    if (my_set.size() == k && possible_solution.size() > max_string.size())       max_string = possible_solution;  } 

notation:
s = input string, zero-based index
[start, end) = substring of input start end, including start excluding end
k-substring = substring contains at most k different characters

algorithm: linear complexity o(n)

start = 0 result = empty string find max(end): [start, end) k-substring loop:   // please note in every loop iteration, [start, end) k-substring   update result=[start, end) if (end-start) > length(result)   if end >= length(s) done! exit   increase start until [start, end) (k-1)-substring   increase end while [start, end] k-substring endloop 

to check if increasing start or end respectively decrease or increase character pool size (k property), can use count[] array, count[c] = number of occurence of c in current substring [start, end).

c++ implementation: http://ideone.com/i2jpcq


Comments

Popular posts from this blog

amazon web services - S3 Pre-signed POST validate file type? -

c# - Check Keyboard Input Winforms -