So, we might start this exercise by first writing a function to classify a string as a palindrome or not. A nice way to do this is to start with a pointer at either end of the string, and iterate them inwards, checking the lower and upper characters against each other at each stage. For example, if the first and last characters are equal, we iterate inwards. If not, we can return "false" immmediately. The following snippet shows this function.
bool is_pallindrome(string input){
int lower=0;
int upper=input.length()-1;
while (upper>lower){
if (input[upper]!=input[lower]) //If we find differing characters, not a pallindrome
return false;
upper--;
lower++;
}
return true;
}
So now, to find the longest palindrome in a string, we could just iterate over the string, looking at all substrings, and calling is_pallindrome() [notice my typo in the function name]. That sounds pretty straight-forward, but searching all substrings becomes O(N^2), kind of cumbersome.
A nice way to use the is_pallindrome() function is to think about strings that are centered at index i. So, we have a pointer that searches left from i, and a pointer that searches right from i, and we stop iterating when input[lower]!=input[upper]. This is very similar to the is_pallindrome() function.
There's another approach that's more efficient than the above. Sometimes a good way to think about such a problem is to find the desired subset that ends at index i. If I'm searching for a palindrome that ends at index i, I can store the longest palindrome that ended at index i-1. Then I simply have to look at the first index of that palindrome, and the character that precedes it in the input string.
The storage/caching can be implemented with a hashmap, so we store the longest palindrome that ends at each index, and then we can look it up at the next step. This means, each additional step only features two more comparisons. The following snippet implements this approach:
string longest_pallindrome_map(string input){
map<int, string> pal; //hashmap from index to longest pallindrome that *ends* at index
int max_length=0; int max_index=0;
for (int i=0; i<input.size(); i++){
int prev_index=i-pal[i-1].length()-1;
if (pal[i-1][0]==input[i]) //Looking for consecutive duplicate letters
pal[i]=pal[i-1]+input[i];
else if (input[i]==input[prev_index])
pal[i]=input[prev_index]+pal[i-1]+input[i]; //Looking at the character prior to previous pallindrome
else
pal[i]=string(input.begin()+i, input.begin()+i+1); //Longest pallindrome is current character
if (pal[i].length()>max_length){
max_index=i;
max_length=pal[i].length();
}
}
return pal[max_index];
}
This is actually quite a bit more efficient in the searching, more like pseudo-linear instead of quadratic search. [Again notice my mispelling of palindrome.] The only thing is, we are storing a bunch of strings in this hashmap. We don't actually need to do this.
Instead, we can just store the indices or pointers that begin the palindromes that ends at each index in the input string. The following function does this.
string longest_pallindrome(string input){
map<int, int> pal; //hashmap from current index, to the begin-index of the longest pallindrome that *ends* here
int max_length=0;
int max_ind=0;
for (int i=0; i<input.size(); i++){
int prev_index=pal[i-1];
if (input[i]==input[prev_index]){ //Looking for consecutive duplicate characters
pal[i]=prev_index;
}
else if (input[i]==input[prev_index-1]) //Looking at the index prior to the last pallindrome
pal[i]=prev_index-1;
else
pal[i]=i; //Longest pallindrome is simply the current character
if ((i-pal[i])>max_length){ //Finding the longest one along the way
max_ind=i;
max_length=i-pal[i];
}
}
return string(input.begin()+pal[max_ind], input.begin()+max_ind+1);
}
So now we have, via dynamic programming, a pseudo-linear search, in that we basically have a linear search over the input string, but we have to store O(N) indices.
Did I really need the hashmap? Not necessarily; I could have used a regular old array, and had some code to check_bounds. On the other hand, the hashmap approach is generalizable to other problems, where the key values might not be simple ordinal numbers.
No comments:
Post a Comment