Sayanan Sivaraman, PhD: November 2014

Sometimes you come across puzzles or interview questions along the lines of "Find the longest palindrome in a string." To be honest, I've never looked for palindromes in my work, but I think it's a good exercise nonetheless. For one thing, the "best practices" strategies can be applied to other problems. For another, the search for symmetric subsets in datasets of all types definitely does come up.

So, we might start this exercise by first writing a function to classify a string as a palindrome or not. A nice way to do this is to start with a pointer at either end of the string, and iterate them inwards, checking the lower and upper characters against each other at each stage. For example, if the first and last characters are equal, we iterate inwards. If not, we can return "false" immmediately. The following snippet shows this function.

 bool is_pallindrome(string input){  
      int lower=0;  
      int upper=input.length()-1;  
      while (upper>lower){  
           if (input[upper]!=input[lower]) //If we find differing characters, not a pallindrome  
                return false;  
           upper--;  
           lower++;  
      }  
      return true;  
 }

So now, to find the longest palindrome in a string, we could just iterate over the string, looking at all substrings, and calling is_pallindrome() [notice my typo in the function name]. That sounds pretty straight-forward, but searching all substrings becomes O(N^2), kind of cumbersome.

A nice way to use the is_pallindrome() function is to think about strings that are centered at index i. So, we have a pointer that searches left from i, and a pointer that searches right from i, and we stop iterating when input[lower]!=input[upper]. This is very similar to the is_pallindrome() function.

There's another approach that's more efficient than the above. Sometimes a good way to think about such a problem is to find the desired subset that ends at index i. If I'm searching for a palindrome that ends at index i, I can store the longest palindrome that ended at index i-1. Then I simply have to look at the first index of that palindrome, and the character that precedes it in the input string.

The storage/caching can be implemented with a hashmap, so we store the longest palindrome that ends at each index, and then we can look it up at the next step. This means, each additional step only features two more comparisons. The following snippet implements this approach:

 string longest_pallindrome_map(string input){  
      map<int, string> pal; //hashmap from index to longest pallindrome that *ends* at index  
      int max_length=0; int max_index=0;  
      for (int i=0; i<input.size(); i++){  
           int prev_index=i-pal[i-1].length()-1;            
           if (pal[i-1][0]==input[i]) //Looking for consecutive duplicate letters  
                pal[i]=pal[i-1]+input[i];            
           else if (input[i]==input[prev_index])  
                pal[i]=input[prev_index]+pal[i-1]+input[i]; //Looking at the character prior to previous pallindrome  
           else  
                pal[i]=string(input.begin()+i, input.begin()+i+1); //Longest pallindrome is current character  
   
           if (pal[i].length()>max_length){  
                max_index=i;  
                max_length=pal[i].length();            
           }            
      }  
      return pal[max_index];  
 }

This is actually quite a bit more efficient in the searching, more like pseudo-linear instead of quadratic search. [Again notice my mispelling of palindrome.] The only thing is, we are storing a bunch of strings in this hashmap. We don't actually need to do this.

Instead, we can just store the indices or pointers that begin the palindromes that ends at each index in the input string. The following function does this.

 string longest_pallindrome(string input){       
      map<int, int> pal; //hashmap from current index, to the begin-index of the longest pallindrome that *ends* here  
      int max_length=0;  
      int max_ind=0;  
      for (int i=0; i<input.size(); i++){                           
           int prev_index=pal[i-1];                                
           if (input[i]==input[prev_index]){ //Looking for consecutive duplicate characters  
                pal[i]=prev_index;  
           }  
           else if (input[i]==input[prev_index-1]) //Looking at the index prior to the last pallindrome  
                pal[i]=prev_index-1;  
           else  
                pal[i]=i; //Longest pallindrome is simply the current character  
           if ((i-pal[i])>max_length){ //Finding the longest one along the way  
                max_ind=i;  
                max_length=i-pal[i];  
           }                           
      }            
      return string(input.begin()+pal[max_ind], input.begin()+max_ind+1);  
 }

So now we have, via dynamic programming, a pseudo-linear search, in that we basically have a linear search over the input string, but we have to store O(N) indices.

Did I really need the hashmap? Not necessarily; I could have used a regular old array, and had some code to check_bounds. On the other hand, the hashmap approach is generalizable to other problems, where the key values might not be simple ordinal numbers.

If you live in the Bay Area, you know that the cost of living is high. By "cost of living," I don't mean food prices, fuel prices, or many other goods and services. I mainly mean the high prices for renting or purchasing property.

While many areas offer an array of low-end to high-end housing solutions, where the renter/buyer can adapt his budget to the options, San Francisco offers almost no availability at the low-end, and almost everything else at high-end prices, with premium-end a growing market share.

There is an array of historical reasons behind this strange, skewed distribution of housing options. The underlying reason is a geographically constrained area, with ample economic opportunities, which attracts a lot of people from all over the world. This underlying problem is exacerbated by Not-In-My-Backyard policies that make it difficult to develop higher-rise, higher-density housing.

Prop 13 severely reduces liquidity in the property market, because if you already own property and decide to move, you stand to increase your tax burden; all property owners stay put. Finally, rent control ensures that renters also stay put.

In 1978, people got tired of property tax increases, so they passed Prop 13 to limit/eliminate property tax increases, and consequences be damned! In 1979, San Franciscans got tired of rent increases, so they passed rent-control, and consequences be damned! Each of these policies are best-described as a "We were here first!" tax on newcomers.

Down where I live, on the Peninsula/Silicon Valley, there is no rent control, but rents are still very high. We have similarly high demand for housing here, and many high-paying employers nearby. One additional reason appears to be monopolistic ownership of rental properties. The same 2-3 companies own all the apartment complexes in town, so they alone control the going rents.

So what's wrong with high rents/home prices? Well, for the lower-income, they're a massive burden. For the middle and higher-income, they're a huge waste of resources. The excess money someone spends on rent or mortgage could go to purchasing other goods and services, creating more economic activity.

In any case, there's a great article on techcrunch that details the housing situation, and historical factors in San Francisco. It's well-researched, and a long but informative read.

Sayanan Sivaraman, PhD

Pages

Friday, November 14, 2014

Pallindromes in C++

Thursday, November 6, 2014

Housing Affordability in San Francisco [and the Bay Area]