Friday, June 12, 2015

Opposition to Affordable Housing in Marin

Marketplace, on NPR, is running a series that deals with affordable housing in Marin County. Marin County is just north of San Francisco, a wealthy, democrat-voting, suburban and rural county, whence "liberal lion" Barbara Boxer hails.

George Lucas, of Star Wars fame, owns a former dairy farm in Marin County. He has offered to build affordable and working-class housing on his property, out-of-pocket. Local residents are opposing his plan. They cite weak reasons like potential traffic, or aesthetics.

In reality, this comes down to "I was here first!" and "Not In My Back Yard!" [NIMBY]. To get a good flavor of the situation, it's very informative to listen to the story from NPR, especially the town hall meeting. 

Wealthy homeowners in the Bay Area are so unwilling to live people who are not as wealthy as they are. They can't imagine that the people who pick the crops, clean the homes of the wealthy, work in their gardens, and teach in their schools, might have to live somewhere nearby.

The standard refrain is that "those people" can just live farther away and commute. The problem is that there is no "farther away" with both affordable housing, and efficient transportation infrastructure. Living more than 20 miles from your workplace is a terrible idea in the Bay Area, because the roads are clogged with congestion.

Saturday, May 23, 2015

El Nino Finally Arrives

El Nino climate events are associated with low atmospheric pressure, resulting in cooler, wetter weather in California. Last autumn, with climate scientists predicting an upcoming El Nino event, there was hope that it would coincide with California's winter, which is the rainy season, leading to healthy rainfall and mitigating the 4-year-old drought.

Unfortunately, El Nino never arrived during California's rainy season. It has very recently arrived, as we can verify at NOAA's website here. This El Nino event is still associated with cooler, wetter weather, relative to a typical California spring and summer. However, at this time of year, there aren't many major wet storms in the Northern Pacific to route to California. As such, we'll have a chilly, drizzly spring and summer, but no impact in the way of drought relief.

I lived in San Diego during one of the recent El Nino summers. It never really got warm. It was a chilly, unsatisfying summer. Perhaps this summer's weather will be similar.

Sunday, May 3, 2015

The Housing Crunch Continues

In the SF Bay Area, the housing crunch, complete with spikes in rent and home prices, continues. I've maintained the whole time, that the housing prices are not even justified for high-paid tech workers. Most people who work as engineers in this area cannot afford to purchase a house, and even finding reasonable rent is difficult.

The local news featured this article recently. It's anecdotal, but the premise is that even google software engineers have a difficult time finding housing in Mountain View.

Monday, February 9, 2015

Enjoying Judo

At this point, I've been training judo for about a month, and I'm really enjoying it. It's pretty amazing how the art exploits balance and leverage. The most elegant judo throws and sweeps use the opponent's weight and balance against him. It's an art that accommodates different body types, heights, weights, etc. In fact, being short with a low center of mass can be quite an asset; shorter opponents are often harder to throw, but have an easier time throwing their taller opponents.

I enjoy the variety that judo presents as well. There are throws, sweeps, takedowns, and ground-based grappling. I've been attending trainings at Cahill Judo academy in San Bruno. The chief instructor there, Willy Cahill, has coached judo for decades, including work with the US Olympic team.

Monday, January 26, 2015

Find the first substring in C++

Years ago, I interviewed for a summer internship, and I got a coding question along the lines of: "Implement a function that finds a substring within a string. If that substring is found, return the index of the found substring."

I fumbled through the question. Eventually I got a working solution, but it was a clumsy process getting there. Why was such a simple question so difficult to implement on the spot?

In hindsight, I think this is a good interview question for a few reasons. The problem is deceivingly simple; just about anybody is capable of doing this correctly, but it's difficult to do it well and efficiently, especially on the spot. Because the problem is so simple, the focus of evaluation lies in your style and thought process. Do you clutter your code with unnecessary variables? Is your code verbose or terse? How many functions do you compose, and why? Is your style sloppy or bug-prone? I didn't get that particular internship, because my code was definitely sloppy and bug-prone.

I think there's one more aspect of this question that causes applicants to freeze: strings. In C++ and Java, people tend to use the std::string/String classes, which have lots of utility functions built in. In day-to-day programming, you would rarely iterate over a string manually, and you might also avoid using pointers of type char*.

In this particular example, I think it's actually easier to use pointers than integer indices; this way you have fewer local variables to keep track of. To some extent, having fewer variables means fewer opportunities to introduce a bug.

In my version, I like splitting it up into two functions. The primary function simply iterates over the outer string, and searches each substring for the sought string. Heres' the code:

bool matched(char* sought, char* sent){
    //utility function
    while (*sought!='\0'){
        if (*sent=='\0') //sought is longer than sent
            return false;
        if (*sent!=*sought) //The characters don't match
            return false;
        sent++; sought++; //Check next character
    }
    return true;
}

int substring(char* sought, char* sentence){
    //primary function
    char* iter=sentence; //iterate over string
    while (*iter!='\0'){        
        if (matched(sought, iter)){
                return iter-sentence;
        }        
        iter++;
    }
    return -1;
}

As you can see, the pointers can reduce clutter in the code, because they already are indices in an array of characters.

Thursday, January 15, 2015

Matplotlib Animations Using Data queried from a SQL Table

In prior examples, I've provided a couple of snippets for writing videos using Matplotlib. In this example, instead of plotting data stored in files, I'm using data stored in a SQL table. In this particular instance, I'm using MySQL for convenience.

These days, your choice of database is about as controversial as discussing politics and religion; it's the third rail of...data? You can choose the best database, and table structure to suit your particular needs. MySQL serves just fine as an example. In any case, SQL syntax is mostly standard across databases. and the Python database interfaces are also quite standard; they mainly consist of instantiating a connection and cursor, and executing SQL commands.

Ok, so now to the code. In prior examples, we needed to know the number of frames to plot, when we initiate the matplotlib animation. In my case, I've structured the data so a given row in the table corresponds to one data frame, and to one plotted frame in the Matplotlib animation. So, I have a couple of utility functions, one to get the number of frames we'll plot, and the other to query a particular table entry, given the desired row [frame] and column.


 import sys  
 import os  
 import time  
 import MySQLdb as mdb  
 from TableDef import columns #Custom definition of the table structure  
   
 def count(table, serverIP, dbUser, password, database):  
   try:  
     con = mdb.connect(serverIP, dbUser, password,database)        
     with con:  
       cur = con.cursor()  
       com="SET sql_mode=\'ANSI_QUOTES\'"  
       cur.execute(com)  
       row_exists=0  
       com="SELECT COUNT(*) FROM \"%s\""%table  
       cur.execute(com)  
       rows=cur.fetchall()  
       count=rows[0][0]        
         
   except mdb.Error, e:   
       print "Error %d: %s" % (e.args[0],e.args[1])  
       sys.exit(1)    
   finally:      
       if con:    
         con.close()   
   return count  
   
 def getdataframe(table,column,frame,serverIP, user, password, database):      
     
   if column not in columns:  
     return  
   if frame>count(table):  
     return  
   try:  
     con = mdb.connect(serverIP, dbUser, password,database);        
     with con:  
       cur = con.cursor()  
       com="SET sql_mode=\'ANSI_QUOTES\'"  
       cur.execute(com)  
       row_exists=0  
       com="SELECT %s "%column  
       com=com+"FROM \"%s\" "%table  
       com=com+"WHERE Frame=%s"%str(frame)        
       cur.execute(com)  
       rows=cur.fetchall()  
       data=rows[0][0]        
         
   except mdb.Error, e:   
       print "Error %d: %s" % (e.args[0],e.args[1])  
       sys.exit(1)    
   finally:      
       if con:    
         con.close()   
           
   return data    

Note that I have a custom Python class called TableDef, where I've declared various parameters relevant to the table and database. This part depends on your particular application, how you structure your data, etc. I could have wrapped these table functions in a class if I wanted a more OO feel.

Now, when I go my Matplotlib animation, I can use these utility functions to get the total number of frames, and to query individual data frames.


 import sys  
 import os  
 import json  
 import numpy as np  
 import matplotlib  
 import matplotlib.pyplot as plt  
 import matplotlib.animation as animation  
 from file_utils import * #table from folder is defined here  
 import table_read as Table  
 from TableDef import columns #Custom definition of the table structure  
 from TableDef import DatabaseCredentials as DB #Custom definition of the table structure  
   
 def init():  
   plotline.set_data([], [])  
   dplotline.set_data([],[])  
   return plotline,dplotline,  
   
 def animate(i):          
   global buffersize    
   if (i%100==0):  
     print i        
   line= Table.getdataframe(table,columns[1],i,DB.serverIP, DB.user, DB.password, DB.name)    
   
   try:  
     jsondata=json.loads(line)    
   except ValueError:  
     print "Invalid JSON"  
     print line  
     f.close()  
     os.system("pause")      
     
   pos=jsondata["Position"]       
   vel=jsondata["Velocity"]  
   
   pos_buf.append(pos)  
   vel_buf.append(vel)  
     
   while len(pos_buf)>buffersize:  
     pos_buf.pop(0)  
   while len(vel_buf)>buffersize:  
     vel_buf.pop(0)  
     
   x=np.linspace(-len(pos_buf)/framerate,0,len(pos_buf))  
   dx=np.linspace(-len(vel_buf)/framerate,0,len(vel_buf))        
     
   plotline.set_data(x,headpose_buf)  
   dplotline.set_data(dx,headvel_buf)    
   return plotline, dplotline,  
   
 table=""   
 framerate=20.0  
 buffersize=5*20  
 vel_buf=[]  
 pos_buf=[]  
   
 fig, ax = plt.subplots()  
 ax.set_xlim(-5.0,2.0)  
 ax.set_ylim(-180,180)  
 my_dpi=96  
 fig.set_size_inches(640/my_dpi, 360/my_dpi)    
 plotline, =ax.plot([],[],lw=2,color='b')  
 dplotline, =ax.plot([],[], lw=2, color='r')  
   
 if __name__=='__main__':  
   if len(sys.argv)<2:  
     print "Usage:\nplot_pos_vel.py folder"  
     exit()  
     
   folder= sys.argv[1]       
   outfile=folder+"/pos_vel.avi"  
   if (os.path.isfile(outfile)):  
     exit()   
   table=table_from_folder(folder)  #Custom function that maps folders to their table representation, user defined  
   line_count=Table.count(table)    
     
   Writer = animation.writers['ffmpeg']  
   writer = Writer(fps=25, metadata=dict(artist='Sayanan'), extra_args=['-vcodec','libx264'])      
   anim = animation.FuncAnimation(fig, animate,init_func=init, frames=line_count-1,blit=True, interval=1, repeat=False)      
   anim.save(outfile,writer=writer)  
     
   #plt.show()  
     

Within the entries in the table, I've further structured my data as json for position and velocity. Again, this is really a design choice, and depending on your database, json encoding might be native.

Other than that, this is the same Matplotlib example as before, but using a database for the data.