<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Pseudointellectual Appendification</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/" />
    <link rel="self" type="application/atom+xml" href="http://www.mischievous.org/atom.xml" />
    <id>tag:www.mischievous.org,2009-06-02://1</id>
    <updated>2012-01-11T07:46:26Z</updated>
    <subtitle>The official long form musings and opinions by Jason Culverhouse on programming, Silicon Valley, venture capital, startups, charity and politics.</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.34-en</generator>

<entry>
    <title>Obscuring Numbers Generated by Auto Incrementing Primary Keys</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2012/01/obscuring-numbers-generated-by.html" />
    <id>tag:www.mischievous.org,2012://1.18</id>

    <published>2012-01-11T07:44:25Z</published>
    <updated>2012-01-11T07:46:26Z</updated>

    <summary>When you build a website it is often filled with objects that use serial column types, these are usually auto-incrementing integers. Often you want to obscure these numbers since they may convey some business value i.e. number of sales, users...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="python" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>When you build a website it is often filled with objects that use serial column types, these are usually auto-incrementing integers.  Often you want to obscure these numbers since they may convey some business value i.e. number of sales, users reviews etc.  A Better idea is to use an actual <a href="http://en.wikipedia.org/wiki/Natural_key">Natural Key</a>, which exposes the actual domain name of the object vs some numeric identifier.  It's not always possible to produce a natural key for every object and when you can't do this, consider obscuring the serial id.</p>

<p>This doesn't secure your numbers that convey business value, it only conceals them from the casual observer.  Here is an alternative that uses the bit mixing properties of <a href="http://en.wikipedia.org/wiki/Exclusive_or">exclusive or XOR</a>, some compression by conversion into "base36" (<a href="https://github.com/reddit/reddit/blob/master/r2/r2/lib/utils/_utils.pyx">via reddit</a>) and some bit shuffling so that the least significant bit is moved which minimizes the serial appearance.  You should be able to adapt this code to alternative bit sizes and shuffling patterns with some small changes.  Just not that I am using signed integers and it is important to keep the high bit 0 to avoid negative numbers that cannot be converted via the "base36" algorithm.</p>

<p><a href="http://wiki.python.org/moin/BitManipulation">Twiddling bits</a> in python isn't fun so I used the excellent <a href="http://packages.python.org/bitstring/">bitstring module</a></p>

<pre class="brush: python">
    from bitstring import Bits, BitArray

    #set the mask to whatever you want, just keep the high bit 0 (or use bitstring's uint)
    XOR_MASK = Bits(int=0x71234567, length=32)

    # base36 the reddit way 
    # https://github.com/reddit/reddit/blob/master/r2/r2/lib/utils/_utils.pyx
    # happens to be easy to convert back to and int using int('foo', 36)
    # int with base conversion is case insensitive
     def to_base(q, alphabet):
        if q < 0: raise ValueError, "must supply a positive integer"
        l = len(alphabet)
        converted = []
        while q != 0:
            q, r = divmod(q, l)
            converted.insert(0, alphabet[r])
        return "".join(converted) or '0'

    def to36(q):
        return to_base(q, '0123456789abcdefghijklmnopqrstuvwxyz')

    def shuffle(ba, start=(1,16,8), end=(16,32,16), reverse=False):
        """
        flip some bits around
        '0x10101010' -> '0x04200808'
        """
        b = BitArray(ba)
        if reverse:
            map(b.reverse, reversed(start), reversed(end))
        else:
            map(b.reverse, start, end)
        return b  

    def encode(num):
        """
        Encodes numbers to strings

        >>> encode(1)
        've4d47'

        >>> encode(2)
        've3b6v'
        """
        return to36((shuffle(BitArray(int=num,length=32)) ^ XOR_MASK).int)

    def decode(q):
        """
        decodes strings to  (case insensitive)

        >>> decode('ve3b6v')
        2

        >>> decode('Ve3b6V')
        2

        """    
        return (shuffle(BitArray(int=int(q,36),length=32) ^ XOR_MASK, reverse=True) ).int
</pre>
]]>
        

    </content>
</entry>

<entry>
    <title>Your Google+ Profile is like a Facebook Feed in Every Search</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2012/01/your-google-profile-is-like-a-f.html" />
    <id>tag:www.mischievous.org,2012://1.17</id>

    <published>2012-01-10T17:20:43Z</published>
    <updated>2012-01-10T19:00:19Z</updated>

    <summary>I&apos;m not a real big Google+ user, but I may consider changing my ways. I really like the &quot;You shared this&quot; feature and it integration with the Google&apos;s Author Information in Search Results. When you set everything up properly it...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="facebook" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="google" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="facebook" label="facebook" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="google" label="google" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>I'm not a real big Google+ user, but I may consider changing my ways.  I really like the "You shared this" feature and it integration with the Google's <a href="http://support.google.com/webmasters/bin/answer.py?hl=en&amp;answer=1408986">Author Information in Search Results</a>.  When you set everything up properly it leads to "effortless sharing" or at least given the latest change to Google's <a href="http://bits.blogs.nytimes.com/2012/01/10/google-adds-posts-from-its-social-network-to-search-results/">Social Posts in Search Results</a>.  If you want to be an influencer in the digiterati it might be time to reevaluate using Google+.  These results are also transitive, even if someone isn't directly in your circle, if they are in one of your friend's circles <strong>you can still influence their search results</strong> and possibly take up one of the bottom results on the first search page.</p>

<p>A sample of what search results with social posts look like given my circle of friends:</p>

<p>This is a SERP for an article that was "shared by me", because I have a Google+ author profile link on my blog pages.  I never had to share this article but Google can identify it as "shared by me"
<img alt="jason_culverhouse_author_profile.png" src="http://www.mischievous.org/images/jason_culverhouse_author_profile.png" width="551" height="124" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /></p>

<p>Here are some friends of mine influencing my search results with very generic search terms that they would generally not rank on the first page of results:</p>

<p><a href="http://www.kazabyte.com">Wayne Yamamoto</a> for the search terms "social proof", at the time I took the screen shot Wayne had not shared this via Google+ but he can still pick up the last result in my SERP.</p>

<p><img alt="wayne_yamamoto_social_proof.png" src="http://www.mischievous.org/images/wayne_yamamoto_social_proof.png" width="544" height="103" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /></p>

<p><a href="http://www.siliconvalleybachelor.com">Kevin Leu</a> for the search terms "Silicon Valley", Kevin usually shares everything on Google+ and is able to pick up 2 SERPS on the front page for Silicon Valley when I am logged into search.</p>

<p><img alt="kevin_leu_silicon_valley.png" src="http://www.mischievous.org/images/kevin_leu_silicon_valley.png" width="527" height="111" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /></p>

<p>If I am in your circle and you repeat these searches, chance are my friends can influence your search results.</p>

<p>Invest in your Google+ profile, it's like a Facebook feed in every google search.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Questions of Fairness in Silicon Valley Apply To Acquisitions as Well as Financing</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2011/10/questions-of-fairness-in-silic.html" />
    <id>tag:www.mischievous.org,2011://1.14</id>

    <published>2011-10-04T04:02:56Z</published>
    <updated>2011-10-26T23:01:56Z</updated>

    <summary>Recently there have been a few articles published on the structure of Airbnb&apos;s latest financing round in TechCrunch, Kara Swisher as well as the excellent blog post by Felix Salmon, the premises of the articles focused on the fairness play...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="venture capital" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="airbnb" label="Airbnb" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="bentsmithiv" label="Ben T Smith IV" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="merchantcircle" label="MerchantCircle" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="deals" label="deals" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="venturecapital" label="venture capital" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="venturecapitalist" label="venture capitalist" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>Recently there have been a few articles published on the structure of Airbnb's latest financing round in <a href="http://techcrunch.com/2011/10/01/chamath-palihapitiya-airbnb-liquidity-everyone/">TechCrunch</a>, <a href="http://allthingsd.com/20111001/vcs-unite-chamath-palihapitiya-decries-airbnbs-recent-112m-funding-for-excessive-founder-control-and-cashout-in-email/">Kara Swisher</a>  as well as the excellent blog post by <a href="http://blogs.reuters.com/felix-salmon/2011/10/02/why-dividend-cash-outs-are-evil/">Felix Salmon</a>, the premises of the articles focused on the fairness play between insiders that both option holders and share holders.  The one thing that really stuck me with was when Chamath Palihapitiya said:</p>

<blockquote>
  <p>In contrast, if you are viewed as self-dealing and shady, it will only hurt your long term prospects</p>
</blockquote>

<p>I see aspects of "self-dealing", the conduct a corporate officer that consists of taking advantage of his position in a transaction and acting for his own interests rather than for the interests of the corporate shareholders, as a recent phenomenon in the Silicon Valley.  It's possible that this is exacerbated by the recent lack of IPO opportunities due to the current market conditions leading to more "creative" forms of achieving liquidity.</p>

<p>On May 26, 2011 My former Employer MerchantCircle was <a href="http://techcrunch.com/2011/05/26/reply-com-acquires-marketing-network-for-small-businesses-merchantcircle-for-60-million/">acquired by Reply.com</a> for $60 million.  At the time of the acquisition, as an early employee, I held around 8% of the unconverted exercised common shares and 1% of the fully converted shares.  Investors received a packet detailing the framework of the deal in a 1,200 page deal document (see image) and were give a week to review the documentation.  I picked up the paperwork on June 6th and then sent this letter to the board on June 9th.  There were some details of the deal that I missed but, the paperwork itself was, in my opinion, purposefully obfuscated.  I never received any reply from the Board or the Company about my concerns even thought my questions are neither specious or rhetorical.  The thing that stood out the most in the deal was the fact that the former CEO, Ben T. Smith IV:</p>

<blockquote>
  <p>"will have the opportunity to receive in the Merger an amount of cash per share of his Company Common Stock that is substantially larger than the amount of cash per share than the other holders of Company Common Stock"</p>
</blockquote>

<p>The definition of "fair" when related to stock has taken on a whole new meaning in Silicon Valley.</p>

<p>Under the "terms of the deal" investors were allowed to sell up to 37.5% of shares, employees were also allowed to sell 37.5% of <strong>total options granted</strong> as long as this was not greater than their <strong>total vested options</strong>. Employees were able to "lever" unvested options to sell vested options.  The structure of this deal actually <strong>punished</strong> the investors that owned their shares since they could only sell 37.5%.  Employees potentially got to sell 100% of their vested options as long as 37.5% or less were vested.  In this case, the <strong>CEO granted himself 2,000,000 shares of common</strong> less than 6 months before the deal closed and exploited this clause to cash out  as many of his shares as possible.  This resulted in a cash dilution to investors and the transfer's major benefactor was Ben T Smith, IV.</p>

<p>A usual rule of thumb is that the more risk you take, the more reward you receive. In this case <strong>unvested options</strong> were worth as much as an purchased share owned by a non-employee. In this case, zero risk returned outsized reward, especially in the case of the CEO.</p>

<p>Here is my letter to the CEO and the Board in it's entirety:</p>

<blockquote>
  <p>Subject:  Open Letter To the MerchantCircle Board <br />
  Date:     June 9, 2011 11:35:31 AM PDT <br />
  To: Ben Smith IV, Members of the MerchantCircle Board  </p>
  
  <p>Board Members, <br />
  Let me start by saying that I am happy the deal with Reply.com was "done".  For the employees, many of whom are my personal friends, I feel that the terms of the deal are very generous.  Almost all of the employees who have worked at the company for more than one year are able to elect to sell 100% of their vested shares at a reasonable valuation. This is an excellent outcome.</p>
  
  <p>The following facts regarding the deal give me concerns: </p>
  
  <ul>
  <li>On December 22, 2010 the company decided to grant 2,144,000 shares to Ben Smith, under the terms of the Reply deal these shares allow him to earn an additional $924,520.  </li>
  <li>This grant was so large that it exceeded the amount of ISO options that are allowed to be granted in a single year, the ISO component alone was the <strong>single largest grant</strong> in value in the history of the company.</li>
  <li>Some members of the Board, former employees, and myself invested money into the company yet we are only able to receive 37.5% of our common holding in cash.  The impact of this transaction was to <strong>create an additional class of shares</strong>, unable to take advantage of the magical December 22, 2010 grants.  The merger documents identify an additional class of shares as the Series C-3 Preferred.</li>
  <li>The value of former non-founding employees' shares is almost equal the $920,000 that was transferred by the December 22, 2010 grant.  Our common share holder value has been leveraged to pay for that grant.</li>
  <li>The <strong>merger documents themselves</strong> identify Ben Smith as receiving a <strong>"Golden Parachute"</strong> and inform me that "there is a presumption that the options granted ... were granted in contemplation of the change of control" .</li>
  </ul>
  
  <p>I am left with these unanswered questions:</p>
  
  <ul>
  <li>Is there any concern that the the timeline in this deal may construe "Self-dealing" by a corporate officer?</li>
  <li>When the Board approved these December 22, 2010 grants, were they aware that the amounts were structured to match the deal?  </li>
  <li>Were grants to Ben Smith abnormally favorable or a "sweetheart deal"?</li>
  <li>Do these action somehow deny Common shareholders equal status?</li>
  <li>Why would the investors leave all their money "on the table"? </li>
  </ul>
  
  <p>The Valley is run on reputation.  Do we all suffered a loss of reputation by our association with Ben Smith and the terms of this deal in regards to former employee Common shareholders?</p>
  
  <p>I hope that we can work together to resolve these questions and concerns, feel free to contact me.</p>
  
  <p>Sincerely,</p>
  
  <p>Jason Culverhouse</p>
</blockquote>

<p><img src="http://www.mischievous.org/big_deal.jpg" alt="1,200 Pages Of Deal" title="1,200 Pages Of Deal" /></p>
]]>
        

    </content>
</entry>

<entry>
    <title>Removing A Django Application Completely with South</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2011/03/removing-a-django-application.html" />
    <id>tag:www.mischievous.org,2011://1.13</id>

    <published>2011-03-08T19:53:14Z</published>
    <updated>2011-03-08T19:58:13Z</updated>

    <summary>Removing A Django Application Completely with South Let&apos;s pretend that the application that you want remove contains the following model in myapp/models.py: class SomeModel(models.Model): data = models.TextField Create the initial migration and apply it to the database: ./manage.py schemamigration --initial...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="python" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="django" label="django" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="python" label="python" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="south" label="south" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>Removing A Django Application Completely with South</p>

<p>Let's pretend that the application that you want remove contains the following model in myapp/models.py:
    class SomeModel(models.Model):
        data = models.TextField</p>

<p>Create the initial migration and apply it to the database:</p>

<pre class="brush: python">
    ./manage.py schemamigration --initial myapp
    ./manage.py migrate
</pre>

<p>To remove the Models edit myapp/models.py and remove all the model definitions</p>

<p>Create the deleting migration:</p>

<pre class="brush: python">
    ./manage.py schemamigration myapp
</pre>

<p>Edit myapp/migrations/0002<em>auto</em>del_somemodel.py to remove the related content types</p>

<pre class="brush: python">
    from django.contrib.contenttypes.models import ContentType
    ...
    def forwards(self, orm):

        # Deleting model 'SomeModel'
        db.delete_table('myapp_somemodel')
        for content_type in ContentType.objects.filter(app_label='myapp'):
            content_type.delete()
</pre>

<p>Migrate the App and remove the table, then fake a zero migration to clean out the south tables</p>

<pre class="brush: python">
    ./manage.py migrate
    ./manage.py migrate myapp zero --fake
</pre>

<p>Remove the app from your settings.py and it should now be fully gone....</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Generating a static sitemap with django.contrib.sitemaps</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2010/11/generate-sitemap-django.html" />
    <id>tag:www.mischievous.org,2010://1.12</id>

    <published>2010-11-12T00:24:17Z</published>
    <updated>2010-11-12T00:39:07Z</updated>

    <summary><![CDATA[Django has a built in sitemap generation framework that uses views to build a sitemap on the fly. Sometimes your dataset is too large for this to work in a web application. &nbsp;Here is a management command that will generate...]]></summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="python" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="django" label="django" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="python" label="python" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="sitemap" label="sitemap" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>Django has a built in sitemap generation framework that uses views to build a sitemap on the fly.  Sometimes your dataset is too large for this to work in a web application. &nbsp;Here is a management command that will generate a static sitemap and index for your models. &nbsp;You can extend it to handle multiple Models.</p>

<pre class="brush: python">
import os.path 
from django.core.management.base import BaseCommand, CommandError
from django.contrib.sitemaps import GenericSitemap
from django.contrib.sites.models import Site
from django.template import loader 
from django.utils.encoding import smart_str

from myproject.models import MyModel


class Command(BaseCommand):
    help = """Generates the sitemaps for the site, pass in a output directory
    """

    def handle(self, *args, **options):
        if len(args) != 1:
            raise CommandError('You need to specify a output directory')
        directory = args[0]
        if not os.path.isdir(directory):
            raise CommandError('directory %s does not exist' % directory)
        #modify to meet your needs
        sitemap = GenericSitemap({'queryset': MyModel.objects.order_by('id'), 'date_field':'modified' })
        current_site = Site.objects.get_current()

        index_files = []
        paginator = sitemap.paginator
        for page_num in range(1, paginator.num_pages+1):
            filename = 'sitemap_%s.xml' % page_num
            file_path = os.path.join(directory,filename)
            index_files.append("http://%s/%s" % (current_site.domain, filename))
            print "Generating sitemap %s" % file_path
            with open(file_path, 'w') as site_mapfile:
                site_mapfile.write(smart_str(loader.render_to_string('sitemap.xml', {'urlset': sitemap.get_urls(page_num)})))
        sitemap_index = os.path.join(directory,'sitemap_index.xml')
        with open(sitemap_index, 'w') as site_index:
            print "Generating sitemap_index.xml %s" % sitemap_index
            site_index.write(loader.render_to_string('sitemap_index.xml', {'sitemaps': index_files}))
</pre>
]]>
        

    </content>
</entry>

<entry>
    <title>Experimenting with Node.js and MongoDB and Mongoose </title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2010/05/experimenting-with-nodejs-and.html" />
    <id>tag:www.mischievous.org,2010://1.11</id>

    <published>2010-05-25T10:12:13Z</published>
    <updated>2010-05-25T10:22:59Z</updated>

    <summary>Experimenting with Node.js and MongoDb and Mongoose I came across Mongoose for Node.js. It looks like a promising project but I ran into a bug as soon as I started playing with a simple counter program. The problem is in...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="bugs" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="mongodb" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="node.js" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="javascript" label="javascript" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="mongodb" label="mongodb" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="mongoose" label="mongoose" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>Experimenting with Node.js and MongoDb and Mongoose</p>

<p>I came across <a href="http://www.learnboost.com/mongoose/">Mongoose</a> for <a href="http://nodejs.org/">Node.js</a>. It looks like a promising project but I ran into a bug as soon as I started playing with a simple counter program.  The problem is in the implementation QueryPromise's atomic functions.  Here is a sample program that updates a counter. The three update forms below should all be identical, only the first seems to work with the <a href="http://github.com/LearnBoost/mongoose/commit/cb4e171ab8aca56ca1e74f1e7cbba0e77a2eaa0e">version</a> I was playing with.</p>

<pre class="brush: python">
// Simple test program to show a problem in QueryPromise
// ['inc','set','unset','push','pushAll','addToSet','pop','pull','pullAll']

var sys = require('sys')
var mongoose = require('mongoose/').Mongoose
var db = mongoose.connect('mongodb://localhost/test');

var Simple = mongoose.noSchema('test',db);
Simple.drop(); 
//should only be one....
var m = new Simple({name:'test', x:0,y:0}).save()
// these should behave the same
Simple.update({name:'test'},{'$inc':{x:1, y:1}}).execute();
Simple.update({name:'test'}).inc({x:1, y:1}).execute();
Simple.update({name:'test'}).inc({x:1}).inc({y:1}).execute();

Simple.find({name:'test'}).each(
     function (doc) {
         sys.puts(JSON.stringify(doc));
     }
).then(
    function(){ // promise (execute after query)
        Simple.close(); // close event loop
    }
);
</pre>

<p>Here is a fixed version of QueryPromise's atomic functions that place the command and arguments in the correct place.  </p>

<pre class="brush: python">
// atomic similar

  ['inc','set','unset','push','pushAll',
  'addToSet','pop','pull','pullAll'].forEach(function(cmd){
      QueryPromise.prototype[cmd] = function(modifier){
        if(this.op.name.charAt(0) != 'u') return this;
        if(!this.op.args.length) this.op.args.push({},{});
        if(this.op.args.length == 1) this.op.args.push({});
        for(i in modifier) {
          if(!(this.op.args[1]['$'+cmd] instanceof Object)) this.op.args[1]['$'+cmd] = {};
          this.op.args[1]['$'+cmd][i] = modifier[i];
        }
        return this;
      }
  });
</pre>
]]>
        

    </content>
</entry>

<entry>
    <title>Parsing City/States From User Input With Python NGram</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2010/04/parsing-citystates-from-user-i.html" />
    <id>tag:www.mischievous.org,2010://1.10</id>

    <published>2010-04-27T05:38:20Z</published>
    <updated>2010-04-27T05:44:08Z</updated>

    <summary>A friend just asked how to do city/state lookup on input strings. I&apos;ve used metaphones and Levenshtein distance in the past but that seems like over kill. Using a n-gram is a nice and easy solution easy_install ngram build file...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="python" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="ngram" label="ngram" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="python" label="python" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>A <a href="http://www.charityblossom.org">friend</a> just asked how to do city/state lookup on input strings. I've used <a href="http://en.wikipedia.org/wiki/Metaphone">metaphones</a> and <a href="http://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance</a> in the past but that seems like over kill.  Using a <a href="http://en.wikipedia.org/wiki/N-gram">n-gram</a> is a nice and easy solution</p>

<ol>
<li><p>easy_install <a href="http://pypi.python.org/pypi/ngram/">ngram</a></p></li>
<li><p>build file with all the city and state names one per line, place in citystate.data Redwood City, CA Redwood, VA etc</p></li>
<li><p>Experiment ( the .2 threshold is a little lax )</p></li>
</ol>

<pre class="brush: python">
import string
import ngram
cityStateParser = ngram.NGram(
  items = (line.strip() for line in open('citystate.data')) ,
  N=3, iconv=string.lower, qconv=string.lower,  threshold=.2
)
</pre>

<p>Example:</p>

<pre class="brush: python">
cityStateParser.search('redwood')
[('Redwood VA', 0.5),
('Redwood NY', 0.5),
('Redwood MS', 0.5),
('Redwood City CA', 0.36842105263157893),
...
]
</pre>

<p>Notes: Because these are NGrams you might get overmatch when the state is part of a ngram in the city i.e. search for "washington" would yield Washington IN with a bette score than "Washington OK"</p>

<p>You might also want read <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.21.344&amp;rep=rep1&amp;type=pdf">Using Superimposed Coding Of N-Gram Lists For Efficient Inexact Matching</a> (PDF Download)</p>

<p>If this works for you, consider giving me a vote on <a href="http://stackoverflow.com/questions/2054422/get-city-state-or-zip-from-a-string-in-python/2718896#2718896">StackOverflow.com</a></p>
]]>
        

    </content>
</entry>

<entry>
    <title>Upgrading from movabletype 4 to 5 with a SQLite database</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2010/01/upgrading-from-movabletype-4-t.html" />
    <id>tag:www.mischievous.org,2010://1.9</id>

    <published>2010-01-06T18:39:35Z</published>
    <updated>2010-01-30T18:41:57Z</updated>

    <summary>I was concerned to see that SQLite was deprecated in movabletype 5.0, but I went ahead and upgraded my blog. I followed the standard procedure, copy the new version over the old version then run the mt-upgrade.cgi via the browser....</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
    <category term="moveabletype" label="moveabletype" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="sqllite" label="sqllite" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="upgrade" label="upgrade" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<span class="Apple-style-span" style="font-family: 'trebuchet ms', helvetica, hirakakupro-w3, osaka, 'ms pgothic', sans-serif; "><p style="margin-top: 0px; margin-right: 0px; margin-bottom: 0.75em; margin-left: 0px; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">I was concerned to see that SQLite was deprecated in movabletype 5.0, but I went ahead and upgraded my blog. I followed the standard procedure, copy the new version over the old version then run the mt-upgrade.cgi via the browser. The upgrade script never made it to migrating the database. When this happened I just used the "Upgrade a large database" instructions</p><pre style="margin-top: 0px; margin-right: 0px; margin-bottom: 0.75em; margin-left: 0px; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; "><code style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; ">    $ export MT_HOME=/var/local/mv
$ cd $MT_HOME
$ perl  ./tools/upgrade --name superuser
upgrade -- A command line tool for upgrading the schema for Movable Type.
    * Upgrading database from version 4.0070.
    * Upgrading table for Website records...
    * Upgrading table for MT::Entry::Summary records...
    * Upgrading table for entry_rev records...
    * Upgrading table for Entry records...
    * Upgrading table for Asset Placement records...
    * Upgrading table for Session records...
    * Upgrading table for MT::Author::Summary records...
    * Upgrading table for User records...
    * Upgrading table for template_rev records...
    * Upgrading table for Template records...
    * Upgrading table for Permission records...
    * Upgrading table for Comment records...
    * Rebuilding permissions...
    * Rebuilding permissions... (100%)
    * Updating existing role name...
    * Populating new role for website...
    * Migrating mtview.php to MT5 style...
    * Assigning new system privilege for system administrator...
    * Assigning to  jason...
    * Updating existing role name...
    * Populating new role for theme...
    * Upgrading Asset path informations...
    * Classifying blogs...
    * Classifying blogs... (100%)
    * Merging dashboard settings...
    * Merging dashboard settings... (100%)
    * Migrating existing 1 blog into websites and its children...
    * Generated a website http://mischievous.org/
    * Moved blog Pseudointellectual Appendification (http://www.mischievous.org/) under website mischievous.org
    * Creating new template: 'Comment Listing'.
    * Database has been upgraded to version 5.0016.
Upgrade complete!</code></pre></span>]]>
        
    </content>
</entry>

<entry>
    <title>How to Backup California Mathematics Grade 4 PDFs to Paper</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2009/10/how-to-backup-california-mathe.html" />
    <id>tag:www.mischievous.org,2009://1.8</id>

    <published>2009-10-13T04:03:29Z</published>
    <updated>2009-10-13T04:13:49Z</updated>

    <summary>My school sent home a set of 3 CD&apos;s with circa Glencoe StudentWorks software from 2007. This software consists of a flash application that launches Adobe Acrobat version 7 to display PDF the PDF content of the California Mathematics Grade...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="education" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="backup" label="backup" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="password" label="password" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="pdf" label="pdf" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="print" label="print" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="protected" label="protected" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>My school sent home a set of 3 CD's with circa Glencoe StudentWorks software from 2007.
This software consists of a flash application that launches Adobe Acrobat version 7 to display
PDF the PDF content of the California Mathematics Grade 4 workbooks.  Children are 
expected to complete homework assignments using this software.  My primary operating system
is Mac OS X v10.6 Snow Leopard, the application "Student Works OSX" is a PowerPC application 
and would require Rosetta to be installed to run.</p>

<p>Thankfully, the PDFs are on the CD 
and you do not need to run the application to see the content, just navigate to 
<strong>/Volumes/CA Math 4/support/PDF/docs</strong> and you will find the PDF for each individual chapter. <br />
On Mac or Windows, there is no need to install or run any application from the disc, If you
are running Windows, download the latest Acrobat, don't install the antiquated version on the disc.</p>

<p>Printing is another matter, you can't.  The chapters are password print protected. I'm not
a lawyer but I read the standard shrink wrap licensing agreement that came with the software
and here is what I found:</p>

<blockquote>
  <p>COPIES. Copies can be made only as authorized above
  in machine readable form. Print copies of Software code
  are not authorized. All copyright and trademark notices
  must remain on all copies. All copies must be faithful
  reproductions. You are solely responsible for the content,
  quality and operation of all Software copies. Certain
  Software programs may be "copy protected" by special
  encryption coding that prevents copying or 
  printing-out content</p>
</blockquote>

<p>And</p>

<blockquote>
  <p>You may also make one (1) back- up copy of the Software
  for archival purpose</p>
</blockquote>

<p>Great, I choose to backup the PDF portions of the "software" by printing on
paper.  Here is my "backup" program, you just need GhostScript from macports.</p>

<pre class="brush: python">

    #!/bin/bash -x
    for i in "$@"; do
        NAME=`basename "$i"`
        gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile="$NAME" -c .setpdfwrite -f "$i"
        lpd -d "SCX_4500" "$NAME"
        rm $NAME
    done

</pre>
]]>
        

    </content>
</entry>

<entry>
    <title>How To Use Your Redwood City Public Library Card to Access Safari Books Online</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2009/08/how-to-use-your-redwood-city-p.html" />
    <id>tag:www.mischievous.org,2009://1.7</id>

    <published>2009-08-07T17:24:52Z</published>
    <updated>2009-08-07T17:41:38Z</updated>

    <summary>If you live in San Mateo County you can use your Peninsula Library System Card to access Safari Books Online. If you manage to find link on the plsinfo.org website you will find that the proxy url is incorrect. Here...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="library" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="library" label="Library" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="safari" label="Safari" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>If you live in San Mateo County you can use your Peninsula Library System Card to access Safari Books Online.</p>

<p>If you manage to find link on the plsinfo.org website you will find that the proxy url is incorrect. <br />
Here is the correct link <a href="http://ezproxy.plsinfo.org:2048/login?url=http://proquest.safaribooksonline.com">Safari Books Online</a>, you need your library card.</p>

<p>Read away.</p>

<p>You can also see a list of all the <a href="http://ezproxy.plsinfo.org:2048/menu">resources you can access via your library card</a>.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>SocialCurrent, Become a Cohesive Force for Change</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2009/07/socialcurrent-become-a-cohesiv.html" />
    <id>tag:www.mischievous.org,2009://1.6</id>

    <published>2009-07-27T22:42:01Z</published>
    <updated>2009-07-27T22:48:00Z</updated>

    <summary>A friend just launched his website, SocialCurrent. The idea brings the Tao Te Ching to mind. All things spring up and there is not one which declines to show itself. They grow and there is no claim made for their...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="social responsibility" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="change" label="change" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="socialcurrent" label="socialcurrent" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>A friend just launched his website, <a href="http://www.socialcurrent.org">SocialCurrent</a>.  The idea brings the Tao Te Ching to mind. </p>

<p>All things spring up and there is not one which declines to show itself. <br />
They grow and there is no claim made for their ownership. <br />
The go through their processes and there no expectation of a reward for their results. <br />
The work is accomplished, and there is no resting in it as an achievement. <br />
The work is done, but how no one can see. 'Tis this that makes the power not cease to be.  </p>

<p>Everyday we do things in our daily life that we perceive as socially responsible, yet our actions sometimes invisible. These invisible things add up and the actions of one person can truly make a difference.  </p>
]]>
        

    </content>
</entry>

<entry>
    <title> mdworker Unable to use font: no glyphs present</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2009/06/mdworker-unable-to-use-font-no.html" />
    <id>tag:www.mischievous.org,2009://1.4</id>

    <published>2009-06-04T21:08:24Z</published>
    <updated>2009-06-04T21:15:33Z</updated>

    <summary>Running Apple Mac OSX, and your system.log is filling up with mdworker[473]: Unable to use font: no glyphs present. /System/Library/Frameworks/ApplicationServices.framework /Frameworks/ATS.framework/Support/ATSServer[474]: Serious problems were found in font data while activating it. /System/Library/Frameworks/ApplicationServices.framework /Frameworks/ATS.framework/Support/ATSServer[474]: You may encounter drawing or printing problems....</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="bugs" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="apple" label="apple" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="hadoop" label="hadoop" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="mdworker" label="mdworker" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="osx" label="osx" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>Running Apple Mac OSX, and your system.log is filling up with</p>

<p>mdworker[473]: Unable to use font: no glyphs present.</p>

<p>/System/Library/Frameworks/ApplicationServices.framework /Frameworks/ATS.framework/Support/ATSServer[474]: Serious problems were found in font data while activating it.</p>

<p>/System/Library/Frameworks/ApplicationServices.framework /Frameworks/ATS.framework/Support/ATSServer[474]: You may encounter drawing or printing problems.</p>

<p>Well, it could be Spotlight trying to index a bad PDF file.  To find the offending file with use lsof and the process id of the mdworker process</p>

<pre>
  lsof -p 473
</pre>

<p>In my case it was a PDF from the hadoop 20.0 release</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Cascading and Coroutines</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2009/06/cascading-and-coroutines.html" />
    <id>tag:www.mischievous.org,2009://1.3</id>

    <published>2009-06-02T03:23:37Z</published>
    <updated>2009-06-02T07:07:59Z</updated>

    <summary>Cascading looks quite interesting. Here is a python program that does something similar to the Technical Overview seen main in the python program. #!/usr/bin/env python # encoding: utf-8 import sys def input(theFile, pipe): &quot;&quot;&quot; pushes a file a line at...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="python" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="cascading" label="cascading" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="coroutine" label="coroutine" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="python" label="python" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p><a href="http://www.cascading.org/">Cascading</a> looks quite interesting.   Here is a python program that does something similar to the <a href="http://www.cascading.org/documentation/overview.html">Technical Overview</a> seen <strong><code>main</code></strong> in the python program.  </p>

<pre class="brush: python">
    #!/usr/bin/env python
    # encoding: utf-8
    import sys

    def input(theFile, pipe):
        """
        pushes a file a line at a time to a coroutine pipe
        """
        for line in theFile:
            pipe.send(line)
        pipe.close()

    @coroutine
    def extract(expression, pipe, group = 0):
        """
        extract the group from a regex
        """
        import re
        r = re.compile(expression)
        while True:
            line = (yield)
            match = r.search(line)
            if match:
                pipe.send(match.group(0))

    @coroutine
    def sort(pipe):
        """
        sort the input on a pipe
        """
        import heapq
        heap = []
        try:
            while True:
                line = (yield)
                heapq.heappush(heap, line)
        except GeneratorExit:
            while heap:
                pipe.send(heapq.heappop(heap))

    @coroutine
    def group(groupPipe, pipe):
        """
        sends consectutive matching lines from pipe to groupPipe
        """
        cur = None
        g = None
        while True:
            line = (yield)
            if cur is None:
                g = groupPipe(pipe)
            elif cur != line:
                g.close()
                g = groupPipe(pipe)

            g.send(line)
            cur = line

    @coroutine
    def uniq(pipe):
        """
        implements uniq -c
        """
        lines = 0
        try:
            while True:
                line = (yield)
                lines += 1
        except GeneratorExit:
            pipe.send('%s\t%s' % (lines, line))

    @coroutine
    def output(theFile):
        while True:
            line = (yield)
            theFile.write(line + '\n')

    def main():
        input(sys.stdin,
            extract( r'^([^ ]+)',
                sort(
                    group( uniq,
                        output(sys.stdout)
                    )
                )
            )
        )

    if __name__ == '__main__':
        main()
</pre>

<p>You can achieve the same results with the unix command line:</p>

<pre><code>cat  access.log | cut -d ' ' -f 1 | sort | uniq -c
</code></pre>
]]>
        

    </content>
</entry>

<entry>
    <title>Python Coroutines and Twitter</title>
    <link rel="alternate" type="text/html" href="http://www.mischievous.org/2009/06/python-coroutines-and-twitter.html" />
    <id>tag:www.mischievous.org,2009://1.2</id>

    <published>2009-06-01T22:54:25Z</published>
    <updated>2009-06-02T07:41:11Z</updated>

    <summary>Reading http://www.dabeaz.com/coroutines/ and thought this was a natural for a twitter client. Here is a pretty simple version that just prints the public timeline every 60 seconds. Next, up removing the time.sleep and scheduling the followStatus function as a task...</summary>
    <author>
        <name>Jason Culverhouse</name>
        <uri>http://www.mischievous.org/jason/</uri>
    </author>
    
        <category term="python" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="coroutine" label="coroutine" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="python" label="python" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="twitter" label="twitter" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en-US" xml:base="http://www.mischievous.org/">
        <![CDATA[<p>Reading <a href="http://www.dabeaz.com/coroutines/">http://www.dabeaz.com/coroutines/</a> and thought this was a natural for a twitter client. Here is a pretty simple version that just prints the public timeline every 60 seconds.  Next, up removing the time.sleep and scheduling the followStatus function as a task so I can follow more than one stream at a time.</p>

<pre class="brush: python">
    #!/usr/bin/env python
    # encoding: utf-8
    import time
    import twitter

    def coroutine(func):
        """
        A decorator function that takes care of starting a coroutine
        automatically on call.

        see: http://www.dabeaz.com/coroutines/
        """
        def start(*args,**kwargs):
            cr = func(*args,**kwargs)
            cr.next()
            return cr
        return start

    @coroutine
    def statusPrinter():
        """
        Just prints twitter status messages to the screen
        """
        while True:
             status = (yield)
             print status.id, status.user.name, status.text

    def followStatus(twitterGetter, target, timeout = 60):
        """
        Follows a twitter status message that takes a since_id
        """
        since_id = None
        while True:
            statuses = twitterGetter(since_id=since_id)
            if statuses:
                # pretty sure these are always in order
                since_id = statuses[0]
                for status in statuses:
                    target.send(status)
            # twitter caches for 60 seconds anyway
            time.sleep(timeout)

    def main():
        api = twitter.Api()
        followStatus(api.GetPublicTimeline, statusPrinter())

    if __name__ == '__main__':
        main()
 </pre>   
]]>
        

    </content>
</entry>

</feed>

