Circus, Nginx and Websockets

Looking for a high performance, powerfull process manager for a Python project I’m working on, I stumbled on Circus on this excellent benchmark blog post. After running my own benchmark tests, I agree that the Circus + Chaussette + Meinheld stack is the way to go. High concurrency, fast response time, and socket support are the features that pulled me in. I’m switching off of Supervisord, because: a) Circus integrates directly with ZeroMQ, and b) Gunicorn + Supervisord requires two levels of process management: Supervisor controls gunicorn, and gunicorn in turn watching it’s own worker processes. Circus keeps everything on the same level.

circus.readthedocs.org has excellent documentation for getting started with Circus. I did run in to a couple caveats though:

Circus doesn’t start on boot

I started testing Circus in the command line by running circusd circus.ini. I quickly switched to running it as a service in Upstart, using this etc/init/circus.conf file:

1
2
start on filesystem and net-device-up IFACE=lo
exec /usr/local/bin/circusd /etc/circus.ini

This script just waits for the file system and networking to become available, then it runs circusd with my config file in /etc/circus.ini. Easy enough.

Getting Circus workers to work with virtual environments

Although Circus workers have awesome properties like env, copy_env and copy_path (which all work great when running from a local folder), this falls apart when starting the daemon from Upstart. I looked at my $PATH variable in an active environment, and copied it into the worker config:

1
env = PATH=/path/to/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin,VIRTUAL_ENV=/path/to/venv

Circus Web Console behind Nginx with Sockets

Circus has a sweet web console to manage processes and workers. By default, it runs on port 8080, and uses websockets to push stats on CPU, memory and socket reads for each running process. The web console should never be publicly available, it allows arbitrary commands to be executed on the server. The preferred way to password protect the console is to use Nginx like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
server {
    listen  8001;
  location ~/media/*(.jpg|.css|.js)$ {
      alias /usr/local/lib/python2.7/dist-packages/circus/web/;
  }

  location / {
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Host $http_host;
      proxy_redirect off;
      proxy_pass http://127.0.0.1:8080;
      auth_basic            "Restricted";
      auth_basic_user_file  /etc/nginx/htpasswd;
  }
}

I have the web console process listening on 8080: it’s serving both the website and the socket connection. Notice the issue? Nginx doesn’t support websockets! So I’m running Nginx on port 8001, and the web console processs on 8080. And this is where Varnish comes in: Varnish is a caching proxy, but I’ll just use it to multiplex port 8002 to two seperate backends. If the connection is a websocket, route it directly to 8080. If it’s the website, switch the backend to Nginx on port 8001:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
backend default {
    .host = "127.0.0.1";
    .port = "8001";
}

backend socket {
    .host = "127.0.0.1";
    .port = "8080";
    .connect_timeout = 1s;
    .first_byte_timeout = 2s;
    .between_bytes_timeout = 60s;
}

sub vcl_pipe {
     if (req.http.upgrade) {
         set bereq.http.upgrade = req.http.upgrade;
     }
}

sub vcl_recv {
    if (req.http.Upgrade ~ "(?i)websocket") {
        set req.backend = socket;
      return (pipe);
    }
}

FontPrep: The Missing CSS3 Font Generator for Mac OS X

A couple days ago, if you asked me how to add webfonts to my website, I would have immediately said Font Squirrel. One of the reasons that Font Squirrel is awesome is their @font-face generator. It is the most popular solution for generating webfonts and font CSS online.

That’s all about to change. See, there are several limitations to using an online generator for fonts. After uploading two or three fonts, it’s obvious that Font Squirrel would be not be able to handle hundreds of font files. A web browser simply isn’t built to support that. Previewing generated fonts in the browser on the fly is also impossible. There’s also the licensing to worry about: Font Squirrel only accepts fonts that are “legally eligible for web embedding.”

FontPrep solves all of this. FontPrep is a native Mac OS X app that prepares your fonts for the web. At its core, it generates webfont bundles with full browser compatibility. After dragging and dropping a TTF font on to the app, the whole bundle can be downloaded, including WOFF, SVG and EOT formats, and the snippets of CSS needed to add these fonts to a website. And that’s just scraping the surface of what the app can do:

FontPrep can spin up a local server running a font testbed in the browser. After adding fonts to FontPrep, the entire webfont can be previewed in the browser. Every glyph in the typeface is rendered, as well as a waterfall of font sizes and a text area for testing the font with custom text. Font size is easily changed as well.

According to the creators, Brian and Matthew Gonzalez, FontPrep was created with the goal of making an “easy to use, drag & drop OSX app which makes our lives as developers much easier when it comes to working with fonts.” They succeeded: not only is FontPrep highly functional, it is also beautiful. Admittedly, their app have less options than Expert mode on Font Squirrel, but they are working on adding more support. Meanwhile, head over to FontPrep.com and pick up the app. It’s free for the next 6 hours, after that it costs $5.

iTerm + BetterTouchTool

The best terminal emulator for Mac OS X is iTerm2. I especially love the drop-down style windows. No matter what I’m doing, there’s always a command prompt a ⌘+Tab away. And it floats over my current windows, so I can read commands from Stackoverflow while I’m typing #noob.

This is the year 2012 though, and while we don’t have jet packs yet, we should at least be able to flick things around a screen with a swipe of our fingers. Enter BetterTouchTool. This is basically an awesome app that lets you define a bunch of custom gestures for your trackpad and Magic Mouse. Quicksilver for gestures, if you will.

The magic glue for all of this is the key-binding option in iTerm. Go to iTerm Preferences > Profiles > Hotkey Window. You can choose a shortcut key, in my case ⌃⌘O, to trigger the app. Since it’s a system-wide hotkey, choose something that doesn’t conflict with apps you use. ⌃⌘Z is a really bad choice if you use Photoshop ;)

Once you have the HotKey set up in iTerm, set up a Global gesture in BetterTouchTool. I’m a fan of the three-finger swipe down, it feels natural and doesn’t conflict with Spaces for me, since I’m still rocking Snow Leopard. Obviously hook it up to the same shortcut key you used in iTerm. Now, you should be able to swipe to get a command prompt.

But how about sending it back up? It feels natural to swipe back up, but there’s no way to trigger that in iTerm. So, fellow hacker, add a second gesture, the three-finger swipe up, and bind it to ⌘+Tab. This simply switches back to the original app you were using before you triggered iTerm, and effectively sends the window swishing back up.

Make Requests-Cache Play Nicely With Heroku

The Python module that has saved me the most time is, without a doubt, the Requests module by Kenneth Reitz. This guy never fails to produces awesomeness.

The tagline is “HTTP for Humans”, and the whole thing is very pythonic. It was missing, however, an equally pythonic way to deal with caching. Fortunately, Requests-Cache fills the void. It supports several different caching backends, as well as an extensible interface for adding new ones.

SQLite is the default, and probably works well in most cases. But it fails completely when deploying to Heroku. That’s because SQLite requires a permanent writable file system. It ultimately needs access to the POSIX fopen() and fwrite() API calls to a particular file. Heroku does not provide a permanent writable file system. It doesn’t even allow the SQLite3 module to be installed:

1
ImportError: No module named _sqlite3

Even though Requests-Cache supports in memory caching, I had to remove a couple parts of the module to get it to even import on Heroku. I removed the sqlite.py backend file requests_cache/backends/sqlite.py and delete the couple sqlite lines from requests_cache/backends/__init__.py. After that, it imported fine and is easy to set up in memory caches, like this:

1
requests_cache.configure('cache_name', 'memory')

Here’s my fork of the repo with those changes.

If you know a better solution, or have another way of caching HTTP requests easily with Python, let me know.

OAuth With Twitter and Python Flask

Here’s a couple tips on getting OAuth to work with Python Flask.

First, Flask has bunch of awesome extensions that make coding with Flask so easy. The two I am using for an OAuth login system are Flask-OAuth (obviously), and Flask-Login. The database I am using is Postgres, with the great Flask-SQLAlchemy ORM.

The User class I am using just needs 3 properties:

  • username
  • token
  • secret

For Flask-OAuth, most of the default configuration works as described. However, I did have to change the request_token_url, access_token_url and authorize_url to use https. The access_token_url should point to “https://api.twitter.com/oauth/authenticate” for processing the login.

I had trouble getting it to recognize the callback URL. I kept getting this error: raise OAuthException('Failed to generate request token'). I gave up debugging it, and just added the correct callback URL in the settings in Twitter.

Using Flask-Login is also straight forward. I initially didn’t know how to make enable a User class to use the login system, turns out it’s just a four functions that have to be included in the class:

1
2
3
4
5
6
7
8
9
10
11
def get_id(self):
  return self.id

def is_authenticated(self):
  return True

def is_active(self):
  return True

def is_anonymous(self):
  return False

The glue that makes the two play nicely together is all in these three functions:

1
2
3
4
5
6
@app.route('/login')
def login():
  if current_user.is_authenticated():
      return redirect('/')
  return twitter.authorize(callback=url_for('oauth_authorized',
      next=request.args.get('next') or request.referrer or None))

Before I send a Twitter OAuth request, I make sure the current_user is not authenticated. If I didn’t the OAuth would fail. Not sure why yet.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
@app.route('/oauth-authorized')
@twitter.authorized_handler
def oauth_authorized(resp):
  next_url = request.args.get('next') or url_for('index')
  if resp is None:
      return redirect(next_url)

  this_account = Account.query.filter_by(username = resp['screen_name']).first()
  if this_account is None:
      new_account = Account(resp['screen_name'], "", resp['oauth_token'], resp['oauth_token_secret'])
      db.session.add(new_account)
      db.session.commit()
      login_user(new_account)
  else:
      login_user(this_account)

  return redirect(next_url)

What the callback handler does:

Once the authentication is complete, it looks in the database for the Twitter username. If it’s not found, it creates a new account and calls login_user(). If it is, it uses the returned account object to login the user.

1
2
3
4
5
6
@twitter.tokengetter
def get_twitter_token():
  if current_user.is_authenticated():
      return (current_user.token, current_user.secret)
  else:
      return None

This third function is self-explanatory.

Now, it’s as easy adding the @login_required decorator before a secure page function.

Film: Canon AE-1 Program

Startup in… Latin America?

Paul Graham recently wrote about Why Startup Hubs Work. He cited two reasons why startups do not die in startup hubs:

“[1] being in a place where startups are the cool thing to do, and [2] chance meetings with people who can help you. And what drives them both is the number of startup people around you.”

Starting a startup is hard enough, doing so in a place where it is seen as synonymous with unemployment is a major setback. And chance meetings with mentors and people pursuing the same ideas can be miraculous for a startup. While I believe these antidotes for “startup death” are true in developed countries like the US, developing countries like Guatemala bring new factors into play.

In the United States, the number of entrepreneurs has hovered around 10%, give or take, depending on the economy. In Latin America, on the other hand, this number averages around 20%, with the highest around 30-35% in Peru and Bolivia. Starting a startup is not looked down upon as unemployment, indeed, these people are thought highly of in the community.

Other quirks of a developing country that a startup can benefit from are the limited availability of good wages and the cheap abundance of resources. More people are likely to start their own business when they can get better wages working for themselves. Work-live places, an essential for startup entrepreneurs are easier than ever to create when the cost of living is a fifth of the cost in the US. Latin American countries also tend to rapidly adopt new technologies: most people in Guatemala leapfrogged from no phone connection within miles to 3G cellphones in a couple of years. Argentina, Venezuela, Colombia, Chile, Mexico, Puerto Rico and Peru are 7 of the top 20 countries using social networks:

Lacking a strong community of mentors that startup hubs like Silicon Valley, Boston or Boulder offer is one of the biggest setbacks. But as the connectivity and education increases there is no question this will change. Guatemalan businesses are starting to prop up .com.gt websites in a similar way as the beginning of the .com bubble years ago. Given the rate of conversion to new technologies, the creativity and self-discipline people have to new startups and the economy of resourcefulness, Latin America will soon produce startups to rival Silicon Valley.

On Extracting an Element From a Web Page With CSS Styles

The goal: to extract an element from a web page and display it independently from it’s original website.

Note: I initially toyed with running this whole process server-side, and it still may be the more efficient way to to this. This is extraordinarily difficult though, you will soon see why.

First, we fetch the complete web page from our target site. For example, news.google.com. We then dump the web page into an invisible div.

1
2
3
4
5
6
7
8
9
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://news.google.com/');
$html = curl_exec($ch);
?>

<div style="display:none;">
    <?php echo $html; ?>
</div>

Our target element is the Recent News box on the right of the page. The id is ‘s_BREAKING_NEWS_BOX’. Extracting the HTML is easy enough with jQuery:

1
$('<div>').append($('#s_BREAKING_NEWS_BOX').clone()).remove().html();

If we were to just output the HTML at this point, we would get text with default browser styles. We need to copy the computed CSS styles from the original rendering to our new element. Firefox, Chrome and Safari use getComputedStyle, IE uses currentStyle.

Here’s our JavaScript function to do just that:

1
2
3
4
5
6
7
8
function getPropValue(ele, styleProp) {
    if (ele.currentStyle) {
        var y = ele.currentStyle[styleProp];
    } else if (window.getComputedStyle) {
        var y = document.defaultView.getComputedStyle(ele, null).getPropertyValue(styleProp);
    }
    return y;
}

This function requires an element, and the CSS property that we want to grab. So we can define an array of properties we want to copy, then loop through these properties to get the entire style for the element.

1
2
3
4
5
6
7
8
9
var styles = ["color", "font-family", "font-size", "line-height", "white-space", "padding", "display", "float", "border", "border-top", "border-right", "border-bottom", "border-left", "border-color", "border-width", "border-style", "padding-top", "padding-right", "padding-bottom", "padding-left", "height", "font-weight", "margin-top", "margin-left", "margin-bottom", "margin-right", "text-decoration"];

function getStyles(ele, styles) {
    var values = new Array();
    for (var i=0; i < styles.length; i++) {
        values[i] = getPropValue(ele, styles[i]);
    }
    return values;
}

If we give getStyles an element reference and an array of styles, we get an array of computed CSS values for that element. But remember, we need to do this for every single child element, not just the parent s_BREAKING_NEWS_BOX element. So we need a recursive loop that runs getStyles on the parent, the children, grandchildren, etc. We also have a running count, index, to keep give each elements styles a position in an array, element_styles. Thank to Adam Bratt for help getting this to work.

1
2
3
4
5
6
7
8
9
10
11
function loopChildrenGrab(this_ele, styles, element_styles) {
    element_styles[index] = getStyles(this_ele, styles);
    index++;

    if ( $(this_ele).children().length > 0 ) {
        $(this_ele).children().each(function(){
            loopChildrenGrab(this, styles, element_styles);
        });
    }
    return element_styles;
}

Now we have a multidimensional array of every single elements style, order by the elements position in the DOM. So we can delete the whole web page that we loaded at first, and then run the reverse of the previous functions to assign each CSS property to our copied HTML:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
function pushStyles(ele, styles, values) {
    for (var i=0; i < styles.length; i++) {
        $(ele).css(styles[i], values[i]);
    }
}

function loopChildrenPush(this_ele, styles, element_styles) {
    pushStyles(this_ele, styles, element_styles[count]);
    count++;

    if ( $(this_ele).children().length > 0 ) {
        $(this_ele).children().each(function() {
            loopChildrenPush(this, styles, element_styles);
        });
    }
}

Tada! We now have successfully copied the HTML and flattened the CSS for each element. s_BREAKING_NEWS_BOX can now be loaded independently of its parent page.

What would it take to do this server-side? In short, render the HTML in some sort of web view, then extract the flattened CSS for every element. While it is possible, and most likely the way to proceed with this project, letting the browser do the heavy lifting rendering CSS is definitely easier in the short term.