Archive for April, 2008


Das Rad (The Rock)

I’m work­ing on a short story about dif­fer­ent per­spect­ives over a long period of time at the moment, as was boun­cing around the inter­webs for inspir­a­tion when I came across this little gem from the 2003 Academy Awards for short films. It didn’t win, but it clearly deserved the nomination.

YouTube Preview Image

There’s a higher res­ol­u­tion ver­sion (def­in­itely recom­men­ded for some of the details in the anim­a­tion) avail­able too.

Notes From a Small Internet

Beautiful Soup is a great little Python mod­ule that will read just about any HTML page and give you back a struc­tured parsed tree. It’s awe­some because you can pass it just about any mangled markup — I’ve never known it to choke on any­thing. For some web ser­vice con­sumers I’ve had to write over the years Beautiful Soup has saved me many, many hours of slog­ging through crappy HTML pars­ing. Great soft­ware deserves appreciation.

Whilst brows­ing my good friend Rachel’s web­site I happened to notice that her brother Leonard wrote Beautiful Soup. He also wrote RESTful Web Services, which is part of my (recently pruned) dead tree col­lec­tion, and which I’d heart­ily recom­mend to any­one who has to work with REST web ser­vices. The Django examples were espe­cially useful!

Google’s AppEngine Beat Me To It

Recently I’ve been put­ting some time into writ­ing a data­base adapter for Django that uses Amazon’s S3 and SimpleDB ser­vices as a stor­age layer, whilst try­ing to retain as much of Django’s QuerySet func­tional layer as pos­sible. The gen­eral goal is to provide a stor­age back-end for Django that isn’t depend­ent on the tra­di­tional vertically-scaling data­base server, but can scale hori­zont­ally in the same way as the EC2 com­put­ing cloud does. My even­tual goal being the abil­ity to deploy Django in the cloud with no external depend­en­cies. Just throw out a Django machine image, deploy your app’s code and con­fig, and you have a scal­ing solu­tion that takes minutes rather than days or weeks.

It’s a non-trivial exer­cise that is both stim­u­lat­ing and frus­trat­ing in equal meas­ure, and pro­gress has been steady, if not exactly rapid. It’s worth it to me though, as the abil­ity to roll out scal­ing infra­struc­ture is dra­mat­ic­ally hampered by the data­base layer.

Imagine my delight then to find that Google have launched AppEngine, their own cloud-based web applic­a­tion sys­tem. It’s Python without any messy machine-based lib­rar­ies, uses WSGI so you can use pretty much any Python web app, and with GFS for a dis­trib­uted file stor­age and BigTable as a data per­sist­ence layer. Google even throws in Django 0.96.1 with instruc­tions on how to use their stor­age lay­ers by doing away with Django’s own model  (more on this later).

There’s a lot of whin­ing about how Google’s solu­tion cripples Python (which is crazy when you look at how trivial it is to refactor code to use Google’s sup­plied altern­at­ives), and locks you into their solu­tion. I sus­pect that this is mostly from people who have never even con­tem­plated build­ing an applic­a­tion that needs to really scale, and are there­fore still think­ing in terms ser­vices provided by the under­ly­ing OS. That’s a big prob­lem for scal­ing, because disk, IO, threads, sock­ets, etc are finite resources that are hardware-bound. Abstracting access to these things is tough. Most scal­ing solu­tions these days are about provid­ing mul­tiple hard­ware instances, but unfor­tu­nately that only solves the hard­ware prob­lem. Building an app that scales trans­par­ently over mul­tiple hard­ware instances is a huge chal­lenge in com­par­ison to pro­cur­ing more servers.

Google’s approach is to do away with the concept of hard­ware entirely. That means a change of mind­set towards every request being an atomic oper­a­tion. Persistence occurs (cor­rectly) in your per­sist­ence layer and not in tran­si­ent stor­age avail­able to an instance of your applic­a­tion. Google have provided extens­ive Python lib­rar­ies and API calls to enable applic­a­tions to take advant­age of this, but it seems that a fairly vocal group aren’t inter­ested unless their applic­a­tions work on AppEngine without any addi­tional effort. Considering the paradigm shift that AppEngine rep­res­ents (from machine-centric pro­gram­ming to dis­trib­uted pro­gram­ming) it’s not unreas­on­able to expect some small effort to be required. Especially when you take into account that AppEngine is cur­rently in a very lim­ited trial phase.

I’m extremely optim­istic that Google’s approach will work well for a num­ber of reas­ons. As an applic­a­tion pro­gram­mer I spend huge amounts of time work­ing around hard­ware and plat­form lim­it­a­tions that I should be spend­ing on core func­tional areas. If Google can provide a solu­tion that means I never have to worry about spe­cific hard­ware prob­lems ever again, I doubt I’ll look back.

View from my Garden



View from my Garden, ori­gin­ally uploaded by iam­seb.

You wouldn’t think this is April, rather than January.

View from my Bedroom



View from my Bedroom, ori­gin­ally uploaded by iam­seb.

Finally, snow in London!

Powered by WordPress | Theme: Motion by 85ideas.