Force is inelegant.
 
March 16th, 2010

Is Canvas the End of Flash?

Loose notes from SXSW 2010 panel discussion Is Canvas the End of Flash?. This debate is really heating up as more browsers gain Canvas support and sentiment seems to be rapidly turning against Flash. But how feasible is it to consider the canvas element a real Flash replacement? Five panelists hashed it out, with excellent points on all sides. Very useful session.
(more…)

March 16th, 2010

Why Your Baby is Ugly – Effective Dashboard Design

Loose notes from SXSW 2010 session Why Your Baby is Ugly – Effective Dashboard Design, with Aaron Hursman of Hitachi Design. Though I’ve only ever worked on one dashboard system, I am interested in data visualization, and this was an excellent crossover session for both dataviz and information design concepts.
(more…)

March 16th, 2010

Prototyping Web Apps – Nobody Loves a Wireframe

Loose notes from SXSW 2010 session Prototyping Web Apps – Nobody Loves a Wireframe, with Darren Delaye and Michael Leggett of Google. I’m more of a back-end guy than a designer, but with an increasing interest in design considerations and usability. This became one of the most useful sessions of the conference for me.
(more…)

March 16th, 2010

RIP Content Management System

Loose notes from SXSW 2010 session RIP Content Management System by Drupal creator Dries Buytaert.

Unfortunately, the “R.I.P. part of the session title was never addressed, nor were any of Drupal’s core shortcomings or architectural annoyances. This was unfortunately just a 30-minute informercial for Drupal.

Would really have preferred to have heard Dries talk about plans to address Drupal’s deep archtitectural problems like lack of object orientation, lack of an ORM, lack of MVC, and annoying templating system. Took notes anyway.
(more…)

March 15th, 2010

Wow, That’s Cool… Fun With HTML5 Video

Loose notes from the SXSW 2010 session Wow, That’s Cool… Fun With HTML5 Video, with Michael Dale of Wikimedia and Christopher Blizzard of Mozilla.
(more…)

March 15th, 2010

HTML5: Tales from the Development Trenches

Loose notes from the SXSW 2010 session HTML5: Tales from the Development Trenches, in two parts (history lesson and examples). With Bruce Lawson of Opera and Martin Kliehm of namics.
(more…)

March 14th, 2010

Coding for Pleasure: Developing Killer Spare-Time Apps

Loose notes from the SXSW 2010 session Coding for Pleasure: Developing Killer Spare-Time Apps, hosted by :

Gina Trapani of Lifehacker and now author of Google Wave book. Also made BetterGmail and ThinkTank;
Matt Haughey – Fuelly – public social miles per gallon site, also creator of MetaFilter (now a 4-employee corporation); Adam Pash – MixTape.me (playlist/music sharing site). Also Belvedere and Texter.
(more…)

March 14th, 2010

Server-Side Javascript

Loose notes from SXSW 2010 session Javascript: The Front and the Back of It, on using server-side Javascript to reduce the pain points of the few non-DRY areas left in MVC stacks.

(more…)

March 13th, 2010

Is WordPress Killing Web Design?

Loose notes from SXSW 2010 session: Is WordPress Killing Web Design

Good question – I’ve been asking myself this lately. Unfortunately the session quickly devolved into a lot of platitudes and stating of the obvious. Yes, design has been commoditized and is no longer an “elite” activity. Yes, your site is as creative as you make it, it has nothing to do with the CMS you use. All pretty much goes without saying. Took notes for half an hour, then headed to the HTML5 discussion… which was full and not allowing more people in.
(more…)

March 13th, 2010

Web Fonts: The Time Has Come

Loose notes from SXSW 2010 session Web Fonts: The Time Has Come

(more…)

November 17th, 2009

What a Traffic Spike Looks Like

This blog has been chugging along at around 300 visitors per day for the past few months (it was much better back before Twitter nearly keeled my urge to blog at all). But the recent Drupal or Django article went a little viral, and things have been nutty over the past 48 hours:

Spike

WP-SuperCache has held up admirably, scarcely a performance blip felt on the Birdhouse VPS.

November 11th, 2009

Drupal or Django? A Guide for Decision Makers

Target Audience

drupliconThere’s a large body of technical information out there about content management systems and frameworks, but not much written specifically for decision-makers. Programmers will always have preferences, but it’s the product managers and supervisors of the world who often make the final decision about what platform on which to deploy a sophisticated site. That’s tricky, because web platform decisions are more-or-less final — it’s very, very hard to change out the platform once the wheels are in motion. Meanwhile, the decision will ultimately be based on highly technical factors, while managers are often not highly technical people.

django-logo-negativeThis document aims to lay out what I see as being the pros and cons of two popular web publishing platforms: The PHP-based Drupal content management system (CMS) and the Python-based Django framework. It’s impossible to discuss systems like these in a non-technical way. However, I’ve tried to lay out the main points in straightforward language, with an eye toward helping supervisors make an informed choice.

This document could have covered any of the 600+ systems listed at cmsmatrix.org. We cover only Drupal and Django in this document because those systems are highest on the radar at our organization. It simply would not be possible to cover every system out there. In a sense, this document is as much about making a decision between using a framework or using a content management system as it is between specific platforms. In a sense, the discussion about Drupal and Django below can be seen as a stand-in for that larger discussion.

Disclosure: The author is a Django developer, not a Drupal developer. I’ve tried to provide as even-handed an assessment as possible, though bias may show through. I will update this document with additional information from the Drupal community as it becomes available.

(more…)

November 8th, 2009

django-treedata: DataSF Contest Winner

treewordle-150x150Recently I was invited to participate in the California Data Camp and DataSF App Contest hosted by California Watch and spot.us. The unconference would feature lots of discussion about making use of publicly available data sets to improve quality of life. The App Contest challenged developers to choose one of the many data sets available at DataSF.org and build something cool with it in a relatively short period of time.

Long story short — my contest entry, which explores San Francisco’s database of publicly maintained trees and plants, won the competition! Full details, and downloadable source code, available at my Scripts and Utilities site.

Thanks so much to David Cohn of Spot.us and all of the conference organizers and supporters. Thanks also to J-School webmaster for Chuck Harris for his contributions to the project. It was a great day, and winning the competition was a total surprise. Now I just need a city to take the source code and run with it.

spot.us has covered the event live throughout the day.

Huffington Post mentioned django-treedata in Sophisticated Tree Hugging: the Pure Joy of Public Data

November 5th, 2009

Birdhouse 960

960 Blog look different? At first glance, not by much, but I’ve just completed a massive cleanup of the back-end, replacing the old HTML/CSS with the 960 Grid System, starting with the 960bc (blank canvas) WordPress theme. While I was at it, took the opportunity to search/replace out a bunch of old non-semantic code buried in the posts, updated or replaced a lot of plugins, and killed off a few old features that had out-lived their usefulness.

The biggest news: After years of preaching the HTML validation gospel to students, I still hadn’t gotten around to trying to make my own platform validate… but the Foobar Blog finally does! Well, almost. There will always be 3rd party code outside your control that can’t be hammered into shape. The biggest offenders here are embedded Flickr slideshows and Wordpress’ own embedded Gallery feature. Ugh. But aside from that, we’re pretty darn close to clean. Everything I can control validates at least.

The old design had accreted slowly over the years, from a patchwork of parts built and gathered. Original intention was to go for a clean break and adopt a modern 3rd-party theme, but the more I searched, the more I felt like I loved the “Cheap Thrills” design that’s evolved here (not available for download, sorry). So I decided to port Cheap Thrills to 960. It wasn’t all roses, since the divs in this theme hug each other so tightly, while 960 assumes margins everywhere. A lot of fiddling with negative margins, and I haven’t  solved the equal height divs problem quite yet. Will do soon.

New in this pimplementation:

  • Much wider content area. Goal is to be able to show full-width video and slideshows, plus code samples that don’t fold to the next line or stretch out of the content space.
  • Syntax highlighting for code samples (example)
  • Tag cloud (see sidebar) – I’ve been tagging random articles for a long time but didn’t want to display a cloud until there were enough of them to warrant it. Still haven’t gone through and tagged the entire site history, but the cloud is picking up steam.
  • General cleanup. Cruft removal. So. Much. Cruft.
  • Somewhat wider sidebar – more room for Image from Nowhere and Recent Comments. Some of the old Images from Nowhere look a bit stretched but future ones will be generated larger.
  • Replaced my old handmade RSS-based Twitter integration with Twitter for WordPress. Super clean – much better for DIY theme builders than the usual TwitterTools.
  • The old Democracy plugin for polling appears to have been abandoned. Replaced it with the much cleaner WP-Polls, which also meant manually copying all of the old Poll data into the new system (ugh!). See the Pollster section.
  • Replaced the old contact form  in the shacker contacter with the much simpler Contact Form 7.
  • Nips and tucks galore.

Process took way longer than expected of course – everything does – but these things had been gnawing away at me for a long time now. Feels great to have it all done. Haven’t done any cross-browser testing yet – let me know if something doesn’t look right for you.

Can’t say enough good things about 960 Grid. We’ve standardized on it at work, and it really does make life easier. Not without its warts, but much more pleasant than the YUI grid it replaces.

October 20th, 2009

Generating RSS Mashups from Django

I recently got to work on an interesting Django side project: the Bay News Network – a directory of Bay Area bloggers and hyperlocal news sites. The goal of the site was three-fold:

  1. To create a many-to-many directory of local sites that matched our editorial criteria
  2. To let site owners log in and edit their own listings
  3. To both consume and produce RSS feeds from the listed sites

The first two were pretty standard Django approaches – develop data models and editing interfaces using Django forms and re-usable apps like django-profiles and django-registration. The third goal turned out to be more interesting. We not only had to gather RSS feeds from more than 100 external sites several times per day, we needed to re-mix them (e.g. provide an integrated feed representing all blogs that cover Food, or all blogs that cover Oakland).

“Consuming” RSS feeds meant we needed to integrate feeds from the external sites into our own site. At the most basic level, this was pretty straightforward using Mark Pilgrim’s excellent Universal Feed Parser, which turns the real-world’s tag soup of disparate, incompatible RSS formats  into a reliable data format you can step through in your code or templates. This worked well enough until I realized that grabbing and parsing external feeds in real-time was just not going to scale, performance-wise. Plus, we still had the RSS mashups to build, and would clearly need to be storing feed entries in our own database in order to sort them by category, etc.

Thus began the hunt for good feed aggregation systems for Django. Most roads pointed to django-planet, planet planet, and FeedJack, which are systems for gathering content from external sites and importing it into a single aggregated site. These were close to what I wanted, but weren’t great on the re-usability side. Since I already had  existing models to define the sites, their owners, and their feeds, I didn’t want to rewrite all my models to work with another system’s conception of how things should be laid out. I also didn’t feel like plowing through their source code to chop out and rewrite just the bits I wanted. Eventually realized that I was looking for a few lines of code to work with my system, not a whole external system.

The surprising solution came from the Community section of the official Django project web site. The Django developers keep the code that drives djangoproject.com in subversion along with the source code to Django itself. And the code that drives that section of the site is really lightweight. So I did a subversion checkout of the Aggregator app, and found that all I really needed from it was its update_feeds.py script, which itself is a wrapper around Universal Feed Parser, tweaked to talk to my own models.

Two gotchas to be aware of:

  1. The app includes a bundled templatetags directory with a file called aggregator.py. But the name of the app itself is “aggregator.” I was getting strange import errors in various places before I discovered on the django-users mailing list that Django doesn’t like it when an app name matches a templatetag name. Easily fixed by renaming the templatetag.
  2. My first runs of update_feeds.py went fine, but later started erroring out with database integrity errors. The GUID field on the FeedItem model is set to unique=True, which prevents your database from storing any one FeedItem more than once. That’s great, but it was dishing up integrity errors for some reason. I fixed this by changing this line in update_feeds.py:
1
feed.feeditem_set.get(guid=guid)

to:

1
FeedItem.objects.get(guid=guid)

Once I was able to get the updater to run consistently without error, I needed to get it running via cron. The trick to running a Python script that talks to the Django ORM from a crontab is that you must supply the full Python paths in the environment to cron – it doesn’t pick them up automatically from the environment of the user that runs the cron job. This worked for me:

1
2
3
PYTHONPATH=/home/bnn/projects:/home/bnn/projects/bnn
DJANGO_SETTINGS_MODULE=bnn.settings
20 15 * * * python /home/bnn/projects/bnn/scripts/update_feeds.py 2>&1

Producing Feeds

With the harvesting system up and running, and all content coming into the datbase associated with blogs that were in turn categorized by “beat” and geographical area, outputting aggregated RSS feeds was a simple matter of using Django’s native syndication framework as documented. This went into urls.py:

1
2
3
4
5
6
7
8
feeds = {
    'all': AllFeeds,
    'cat': CategoryFeeds,
    'area': BeatFeeds,
}
 
# Feeds
url(r'^feeds/(?P.*)/$', 'django.contrib.syndication.views.feed', {'feed_dict': feeds}),

… and I created a file feedgenerator.py to contain the three corresponding classes and their querysets, using Holovaty’s sample code from chicagocrime.org as a starting point.

September 27th, 2009

Python-MySQL Connections and Snow Leopard

Apparently I’m not the only one having trouble getting MySQL and Python to play nice under OS X — last February’s post on getting the two to cooperate under OS X has generated a ton of traffic. Now I’ve upgraded to Snow Leopard and faced a handful of new challenges (but eventually got it working). Rather than scatter my notes, I’ve updated the original post with Snow Leopard instructions.

September 7th, 2009

Populate Mailman Lists from Django Projects

I spent much of the summer building an intranet in Django for Miles’ school. Since the school is a co-op, we need to keep track a lot of stuff – charges, credits, and obligations, parents, students, teachers, family jobs, committee membership, the board, etc. etc. I’m happy with how the site came out, but unfortunately can’t share it here, since it’s a private site.

One of the goals of the rebuild was to put an end to the laborious manual process of maintaining the school’s multiple overlapping mailing lists. Since all of those relationships, people types, and groups were already stored in the intranet’s database, I figured it should be possible to run various queries and populate Mailman mailing lists from them directly. Due to the messy nature of the real world, the process was a lot trickier than it sounds on paper, but I eventually did get a smoothly working list generation system up and running, talking to our Django system and working with virtually no manual intervention. Members can update their own profiles and find that their mailing list subscription address has changed automatically a few hours later. Administrators can give someone a new family job or board position and that person will find themselves subscribed to the right mailing list for it later that day.

Since there isn’t much published out there on making these two systems (Django and Mailman) play nicely together, I decided to publish the scripts and document the recipe I used to get it all working. Hope someone finds the system useful.

June 27th, 2009

django-profiles: The Missing Manual

The User model in Django is intentionally basic, defining only the username, first and last name, password and email address. It’s intended more for authentication than for handling user profiles. To create an extended user model you’ll need to define a custom class with a ForeignKey to User, then tell your project which model defines the Profile class. In your settings, use something like:

1
AUTH_PROFILE_MODULE = 'accounts.UserProfile'

To make it easier to let users create and edit their own Profile data, James Bennett (aka ubernostrum), who is the author of Practical Django Projects and the excellent b-list blog, created the reusable app django-profiles. It’s a companion to django-registration, which provides pluggable functionality to let users register and validate their own accounts on Django-based sites.
(more…)

May 20th, 2009

Webcasting with Django

The Knight Digital Media Center, which runs on Django, hosts week-long workshops for working journalists who come from around the country to learn multimedia and internet technology skills. We fill many of our lunch and dinner sessions with talks by journalism industry experts and pundits, and webcast their presentations live. After workshops are over, we post the archived video for posterity. There’s more to handling multi-day, multi-part live and archived video with Django and a genuine streaming server than meets the eye, so thought I’d break it down.

An “event” can last any number of days, and can include any number of presentations, each of which may or may not include a webcast. While the event is in progress, you want the ability to advertise a single URL, where all of the live webcasts will happen. But for the archives, which is where the vast majority of viewing happens over the course of time, you want a separate page/URL for each presentation. Presentation pages include details on that speaker, summaries of what was presented, and optional downloads of PowerPoint or Keynote presentations. Our Presentation model is foreign-keyed to a master Event model (or, in our case, the Workshop model).

Because they’re time-based, synchronous events, webcasts are different from typical web pages. There are five possible “states” a webcast page can be in at any given time, all of which require different things to be inserted into the view:

Upcoming: The event is announced but there’s nothing yet to show. Tell user that webcast will be live at posted time (along with schedule).

In progress: The event is occurring. Insert appropriate object code to embed live QuickTime stream.

Concluded: The live webcast has ended, but the archives haven’t yet been prepared and posted (this can take us a few days). Tell user to come back soon.

Archive: The archived video is prepared and available on the streaming server for posterity. Insert appropriate object code to display streamed archive file from QuickTime Streaming Server.

External: We sometimes host events at other locations on campus, in which case UC Berkeley handles the webcasting rather than us. If so, we need to link from our events database to theirs. Insert appropriate message and link.

In Django, we represent these choices with the typical CHOICES construct:

1
webcast_state = models.CharField(max_length=4,choices=WEBCAST_STATE_CHOICES)

… which ends up looking like this in the Django admin:

webcast_state

Depending on the current state, different content (text or object/embed code) is inserted into the page in real time (using simple conditionals in Django templates). The Django admin thus becomes a handy tool our student helpers can use to make the master workshop page embed the right thing in the right place at the right time without requiring tech skills. Remember, during the course of a workshop week, all video is happening in the master Workshop page – later, streaming video archives will go into separate Presentation pages and be automatically linked to from the parent Workshop page.

Stream Handles

At the J-School, we use QuickTime Streaming Server, in part because it’s free, and in part because all of our  workstations and most of our servers are Macs. We’ve contemplated switching to Flash streaming, but the simplicity of keeping everything Mac-native keeps us on QTSS for now.

Embedding a stream from an external QTSS server is not quite as straightforward as embedding a typical QuickTime movie. Video comes from QTSS over the rtsp:// protocol, rather than http://. And there’s the catch: You can’t embed an rtsp stream directly into a web page — instead, you need to embed a fake QuickTime movie (a “reference movie”), which is actually a text file with the .mov extension. That text file simply references the full URL of the rtsp stream coming from QTSS. The contents of a reference movie file might look like this:

1
2
3
<?xml version="1.0"?>
<?quicktime type="application/x-quicktime-media-link"?>
<embed src="rtsp://streamer.domain.edu/events/131.humanity_2.0.mov" />

Here’s where things get interesting as far as Django is concerned. We don’t want to have to create a physical reference movie for every single stream we serve. And yet, at the HTML level, we have to embed something that looks like a reference to a physically external movie file, e.g.:

1
2
3
4
5
6
7
8
9
10
<object classid="clsid:02BF25D5-8C17-4B23-BC80-D3488ABDDC6B" 
 width="480" height="376" 
  codebase="http://www.apple.com/qtactivex/qtplugin.cab">
 <param name="SRC" value="/presentations/webcast-archive.227.ref.mov">
 <param name="AUTOPLAY" value="true">
 <param name="CONTROLLER" value="true">
 <embed src="/presentations/webcast-archive.227.ref.mov" 
  width="480" height="376" autoplay="true" controller="true" 
  pluginspage="http://www.apple.com/quicktime/download/">
</object>

So how can we make Django think that /presentations/webcast-archive.227.ref.mov is an actual file on the server, which in turn contains the correct reference to the rtsp stream coming from the streaming server? In effect, it’s a “view within a view.”

webcast_setup
Click for larger version

Displaying the presentation page is straightforward Django – I won’t get into that here. But here’s how the “view within a view” stuff works. In the object section of the presentation page template there is a reference to:

1
<param name="SRC" value="/presentations/webcast-archive.{{object.id}}.ref.mov">

which resolves to something like:

1
<param name="SRC" value="/presentations/webcast-archive.267.ref.mov">

When the browser hits that line, it requests /presentations/webcast-archive.267.ref.mov from the server, which in turn triggers this entry in urls.py:

1
2
3
url(r'^presentations/webcast-archive.(?P<pres_id>\d+).ref.mov$',
'workshops.views.presentation_webcast_archive',
name='workshops_presentation_webcast_archive'),

So after the presentation page has been rendered by Django and sent to the browser, a second (very simple) view, presentation_webcast_archive, is called, which is simply:

1
2
3
4
5
6
7
8
9
10
11
12
13
def presentation_webcast_archive(request, pres_id):
    """
    Generate a virtual QuickTime reference movie on the fly,
    to be embedded in presentation webcast pages.
    """
 
    pres = get_object_or_404(Presentation,id=pres_id) 
 
    return render_to_response( 'workshops/presentation_webcast_archive.txt',
        {
            'p': pres,
        }, context_instance=RequestContext(request),
    )

That view spits out the same presentation object to a different template, presentation_webcast_archive.txt, which consists of:

1
2
3
<?xml version="1.0"?>
<?quicktime type="application/x-quicktime-media-link"?>
<embed src="rtsp://domain.edu/events/{{p.webcast_path}}/{{p.webcast_filename}}" />

Where webcast_path and webcast_filename are fields on the model representing the physical location of the QuickTime media on the streaming server (not the web server). After a workshop week is over, staff only need to hint the saved archive files, upload them to a directory and filename on the streaming server, enter those paths in the Django admin, and check the “Has Webcast” box. The rest is automatic.

In a previous, PHP-based version of this system, we had to prepare an actual reference movie for every archive stream we hosted. By using this “view within a view” technique, Django has let us remove that part of the workflow.

February 21st, 2009

Python-MySQL Connections on Mac OS

Update: This entry has been updated for Snow Leopard.

In all of Mac-dom, there are few experiences more painful than trying to get Python tools to talk to a MySQL database. Installing MySQL itself is easy enough – Sun provides a binary package installer. Python 2.5 comes with Mac OS X. If you enable Apache and PHP, your PHP scripts will talk to your installed MySQL databases just fine, since PHP comes bundled with a MySQL database connector. But try to get up and running with Django, TurboGears, or any other Python package where MySQL database access could be useful (or needed), and you’re in for a world of hurt.

Update: I finally did manage to get Python and MySQL playing nice together, but it took a few more contortions beyond what’s described in the recipes found scattered around the interwebs. I’ve added my solution at the end of this post.

(more…)

January 30th, 2009

Who Owns Your RSS?

In a case with far-reaching implications for the widespread practice of automated aggregation of headlines and ledes via RSS, GateHouse Media has, for the most part, won its case against the New York Times, who owns Boston.com, who in turn run a handful of community web sites. Those community sites were providing added value to their readers in the form of linked headlines, pointing to resources at community publications run by GateHouse. The practice of linked headline exchange is healthy for the web, useful for readers, and helpful for resource-starved community publications. However, for reasons that are still not clear (to me), GateHouse felt that the practice amounted to theft, even though the Boston.com sites were publishing the RSS feeds to begin with.

Trouble is, RSS feeds don’t come with Terms of Use. Is a publicly available feed meant purely for consumption by an individual, and not by other sites? After all, the web site you’re reading now is publicly available, but that doesn’t mean you’re free to reproduce it elsewhere. The common assumption is that a site wouldn’t publish an RSS feed if it didn’t want that feed to be re-used elsewhere. And that’s the assumption GateHouse is challenging.

Let’s be clear – this is not a scraping case (scraping is the process of writing tools to grab content from web pages automatically when an RSS feed is not available). Boston.com was simply utilizing the content GateHouse provided as a feed. I would agree that scraping is “theft-like” in a way that RSS is not, but that’s not relevant here.

In a weird footnote to all of this, GateHouse initially claimed that Boston.com was trying to work around technical measures they had put in place to prevent copying of their material. Those “technical measures” amounted to JavaScript in its web pages, but boston.com was of course not scraping the site — they were merely taking advantage of the RSS feeds freely provided by GateHouse. In other words, they were putting their “technical measures” in their web pages, not in their feed distribution mechanism, missing the point entirely.

GateHouse seems primarily concerned with the distinction between automated insertion of headlines and ledes (e.g. via RSS embeds) vs. the “human effort” required to quote a few grafs in a story body. Personally, I don’t see how the two are materially different, or how one method would affect GateHouse publications more negatively or positively than the other. If anything, now that GateHouse has gotten its way, they’re sure to receive less traffic.

The result is that Boston.com has been forced to stop using GateHouse RSS feeds to automatically populate community sites with local content. If cases like this hold sway, there will soon be a burden on every site interested in embedding external RSS feeds to find out whether it’s OK with each publisher first.

PlagiarismToday sums up the case:

It was a compromise settlement, as most are, but one can not help but feel that GateHouse just managed to bully one of the largest and most prestigious new organizations in the world.

Also:

The frustrating thing about settlements, such as this one, is that they do not become case law and have no bearing on future cases. If and when this kind of dispute arises again, we will be starting over from square one.

I’m trying to figure out who benefits from this decision… and I honestly can’t. GateHouse loses. Boston.com loses. Community web sites with limited resources lose. And readers lose. Something’s rotten in the state of Denmark.

December 23rd, 2008

Django and graphviz

I’ve been watching the django-command-extensions project out of the corner of my eye for a while, promising to give it a shot. With the extensions added to your installed_apps, manage.py grows a bunch of additional functionality, such as the ability to empty entire databases, run periodical maintenance jobs, generate a URL map, get user/session data… and to generate graphical visualizations from models.

A recent post by John Tynan on the power of command extensions finally kicked my butt enough to give it a spin. Essential stuff for debug and development work.

Getting visual graphing to work takes a bit of extra elbow grease, since it depends on a working installation of the open graphviz utilities as well as a Python adapter for graphviz, PyGraphviz. graphviz itself has both command-line utilities (which I got via macports) and a GUI app for opening and manipulating the .dot files that graphviz generates.

Took some wringing of hands and gnashing of teeth to get macports to happily install all of the pieces, but finally ended up with this:

1
python manage.py graph_models beverages > beverages1.dot

Beverages-Model
Click for PDF version

The key to getting decent resolution output, I found, is to output a graphviz .dot file rather than PNG. You can’t control the relatively low resolution of the latter, but .dot files are vector, and can be exported from the GUI Graphviz app to any format, including PDF (infinite resolution!).

Amazing to be able to visualize your models like this, but it’s not perfect. What you don’t see reflected here is the fact that Wine, Beer, etc. are actually subclassed from the Beverage model. And the arrows don’t even try to point to the actual fields that form table relations, which would be nice. graph_models has a way go, but it’s still a terrific visualization tool for sharing back-end work with clients in a way that makes immediate sense.

November 19th, 2008

Notes on a Django Migration

Powered by Django. Earlier this year, I inherited responsibility for the website of the Knight Digital Media Center at UC Berkeley’s Graduate School of Journalism. The site is built with Django, a web application framework written in Python. The J-School has primarily been a PHP shop, using a mixture of open-source apps — lots of WordPress, Smarty templates and piles of home-brew code. Because it’s grown organically over time with no clear underlying architecture and a constantly changing array of publications to support, the organization sits on top of dozens of unrelated databases.

These are my notes and observations on how the J-School got into this mess, why we’ve fallen in love with Django, and how we plan to dig ourselves out.

(more…)

September 3rd, 2008

Podcast Diet

Podcastlogo Podcasting changed my life.

There, I said it. Melodramatic, but true. When free time is whittled down to razor-thin margins, something’s gotta give, and media consumption is often the first luxury to go. And, speaking for myself, when I’m tired at the end of the day and give myself an hour of couch time, I’m not exactly predisposed to turn to the news. “Man vs. Wild” is more like it.

The one chunk of time I get all to myself every day is the daily commute (by bike or walk+train), which amounts to just over an hour a day. A few years ago, commute time was music time, but podcasting changed all that.

With a weekly quota of five hours consumption time, didn’t take long to subscribe to more podcasts than I could possibly digest before the next week rolled around. But I continue to hone the subscription list. Here are some of the podcasts I’ve come to call friends:

Links are to related sites – search iTunes for these if podcast links aren’t obvious.

- This Week in Tech: Tech maven Leo Laporte used to do great shows at ZDTV, now runs his own tech news & info podcasting network. I appeared on his TV show a few times back in the BeOS days; now I’m just a faceless audience member. Show gets rambly and too conversational at times, but they do a good job of traversing the landscape, and there are plenty of hidden gems. Frequent co-host John Dvorak drives me crazy, despite his smarts.

- Podcacher: All about geocaching, with “Sonny and Sandy from sunny San Diego, CA.” Great production values. Love it when the adventures are huge, but get bored with all the geocoin talk (unfortunately fast-forwarding through casts and bicycling don’t go well together, especially since losing tactile control after moving to the iPhone). Still, lots of tips, excellent anecdotes, and occasional hardware reviews.

- Radiolab: I’ll go with their own description: “On Radio Lab, science meets culture and information sounds like music. Each episode of Radio Lab. is an investigation — a patchwork of people, sounds, stories and experiences centered around One Big Idea.” I love what they do with sonic landscapes. I can’t think of a better example of utilizing the podcasting medium’s unique characteristics. The shows are mesmerizing, and welcome relief from my tech-heavy audio diet.

- This American Life: Everyone’s favorite NPR show. Excruciatingly wonderful overload of detail on the bizarre lives or ordinary Americans. Your soul needs this show.

- Slate Magazine Daily Podcast: They say it would be a waste of the medium’s potential to just have someone read stories into a microphone. I beg to differ. I don’t have time to read Slate, but love their journalism. I’m more than stoked to receive a digest version of the site through my ear-holes.

- FLOSS Weekly: Another Leo Laporte show, but in this one he gets out of the way and lets his guests do the talking. All open source, all the time. Usually interviews with leaders / founders / spokespeople for various major OSS initiatives. Great interviews recently with players from the Drizzle and Django camps.

- Stack Overflow: Who woulda thunk a pair of Windows-centric web developers would have captured my attention? But great insight here into the innards of web application construction. Geeks only.

- NPR: All Songs Considered If you’re old-and-in-the-way like me, feeling like your musical soul isn’t get fed the way it should, you could do a lot worse than subscribe to All Songs Considered – annotated rundown of recent (and sometimes not-so-recent) discoveries that remind you why music is Still Worth Paying Attention To.

- This Week in Django: Part of the reason I’ve been so quiet lately is that I’m deeply immersed in Django training, having inherited a fairly complex Django site at work (more on that another day). This podcast is pretty hardcore stuff, for Django developers only. Can’t pretend to understand it all, but right now it’s part of the immersion process, and is helping me gain scope on the Django landscape.

- The Wordpress Podcast: I spend more of my time (both at work and at home) tweaking on WordPress publication sites than anything else, and this is a great way to stay abreast of new plugins, security issues, techniques, etc. Wish it was more technical and had a faster pace, but it’s the best of the WordPress podcasts.

- Between the Lines: Back in my Ziff days, I worked for the amazing Dan Farber, who’s still going strong at ZD. This is my “check in with the veteran tech journalists” podcast, and is a serious distillation of goings-on in the tech world. Always a good listen.

Obviously there’s no way to fit all of these into the weekly commute hours, but I try. No time to digest more, but dying to know what podcasts have you gripped. Let me know.

Music: Minutemen :: Storm In My House
August 24th, 2008

Notes on Open APIs

Geocachingicon Readers following this blog have seen my occasional references to geocaching – a sport/hobbby/pastime that Miles and I do quite a bit of, which involves using a hand-held GPS to place and find hidden treasures – either in the woods or in the city.

One of the many unusual aspects of geocaching is the fact that it relies completely on the existence of a single web-based database, represented by the site geocaching.com. As web-based database applications go, the site is a modern marvel. The database represents hides, finds, people and their discovery logs, travel bugs (ID’d items that travel the world, hopping from container to container), and more, all sliced and diced a million ways to Sunday. The site is deeply geo-enabled, letting users hone in on hides near them, along a route, or near arbitrary destination locations. It’s also one of the best examples I’ve seen of useful Google Maps mashups, relying heavily on the open APIs provided by Google to integrate its cache database with Google’s map database. This is what map mashups are all about, and geocaching.com has done an amazing job with them.

As the popularity of personal GPSs rises, so does the game’s popularity. But when geocaching.com goes down (or slows down), so does the game, which involves more than half a million hides world-wide, and many millions of players. The site, which is, sadly, based on Microsoft database technology and ASP, does go down from time to time (big surprise); it’s a “single point of failure” in bit-space for the entire meat-space game – a precarious position. (more…)