Irony, the opposite of wrinkly.
 
June 6th, 2010

Building a Bucketlist Site with Django

Half a year ago, I got this crazy idea to build a site where people could log and record all the things they wanted to accomplish before they died. But more than just simple list-making, I wanted to make it easy for people to tell stories about their goals, and to add images and video. I wanted to let people “follow” other people’s lists, to receive email when their friends accomplished their goals, to start discussions about getting the most out of life. I wanted it to be a place where people could get inspired by the goals of others, and to easily make copies of those goals in their own bucketlists.

The result is bucketlist.org.

I had a pre-existing love affair with the Python-based Django framework – there was never a question of what platform to build on. But no matter how good the platform, the devil’s in the details.
(more…)

May 12th, 2010

Allowing Secure User Input with Django

Building a site that needs to accept formatted user input? There’s no way you’re going to let random users input any old HTML – you’d open the door to all kinds of cross-site-scripting attacks and other nastiness. Nor can you just filter out the tags you consider dangerous – that road is fraught with peril. The only solution is to white-list a small subset of tags and unceremoniously drop the rest.

There are two layers to the problem – how to support formatted text on the front-end, and how to process submitted text on the back-end.

For the front-end, some developers are drawn to the Markdown syntax – a supposedly user-friendly wiki-like syntax that can be re-rendered as safe HTML. But while Markdown may look friendly to developers, it doesn’t to normal users – trust me on this. Even for tech-savvy users, Markdown requires that you place syntax instructions on your site (inelegant). A better solution is to use a rich text editor for the web, like TinyMCE or WYMEditor.

Ever notice that you often see rich text editors in content management systems run by trusted users, but seldom on public-facing web pages? That’s because it’s tricky to do securely, and without giving users enough rope to hang themselves formatting-wise.

With a bit of configuration though, you can deploy public-facing rich textareas securely, allowing only the input of tags you specify. But you can’t stop there – all the user has to do is disable Javascript in the browser to bypass your rich text editor. You must process submitted text on the back-end with the same set of rules in your view logic.

(more…)

April 21st, 2010

Reading .rst doc files with sphinx

Quite a few re-usable Django apps and Python modules come with documentation in text files ending with a .rst extension. The formatting in them is odd, but they’re more-or-less readable.

To this day, I haven’t encountered a single package that explained why docs were formatted this way. I knew there had to be an explanation, but hadn’t gotten around to looking it up, and basically just waded through. Finally went looking for an answer today. Turns out .rst files use a simple markup syntax called restructured text and you can generate nicely formatted HTML (and other documentation formats) out of them if you have python’s sphinx module installed. For the benefit of future googlers, here’s how to get up and running quickly:

1
2
3
4
$ pip install sphinx
$ cd docs
$ mkdir out
$ sphinx-build . out

Now take a look in the “out” directory and you’ll find the same set of files as a collection of handsomely formatted HTML docs.

Sphinx goes pretty deep, and I’m looking forward to exploring it for future documentation projects. For now I’m just happy to have an alternative to squinting.

November 8th, 2009

django-treedata: DataSF Contest Winner

treewordle-150x150Recently I was invited to participate in the California Data Camp and DataSF App Contest hosted by California Watch and spot.us. The unconference would feature lots of discussion about making use of publicly available data sets to improve quality of life. The App Contest challenged developers to choose one of the many data sets available at DataSF.org and build something cool with it in a relatively short period of time.

Long story short — my contest entry, which explores San Francisco’s database of publicly maintained trees and plants, won the competition! Full details, and downloadable source code, available at my Scripts and Utilities site.

Thanks so much to David Cohn of Spot.us and all of the conference organizers and supporters. Thanks also to J-School webmaster for Chuck Harris for his contributions to the project. It was a great day, and winning the competition was a total surprise. Now I just need a city to take the source code and run with it.

spot.us has covered the event live throughout the day.

Huffington Post mentioned django-treedata in Sophisticated Tree Hugging: the Pure Joy of Public Data

October 20th, 2009

Generating RSS Mashups from Django

I recently got to work on an interesting Django side project: the Bay News Network – a directory of Bay Area bloggers and hyperlocal news sites. The goal of the site was three-fold:

  1. To create a many-to-many directory of local sites that matched our editorial criteria
  2. To let site owners log in and edit their own listings
  3. To both consume and produce RSS feeds from the listed sites

The first two were pretty standard Django approaches – develop data models and editing interfaces using Django forms and re-usable apps like django-profiles and django-registration. The third goal turned out to be more interesting. We not only had to gather RSS feeds from more than 100 external sites several times per day, we needed to re-mix them (e.g. provide an integrated feed representing all blogs that cover Food, or all blogs that cover Oakland).

“Consuming” RSS feeds meant we needed to integrate feeds from the external sites into our own site. At the most basic level, this was pretty straightforward using Mark Pilgrim’s excellent Universal Feed Parser, which turns the real-world’s tag soup of disparate, incompatible RSS formats  into a reliable data format you can step through in your code or templates. This worked well enough until I realized that grabbing and parsing external feeds in real-time was just not going to scale, performance-wise. Plus, we still had the RSS mashups to build, and would clearly need to be storing feed entries in our own database in order to sort them by category, etc.

Thus began the hunt for good feed aggregation systems for Django. Most roads pointed to django-planet, planet planet, and FeedJack, which are systems for gathering content from external sites and importing it into a single aggregated site. These were close to what I wanted, but weren’t great on the re-usability side. Since I already had  existing models to define the sites, their owners, and their feeds, I didn’t want to rewrite all my models to work with another system’s conception of how things should be laid out. I also didn’t feel like plowing through their source code to chop out and rewrite just the bits I wanted. Eventually realized that I was looking for a few lines of code to work with my system, not a whole external system.

The surprising solution came from the Community section of the official Django project web site. The Django developers keep the code that drives djangoproject.com in subversion along with the source code to Django itself. And the code that drives that section of the site is really lightweight. So I did a subversion checkout of the Aggregator app, and found that all I really needed from it was its update_feeds.py script, which itself is a wrapper around Universal Feed Parser, tweaked to talk to my own models.

Two gotchas to be aware of:

  1. The app includes a bundled templatetags directory with a file called aggregator.py. But the name of the app itself is “aggregator.” I was getting strange import errors in various places before I discovered on the django-users mailing list that Django doesn’t like it when an app name matches a templatetag name. Easily fixed by renaming the templatetag.
  2. My first runs of update_feeds.py went fine, but later started erroring out with database integrity errors. The GUID field on the FeedItem model is set to unique=True, which prevents your database from storing any one FeedItem more than once. That’s great, but it was dishing up integrity errors for some reason. I fixed this by changing this line in update_feeds.py:
1
feed.feeditem_set.get(guid=guid)

to:

1
FeedItem.objects.get(guid=guid)

Once I was able to get the updater to run consistently without error, I needed to get it running via cron. The trick to running a Python script that talks to the Django ORM from a crontab is that you must supply the full Python paths in the environment to cron – it doesn’t pick them up automatically from the environment of the user that runs the cron job. This worked for me:

1
2
3
PYTHONPATH=/home/bnn/projects:/home/bnn/projects/bnn
DJANGO_SETTINGS_MODULE=bnn.settings
20 15 * * * python /home/bnn/projects/bnn/scripts/update_feeds.py 2>&1

Producing Feeds

With the harvesting system up and running, and all content coming into the datbase associated with blogs that were in turn categorized by “beat” and geographical area, outputting aggregated RSS feeds was a simple matter of using Django’s native syndication framework as documented. This went into urls.py:

1
2
3
4
5
6
7
8
feeds = {
    'all': AllFeeds,
    'cat': CategoryFeeds,
    'area': BeatFeeds,
}
 
# Feeds
url(r'^feeds/(?P.*)/$', 'django.contrib.syndication.views.feed', {'feed_dict': feeds}),

… and I created a file feedgenerator.py to contain the three corresponding classes and their querysets, using Holovaty’s sample code from chicagocrime.org as a starting point.

September 27th, 2009

Python-MySQL Connections and Snow Leopard

Apparently I’m not the only one having trouble getting MySQL and Python to play nice under OS X — last February’s post on getting the two to cooperate under OS X has generated a ton of traffic. Now I’ve upgraded to Snow Leopard and faced a handful of new challenges (but eventually got it working). Rather than scatter my notes, I’ve updated the original post with Snow Leopard instructions.

September 7th, 2009

Populate Mailman Lists from Django Projects

I spent much of the summer building an intranet in Django for Miles’ school. Since the school is a co-op, we need to keep track a lot of stuff – charges, credits, and obligations, parents, students, teachers, family jobs, committee membership, the board, etc. etc. I’m happy with how the site came out, but unfortunately can’t share it here, since it’s a private site.

One of the goals of the rebuild was to put an end to the laborious manual process of maintaining the school’s multiple overlapping mailing lists. Since all of those relationships, people types, and groups were already stored in the intranet’s database, I figured it should be possible to run various queries and populate Mailman mailing lists from them directly. Due to the messy nature of the real world, the process was a lot trickier than it sounds on paper, but I eventually did get a smoothly working list generation system up and running, talking to our Django system and working with virtually no manual intervention. Members can update their own profiles and find that their mailing list subscription address has changed automatically a few hours later. Administrators can give someone a new family job or board position and that person will find themselves subscribed to the right mailing list for it later that day.

Since there isn’t much published out there on making these two systems (Django and Mailman) play nicely together, I decided to publish the scripts and document the recipe I used to get it all working. Hope someone finds the system useful.

June 27th, 2009

django-profiles: The Missing Manual

The User model in Django is intentionally basic, defining only the username, first and last name, password and email address. It’s intended more for authentication than for handling user profiles. To create an extended user model you’ll need to define a custom class with a ForeignKey to User, then tell your project which model defines the Profile class. In your settings, use something like:

1
AUTH_PROFILE_MODULE = 'accounts.UserProfile'

To make it easier to let users create and edit their own Profile data, James Bennett (aka ubernostrum), who is the author of Practical Django Projects and the excellent b-list blog, created the reusable app django-profiles. It’s a companion to django-registration, which provides pluggable functionality to let users register and validate their own accounts on Django-based sites.
(more…)

June 1st, 2009

GMail vs. Mail.app

Gmail-1 Confession: I’ve never liked webmail – I was a hardcore Eudora user for ages, then spent five years with BeOS desktop mail clients, then a year with Entourage on the Mac before finally switching four years ago to Apple’s Mail.app, with its flawless IMAP implementation. Every time I’ve tried the “next generation” of webmail clients, they’ve felt anemic to me, and I’ve felt like my workflow slowed way down — not because they were slow per se’, but because of the dozens of small niceties you get with desktop clients that you don’t get with webmail. I’ve relegated webmail to something you use when you’re not at your own machine for some reason and/or aren’t able to take the two minutes it takes to configure IMAP at a foreign machine.

Mailapp-1 That’s why I’ve always been amazed to see how many developers and gear-heads use GMail. These are tech-savvy people, who I’d think would have the same frustrations with webmail that I do. What are they seeing that I’m not seeing? I totally get the convenience factor of being able to access my mail through any web browser, anywhere. I wouldn’t mind having that, but so far it hasn’t seemed worth the sacrifices. I know GMail keeps getting better, so thought it was finally time to give myself over to GMail for a week and see how it goes. Here are some notes on that experience.

n.b.: I’m using Google’s official list of keyboard shortcuts. I used the 3rd party tool A to G to convert Apple’s Address Book to CSV, then imported 1200 contacts into GMail’s contact system.

My list of GMail gripes, with a few faint praises in the mix:

- No way to change the default reading font. Really??? The default reading/writing font is just too small to be comfortable (for me), and it’s ridiculous that something this straightforward and ubiquitous in desktop clients would not be there. How hard can it be to give the user a choice of common font faces and sizes? Does not compute.

- No way to quote previous text before replying. Every desktop mail client I’ve used lets you select a block of text in a message, then hit Reply. Only the selected text appears in the reply. This is so core to netiquette and to my every day workflow that it seems like a non-negotiable feature. And yet no webmail client I’ve tried supports it. Not even GMail. No wonder over-quoting is such a problem these days. Later… OK, I discovered that this “feature” is actually available under Settings | Labs. When I enabled it, it complained that it could “not be loaded,” and continues to complain every time I exit the Settings menu, though it did work correctly in my first test. Cool, but why is it in Labs, as if it’s some kind of optional convenience that only a few people might want? How can this not be part of the default package? Core functionality.

- Inline photos. A family member sent 10 photos as attachments. When viewed in Mail.app they’re displayed inline, nice and large; GMail only shows thumbnails inline, though you can click “View all images” to see them full size on a separate page. There is of course no option to “Save all to iPhoto” in GMail. Since they were family photos, that’s exactly what I wanted to do.

- No preview pane. For realsies? I know of at least two webmail clients (RoundCube, which is available on Birdhouse, and Apple’s mac.com (errr, me.com)). If they can do it, why can’t one of the most popular webmail clients of them all?

- More clicks to view the next message. When done viewing one message, if you click Delete or Archive, you’re taken back to the full message list, which lacks a preview pane. So you then need to click again to view the next message. This kind of “more clicks/keystrokes to accomplish common tasks” is all over the place in GMail.

- No way to turn external mail checking on/off. I now have GMail configured to work as a POP client to two external accounts (would have configured it as IMAP, but GMail doesn’t support that, even though you can use external clients to talk IMAP to GMail – weird). Now I’d like to have GMail stop checking those two external accounts for a while, without removing all the config info. Too bad – the only way to make it stop is apparently to delete the account completely. Grrr…

- Poor conversation threading. GMail does an OK job at this – better than other webmail clients, but nowhere near as clean visually or as easy to navigate as threaded discussions in desktop mail apps. And because GMail shows a thread all on one page (thanks again to no preview pane), deleting individual messages out of the thread takes a lot more scrolling and clicking than it does in a desktop client. GMail’s threading is a pale imitation of technology we’ve had on the desktop for years. However, I really do like being able to see my own replies automatically in the context of the thread, even without having explicitly cc:’d myself, and without having to dig through the Sent folder. But the ease of expanding and collapsing a thread, of jumping to the next unread message in a thread, of deleting individual messages from a thread… all vastly superior in Mail.app.

Threadcollapse
In Mail.app, a thread is indicated by the presence of an arrow in the left column.

Threadexpand
Cmd-RightArrow expands the thread; spacebar jumps you to the next unread message in the thread. The actual conversation is shown in the Preview pane. It’s easy to delete individual messages from the thread.

- Keyboard shortcuts. Yes, there are some. Yes, they work for the most part. But they’re not as ubiquitous or as clean to use as the keyboard shortcuts in a desktop client. I found myself doing a lot more mousing in GMail than I’m accustomed to doing in email.

- Adding contacts. I get a message from someone who’s not in my Contacts list. If there’s a way to add this person to my Contacts list on the fly, I’m not seeing it (yes, I looked). Mail.app makes this common process trivial and intuitive.

- Moving messages between accounts. One of the ways I rely heavily on IMAP is the ability to drag and drop messages between various mailboxes and servers. If I receive a message at work that I want to handle at home, I drag it from calmail to birdhouse, and vice versa. If I want to pull something out of cold storage (e.g. from a local mail store and put it back on a live mail server for handling), I can do that. GMail can be configured to talk to multiple accounts, but since it itself does not work like an IMAP client to foreign mail servers, it can’t do any kind of inter-server message moving. I guess the idea is that its model makes this kind of thing irrelevant, but it feels like a big missing piece of the modern mail experience.

- Integrated chat. Both GMail and Mail.app have this, but GMail clearly wins here when you’re at someone else’s computer since you don’t have to set up both the mail and chat clients (thanks @jrue for this point).

- “Send Again” feature. Not something you use a lot, but when you do, it’s a real time saver. Use this after sending a message to someone who’s address has died and you want to try again to the right address, or when you left someone off the original cc: list. Mail.app and other desktop clients have it. GMail doesn’t.

- Breaks quoting. Let’s say you’ve got a paragraph of quoted text in an incoming message and you want to reply to it in two parts. In a desktop client, you put the cursor where you want to break the graf and hit Return. A new quote mark is automatically added to the beginning of the new line. Not in GMail – you end up with the first line that should be quoted suddenly unquoted. Later… turns out this does work properly in rich text mode in GMail, but not in plain text mode. But I prefer to stay in plain text mode, only switching to rich text mode when necessary.

quotebroken
While replying in plain text mode in GMail, insert cursor in the middle of a paragraph and hit Return to start your reply. The new line lacks a starting mail quote mark, breaking netiquette and readability for the recipient.

- No Data Detectors. OK, this is only available in Mail.app, not all desktop mail clients, but it really is a killer feature. Roll over any date or time in any format, or any person’s name or email address, even in a plain text message, and you get a little drop-down menu that lets you quickly add that item to your calendar or address book.

Datadetector-1

Data detectors do an amazing job of figuring out all the right fields — almost magic (try it with messages referencing “tomorrow” or “next Tuesday.”) GMail does have an “Add Event” option but it’s nowhere near as intelligent or as slick, and it works for the whole message, not for individual text snippets within the message. Big win for Mail.app.

- Partial word searches. The search feature in GMail is nice, but is not better than the one in Mail.app. Yes, Google is a bit faster at returning results, but not by much (yes, Apple’s Spotlight is *that* fast). But here’s the kicker – Google and GMail can’t do partial-word searches. So if I’m looking for an email that I know includes the word “question” but I just type “quest” [Return] into GMail search, it turns up nothing! Wildcard searches don’t work either. Very frustrating. Even on their native search turf, Google loses to Apple. Update: There are also types of searches Mail.app can’t do, such as combined OR statements. So let’s call this one a draw.

- End-of-line key combo. On the Mac, the standard keyboard shortcuts to jump the cursor to the start/end of the current line are Cmd-RightArrow and Cmd-LeftArrow. These don’t work in GMail. In fact, as far as I can tell there’s no keyboard short to do this on the Mac in GMail. Which amounts to one more reason GMail is a lot more mouse dependent than using Mail.app or other desktop client. Can’t blame this on rich text editors either — WordPress uses a TinyMCE variant, and Cmd-RightArrow works there just fine. GMail is just broken in this respect.

- Ads in my email. They just bug me. I totally understand that that’s how I pay for the service. I get that. I still don’t like looking at them. Irritating. In fact, I found the whole GMail experience more cluttered and just… less elegant than working with a desktop client.

ADDED LATER

- Multiple windows. Sometimes I like to have two or more messages open at once, plus a compose window, so I can copy/paste bits around and between messages, or for reference while writing something new. Easy to do in a desktop client. Assumed I could do similar in GMail by cmd-clicking messages to open them in various tabs, but nope – GMail doesn’t allow that – forces you to only be looking at one thing at a time. Is that a feature they haven’t implemented yet, or an intentional limitation? Feels like the latter.

Upshot: I didn’t follow through on my promise to try GMail for a week. The frustration was too much to deal with, and I quit after four days. I’m back on Mail.app now. I probably missed out on some of GMail’s goodness, but overall, I left feeling exactly like I did going in. GMail has its advantages, but to me, it seems like they’re vastly outweighed by the absence of basic functionality and elegance present in all desktop mail clients (and by additional features in Mail.app) that I just missed too much. Feels good to be home.