Notes on Open APIs

Geocachingicon Readers following this blog have seen my occasional references to geocaching – a sport/hobbby/pastime that Miles and I do quite a bit of, which involves using a hand-held GPS to place and find hidden treasures – either in the woods or in the city.

One of the many unusual aspects of geocaching is the fact that it relies completely on the existence of a single web-based database, represented by the site geocaching.com. As web-based database applications go, the site is a modern marvel. The database represents hides, finds, people and their discovery logs, travel bugs (ID’d items that travel the world, hopping from container to container), and more, all sliced and diced a million ways to Sunday. The site is deeply geo-enabled, letting users hone in on hides near them, along a route, or near arbitrary destination locations. It’s also one of the best examples I’ve seen of useful Google Maps mashups, relying heavily on the open APIs provided by Google to integrate its cache database with Google’s map database. This is what map mashups are all about, and geocaching.com has done an amazing job with them.

As the popularity of personal GPSs rises, so does the game’s popularity. But when geocaching.com goes down (or slows down), so does the game, which involves more than half a million hides world-wide, and many millions of players. The site, which is, sadly, based on Microsoft database technology and ASP, does go down from time to time (big surprise); it’s a “single point of failure” in bit-space for the entire meat-space game – a precarious position.

We could – and probably should – have a separate discussion about ways to distribute the load and eliminate that single point of failure, either by replicating / load-balancing to other servers elsewhere in the world, or by coming up with a protocol and distributed architecture so the game isn’t in the hands of a single group to begin with. Discard the dependency on a single organization and open source the whole concept.

Lots of difficult problems to solve there, but save that thought for another day. These notes are about Web 1.0 vs. Web 2.0 cultures. Yes, I know those terms are vague and scattered, but for these purposes I’m thinking about one key ingredient of Web 2.0: Open-ness, manifested in technology as interoperability between servers and clients via published APIs.

The ability for people to do cool things with data living on someone else’s server is what has enabled the rapid growth of cottage industries surrounding the most popular web 2.0 sites. There are dozens of external web sites and desktop/phone clients doing amazing stuff with the data living on Twitter. Facebook’s API is credited with the huge ramp-up in that site’s popularity over the past couple of years, as thousands of developers wrote applications to interoperate with the site. RSS/Atom have enabled countless opportunities for interoperability between sites. XML-RPC lets us create excellent desktop publishing tools for posting to blogs of all kinds, and to get our data into and out of web 2.0 sites. Google’s maps API has opened up a universe of possibilities for creative developers working on other sites. Flickr’s open API has created a vast cottage industry of external sites grabbing, slicing, and dicing data on Flickr in creative ways. When a site is built on top of structured data, that data should be available in programmatic ways. Open that gate and let the building begin. It’s not just about technology – it’s a mindset that opens doors.

Compare: In 2008, I can’t even get an RSS feed of my recent finds from geocaching.com. In fact, even though I can see my recent finds through the site, I can’t even create a distinct URL for that view of their data to give you here. Nor can I get an RSS feed of caches recently published in my area. In fact, geocaching.com doesn’t even seem to know that RSS exists – one of the most fundamental technologies on the web in the past eight years, completely missing.

Similarly, there is apparently no way for external sites or clients to programmatically retrieve data from the site. Since the day we first heard that 2nd-generation iPhones would come with a built-in GPS, many of us thought the iPhone would become the ultimate geocaching device, allowing us to go “paperless” from anywhere in the world, without loading up our GPSs with waypoint data before leaving the house. Instead, what we ended up with was a well-intentioned but anemic client called Geopher Lite – a noble attempt to create a geocaching application for the iPhone, but which fails spectacularly for one simple reason: While Geopher can easily determine your current location, it can’t pass that information to geocaching.com and get back a list of nearby caches. And if you select a cache with its built-in browser, it can’t get that cache’s coordinates into its own dataset. Geocaching.com is so closed down that even the most basic level of interoperability is impossible. It’s just sad.

I’m currently in the process of building a geocaching satellite site in Django (more details on that in the future). Not having an open API at geocaching.com is a major pain in the butt, and has put the kibosh on many of my plans. Shortly after getting started, I realized that if I had something as simple as a cache ID, such as GCK6F2, there was no way for me to construct an automated link to that cache’s page at geocaching.com — the cache ID isn’t even present in the unreadable hairball URL (geocaching.com apparently never got the memo that “URLs are architecture, and should be readable / elegant / meaningful). So I asked in the forums whether there was some kind of shortcut URL I could use to redirect from a known cache ID to a cache’s page. I did get a useful answer, but I also had not one, but two very experienced community members insinuate that I was a bad guy, probably intending on scraping the entire database for my nefarious purposes.

This blew my mind. The culture of the site is so web 1.0 that a basic question about interoperability was met with distrust. Not only is geocaching.com lacking the technology it needs to enter the web 2.0 world, it’s lacking the culture needed to support it. In 2008, interoperability between sites needs to be encouraged, not discouraged. Sad that geocaching.com’s traditional closed-ness has created this kind of culture.

There are many things I’d like to do with my project that I won’t be able to do as a result. But I do plan to respect the geocaching.com terms of service, even if I don’t agree with all of them.

The irony is that geocaching.com relies so heavily on the open APIs provided by Google and other mapping services, but provides no open-ness back to the web in return. Imagine using geocaching.com without the map mashups integration – it would be nearly impossible. One would think that the folks at geocaching.com would see their own mashups as an example of the great ideas that bloom when datasets and APIs are open and shared.

I don’t want to give the wrong impression – again, geocaching.com is an absolute marvel, and one of my favorite web database applications in the world. Hats off to everyone who’s labored on the site over the years; you’ve built something really incredible. I really do appreciate your work. But it’s time for change.

In a perfect world, geocaching.com would ditch the Microsoft technologies it’s sitting on and re-write the entire system in Django, being sure to build open, published APIs into every imaginable corner of the system. Then, to solve the reliability problems of the site, move it all into Google App Engine, solving the scaling problems for good (App Engine happens to love Django, but that’s coincidental). Finally, sit down with all of the geocaching.com employees and explain to them that it’s time for a culture shift — that it’s time to enter the world of open-ness and interoperability that transforms sites from walled gardens to thriving platforms. Then just sit back and watch as hundreds or thousands of add-on sites and services bloom, possibly leading to entirely new modes of geocaching.

I know, pie-in-the-sky stuff, not likely to happen. And I don’t like to come off as an armchair analyst, pretending to know what’s best for a site I don’t own or control. Re-building / rethinking geocaching.com would be a massive undertaking. I don’t want people telling me how I should throw away my labors of love and start over, so I’m loathe to suggest same for someone else’s project. On the other hand, geocaching.com is a resource for the web community, and it’s not keeping up with the technologies that drive modern web communities forward. I’m just dreaming aloud here – take it as such.

Music: Sun Ra and His Arkestra :: Next Stop Mars

21 Replies to “Notes on Open APIs”

  1. Hi, Scot!

    This is a very insightful blog post. My family and I enjoy geocaching as well, so I feel like I know where you’re coming from.

    Until I read your article, I had no negative feelings about Geocaching.com, but now I’m frustrated by their unwillingness to share their data. I also downloaded the Geopher Lite application on my iPhone and was shocked to see that I had to manually enter coordinates. Can you imagine how beautiful it would be if it were automated? One button for find nearest geocache and so on?

    Anyway, I stumbled on this post as I was researching resources for building a customized geo-coded database. Do you happen to know if there is anything open-source that gives simliar functionality to the Geocaching databse?

    All the best,

    Izzy

  2. Hey Izzy – Yeah, the lack of openness at geocaching.com sucks, and so does Geopher Lite. But … very good news: An official geocaching app for iphone is on the way. They claim they’re still waiting for approval from Apple on getting it released… it’s been a while now. Hope it won’t be much longer.

    Are you referring to building a custom geo-coded db for the iPhone, or for a server?

  3. I guess I’m referring to both. I need a server side app that will be highly searchable by geo-codes, and then I want to develop it into an iPhone app later. The iPhone app will cost money because I’ll need to hire someone to develop it, but I’m hoping there’s already something that I could customize in the server-side world.

    Although it won’t be exactly like Geocaching.com and it won’t have anything to do with geocaching, the functionality will need to be almost identical when it comes to searching and so on.

    Thanks!

    Izzy

  4. Would the proposed app/database store actual geocache IDs and their coordinates, or are you thinking about storing generalized geocoded data that’s not specifically geocaching related? I just ask because you could easily bump up against geocaching.com’s proprietary walls (same ones described above).

    I’m asking all these questions because I actually have started work on a semi-secret server-side project that sort of aligns with what you’re describing…

  5. No, it wouldn’t need to store actual geocache ID’s because it won’t have anything to do with geocaching. I guess what I’m looking for is a way to have a database of locations with GPS data, and the way to search things like “Locations near me”. It’s sort of like Points of Interest except not restaurants, gas stations, or anything like that.

    If the project you’re working on has this kind of functionality, we should definitely talk!

    Right now I’m trying to see if there’s a way to use Pligg with custom fields for GPS coordinates. Since I’m not a programmer, that might be easier than starting from scratch or hiring someone to create it…

  6. Izzy – Very interesting. Yes, there will be some similar functionality in the app I’m building. Part of me wants to jump up and say “Let’s work on this together!” The other part – the realistic part – knows how time-constrained I am, and that committing to a major project on any kind of deadline might be fool-hardy.

    What I’d like to do is try and move this project a little further along than where it is know – get it to where I have something worth showing – then contact you again.

    I should note that I’m building this app in Django – a Python web framework. I’m committed to that platform, so let me know now if that’s a show-stopper for you in any way.

  7. Hello. Great article. I’ve been thinking exactly the same thing – I wanted to roll my own Symbian app for my N95. Geocache Navigator is good, but I’d like to see multicaches also, etc. I was also thinking of an associated site to keep track of routes, geocoded photos, planning caches, etc. I’ve been seriously disappointed about just how locked down the geocaching.com data is. I have however come across the following (though I haven’t had time to test it yet):
    http://code.google.com/p/geotoad/

  8. Patrick – I’ve actually got plans in the back of my head for a somewhat similar site. Thanks for the geotoad link – sounds super useful!

  9. Completly agree with this article. It’s really sad that gc.com keep the data for them so hard. In fact, the data are made by users, so they can’t talk about copyright.

  10. This is all so true. Being a part-time server-systems programmer, and website designer the first thing I wanted to do when I started exploring geocaching was expand it’s horizons. But all I got was a fort-knox of a system.

    It’s ironic that a website that relies on google API is so archaic that it won’t implement an API of it’s own. But are we suprised? It is running on a single server! And did you know that one of the core developers that worked with helping create the GPS systems that geocaching is foundationed on used to work for microsoft, that’s probably why the website is based on a microsoft server.

    I for one think, if anything, the geocaching web development team are just a bit lazy, and a bit ‘Microsoft’ in its approach to being open – especialy being open-sources.

  11. I fully agree that it would be very nice for GC to have an open API that developers could use to build huge applications to enhance the love of geocaching. The problem is, the ultimate thing that folks want out of the site is the cache listings and by opening an API to let users access that data, GSP would immediately open themselves up to a huge financial loss.

    I say that because currently, the only way to get a good amount of data on caches in their system is to use pocket queries, which are available only to premium members. Those members pay a yearly fee to get at that data in small chunks. If GSP opened the API allowing folks to hit their system with other apps, that would negate the premium membership.

    Since most people get a premium membership so they can do PQ’s, why would they continue to pay that fee if some other app will get that data for them for free? You can say all you want about “supporting the site” but most folks would rather get something for free than to pay for it.

    This is why I don’t ever expect to see GSP to open their API up unless they change their fee structure. The premium memberships pay a chunk of their costs so unless they can offer up something better than what the API gives (which perhaps they could), I doubt this will ever happen.

  12. @Kirgy – Yes, well put. Yes, it’s very ironic that they thrive on the back of open APIs but run such a closed system. And I’m sick of the site running on a single server. I recently had to wait several days for a PQ to be generated. The performance issues just never seem to go away.

    @Zor – I think you’re kind of missing the point. This is our data they’re making money from. Yes, opening it up would result in them losing premium memberships. Instead there would be dozens of sites competing to find ways to make money off the same data set. Competition is good. Maybe the whole thing could turn into a more wikipedia-like venture – an open source model with open data controlled by the public and shepherded by a foundation that lives off contributions. Kirgy is right that the whole thing exists in a very Microsoft-like culture right now. It needs to be transformed into something that better fits the contours of the modern world.

  13. Shacker,

    See, I have mixed feelings about the whole idea of other people starting using the data set to make coin. If User A uses an open API to snag the entire GC database, then User A can go out, start his own listing service, and completely put GSP out of business. Now, if User A does good with the data and provides a FAR better service than what GSP has, hey I’m all for it. But in all likelihood, User A, B, C, & D would come out with competing services and then people would be at a loss as to which one is the “right” database. Oh I can use my Android phone with User A’s service but not User B’s, but User B has 2000 caches in their pocket queries. What do I do?

    The idea that everything should be open and available to everyone is a great idea but the implementation of it needs to be done right otherwise you end up with a real mess.

    At the end of the day, GSP is a for-profit business. Whether they run Microsoft, Linux, Mac, Nintendo DS, or whatever as their base is irrelevant. To them, they have a business model that currently works and obviously works very well otherwise they would have folded long ago. The ramifications of them simply “opening up” their database, whether the data belongs to them or not, means a DRASTIC change in their business practices which for them could be disastrous.

    Let’s say they opened it up, and plenty of people got a copy of the DB, and then within 6 months, GSP went under. What would happen? You’d have a bunch of other people trying to recreate what had already been done. Doesn’t that seem a little ridiculous?

    I think that unless GSP has a way to continue to guarantee it’s income, a fully open API will never happen. To turn it into something like wikipedia is all well and good to say that it can survive on donations is fine too. But if you were the owner of a business who charges people for a service and you’d been active doing so for many years, there’s no way you can tell me that it would make sense to just start giving away that service and ask for donations to keep it active. From a business owners perspective, it’s suicide.

    I’d love to have an API to do all kinds of cool stuff with my listings. But I just don’t ever see it happening.

  14. You make some great points Zor. I think we can agree that this is a case where it is essential that there be a single canonical source of information. For example, if a cache gets archived, that information needs to become available everywhere immediately. Otherwise people end up wasting time and energy on wild goose chases. It benefits users (not just GSP) to make sure there’s only source for the data (or that all databases are in constant sync). That’s why I advocate for a more wikipedia-like approach.

    It sounds like you’re saying you’d like that, but that GSP would never go for it because they won’t willingly give up their profits. And that’s probably true. But that amounts to defending their “right” to an effective monopoly, their “right” to a profit from our hard work and not allow anyone else to.

    Yes, this is kind of a hard problem. At the same time, when you look at the huge flourishing of add-on products and services that have been added to the Twitter experience on account of their openness, it’s really hard to not want that for all of us.

    I almost wonder what a lawyer would have to say about the legality of GSP having an effective monopoly over a game that belongs to everyone, even in other countries, and profiting from data that they could not have gathered without us. IANAL and have no idea, but suspect that something in this scenario is probably illegal. Not sure.

    I’m not saying GSP have ill intent – far from it. But when we say they’re “Microsoft-like” we’re complaining about their attitude toward people and business, not about the specific platform they’re on (though I would LOVE to see the site rewritten in Django).

    Remember that with page scrapers, we could already make copies of the entire database. It can only be so locked down.

    A lot of the features we’re wanting here don’t amount to asking for access to all of the data. For example, a 3rd party geocaching client for the iPhone should be able to submit logs to the site through an API. I should be able to get an RSS feed of my own finds. Etc. etc. It is very conceivable that these kinds of things could be done with the existing ownership and the existing technology, with no loss whatsoever to GSP.

  15. Ahhh, just ran into this this very morning…

    I’m working on the beginning fooling around stages of a gps based game that will act as an “overlay” on existing caches. So I went digging for the api…

    Instead I was met by zero information and only hostile comments on message boards. Seriously, have people never heard of rate limiting? It’s not as if you open up an api and then have absolutely no control over the resource usage.

    One of the most attractive parts of geocaching to me is the lateral thinking involved. I’ve seen the most amazing caches, seriously things I never would have dreamed, and its all because when you have a simple framework to work within and just let your imagination loose, amazing things will happen.

    The parallel to this would be geocaching data of course. There are so many ideas out there, so many *new* additions to the sport as well as many things no one has even thought of yet, simply because this data is locked down.

    The perfect point of this, is as you mentioned above, the google maps api. How many entire businesses and apps are based off of that?!?

    I’m on the verge of scrapping the whole game idea and starting a truly Open geocaching site from the ground up! :-)

  16. Zach – Agreed on all fronts. It’s a shame to see all of our data locked up like this. Still, best of luck getting your game idea off the ground.

  17. Great post that still applies now. I’m a developer that was just getting into geocaching and hit the same wall / hostility as other’s have mentioned here.

    I did find this site: http://www.opencaching.us/. I know there are several dozen small cache sites out there, but I think these guys have the mission and polish right. They just started a couple months ago.

    Try checking them out. Maybe post a cache or two. I’m not proposing a switch (you probably don’t have many caches listed in your area anyway), but throwing a few caches at it might help get the ball rolling. There’s a chance that the kind of people who value an open infrastructure post higher quality caches anyways. That’s my 2cents.

  18. Easter – Is it OK to list caches there that are already listed at geocaching.com? (can they be duplicative?)

    Out of curiosity, why do you think there aren’t may caches listed in my area? We’re crawling with them!

  19. OK, just re-posted our first cache, Jewel of the Nile, on opencaching.us. Site’s coming along well! So happy to see them doing this, but so sad to see them doing it in PHP. Ugh. I would have enjoyed contributing code to the site if they’d built it with a modern framework like Django. What a missed opportunity. In any case, I’ll get the rest of my caches in there soon.

Leave a Reply

Your email address will not be published. Required fields are marked *