
There were long delays, petition drives, and some final technical hiccups, but WMATA has finally released its schedule data in the Google Transit Feed Specification format. What does that mean? Well, most obviously it means that Google Transit will soon be adding D.C. to its list of supported cities (UPDATE: or perhaps not — see below for a comment from Michael Perkins of GGW, who explains that there are lingering complications surrounding WMATA’s legal relationship with Google). But far more exciting is the opportunity this dataset represents to third-party developers. You can bet that geeks across the region were feverishly importing schedule data into databases last night (I certainly was).
So what’s in a GTFS file, anyway? You can read the full spec here if you’d like, but the short version is actually pretty simple: a bunch of text files are zipped up into a single archive, which can be downloaded from the transit agency’s website — in WMATA’s case, the file clocks in around 20 megabytes. These comma-separated text files have names like routes.txt, stops.txt and stop_times.txt, and they can be opened in a text editor or spreadsheet program. The setup is pretty simple to understand: for example, stops.txt contains a list of bus and rail stops, complete with information like name, latitude and longitude, and assigns each one an ID. stop_times.txt, on the other hand, has a bunch of entries that assign arrival and departure times to individual routes, linking back to the stop information via each stop’s ID.