Submitted by David Hobbs on 15 January 2008 - 11:56pm
I've been thinking about and researching how an institution can share its data, documents, and other content. Obviously your data and content is already exposed via the web, but providing the data in a more structured way allows more users (both internal and external) to manipulate the data in interesting ways, for example in mashups. There seem to be a few ways to share data from an enterprise with a lot of content:
Straight RSS/Atom. Although straight RSS/Atom (with no custom extensions / namespaces) may not be that interesting, it's obviously a useful way to get your content out there. Typically straight RSS/Atom is fairly time-based and might in effect show some history (news items like "John goes to work" and then "John goes home") rather than some state (like "John is now home").
Common repositories / services such as Swivel and StrikeIron. Rather than exposing your data/content directly to the outside world from your site/servers, you can use an intermediary. Swivel allows users to create their own graphs on fata from either official sources or any user-supplied data. StrikeIron is built into mashup editors like QEDwiki, and also has built an extension to Excel to call their services. You probably would want to provide data to these services through an API of your own, but you could get started with Swivel for example by directly uploading the data.
Specialized XML formats for particular types of content. Examples include OpenSearch for search results and SDMX for statistical data. These specific XML formats both allow a level of sophistication for people specializing in your type of content and allows tools built for this type of data to consume it. This fits in with the following item, which, for historical reasons may or may not be XML-based.
Institution-to-institution services. Sometimes you need to provide a point-to-point interface with another institution. In that case, you may need to support all sorts of unusual formats and delivery mechanisms. Hopefully you could leverage your various systems' web services to just transform the data into the formats you need.
A common API that your institution follows across all types of content. This one is the most interesting to me and one that I alluded to in my previous post on interaction publishing. Especially if your institution has various repositories, one possible approach would be to slap up a page that has links to the different instructions for referencing each. But, to make access as easy as possible, a common API with consistent parameters that can be queried against all systems would be preferable (for instance, queries such as "give me all your documents and data on Chad" via url requests like http://xml.example-domain.com/apis/type=docs,data&country=td). Potentially the returned XML could be in a simple format such as RSS extended with a custom namespace (so that other tools such as Yahoo Pipes, and even feedreaders, could easily consume the data).
Microformats. Probably most useful to future browsers or other tools like the Firefox Operator extension (or for services that crawl sites such as Google), microformats allow you to just change your existing HTML a bit to expose very common types of data like address and calendar events. For example, instead of your HTML having "100 Main Street, Anytown, USA" it would be marked up as "<div class="adr"> <div class="street-address">100 Main Street</div>, <span class="locality">Anytown</span>, <div class="country-name">USA</div></div>" and then define the CSS to show it as you wish. For example (with sloppy CSS):
100 Main Street
See how this page appears in Firefox Operator (also notice the tagspaces):
Drupal. Drupal drives this site, and I have been especially impressed by its clean architecture for adding new features/modules and by the strong community supporting it (hence the clean as well as powerful out-of-the-box experience). I wasn't as impressed when I briefly played with Drupal a year or two ago, so I sense that Drupal really now has a critical mass behind it. References: my post about my first month using Drupal, the Drupal development book, and drupal.org.
Sophisticated analytics for the masses. Although tools like Omniture SiteCatalyst are still more sophisticated and customizable, Google Analytics is really amazing, especially for a free tool: very nice user interface, sensible defaults, campaign tracking, user-defined dashboards, good reverse DNS lookup, and fast. See the Analytics Talk blog for more on Google Analytics.
I also see a lot of opportunities for improvement in 2008:
More sophisticated offshoring models. The naive view of offshoring goes something like this: if someone costs $X per hour in your country and $X/3 per hour in another country, then it would seem to obvious to give the work to the offshore resource. Sometimes this works. Highly repeatable tasks are the most obvious (for example call centers). Also, it often works when you can hand off a specifications document and then wait for the implementation, although this Wall Street Journal article (subscription required) on the outsourcing problems of the 787 points out interesting issues there too: like your outsources suppliers outsourcing to their own suppliers, quality control/process issues, and taking for granted expertise/background built inside Boeing when handing off to suppliers. If the task isn't highly repeatable or very tightly specified, then the overhead of communications/management is very high. I would also expect that places that are currently considered "offshore" will be developing innovative products themselves (see this blog post: State of Innovation in India).
Improvements in single sign on and passwords. If I go to Amazon and then B&H now, I have to log on twice. Worse, if I go to very small sites I have to create a separate username/password (it's one thing to trust Amazon with my password, but why should I trust a very small site with that information?). I plan on adding OpenID for accounts on to this site, and I would encourage others to add it to theirs (many platforms such as Drupal now support this). OpenID allows the user to decide who they trust to keep/authorize their account information (notably password) and you chose what information to give to different sites. Once you log in once, you don't need to provide your password again when you go to a site using OpenID. Hopefully at least smaller sites will start adopting OpenID, but it would be great if this was adopted by larger players as well. I'm still hoping for a replacement of passwords entirely, perhaps by graphical methods, (how archaic is remembering a bunch of passwords, or, worse, if you force users to use "strong" passwords and change them a lot, then they'll just write them down?), but at least reducing the number of accounts you have would help.
Mashup building for the masses. Although APIs and mashups have taken a big stride forward, I hope to see some standardization in APIs and enhanced mashup editors that allow less technical people to create their own interesting (not only with maps!) mashups. See my Enabling the Interaction Publisher post.