Blogs

Content Re-Use: Roles, Consistency, and Standards

Content re-use is important for many sites, but the implementation for large sites is more subtle than it may initially appear (some issues). In order to implement content re-use successfully, you must consider:

  • The exact functionality of the automatic pulls (it isn't just flipping a switch)
  • Metadata quality (not just defining the taxonomy and hoping for the best)
  • Who can do what

In this blog post, I'll explore the third item: roles in content re-use for large sites. These are some of the roles relevant to content re-use:

  • Content Contributor
  • Metator
  • Page Owner
  • Template / block Designer
  • Developers, including DB, Platform, and Site Developers

Why Roles Matter

For the end user's sake, you want your site to be consistent. For the ego of site owners, they may specifically want inconsistency. By consistency here, I mean look and feel as well as consistency in the way content is aggregated. For example, if one section of your site has Current Events that are two years old, another that only lists content in the future, and the RSS feed lists events that aren't even published, then it will be confusing for the user (sounds "out there", but I've seen this occur!). By enforcing what different groups (from database developer to content contributor) can do, you can control these types of issues. Ideally, you enforce as many of your site standards as far back on the left of the graph as possible.

Content Contributor

You could have the most beautiful design and CSS in the world, but then the content contributor could set text to a font that's not in your standard (can your users set font, or just CSS styles)? This is an easy example, but not as crucial as other standards such as the length of titles (longer titles may gum up content appearing in smaller blocks) and tagging. The impact here is setting the relevant standards, training on the standards, and enforcing the standard as much as possible in the content entry interface.

Metator

Who will be responsible for the overall quality of the tagging, key to content re-use? You could have someone dedicated to tagging (a metator), train the content contributors to do it, or have a librarian defining rules for automatic tagging. At any rate, who gets to do what tagging is a key decision. There is no free lunch here, so whichever way you go will require resources. For example, if you do automatic tagging then you do need someone to effectively train the tool (and then decide whether you remove the right of the content contributor to tag).

Page Owner

Since we're talking about content re-use, the difference between a page and piece of content is important. A good example would be this article: you might be reading this in a feed reader, on the HobbsOnTech home page, or on it's own dedicated page (it's permalink). And you could be seeing a link to this piece of content in a variety of places (for example, the See All Articles page which is automated). So the content is re-used on a variety of pages.

The content contributor is considered above, but there may be page owners as well. For example, there could be an owner of the home page that determines what content shows in the top block of the page; in this case, the owner would be playing the role of an editor. Again, this is the editor of a page pulling content and not the content in the first place. You need to define what the page owner can and cannot do. For example, can the page owner chose what appears in all blocks? If they can, is it pre-filtered for them (for example, if the topic of their page is Cycling, can they even choose something that's not tagged to Cycling)?

Developers

For a large site, you will have developers of different skill levels and experience with your environment. Instead of lumping all developers into one group, I would propose considering database / platform developers separately from site developers. This distinction probably only makes sense for large sites, but in that case it's important to at least consider.

Database Developer

For starters, there usually is not a separate role for the database developer. Perhaps only one person can change the schema, but anyone can write whatever query they want to get to the data. Of course, for small sites this is probably desirable, but for large sites (especially with high turnover amongst developers), this can result in fairly major issues. An example would be that if a "published" flag needs to be checked by anyone querying the database; in this case, a new developer could easily create an RSS feed that exposes draft content. A more robust solution would be to have a layer/API implemented by a database developer that is the only way that a site developer can get at the data. So, for example, this would mean that new site developer can't even get at draft content.

Platform Developer

Much of the platform will already be built into the CMS that you use, but inevitably you will make changes to the platform. By platform, I mean the core driving code of the site. For instance, I would consider the basic site-wide page template to be part of the platform. Ideally, much of the functionality of content re-use will be built into the platform, so that it's not done over by the page template or block developers (leading to inefficiency and possibly inconsistency).

Site Developer

The site developer implements the various page templates and re-usable blocks of pages (this is assuming you have multiple sites running off the same platform). A lot of the rubber-hitting-the-road happens here, but hopefully the platform and database access has been defined in a way to make it easier to develop consistency. As with all the developers, the site developer needs to make sure to embed as many of the site standards right into the page template/blocks. The site developer will also be developing what components the page owner can modify.

Layers and Enforcement

One way of looking at these roles is through the lens of the layers of people involved in creating a page on a web site:

  • The DB and Platform developers set up the system for the site developers and content contributors
  • The site developer defines the templates that page owners then use for their particular pages
  • The content contributor publishes the content, possibly with a metator to ensure the tagging is correct
  • All this together renders a successful page

roles-in-content-reuse

A key point is that ideally you would have your site standards enforced as deep in the system (as far to the left of the top track as possible in the diagram) as possible. Some examples:

  • The DB developer has hopefully created a layer such that no one else can even get at draft content.
  • The platform developer sets up a platform-wide basic template such that key elements cannot be overriden by a particular site developer.
  • The site developer creates a page such that only appropriate parameters can be set by the page owner (a topic page owner, for example, cannot include pages that are not tagged to the topic

It may be that for some sites different roles make more sense, but the general point is that for content re-use to work, you need to carefully consider the roles of who can do what. Also, you want to implement standards as deep in the system as possible.

Compelling Vision for a Large CMS Migration

arrow-with-red-dot

Large site migrations are messy, and it's easy to lose one's way in the muck. One way to help keep everyone moving in the same (productive) direction, and to set things up for better success in the first place, is to define a compelling vision of what you want your site to be.

What is a Compelling Vision?

A compelling vision is a simple statement, in terms that all stakeholders can understand, of how the migration will result in a substantially improved site. This vision must be concrete enough to prioritize tasks / functionality / content during the migration, and also motivating enough to mobilize everyone (and tolerate the inevitable hiccups).

How do you know you have a compelling vision?

  • Most stakeholders will say it's compelling. In other words:
    • It can't just benefit a small group of internal users (although potentially it could just satisfy a small group of customers!)
    • It will be understandable to all levels and functions within the organizations (which may mean it's written slightly different for different groups). This may be what differentiates the compelling vision from the business case, which may just be interesting to upper-level folks.
  • The vision is for a substantial improvement from the current site.
  • Translates to a prioritization for moving forward (so that you're always moving toward your priorities and not meaningless busy work).
  • Justifies doing the migration.
  • Short (a sentence to a page max).

Some stakeholders will not find it compelling, at least for large and complex migrations. Some will lose out. An obvious (although usually not honestly confronted) issue is some may lose some control that they currently have. For instance, a group may have been publishing their own sub-site when it does not make sense institutionally. Or someone that's been used to working in Dreamweaver and uploading the site may feel that they are severely limited (they may even declare that the system is lame, and their boss who may not know any better may believe them)

Using a Compelling Vision

Having a compelling vision that no one knows about or uses is not especially helpful. So what do you do with this inspirational statement once you have it?

  • Broadcast the vision widely. Everyone should have the same understanding. Of course, hopefully folks were directly or indirectly involved in creating the vision, so it should not be a total surprise. That said, this vision should be repeated frequently (at the beginning of status reports, presentations, etc) to remind everyone what the migration is attempting to accomplish.
  • As migration moves forward, refer to the compelling vision. This isn't just about communicating it (the previous point), but to anchor the ongoing work. For instance, when working on functionality or content, make sure that it moves toward the compelling vision. Or when deciding how to phase the migration, consider ensuring that each iteration inches forward toward satisfying the compelling vision.
  • Prioritize based on the vision. When weighing how to proceed during the migration, try to justify based on the goals.
  • Clearly articulate that downsides (see above).
  • Set metrics to evaluate the migration based on the vision.

Some migrations' "compelling" vision goes something like: "You must move in or else, mandated by management". Obviously this doesn't deeply meet any of the checks of a good compelling vision, and those that it partially meets it distorts. For instance, if the only goal is to get content into the new system, then you'll wind up pushing for maximum progress on the percent of content items that have been migrated (which may mean moving in the 90% of easy but less essential content and setting up a train wreck for the 10% that requires more complex functionality and potential re-tagging of the initial 90% of content).

A more compelling vision would be something like "Authoritative source on large repository of information available via a variety of channels (web, RSS, API), different types (raw data, analysis), and selectable by topic pulling from a variety of back-end systems. Topic pages should be meaningful, consistent, and relevant to the public." This vision drives another type of migration, where you might start with those topic pages, iterating on the content and rules to improve the quality.

In summary, a compelling vision clearly articulates an end state for your site after your migration is complete, to help focus the team and project to success.

Have you had success in creating a compelling vision for your large site migration? I'd love to hear about it.

Automatic Pull of Content: Some Issues

atlas-district-news

It seems so simple. You've got press releases that are clearly tagged to neighborhood (let's say the two possible neighborhoods are Capitol Hill and Atlas District). The Atlas District page should obviously only have Atlas District news, so you create a a section on the Atlas District page that lists the most recent three press releases there. Your web developer whips something like this up quickly (examples from the excellent local blog Frozen Tropics):

Possible Issues

Seems easy enough, right? Sometimes the straightforward approach may be fine (especially for small sites), but you could wind up with something more like this if you're not careful:

atlas-district-news-bad

Here are some of the potential issues with larger sites:

Drafts and embargoed material

"this should not appear anywhere, in any channel, until published"

Let's say you're about to post a press release containing the menu for a new restaurant in the Atlas District, and you've agreed to post it after 7pm tonight. You'll be working on a draft beforehand so that it's ready to go at 7:00. Obviously, the press release shouldn't appear until after approved time. This is more significant an issue than it appears, since if you start exposing APIs and other means of sharing your content, the same rules should apply there (rather than developers recreating the rules, and potentially introducing errors, every time).

Editorial decisions

"yeah, but I don't want it on my page"

A press release is published that is related to both the Atlas District as well as Capitol Hill. Perhaps it's about a bicycle race that will result in street closings in Capitol Hill but only parking in the Atlas District. The owner of the Atlas District page doesn't think it's significant enough to appear on the Atlas District page. This would be a case where the tagging to Atlas District is correct, but there is a valid editorial decision to not include it on the Atlas District page (perhaps there's another separate event there that should be in the top three). In this case, the press release should not be retagged to remove Atlas District, since for some purposes (such as enterprise search) you will want the correct tag.

Bad Tagging

"this tag is just wrong"

This one is virtually impossible to avoid when dealing with a large group of people submitting content (although see a related metator discussion about ways to improve this). Let's say that a new person who does not know DC very well arrives, and mistakenly tags something to Capitol Hill instead of the Atlas District (perhaps mixing up 401 H St NE and 401 H St SE). Note that this is very different than the editorial decision issue, although at first blush they seem similar. In this case, the tagging is wrong and should be corrected (or, in the case of automated tagging, the rules should be changed).

Multilingual Issues

"don't show me partial results in another language"

A variety of issues can occur when pulling content in many languages, especially when, as is usually the case, different pieces of content are in different languages. You can end up with too little new content (if you are displaying a page with too little content in that language), or with unnecessary duplicate content (see Interleaving Languages).

Broadcasted content

"I need this important information on all pages of the site"

If you have a lot of publishers and content, you may sometimes have content that should appear in all pages (broadcasts), regardless of what neighborhood the news is about (let's say a press release about Washington, DC overall and not specific to a neighborhood). What you *don't* want to do (but may indeed do in a crisis if this wasn't planned for) is tag content to all neighborhoods, for example, to have content appear there although it is not correct to tag it so.

Appearance of Timeliness

"a year old press release isn't 'current news'"

If you end up with a lot of automated pages (for instance if you cover 30 different neighborhoods), then it's easy to wind up with the block that says "Current News" that has very old content. In addition, if you are displaying events then events that are far in the future could overwhelm an event happening tomorrow.

What to do about it?

In future blog posts, I hope to cover approaches to avoid these issues, but in closing I thought it would be helpful to list some high-level pointers:

  • Clearly articulate how you want your automatic pulls should work, as early in your process as possible.
  • Don't think of each block in isolation, but try to implement things in a consistent manner (for instance, by only having page blocks behave in a few different ways)
  • Similarly, consider whether developers should have control over all aspects of each block, or whether much of the aggregation should only be available through a consistent API
  • Be mindful of the issues above when designing your page/block behavior and training of those that will be tagging.

As always, please provide any comments at HobbsOnTech or on Twitter at @jdavidhobbs.

A Recap of Two Years of Blogging: CMS Migration, Large Site Issues, and more

After a long hiatus, I plan on focusing my blogging energies back on the Hobbs On Tech blog. Looking at my posts over the last couple years on the Hobbs On Tech and WelchmanPierpoint sites (see full list of articles), I see some themes: CMS Migration, Internal CMS Product Management, and Content Re-use / Large Site Issues. As I reflect on future blog entries, I'm stepping back and thinking about what blog posts have been the most successful in my opinion.

Client choice via a la carte menu (using spreadsheet)

Sometimes you want to give your client a variety of options to choose from the next phase of a project.  For instance, if you are developing a web site for a client, you may wish to discuss a variety of possible features to implement.  Since it can often get confusing for both the client and yourself when negotiating different features, one approach is to give the client a spreadsheet where they can try out different combinations of features to figure out what they want to pay for (this is especially useful when both you and the client have far more ideas than will fit in the budget).  By also providing the option of tentatively planning for future phases (as well as tentatively indicating items that the client will probably never want), the client can better envision what they will get in the future as well. 

One useful way to present this is in a spreadsheet where the rows represent the features and the columns represent different possible phases -- a particularly straightforward method for the client is to allow them to mark an "X" in columns to indicate what features should be delivered when (see example spreadsheet below).  This can be accomplished using the DSUM function in Excel and other spreadsheet programs (the example below uses EditGrid, although it can be downloaded to Excel if you wish). 

Here is a non-interactive example sheet (you can also play with an interactive version of this spreadsheet, including moving around the selections of features in each column/phase -- WARNING for Firefox users: if you are using Firebug disable it before going to the interactive version due to a bug in Firebug):