Blogs

Why are HTML editors in CMSes?

This may be a sensitive issue to many, especially those that like getting their hands deep into HTML, but should we even have HTML editors in CMSes?

editor-without-deep-control

Most users don't care about the HTML details -- they just want to make their content look nice. For instance, my last blog postincluded a table. Since I didn't have a quick and easy way of embedding a nice-looking table, I created a table in InDesign, took a screenshot, and then included that screenshot in the blog post. Yuck. Obviously not a good thing to do, but I didn't want to spend a few hours getting an HTML table looking half-decent. A couple days later the table was still bothering me enough that I did spend an hour or two (using help from Smashing Magazine) to put in a pretty nice-looking table into the blog post. But it was a one-time workaround and I don't have an especially clean way of doing this the next time I need a table. People do these sorts of workarounds in CMSes all the time. When this occurs on a big site, the extra work, inconsistency, quality and frustration multiplies quickly.

Just taking a table as an example, here's the process to creating and using table that would be great:

  1. User indicates where they'd like to insert their table.
  2. User creates the table content, giving headings to the table, perhaps merging some cells, adding columns and rows, highlighting some particular cells, and inputting content. Then publish.
  3. The table then appears on the site with a nice-looking, consistent format -- let's say it has rounded corners.
  4. User adds a column and row, and the rounded corners still appear.
  5. Globally the CSS is changed in one place in one file and the highlighted cells are now longer yellow but blue. A similar change could be made for the rounded corners, cell padding, globally dropping Arial for Verdana, etc.

Notice a few key elements in this:

  1. User doesn't have to know anything about HTML (hiding for instance the fact that the first and last column and first and last row have to be classed differently)
  2. The table looks nice and will be consistent across the entire site without effort by the user
  3. Styling changes can be made globally to all tables

When putting together requirements for selecting CMSes, I often hear about the need for the editor to completely control the HTML. But I think this is based on a broken model since most users are just dropping into HTML since the system isn't set up to easliy allow them to do what they need. Another reason this model is broken: it shifts the burden of consistency across content to training / documentation / content publisher when the system could both be providing consistency for the site visitor as well as easier use for the content publisher.

When looking at improving your system or selecting a new CMS, consider how common elements are published and whether deep HTML editing is needed for your content publishers. Obviously this will depend on your industry (for instance newspapers would want very streamlined publishing without HTML control) and how distributed your publishing is (this is more important for a large number of publishers), but make sure to think about the tradeoffs of deep control and ease of publishing.

 

Content Migration: What Can Be Automated and What Must Be Manual

Understandably, teams often focus on what tools can provide for migration. On the flip side, many teams are too knee-jerk about getting a legion of people to migrate content by hand. The decision of automation or not is probably a bit more subtle than immediately obvious (it's not an either/or, and there are degrees of involvement needed by people), but one that should be considered carefully. That said, let's assume you have a team and the tools to do a lot of automation in your migration. In order to make sure that you have a successful migration, make sure to consider the things that cannot be automated. Here's a quick table listing some items that could be automated and those that cannot:

Migration Activity
Can By Automated?
Note
This is a revised HTML version of the table. Click here for image of original table.
Turning into valid XHTML Yes  Fairly trivial, but focus of some projects
Moving into new DB schema Yes  Focus of many Tool X to Tool Y discussions
Placing content based on rules Yes  
Stripping out extraneous page info Yes  
Transforming to use new CSS Yes    
Scraping out structure from HTML Yes   Systems Integrators often avoid
Dealing with links between content Yes    
Track Progress Yes    
Automatic tagging Yes   Depends on domain and training
Training/designing/configuring tools No Automation isn't entirely free
QAing automation No Probably not 100% but checking required
Dealing with unusual cases No
Defining new site vision No No unifying vision might mean loss of focus
Editorial changes No A lot of content will require editorial work
Internal communications No How communicate about changes?
Product Management No When issues raised, what will be fixed?
Training No How train people for new system?
Defining team No
Defining site behavior No Contant won't be moved into vacuum
Content strategy No
ROT cleanup No Analysis is aided by tools / decisions by ppl
Site governance No
Defining taxonomy, IA, etc No Directly affects automation efforts

Content Migration is Interesting, Really!

One of the underlying themes of content migration is that it's boring, or just a necessary evil. Worse, many just look at it as something that doesn't really need to be worried about, since at worst you could just hire a bunch of interns to cut and paste.

Perhaps you won't find content migration fun, but by looking at it through the lens might make it interesting enough for you to increase your chances of website implementation success. Here are some of the reasons you might think content migration is boring, and another way to look at it to make it more interesting:

table-interesting

Cutting and pasting 1) Searching for patterns and 2)Improving your content

Sure, if you are the one that's stuck with cutting and pasting into a new CMS then you've got a boring task ahead of you. But hopefully the migration can be flipped a bit so that it's not structured as a cutting and pasting exercise.

First, consider looking at your content for patterns in the content. For instance, perhaps all press releases entered between 2002 and 2006 were entered in a consistent manner and can also be dealt with (hopefully automatically) similarly. At any rate, by looking for patterns you can consider the migration at large rather than immediately devolving into looking at one piece of content after another. See more reasons for using rules.

Second, consider your broader content strategy, or, perhaps more likely, use your content migration has an opportunity to define your content strategy. Your overall content strategy can help you determine to do with your content, including what content is most important and requires careful editing rather than blind cutting and pasting.

By looking for patterns and considering your overall content strategy you can: a) look for automation opportunities and b) turn the exercise into one of Editing rather than simply copying and pasting (see a comparison between editing and QAing).

One-time exercise Setting up long term program

Migration is necessarily an event. You will probably iterate on the migration LINK, but there is a definitive beginning and end. That contributes to the boredom factor, but, similar to using the migration as an opportunity for content strategy, you can also use this as an opportunity to set up a longer-term program. You will need to set up a team, training, processes, product management, and other factors in your migration. By trying to look at as setting up your overall program than just what is needed for a single migration, the task takes on a more interesting color.

Unending Develop tracking metrics

Especially for large migrations, the entire undertaking can seem almost unending. By looking at it monolithically, this encourages the "well, we better just get started" approach. But I would encourage you to look at developing useful tracking metrics that will help make it feel a bit less insurmountable. Also, looking for patterns is related since you may track against those patterns, and patterns can also help you determine what level of quality is needed for different content.

Unimportant Critical to success

Most of the macho talk around a web site might be about things that lots of people have opinions on like design or technical buzzwords. Many of us can hold our own in these discussions, and much of the time these are important and interesting discussions. But the migration task is often viewed as something that can be dealt with later, or, as mentioned above, dealt with by interns at the last moment. Why migration planning is important? Here are three quick issues:

  • Migration can take much longer than expected, which means that from the project perspective there's a sudden slip at the end. Careful planning including a pilot can help reduce this risk.
  • Related to the above point, if you don't plan your migration carefully, you may realize systemic problems with your implementation. For instance, you may discover functionality problems with your site, and by dealing with them as a surprise at the end is more problematic.
  • Not planning effectively can mean problems persisting in your site that are difficult to correct later.

I'd like to make sure that as an industry we're planning and executing migrations better (one of the reasons for writing the Web Site Migration Handbook). Did this post help convince you that migrations are a bit more interesting? I'd love your comments, either on this blog or on Twitter at @jdavidhobbs.

Touchpoints During Content Migration: QA and Edit

Content migration isn't just a technical undertaking. For example, you will have staffing impacts. In particular, people will need to touch at least some of the content being migrated. Regardless of the quality of any content migration automation, someone will need to QA the most critical pages of your site to ensure unintended issues arose during the migration. Furthermore, you may want to editorially change some of your content to hone the messaging of your site if you are also restructuring the site.

touchpoints

Broadly speaking, these are the steps that involve someone to touch content during a migration: Sort -> Place -> Edit -> QA. Sorting content is the task at the beginning of content migration planning where the disposition of the content is determined, which hopefully is done by defining rules/patterns rather than looking individually at content. So at this point team members need to decide what gets dropped, archived, moved over without change, or edited (see the website migration handbook for ideas on making that determination). Hopefully in the same pass as sorting, you can determine where content will be placed on the new site. Again, ideally this is done with rules rather than a one-off basis.

Actual changes to the content occur at the Edit and QA stages. Since different resources need to do each, the distinction is important.

Edit

Editing is where you are substantively changing the content. In other words, regardless of any technical issues, how does the content need to change? Is the focus of the site changing, requiring a different angle or style? Editing requires either someone with a writing / editorial background or a subject matter expert. For instance, if your site is about photography, then you will need a photography subject matter expert to write the introduction to an entirely new section on the latest technology. Note also that editing could occur either in the existing system or the system you are moving to.

QA

If editing is what the subject matter expert or writer needs to do, actually QAing the content is what someone with more traditional webmaster skills does. The purpose of the QA process is to ensure that the technical migration was successful. In other words, this is more of an HTML/CSS flavor of issues that require web knowledge to both catch and fix. QAing can only occur on the new system. The types of issues here are:

  • Special characters appearing unexpectedly
  • Aspects of the new template "blown out" (where content, images, or tables extend beyond the area they are meant to be within)
  • Strange wrapping of text (for instance in titles)
  • Graphical elements from the old site that do not appear correctly in the new site
  • Elements that have disappeared (it's useful to look at both side-by-side)
  • Elements that were prominent in the old content that have lost their prominence (for instance headers)

See "Ensuring Quality During Site Migration" for more on the iterative aspect of QAing.

When defining your content migration plan, you can attempt to isolate what needs to be Edited and what needs to be QA'd. Then this information can be used to figure out the staffing impact so you can make intelligent decisions.

Rules for Content Migration: Panning for Gold

Moving content into a new CMS comes with many caveats (for example, is it even a good idea, and migrating to a CMS is far more than just moving content). However you slice it, moving to a new system is an important time to think about your content. Specifically, what content should move? Kristina Halvorson's Content Strategy for the Web lists two reasons to have content: a) it supports a key business objective and/or b) supports a user (or customer) in completing a task. So the Compelling Vision is important here (as always!). It's not just about what can be moved in, but filtering out the gold nuggets from the muck (see The Web Diet for more overall ideas on shedding bulk from your site).

Gold prospector

OK, so how do you decide what to move? The most straightfoward method is to have a spreadsheet of content, basically an audit of your current content, and then line-by-line decide what moves over. This can work for smaller pools of content, but what if you have 10,000, 100,000, or more pieces of content? Even with a hundreds of pieces of content this can start being a bit unwieldy.

Having rules (like different types of sieves to leave behind the useful tidbits) for content migration are more helpful than one-off decisions for the following reasons:

Apply in the system on ongoing basis

Why go through the effort of your migration only to have the content go stale immediately? If you decide that press releases older then four years old should be dropped during the migration, then do you really want seven year old releases kicking around your system two years after launch. Similarly, if you decide that pages that are regulation-driven and over one year old should be reviewed, then don't you want that in your new system? Of course, there's a whole other area of governance around actually being able to enforce these rules on an ongoing basis, but perhaps during migration planning is a good time to discuss this.

Better justify drops

In a simple (no beauracracy) environment, justification is not really relevant. For instance, if I want to remove a page from this blog, I don't have to argue with anyone about it. But if you are operating in a large organization, you could wind up in endless discussions going nowhere by reviewing content items on a case-by-case basis (or, perhaps worse, just decide to throw everything into the new CMS again). That said, if you are dropping a piece of content because of a rule (any content of any type that has not been viewed more than ten times in the last year will be archived), it's much harder to argue with (of course, you would probably also need to come up with some sort of very tight exception policy).

More easily to agree first and then everyone do work separately

Related to the above comment, if you agree on the rules first, then you apply those rules and everyone can get cracking on whatever they have to do on the content. For example, if a rule states that a page needs to be updated because it mentions a particular highly negative incident, then the someone can start updating it rather than spending time talking about whether it needs to be updated. Related to this, having some rules in place would better fit into a web operations management framework so that existing teams could make high-impact decisions rather than the impossible task of getting mired in infinite details.

More easily identify disposition

Let's say you have a spreadsheet of 50,000 pieces of content. How do you start? You could just start at the top and work your way down, but how efficient will that be? If you have a rule like any content not updated in the last 8 years gets put into an archive site regardless of any other factors, then you can (assuming you have reliable dates) apply that rule and start working through the content in chunks like that. Note that the process of defining the rules probably means that you need to deep dive into your audit, but the point is that with rules in mind you will quickly see patterns that you can apply to quickly identify the disposition for different content.

Better move content

By looking for patterns, some commonalities may be relevant to deciding how to move content. For example, you may notice that content over a certain age used an old HTML template you forgot about, but that could be scraped easily. Or you may realize that your old working papers can just be moved in as-is rather than needing any manipulation.

Explain to end users if appropriate

If you have these rules in place, then it may in some cases be relevant to your end users. For example, if certain types of content go into an archive after a certain period, you can indicate this to your users. This also gives you a means for public feedback on your decisions.

Reapply rules once you realize you don't have the resources

While going through this process, hopefully you can also take some wild guesses at the effort it will take to deal with different types of content. If working in your content audit spreadsheet, you could always have a running table of the total editing effort. If the effort is high, then you can re-evaluate assumptions, change quality levels, and many other options. One option is to migrate in less. If you have applied rules, then you can tweak the rules and then see the impact on the projected effort. If you went in and did your analysis in a piecemeal manner, then you would now need to go through and have all those negotiations again about what's moving and not.

Hopefully this has convinced you to at least consider defining meaningful rules for defining your content migration. Please share your thoughts or experiences in the comments below.