Content Migration Burden: It's Not Just Automated or Manual
How much burden will each team face?
The content migration process is a bit more subtle than is obvious on the surface. Notably, it isn't just a decision of whether to automate or not, but how much burden the different teams will face. Sure, maybe a fifty page site should be migrated entirely manually, but for anything much larger there will be a blend of automation and manual intervention. Often the technical and non-technical teams don't really understand each other (or the migration process), so the technical team just says it will automate what it can and then hand it to the less technical teams to clean up. The technical team might do what amounts to an automated copy and paste with a twist of trivial cleanup using a tool like HTMLtidy. That's a bit like just kicking a bunch of rocks down the mountain for the people living below to deal with.

Problems with the simple two-step automation approach
Some problems with this technical-team-hands-off-to-other-teams approach:
- Unplanned cost for nontechnical teams. By shifting the burden to the nontechnical teams (also typically known as The Client), these teams then have more work than they were probably anticipating. This approach gives no chance for effective estimation (see Why estimate? I'm not getting more resources for this site migration..
- Lost opportunities. Often a lot more can be automated than is initially discussed (see Content Migration: What Can Be Automated and What Must Be Manual), and by implementing in this two-phase handoff approach these possibilities are easy to miss.
- Lower quality. If all of this manual burden happens at the end of the process, then quality will probably start being tossed out the window.
- Project slips. Obviously another outcome of this approach is project slips.
Dangerous Phrases
Here are some dangerous phrases to listen out for:
- "We'll automate what we can". This is a signal that the technical team isn't engaging in a discussion about the migration but will bang out whatever they can quickly to ram the content into the new system.
- "We looked at the content and it isn't regular enough to automate". If there are only ten content items that are in question, then move on and migration manually. If you have a large batch of content, you might want to dig in further. I once had a system integrator say that just because the phrase "Description" was used on some pages and "Overview" on others that it couldn't scrape out the content components. Wrong: regular doesn't mean identical and it is trivial to deal with these types of issues once identified.
- "We'll turn it into XHTML and then you take it from there". Run. Fast. Turning into XHTML is trivial, and you probably need transformation as well. This is a sign that the technical team is not thinking creatively about migration.
- "Since you'll have to edit the content anyway, you might as well clean up the HTML as well". Remember that editing and QA are different.
- Anything ending in "and you clean up after that". The process should be iterative, not resulting in the editorial teams cleaning up unnecessarily.
A Better Planned Migration Approach
Instead of just stumbling into the migration process, I would recommend a more planned approach. Some ways of doing this (also see Web Site Migration Handbook):
- Do a proof of concept and pilot
- Iterate on the migration (so it isn't just two steps of "automating what we can" and then leaving it to the the content owners to clean up all sorts of problems)
- Keep estimating the cost of the migration, including both technical and non-technical resources, to keep the dialog going about where the burden will fall
- Ensure that all of these types of issues are talked about when initially working with your development shop, system integrator, or internal technical team to help determine an approach to migration instead of just hoping for the best later
- define what acceptable quality is
-----------------
Need help defining your migration process so it will be a reasonable level of effort for everyone involved? Contact David Hobbs Consulting




Comments
Thanks for your comment. I'm an advocate for holistic migration planning, and not for particular tools. So for example the points in the above blog post hold whether you are using a particular tool or custom scripts: you still need to iterate as you go forward rather than one team just shoveling the results of migration to another team. Also, there are a *lot* of tasks that cannot be automated (see http://hobbsontech.com/content/content-migration-what-can-be-automated-a...). Also, any tool or custom scripts still would need to be configured to handle irregularities in the data. As to budgeting, I would argue that by planning more thoroughly in advance these costs aren't hidden, but discussed very early in the process. And, in some cases, that might mean extremely limited automation -- but at least that should be discussed. As to success rate, my main goal is a high quality end result website, and not whether someone automated or not. Unfortunately, clients do not always accept automation, but at least we discuss the pros/cons of different approaches.
Hi David,
I'm curious. Your messaging sounds a lot like the pitch of content migration tool vendors. How are you handling automated content migration? Are you using an off-the-shelf tool or service? How are you defraying the (not insignificant) vendor costs? How do you get your customers to recognize and budget for the (many) hidden costs of content migration?
What arguments are you hearing from your customers against a carefully considered approach to content migration? How do you counter them? What's your success rate?
Post new comment