In previous installments of this Rethinking the Content Inventory series we've covered inventorying content and sites. As many sites become more topic-driven, another crucial slice is by topic. So basically we take an inventory of every topic along with relevant metrics. Take this example of a site that is organized by type of bird:
Topic
Content Count
Has Description?
Pageviews last month
Supply Side
Supply Side
Demand Side
Cardinals
0
Yes
1
Robins
1000
No
100
Seagulls
2000
Yes
5,000
Starlings
1
Yes
1,000
Woodpeckers
10,000
No
20,000
Wrens
500
Yes
20
The columns in this example are the topic name, the number of content items tagged to that topic, whether or not the topic has a contextual description for visitors, and how many pageviews there have been on the topic page in the lst month. In this example, we could use this topic inventory to make a variety of useful observations:
A lot of content is published (and categorized) to the woodpecker subject, and has relatively high pageviews to show it. That said, we are missing an opportunity with such an important topic (for this site) in not providing the context of a description for the topic page.
Cardinals as a topic should probably be dropped.
More should be published on Starlings, since there are high page views based on only one piece of content.
Studying the topic inventory is interesting because:
This can affect publishing schedules, pushing editorial teams to publish to keep quality high.
Provide feedback on what topics are most interesting to readers (other more sophisticated measures can be brought to bear such as how "evergreen" the content on that topic is).
Topics pages in particular can erode quickly in quality, and this gives a mechanism to monitor them.
In a migration, topics pages can sometimes be like "ghost" pages, assumed out of migration discussions since they will be "automatic." That said, there are for example opportunities to cut underperforming topics just like unneeded content during a migration.
As discussed in Sources of Data, one key to an inventory is to use the sources of data that are required to get the information needed for an inventory. In this case, we'll cover information from the Origin (the CMS) and Usage (How content is used by visitors).
Submitted by David Hobbs on 8 February 2012 - 1:42pm
Bill Trevor was the Project Manager who led the Mass.Gov effort to replace the Web Content Management System, visual design and navigation. He is a consultant specializing in Information Architecture, Website Migration and Optimization, Project Management and Social Media Marketing. His views are his own and in no way represent the views of Mass.Gov or the Commonwealth.
You migrated 26 sites and around 700,000 content items into your new CMS, and some of the sites were previously on other platforms. How did you coordinate with all of the stakeholders? What were the initial discussions like, and how did you stay engaged throughout the process?
With any project (especially one of this size) communication is key. We held frequent stakeholder meetings at both the migration coordinator and senior management levels for each of the 26 sites. We formed a "migration liaison team" that included one representative from each site to ensure information was broadly communicated. We also leveraged a wiki so stakeholders could receive notifications if there were any updates related to the migrations.
A primary goal of the project was to maintain one platform that easily presents a single view for web users looking for information on Massachusetts government. Was that goal met? Why was a new platform needed to make that happen? How did you convince people to move to a single platform, and how much variance did you allow between websites?
Since its inception, Mass.Gov has been focused on maintaining a single face of government. The goal of this mantra is to create a state website that constituents feel comfortable navigating around, no matter the agency/department providing the information they seek. Too many state websites have a "single facing" homepage only to disperse into as many different websites as there are agencies. Mass.Gov takes the opposite approach and as far as I can tell, is doing it the best. The new Content Management System (CMS) will enable Mass.Gov to keep that guiding principle true while allowing some flexibility to content authors to slightly alter their web pages without losing that single face. Constituants want to be confident that the site they are visiting is an official state sanctioned site and Mass.Gov makes that happen. The Commonwealth of Massachusetts continues to pursue the goal of IT consolidation. Why not offer a single enterprise level CMS instead of having 160+ different systems, visual designs and site navigation structures. This website consolidation has been in the works for many years (termed Portalization) and the new CMS will allow Mass.Gov to continue the push to bring more sites onboard.
How different is the experience now for web users looking for information from Massachusetts government?
While it may have been an ambitious project, Mass.Gov saw the replacement of the CMS as an opportunity to also update the visual design (4+ years old) and the navigation schema (7+ years old). The prior versions of the Mass.Gov site templates were very rigid, narrow and leveraged the old Yahoo category navigation schema. You know, the one where you saw topics, clicked them, saw sub-topics, clicked them and so on and so forth. Mass.Gov has introduced a modern mega-drop down, in the same style of ESPN or Target, quickly exposing level 2 and 3 topics from the banner on every page. Another addition is a similarly fashioned left navigation that allows users to expand a drop down menu from the left column to get a peak at what content lies beneath that category. These features reduce "number of clicks" and help to flatten the information architecture. We put a new "modern minimalist" design on top which offers more space for agencies to showcase content along with an easy to maintain slideshow template and a wider page layout.
How did you develop content inventories, and what did you discover when you did them?
We used everything we could find. Some agencies kept good inventories of their own and we leveraged those. We also used free tools like Xenu to spider sites and export the findings into excel to obtain a comprehensive view. For better or worse, the old CMS was a very linear tool and so it was a lot of work but attainable to export the navigation structure from the old site and dump that into the new tool to build the underlying folder structure. One thing to note, most agencies saw this as such a large project to simply move their existing content/navigation into the new CMS that few took the opportunity to redo their IA. We did, however, try to get agencies to see the value in a pre-migration ROT (redundant, outdated and trivial) analysis because the less content you migrate, the less you have to QA/tweak.
What simplifications did you make the project a success?
Keeping the scope in focus and using it as the barometer for any requested change. We used our daily scrum (15 minutes) to update each team member on what we accomplished yesterday, tasks for the current day and to discuss any issues blocking progress. These meetings kept team members honest and ensured everyone was on the same page. I also cannot stress the importance of a parking lot page and Executive Sponsorship. We had great support with the upper management circles who really listened when something came up that might derail us and helped to determine an effective resolution without busting scope.
How much of the migration was automated and how much was manual?
We had really great partners from the CMS vendor (Percussion) and a rocket scientist (well he was considering becoming one) who did the automated migration. Again, we did a lot of research and heard the horror stories about how poorly content migrations ended up but I can say that the automated portion of the project *fairly* cleanly migrated 85% of the agency content. This included documents and images. There was some content that may have gotten lost in translation but the vendor did a great job translating the old to the new. It was no small feat and only caused minor delays. We knew the cleaner the content was after it migrated, the less work the agencies would need to do prior to their site going live.
How much was dropped from the old sites?
Because the old tool had a separate navigation component, Mass.Gov knew exactly what was live on the old website when we took our final content snapshot. Mass.Gov migrated every web page that was live at the time of the snapshot and the numbers of pages dropped due to issues was probably less than 2% of 400,000 pages. This was most likely due to malformation in the page code.
How did you track progress as you went forward?
We used every resource we had at our disposal! Scrum was a tremendous help. Our main tracking methods were MS Project, our Wiki and Sharepoint. We leveraged MS Project to track milestones, resources, critical path and high level project points to senior management. Our Wiki was the main communication mechanism for our stakeholders as they were able to comment, ask questions and view schedules/timelines in real time as the project evolved. Internally, we leveraged Sharepoint as a means to track issues / bugs / fixes during the configuration and implementation of the CMS.
The old saying goes something like "You have your choice of schedule, scope, and cost -- pick any two." How did you do against schedule, scope, and cost?
Scope was always a challenge but we kept focus via daily scrum meetings and constant sessions with our stakeholders. The schedule did take some hits to ensure the product was scaled appropriately and that the content was migrated as clean as possible. This did not result in a significant delay and by the time we launched the first websites on the new platform, we were within the margin of error for our original project plan we drafted some two years prior.
If you could do it over again, what would you change?
If time was not a factor (we were dealing with fiscal year funding deadlines at times) I would have preferred to have spent more time analyzing website Information Architecture. Due to the enormity of migrating to a new toolset (fear of the learning curve) we recommended that agency staff focus on QA'ing the content that migrated and getting comfortable with the CMS. I still believe that this was the right decision but some sites are now looking to overhaul their IA.
What were the biggest constraints you had, and how did you overcome them?
I think the biggest challenge is that Mass.Gov is the top level website and maintains the CMS tool but does not control the governance model used by the 26 sites. While not a bad thing, it was challenging to get agreement and consensus on some aspects. In the end, we had a very strong communication model that served us and our stakeholders well. Absent this, we might still be migrating websites.
How is Mass.gov set up for ongoing management of the site? How many sites are still on the horizon to be moved to the new platform?
Mass.Gov took great care to develop authoring documentation and posted it to the Wiki. This way, it is available 24/7 and is continually updated by the team. Gone are the days of printing out a training manual as the minute it is printed, some aspect is out-of-date. Mass.Gov is also working to make short video tutorials that authors can watch to see click by click how different content components are created. Now that the CMS is in production and the original sites have launched, the next phase has commenced and "portalization" of state entities not on the CMS platform has begun. There are several large sites outside the Mass.Gov platform and the hope is to migrate as many as are willing to leverage this enterprise CMS and join their fellow agencies in expanding the single face of government.
--------------------------
Are you a website owner with a migration success (or failure)? HobbsOnTech will be featuring interviews like this one to help provide a repository of experiences that others can draw upon. Please contact us if you would like to participate.
Submitted by David Hobbs on 6 February 2012 - 12:50pm
Any website needs focus to achieve ongoing quality, and keeping this focus is a way of avoiding the redesign-forget-redesign cycle. Once you have a sizeable website, you have many voices (for example the owners of different subsites or sections) competing for changes to the website and underlying CMS. You need a way of product managing the implementation, so that you have a productive way of getting feedback. Without this, you could wind up with an unsustainable website, catering to the whims of the loudest stakeholders.
Organizations are tempted to take one of two approaches that get them trapped in a maze:
1. Don't engage (otherwise known as "they first have to give us their requirements") — NOT recommended
The CYA method of the central web team saying "first they have to give us their requirements" is an almost sure sign that there isn't engagement or focus. More sophisticated stakeholders may "win" in this environment, but this will also mean that items without an organization-wide priority will probably be implemented. On the face of it, this does seem like a reasonable approach because after all the central team needs to know the requirements, and who better to provide them than the team needing the requirement? But this overlooks the fact that individual teams probably have many non-web responsibilities, different groups don't know what other groups are up to, and also they do not understand the technical impacts of different requirements (see Developers, Don't Miss The Opportunity). Note that this is related to the approach of doing what stakeholders can pay for, which probably is not good for the organization overall (see You're Willing to Pay for the Feature? So What?).
2. Aimless engagement — NOT recommended
So if we're going to have a more active relationship between teams, then of course you want to go out and talk to stakeholders. But one thing to avoid is having this become aimless engagement, without specific goals and context. This problem can be made even worse if this is a big-bang, one-time engagement rather than setting things up for ongoing engagement. Some problems with aimless engagement are:
Wasted energy on details that will longer be a problem in the future. Stakeholders naturally talk about the thorns they bump against in their day-to-day use of their systems. But assuming you are trying to fundamentall change the way the system works, many of these issues may be irrelevant in the new system.
Poor expectations-setting about the future. Unless the context is specifically set, stakeholders will expect that the issues they raise will be resolved. And, as mentioned in the above bullet point, some issues may not even be relevant in the new system. Beyond that, you may wind up with the anchor of a long laundry list of issues, rather than a focused exploration of underlying required capabilities.
Lost educational opportunity. Engagement should always be a two-way street, and one area in particular that we should focus on is educating the stakeholders (after all, they are educating the core team about their needs). For instance, if you are talking about moving away from Dreamweaver to a CMS, then the current site owners will need to be educated about a CMS.
Instead, I would propose the following approach:
3. Focused engagement — do this!
For focused engagement to work, the following must be in place:
Overall objectives set and understood by all.
Iterative / responsive / ongoing.
Process understood (this does not mean heavyweight).
Everyone has input, and can see the status of their requests.
--------------------
I'll be writing more about defining requirements and also improving engagement with internal teams. In the meantime, you can download this new report: Better Engagement for Ongoing Website Improvements.
Content inventories are often considered just long lists of content. In fact, the top Google.com search result on "content inventory" is still the 2002 Adaptive Path article calling them "a mind-numbingly detailed odyssey"). But sites now are often getting way too complex (and big) to plod through every entry.
Many web presences have multiple sites or perhaps subsites or major sections. A multinational consumer product company has sites per country and / or product. Advocacy organizations may have a site per initiative, and news sites will also be broken down into primary sections like Sports.
So instead of a mind-numbing list, your content inventory could be grouped to come up with site inventories like this:
Subsite
Pages
New Template
Popularity
Percentage of pages using most recent corporate-approved Dreamweaver template
Percentage of pages that have received less than five pageviews in the last month
Sports
5000
100%
50%
Celebrity
1000
90%
50%
Europe
1000
90%
90%
World
2000
10%
90%
Politics
3000
50%
50%
Weather
6000
100%
10%
You'll notice that this type of report tells you a lot of information that probably would not be obvious when poking around a laundry list content inventory. For example, you see:
Which sites have a high percentage of unpopular content that are also small and on an old template -- potentially the entire subsite could be dropped for example
Which sites are completely using your newest template -- these could potentially be the first to migrate in a migration project
What sites are large and in the new template but with a lot of unpopular content -- these may be ripe for a new publishing strategy
Part of the point of the site inventory is that it is combining information from multiple sources (the above example table lists some possibilities but there are many more). Obviously, you could look at your analytics to just see the total page views for a site, or even the % of pages on a site that have under a threshold of pageviews per month. But it gets much more interesting when you combine the information, especially when you are considering phasing changes to your site. For example, you could migrate all sites that are 100% in the new template first, then in the next phase move those that have 90%+ in the new template.
A site inventory view into your content inventory of course isn't the only view you need to take. You need to look specifically at high-value content, and may need to slice and dice the results in different ways (such as by content type). But for complex rollout planning or other broad analysis, especially for very large sites which an organization doesn't have a solid handle on, site inventories can help drive decisions.
Note that the site inventory can just be derived from a larger content inventory. For example, if your content inventory has less than a million items, then you can use Excel pivot to aggregate the information (and other tools could be used for larger sets).
Any CMS implementation with a large number of stakeholders will generate far more requests than could ever be implemented. So how do you determine what gets onto the work program and then implemented?
There are several streams where potential features could get added to the system:
User requests (either internal or external)
Stability / security
Performance / scalability
Hardware / infrastructure
Long range requirements
So even an important user request may not make it onto your near term work program if, for example, there is a severe stability issue that needs to be addressed quickly. So there is something of an air gap between the requests coming in and what winds up on the work program. This is one of the most important roles of the product manager, and the product manager needs to consider the following factors.
Backend needs
As mentioned above, key security, scalability, performance, or manageability issues may trump any other requests. Notably, these may be items that your key stakeholders may not be aware of, much less actually requesting them.
Batching
Your near-term work program needs to be consistent, meaningfully grouping changes. This is both for communications and technical reasons. From a communications perspective, if the items being addressed are consistent then everyone can quickly understand what's happenning. From a technical perspective, if you need to completely rewrite a subsystem to address and issue then it might make sense to address many of the issues with that subsystem at once.
Resource constraints
Obviously you need to have the resources to implement your work program. This is both for a raw number of hours as well as balancing key resources. For instance, if your DBA is required for many requests then you may be able to only do some DBA-limited requests at once.
Popularity
Of course, the popularity of a request is very important, but, again, not the only factor). All other factors being equal, a more popular item should be placed on the work program before others. In practice, it may make sense to delay your top-requested feature in order to address the next three (for instance if the next three could be implemented in the same effort as the first).
the length of time that an issue has been discussed
In the end, the product manager must be able to justify the work progrm to the stakeholders. Although clearly more of an art than a science, the factors (and non-factors) should help in forming that solid work program.