adaptive path logo

Doing a Content Inventory
(Or, A Mind-Numbingly Detailed Odyssey Through Your Web Site)

by Jeffrey Veen (slightly adapted by Gisle Hannemyr)
June 18, 2002

image of Jeffrey Veen I've spent the last year working with clients on a variety of information architecture and redesign problems. One of the most strikingly consistent issues, however, has been how many of these companies still haven't developed content management systems. I've spoken with enterprises in the Fortune 100 who find themselves sitting on top of 6 years' worth of Web content trapped in static HTML files. They know they need to get this stuff into database and redesign their site into a template-driven system. But their first question is inevitably, “So, uh, where do we start?”

If you're in a similar situation, your first step is to take stock of what you've got. This process, known as a content inventory, is a relatively straightforward process of clicking through your Web site and recording what you find. We've developed a simple Excel spreadsheet to help you structure your findings, and some tips on how to get through it.

Start at your home page. Identify the major sections of your site. For example, at adaptivepath.com, we've divided our site into these sections: team, services, workshops, publications, and contact. If I were doing an inventory of this site, I'd start with one of those sections, click in, and see what's linked from it. For each page that I visit, I'd record the information specified in the columns of the spreadsheet. I'd follow every link and navigate as far as I could through the site, making sure to gather data about every possible page on the site.

image of xls spreadsheet

Here's a description of the things I put in the inventory:

After you've filled in a couple hundred lines of the spreadsheet, you'll inevitably start to wonder if there is something — anything! — that can speed this process up. Surely technology can come to the rescue. Sorry. The best we've been able to do is enlist the help of a programmer to write us a script that will crawl a Web site and spit out the URLs it finds. If you want something similar, take a look at Arthur de Jong's WebCheck, Tilman Hausherr's Xenu's_Link_Sleuth or Leandro H. Fern├índez' DRKSpider. But a crawler merely ensures that we don't miss any pages. Even with this head start, we always go through the pages by hand. A content inventory is a decidedly human task. In fact, we find that the process can often be as valuable as the final spreadsheet. If you invest the time in scouring your Web site and deconstructing every page (or at least a good selection of pages), you will end up as the uncontested expert in how it all goes together. And that's invaluable knowledge to possess when redesigning your site.

Download content inventory template: MS Excel (42 Kbyte Excel file), OpenOffice.org (17 Kbyte Calc file).

Jeff Veen is the Director of Product Design and a founding partner at Adaptive Path. He specializes in innovative Web design techniques. You can learn more about Jeff at his personal site, Veen.com.

This article is part of an occasional series about techniques for doing user experience work. The previous essay in this series was “Setting Priorities” by Janice Fraser.


Creative Commons License Adapted from Doing a Content Inventory by Jeffrey Veen, available (until 2007-04-18) under Creative Commons BY 1.0 License. This adapted version is available under a Creative Commons BY-SA 3.0 License.
Adaption: Changed terminology to match M&R (2007), added Calc spreadsheet, replaced dead link, added links to site crawlers.


Original published by Adaptive Path | 363 Brannan St. | San Francisco, CA 9410 | http://adaptivepath.com/