A web crawler creates an inventory of all pages on a site by systematically visiting every link. Web crawlers are often referred to as "spiders" because they, you know, crawl webs.
Spiders use a recursive algorithm that looks something like this:
step 1) crawl page
step 2) record page info
step 3) get all links to other pages
step 4) start at step 1 for each linked page that was found
The most famous spiders are the ones used by Google and other search engines to index the web. The spiders utilized in offices like ours are most commonly used to check for broken links and to generate XML site maps so that Google can better find your stuff.
The de facto standard spidering tool is Xenu Link Sleuth. It's free and has a ton of features, but it doesn't work on OSX
or Windows 7 (edit: Windows 7 is now supported. See the comments below.)
Dan recently upgraded to Windows 7, so he now uses Visual Web Spider. Visual Web Spider has a few extra features over Xenu that make it a bit more useful for SEO purposes: namely the ability to pull out the content found in header, page title, and image tags. Visual Web Spider costs $75.
Dan also recomends GSiteCrawler which is geared specifically towards creating site maps. He says "Its also free and is a very useful tool, but some find it a little quirky to get used to." GSiteCrawler also only works on Windows machines.
For Mac users, the closest thing to Xenu is Integrity. It's not as full-featured as Xenu, but it will get you 80% of the way.
Make a habit of spidering your site every month to ensure that you don't have links to nowhere. You may be surprised what else you discover.
Published by: Greg Baugues in Business