Wednesday, February 9, 2011

Do I need an image sitemap?

Google's sitemap extensions for images allow you to include information about important images on your site. You can include a title, caption, geographic location (where the image was taken) and license information. This provides Google with much needed information about the content of your images (sometimes this is called "meta data"). But will your site really benefit from an image sitemap?

The answer is: it depends. If images are integral to your website then yes, there is a benefit to having an image sitemap. However, if your website uses images and graphics for style and navigation only, then you don't need an image sitemap.


How can you tell if images make up an integral part of your website? Ask your self these questions:
  • If I remove the images from my website and replaced it with text, would my visitors get lose information?
  • Does my website include unique images not found anywhere else on the Internet?
  • Does my website have an image gallery of any kind?
  • Do the images on your website contain a subject that is often searched for? (Such as a geographic location, a diagram, or images of a product.)
If you answered yes to any of those questions, then chances are you could use an image sitemap. If you answered no to all of them, then a standard XML sitemap is probably all you need.

Thursday, January 27, 2011

Should I include all content types in my sitemap?

What content should you include in your sitemap? Good question. When the sitemap protocol was first introduced the prevailing wisdom was that you should include all content in your sitemap and let Google sort it out. This meant that everything from your website would be included in your sitemap file (including CSS, images and scripts). But has this ever really made sense? Let's think about it for a second:
  • Google has never (and will never) index CSS and script files because they don't contain human readable content
  • Image URLs on their own provide the images, but no meta data that helps rank the image
  • There are now extensions to the sitemap protocol that support additional content types
After considering these three points it becomes clear that you probably shouldn't include everything in your basic sitemap. Instead, only include URLs that point to HTML and PDF documents (the two document types included in Google's main index).

If your website has important images or videos you want indexed, consider creating a Google Image Sitemap. This extension to the basic sitemap protocol allows you to include additional meta data about your website's images (such as a description and caption). Some automated sitemap tools, such as Inspyder Sitemap Creator, can automatically capture this information and include it in your sitemap.

If your website has videos or mobile content, consider creating a Google Video or Google Mobile sitemap too. These additional sitemap types can help get your website ranked in Google's alternate indexes and will provide better results than including all your content in the basic sitemap format.

Friday, January 21, 2011

What are Canonical URLs and do I need them?

In early 2009 Google announced that their crawler would start reading a new "canonical" meta tag. This was done to allow webmasters to tell Google the preferred URL to use when indexing a particular page. The idea is that if a single page is accessible through multiple URLs, than only one of those URLs should be included in the Google index. Since there is no way for Google to automatically determine what the preferred URL is, the "canonical" tag allows the webmaster to tell Google.

The syntax of the meta tag looks like this:

<link href="http://www.example.com/product.php?item=swedish-fish" rel="canonical"></link>

The tag tells Google to only index this page if it's being accessed through the same URL specified in the "href" attribute.

This meta tag creates a lot of confusion with webmasters. Here are some tips to help you determine if you need this tag or not:
  • If your website uses a large number of URL parameter that are dynamically generated, you may need to use canonical URL tags
  • If your website uses multiple domain names, inconsistent casing in URLs, or has a lot of URL rewriting, you may need to use canonical URL tags
  • If in doubt, your website probably doesn't need the canonical URL tag
The last point is the most important: The reality is that most sites do not need to use the canonical URL tag. It's only for websites that have a large number of URLs which may be used to access the same content. If your website is one of these you will see duplicate content warnings in the Google Webmaster interface.

One last note: If you are using canonical URLs on your website, Inspyder Sitemap Creator is the only sitemap generator that fully supports this tag. It will only include your canonical URLs (if present) in your sitemap. Sitemap Creator also logs when it encounters canonical URLs making it easy to troubleshoot crawling issues if you've made a mistake.

Monday, January 10, 2011

What does it mean when I "Ping" my Sitemap?

Hundreds of thousands of websites use sitemap files, so how can Google tell when yours changes? Google can periodically check your sitemap file, but wouldn't it be nice if there was a faster, easier way for Google to find out? It turns out there is: Pinging.

A Sitemap "ping" is when you send a special signal to Google (or Yahoo or Bing) that lets them know that you've made a change to your sitemap. It's faster and easier for Google to accept "Ping" messages because responsible webmasters will let Google know when they've made changes to their sitemaps.


In our experience you'll need to first create your webmaster account before trying to Ping. Your account lets you make your first sitemap submission manually. After that the ping interface will handle it. To make your first manual submission, create accounts with each search engine:

Once you're verified and your sitemap was successfully submitted and accepted the first time, you're ready to use the Ping interface. To ping all you need to do is load up the following URLs in your browser:
  • http://www.bing.com/webmaster/ping.aspx?siteMap=http%3A//www.example.com/sitemap.xml
  • http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?url=http%3A//www.example.com/sitemap.xml
  • http://www.google.com/webmasters/sitemaps/ping?sitemap=http%3A//www.example.com/sitemap.xml
(Remember to replace www.example.com/sitemap.xml with your server name and path to your sitemap file.)

You can automate this whole process using sitemap software, including Inspyder Sitemap Creator. Inspyder Sitemap Creator can automatically crawl, upload and ping Google for you, making the entire process fast and easy.

Pinging won't force Google to read your sitemap file. It still may take a few hours for them to come around and check for changes. Remember, a lot of webmasters are submitting sitemaps all the time so you might need to wait for a bit.

Properly used, pinging can save you a lot of time and help get your website changes indexed by Google just a little bit faster.

Sunday, January 2, 2011

How Often Should I Update My Sitemap?

You only need to update your sitemap when you make material changes to the content on your website. These types of changes include adding or removing pages, or making significant content changes to existing pages.

You do not need to update your sitemap if you haven't made any changes to your site, or if you change non-content parts of your site (such as navigational elements).

While there is no penalty for submitting your sitemap multiple times with no changes, there is also no benefit. If you know you make several changes to your website throughout the course of a week, then updating and resubmitting your sitemap weekly will be adequate. It's not necessary to regenerate your sitemap every single day (in our experience Google just doesn't index most sites that quickly).

If in doubt, remember that your sitemap assists Google in indexing your site. Your sitemap doesn't force Google to index your website, and it doesn't prevent Google from indexing new pages.

Tuesday, December 21, 2010

When do I need a Sitemap Index?

Many sitemap tools (including our Inspyder Sitemap Creator) allow you to create a sitemap index file, but when do you actually need one? The answer is simple: If your sitemap contains over 50,000 pages then you must split your sitemap and use a sitemap index file.

The sitemap protocol specification states that each individual sitemap file can contain at most 50,000 pages. If your website is made up of more than 50,000 pages than you must create multiple sitemap files. The "sitemap index" is another file that contains links to the individual sitemaps.

For example, if your website contains 125,000 URLs and you're creating a sitemap, you will need at least 4 files:

  • sitemap1.xml - Contains URLs 1 to 50,000
  • sitemap2.xml - Contains URLs 50,001 to 100,000
  • sitemap3.xml - Contains URLs 100,001 to 125,000
  • sitemap.xml - This will be your sitemap index, and contains the URLs of sitemap1.xml, sitemap2.xml and sitemap3.xml.
Automatically create multiple sitemaps in Inspyder Sitemap Creator by checking "Create Sitemap Index"
Using an automated tool to create these files is the easiest approach. As seen above, Inspyder Sitemap Creator can automatically split your sitemap into multiple files and generate your sitemap index. It is possible to create smaller individual sitemap files by adjusting the "Sitemap Size Limit" value.

If your website has fewer than 50,000 URLs, then there is no benefit to creating a sitemap index. A single sitemap file is all you need.

Monday, December 13, 2010

What is an "ROR" sitemap? Do I need one?

An "ROR" feed is a "Resource of Resources". This is a container format that was originally proposed to hold a variety of website data, including a sitemap format. It appears to have been proposed by http://www.rorweb.com/, but this site now appears defunct and the format never caught on as a web standard.

Today, no search engines officially (or unofficially as far as we can tell) accept or use ROR sitemaps. So there is no reason to have this sitemap file on your website. Having it will make no difference from an SEO or website ranking perspective.