I posted how to create a Sitemap file for just one website URL, now we are gonna learn how to make a Sitemap Index file for multiple websites. I recommend you to read my Sitemap XML post, before starting to this one, if you are not an advanced user or don't know how to make a Sitemap.xml file.
You don't have to make a new sitemap file for each websites, we can make just one Sitemap Index file that can contain multiple sitemaps. We can't include more than 50.000 URLs in a sitemap and the file size cannot be larger than 10MB when uncompressed. So if you have more than 50.000 URLs or the file size, you suppose will be larger than 10 MB, you can create a Sitemap Index file for multiple sitemaps. The only limitation is a Sitemap Index file may not list more than 1,000 Sitemaps.
The XML format of a Sitemap index file is very similar to the XML format of a Sitemap file. The Sitemap index file uses only four XML tags, which are <loc> ( it is required , the location of the sitemaps), <lastmod> ( optional, the last time the corresponding sitemap was modified and the value for the lastmod tag should be in W3C Datetime format YYYY-MM-DD), <sitemap> ( the sitemap tag for each sitemaps) and <sitemapindex> ( the mother tag)
A Sitemap index file can only specify Sitemaps that are found on the same site as the Sitemap index file. For example, http://www.frihost.com/sitemap_index.xml can include Sitemaps on http://www.frihost.com but not on http://www.bullshit.com or http://bullshit.forlife.com. As with Sitemaps, your Sitemap index file must be UTF-8 encoded.
Here is a two sitemaps in an Sitemap Index file in XML format.:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.frihost.com/sitemap1.xml.gz</loc>
<lastmod>2008-03-12T18:23:17+00:00</lastmod>
</sitemap>
<sitemap>
<loc>http://www.frihost.com/sitemap2.xml.gz</loc>
<lastmod>2008-03-12</lastmod>
</sitemap>
</sitemapindex>
In the example, as shown, the <sitemapindex> is the mother tag and then the <sitemap> tag comes for each sitemaps which we have just 2 sitemaps, we might had 999 sitemaps, so we would use 999 sitemap tags, inside each sitemap tags we use <loc> tag to show the location to the Crawling Bots, and there is an optional <lastmod> tag which is to show when was the last time the corresponding sitemap file was modified, thus the crawler bot don't make a search on old URLs and easily notice new URLs so it don't waste time on the old ones and our new URLs get ranked on Google or other search engines.
Note: Sitemap URLs, like all values in your XML files, must be entity escaped.
Google uses an XML schema to define the elements and attributes that can appear in your Sitemap file. You can get it from sitemap index schema
Now how to validate your Sitemap index file against a schema, the XML file will need additional headers. The header in the XML file should look like this:
<?xml version='1.0' encoding='UTF-8'?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd">
<sitemap>
...
</sitemap>
</sitemapindex>[/url]
You don't have to make a new sitemap file for each websites, we can make just one Sitemap Index file that can contain multiple sitemaps. We can't include more than 50.000 URLs in a sitemap and the file size cannot be larger than 10MB when uncompressed. So if you have more than 50.000 URLs or the file size, you suppose will be larger than 10 MB, you can create a Sitemap Index file for multiple sitemaps. The only limitation is a Sitemap Index file may not list more than 1,000 Sitemaps.
The XML format of a Sitemap index file is very similar to the XML format of a Sitemap file. The Sitemap index file uses only four XML tags, which are <loc> ( it is required , the location of the sitemaps), <lastmod> ( optional, the last time the corresponding sitemap was modified and the value for the lastmod tag should be in W3C Datetime format YYYY-MM-DD), <sitemap> ( the sitemap tag for each sitemaps) and <sitemapindex> ( the mother tag)
A Sitemap index file can only specify Sitemaps that are found on the same site as the Sitemap index file. For example, http://www.frihost.com/sitemap_index.xml can include Sitemaps on http://www.frihost.com but not on http://www.bullshit.com or http://bullshit.forlife.com. As with Sitemaps, your Sitemap index file must be UTF-8 encoded.
Here is a two sitemaps in an Sitemap Index file in XML format.:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.frihost.com/sitemap1.xml.gz</loc>
<lastmod>2008-03-12T18:23:17+00:00</lastmod>
</sitemap>
<sitemap>
<loc>http://www.frihost.com/sitemap2.xml.gz</loc>
<lastmod>2008-03-12</lastmod>
</sitemap>
</sitemapindex>
In the example, as shown, the <sitemapindex> is the mother tag and then the <sitemap> tag comes for each sitemaps which we have just 2 sitemaps, we might had 999 sitemaps, so we would use 999 sitemap tags, inside each sitemap tags we use <loc> tag to show the location to the Crawling Bots, and there is an optional <lastmod> tag which is to show when was the last time the corresponding sitemap file was modified, thus the crawler bot don't make a search on old URLs and easily notice new URLs so it don't waste time on the old ones and our new URLs get ranked on Google or other search engines.
Note: Sitemap URLs, like all values in your XML files, must be entity escaped.
Google uses an XML schema to define the elements and attributes that can appear in your Sitemap file. You can get it from sitemap index schema
Now how to validate your Sitemap index file against a schema, the XML file will need additional headers. The header in the XML file should look like this:
<?xml version='1.0' encoding='UTF-8'?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd">
<sitemap>
...
</sitemap>
</sitemapindex>[/url]
