Sitemaps for the Adobe CQ Site

Sitemaps for the Adobe CQ Site

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL. We don’t need to submit each Sitemap individually. Just submit the Sitemap index file and you're good to go.
Creating and submitting a Sitemap helps make sure that search engine knows about all the pages on your site, including URLs that may not be discoverable by search engine normal crawling process.
The Sitemaps protocol enables you to let search engines know what content you would like indexed. To tell search engines the content you don't want indexed.


Create sitemap index category page:


Create a page “sitemap-index-categories” in CQ. Under this page create category node with urls property. The category node name should be the page name that is under the root of your site.




Note: “/…” mean that all pages and subdirectories in that directory should be included in the referenced sitemap.


Sitemap index file generation:


Create a sitemap-index template and page component. Create xml.jsp under the sitemap index page component. Write the logic in the xml.jsp that will iterate over the category nodes which are created under the page “sitemap-index-categories” and generated the xml file in the below structure .

Sample sitemap index file

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap> <loc>http://<host>/<category-node-name>.xml</loc></sitemap>
<sitemap><loc>http://<host>/<category-node-name>..xml</loc></sitemap>

</sitemapindex>



Sitemap Generation


Create xml.jsp under the page component which would be the super type of all your templates, So that you can access in the every page in the site.
In the xml.jsp write a logic that will read the URl to get the “category-node-name”. Read the urls property of “category-node” and then generate the below xml structure.  If the URL contains “/…” at the end you need to include all pages and subdirectories in that directory in the xml file.

Sample sitemap file:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url> <loc>http://<host>/about_us.html</loc></url>
<url> <loc http://<host>/about_us/<page-path>.html </loc> </url>

</urlset >



Note: Create sitemap_index page using sitemap-index template under the root of your site. Access sitemap_index.xml from browser url to generate the sitemap index.

References: