8 Web Design tips to Help Google News Better Crawl Your Site
17 February 2009
Web Design publishers will benefit from eight tips that Google News is offering to help them better index your site:
1. Keep the article body clean in your Web Design
For
various reasons, when crawling an article, Google News checks the overall Webpage Design to make
sure it can find the article body. If your article body is broken up by
tags, ads, sidebars or other non-article content, we may not be
able to detect the actual article body, and reject your article as a
result. In addition, if you place the beginning of your article's body
near the title in the HTML, we'll be more likely to extract the correct
title and snippet.
2. Make sure Webpage article URLs are permanent and unique
If
you reuse article URLs in your Website Design, our system may have difficulty crawling and
categorizing your stories. In addition, make sure your article URLs
have at least three digits that don't resemble a year (for example,
5232 is ok, but 2008 is not.) You can get around this requirement by
submitting your articles in News Sitemaps. Also, please note that
session IDs can confuse our crawler, and we may not realize that two
distinct URLs actually point to the same page. You can learn more about
some of these requirements
here.
3. Take advantage of stock tickers in Website Sitemaps
Google
News Sitemaps allow publishers to specify stock ticker symbols for
companies mentioned in individual articles. Using these symbols helps
us better identify the subjects of your articles. You can read more
about the format we use for this data
here.
4. Check your Webpage Design Encoding
We
occasionally see articles that declare themselves to be encoded in one
Webdesign code format (say, UTF-8) and are actually encoded in another (say, ISO
8859-1). Don't do this. It hurts us.
5. Make your article publication dates explicit
In
order to help our crawler determine the correct date, please make the
actual publication date of your articles explicit. You can do this by
placing the article date and time in the HTML, between the title and
the body. Also, you can remove other dates from the HTML of the article
page, and add the required
tag to articles in your News Sitemap. Dates on article pages can be in most common formats, but for sitemaps, we ask that you use the W3C format; e.g. 2008-12-29T06:30:00Z.
Note
that the article times and dates displayed on Google News reflect the
time at which we originally crawled the articles, and may not be the
same as the publication date.
6. Keep original content separate from press releases
If
your site produces original content and distributes press releases that
you'd like us to crawl, make sure to separate your original news
content from your press releases by creating two different sections on
your site. As you may know, Google News labels press releases
distinctly in order to alert our users that the article they're about
to read is a press release. If your original news sections have links
to press releases, adding the rel="nofollow" attribute to all links
that point to your press release articles will ensure that they're
labeled correctly. You can learn more about this attribute here.
7. Format your images properly
To
help Google News identify your images and crawl them along with your
articles, use fairly large images with your webpage Design with reasonable aspect ratios and
descriptive captions. Make sure to place them near their respective
article titles on the page and make the images inline and
non-clickable. Images in the JPEG format are more likely to be crawled
correctly.
8. Article Titles in Google News
In order
for Google News to crawl the correct titles for your articles, make
sure the title you want appears in both the title tag and as the
headline on the article page. In addition, don't hyperlink the headline
on the article page - after all, your reader is already there! And it's
always a good idea to have links that point to your articles use the
article title as anchor text.
If you found these suggestions helpful Google News have also published more general Webmaster Guidelines to help make your site Google
News-friendly.
Articel Source