- Back to Home »
- SEO »
- 1.4 How to Deal With Crawler (SEO Guide)?
Posted by : Blogger Wits
Friday, May 16, 2014
A Crawler is an internet bot, which actually visits web sites
and browses or in simple words, it reads the pages and all information in order
to web indexing for search engine. When an author publish an article on his website, a web crawler called “Web Spider or an Ant” reads the content on it,
and then it indexes the content/article according to their qualities. It
started from your site URL, it identifies all the hyperlinks on your site and
listed to “crawl frontier”, and now these frontiers are revisited according to
search engine’s policies. Therefore, better to optimize your blog/website URLs for Google Crawlers.
First,
Make effective use of Robots.text
What is Robot.text? It is
actually tells search engines whether you want to give an access to crawl pages
of your site. Google Webmaster Tools got a friendly “Robots.text Generator” to
help you. If you don’t want search engine to crawl your sites or your
sub-domains, then visit Webmaster Help Guide on usingRobots.text files.
Best Practice
Try to use more secure methods for sensitive content
|
Avoid result-like pages to be crawled
|
Don’t allow URLs for proxy services to be crawled
|
Carefully submit URLs to directories
|
Secondly,
You should be aware of Rel=Nofollow for Links
Spammy sites can affect the
reputation of your site, therefore use Webmaster Help Center tips, such as CAPTCHAs on commenting
section.
If you want to nofollowing
all of the links on a page, then you can use it on robots meta tag inside the
<head> tag of HTML and you can learn more about on “nofllow” on the robots meta tag article.