Efficient Crawling Through Dynamic Priority of Web Page in Sitemap

Abstract

A web crawler or automatic indexer is used to download updated information from World Wide Web (www) for search engine. It is estimated that current size of Google index is approx 8*109 pages and crawling costs could be around 4 million dollars for a full crawl if only considered network costs. Thus we need to download only most important pages. In order toward, we propose “Efficient crawling through dynamic page priority of web pages in Sitemap” which is query based approach to inform most important pages to web crawler through sitemap protocol in dynamic page priority. Through the page priority web crawler can find most important pages from any website and may just download them. Experimental results reveal our approach has better performance than existing approach.

BibTeX key: noauthororeditor
entry type: article
year: 2014
month: jun
journal: Informatics Engineering, an International Journal (IEIJ)
number: 02
pages: 01-11
volume: 02
language: eng
issn: ISSN : 2349 - 2198
Document: http://airccse.org/journal/ieij/papers/2214ieij01.pdf

BibSonomy

Efficient Crawling Through Dynamic Priority of Web Page in Sitemap

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on