Affiliation:
1. University of Valladolid, Spain
Abstract
The performance evaluation of an information retrieval system is a decisive aspect of the measure of the improvements in search technology. The Google search engine, as a tool for retrieving information on the Web, is used by almost 92% of Iranian users. The purpose of this paper is to study Google’s performance in retrieving relevant information from Persian documents. The information retrieval effectiveness is based on the precision measures of the search results done to a website that we have built with the documents of a TREC standard corpus. We asked Google for 100 topics available on the corpus and we compared the retrieved webpages with the relevant documents. The obtained results indicated that the morphological analysis of the Persian language is not fully taken into account by the Google search engine. The incorrect text tokenisation, considering the stop words as the content keywords of a document and the wrong ‘variants encountered’ of words found by Google are the main reasons that affect the relevance of the Persian information retrieval on the Web for this search engine.
Subject
Library and Information Sciences,Information Systems
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献