How well does Google work with Persian documents?-Reference-Cited by-同舟云学术

How well does Google work with Persian documents?

Published:2016-03-01 Issue:3 Volume:43 Page:316-327
ISSN:0165-5515
Container-title:Journal of Information Science
language:en
Short-container-title:Journal of Information Science

Author:

Sadeghi Mohammad¹,Vegas Jesús¹

Affiliation:

1. University of Valladolid, Spain

Abstract

The performance evaluation of an information retrieval system is a decisive aspect of the measure of the improvements in search technology. The Google search engine, as a tool for retrieving information on the Web, is used by almost 92% of Iranian users. The purpose of this paper is to study Google’s performance in retrieving relevant information from Persian documents. The information retrieval effectiveness is based on the precision measures of the search results done to a website that we have built with the documents of a TREC standard corpus. We asked Google for 100 topics available on the corpus and we compared the retrieved webpages with the relevant documents. The obtained results indicated that the morphological analysis of the Persian language is not fully taken into account by the Google search engine. The incorrect text tokenisation, considering the stop words as the content keywords of a document and the wrong ‘variants encountered’ of words found by Google are the main reasons that affect the relevance of the Persian information retrieval on the Web for this search engine.

Publisher

SAGE Publications

Subject

Library and Information Sciences,Information Systems

Link

http://journals.sagepub.com/doi/pdf/10.1177/0165551516640437

Reference22 articles.

1. Information Retrieval on the Web and its Evaluation

2. Performances of the Most Popular Search Engines in Arabic Language

3. Challenges in information retrieval and language modeling

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Investigating the Challenges and Opportunities in Persian Language Information Retrieval through Standardized Data Collections and Deep Learning;Computers;2024-08-21

2. Integrating word status for joint detection of sentiment and aspect in reviews;Journal of Information Science;2018-11-19