Author:
Can Fazli,Kocberber Seyit,Balcik Erman,Kaynak Cihan,Ocalan H. Cagdas,Vursavas Onur M.
Abstract
AbstractIn this study, we investigate information retrieval (IR) on Turkish texts using a large‐scale test collection that contains 408,305 documents and 72 ad hoc queries. We examine the effects of several stemming options and query‐document matching functions on retrieval performance. We show that a simple word truncation approach, a word truncation approach that uses language‐dependent corpus statistics, and an elaborate lemmatizer‐based stemmer provide similar retrieval effectiveness in Turkish IR. We investigate the effects of a range of search conditions on the retrieval performance; these include scalability issues, query and document length effects, and the use of stopword list in indexing.
Reference71 articles.
1. Indexing strategies for Swedish full text retrieval under different user scenarios
2. Large-scale cluster-based retrieval experiments on Turkish texts
3. Anderson S. &Cavanagh J.(2006).Report on the top 200 corporations.
4. December 2000. Retrieved October 9 2006 fromhttp://www.corporations.org/system/top100.html
Cited by
74 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献