Information retrieval on Turkish texts-Reference-Cited by-同舟云学术

Information retrieval on Turkish texts

Published:2007-12-04 Issue:3 Volume:59 Page:407-421
ISSN:1532-2882
Container-title:Journal of the American Society for Information Science and Technology
language:en
Short-container-title:J. Am. Soc. Inf. Sci.

Author:

Can Fazli,Kocberber Seyit,Balcik Erman,Kaynak Cihan,Ocalan H. Cagdas,Vursavas Onur M.

Abstract

AbstractIn this study, we investigate information retrieval (IR) on Turkish texts using a large‐scale test collection that contains 408,305 documents and 72 ad hoc queries. We examine the effects of several stemming options and query‐document matching functions on retrieval performance. We show that a simple word truncation approach, a word truncation approach that uses language‐dependent corpus statistics, and an elaborate lemmatizer‐based stemmer provide similar retrieval effectiveness in Turkish IR. We investigate the effects of a range of search conditions on the retrieval performance; these include scalability issues, query and document length effects, and the use of stopword list in indexing.

Publisher

Wiley

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/asi.20750

Reference71 articles.

1. Indexing strategies for Swedish full text retrieval under different user scenarios

2. Large-scale cluster-based retrieval experiments on Turkish texts

3. Anderson S. &Cavanagh J.(2006).Report on the top 200 corporations.

4. December 2000. Retrieved October 9 2006 fromhttp://www.corporations.org/system/top100.html

Cited by 74 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. From Foundations to GPT in Text Classification: A Comprehensive Survey on Current Approaches and Future Trends;Foundations and Trends® in Information Retrieval;2025

2. Categorization of Turkish Text Documents Using Extreme Learning Machine;2024 8th International Artificial Intelligence and Data Processing Symposium (IDAP);2024-09-21

3. Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers;Information Systems;2024-03

4. Impact of Tokenization on Language Models: An Analysis for Turkish;ACM Transactions on Asian and Low-Resource Language Information Processing;2023-03-25

5. Social media analytical CRM: a case study in a bank;Journal of Intelligent & Fuzzy Systems;2023-01-30