Bioinfo-Bench: A Simple Benchmark Framework for LLM Bioinformatics Skills Evaluation-Reference-Cited by-同舟云学术

Bioinfo-Bench: A Simple Benchmark Framework for LLM Bioinformatics Skills Evaluation

Published:2023-10-21 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Chen Qiyuan,Deng Cheng

Abstract

AbstractLarge Language Models (LLMs) have garnered significant recognition in the life sciences for their capacity to comprehend and utilize knowledge. The contemporary expectation in diverse industries extends beyond employing LLMs merely as chatbots; instead, there is a growing emphasis on harnessing their potential as adept analysts proficient in dissecting intricate issues within these sectors. The realm of bioinformatics is no exception to this trend. In this paper, we introduce Bioinfo-Bench, a novel yet straightforward benchmark framework suite crafted to assess the academic knowledge and data mining capabilities of foundational models in bioinformatics. Bioinfo-Benchsystematically gathered data from three distinct perspectives: knowledge acquisition, knowledge analysis, and knowledge application, facilitating a comprehensive examination of LLMs. Our evaluation encompassed prominent models ChatGPT, Llama, and Galactica. The findings revealed that these LLMs excel in knowledge acquisition, drawing heavily upon their training data for retention. However, their proficiency in addressing practical professional queries and conducting nuanced knowledge inference remains constrained. Given these insights, we are poised to delve deeper into this domain, engaging in further extensive research and discourse. It is pertinent to note that project Bioinfo-Benchis currently in progress, and all associated materials will be made publicly accessible.1

Publisher

Cold Spring Harbor Laboratory

Reference27 articles.

1. Bioinformatics - instructions to authors. https://academic.oup.com/bioinformatics/pages/instructions_for_authors, 2023.

2. Chatgpt plugins. https://openai.com/blog/chatgpt-plugins, 2023.

3. Bakhshandeh, S. Benchmarking medical large language models. Nature Reviews Bioengineering (2023), 1–1.

4. Science, medicine, and the future: Bioinformatics;BMJ: British Medical Journal,2002

5. Oceangpt: A large language model for ocean science tasks;arXiv,2023

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. reguloGPT: Harnessing GPT for Knowledge Graph Construction of Molecular Regulatory Pathways;2024-01-30

2. A Comprehensive Evaluation of Large Language Models in Mining Gene Interactions and Pathway Knowledge;2024-01-24

3. Robodoc: a conversational-AI based app for medical conversations;2024-01-02

4. Leveraging large language models for data analysis automation;2023-12-12