Performance of SciLitLLM on scientific literature understanding benchmarks. SciLitLLM outperforms Llama3.1 and Qwen2.5 models with similar scales.
Scientific literature understanding involves the systematic evaluation and interpretation of scientific texts and publications, to identify trends and extract targeted information.
Below is an example in SciRIFF. The LLM is asked to understand the content of a biomedical research paper and then extract the targeted information. LLMs' potential might be hindered by two major barriers:
SciLitLLM utilizes a two-stage pipeline for enhancing domain-specific knowledge in LLMs.
High-quality scientific textbooks and research papers provide a wealth of scientific knowledge. However, we face some practical challenges:
An example of formatting and grammer correction.
To address these challenges, we implemented the following modules:
Examples of high and low-quality CPT text and SFT instructions.
To address the scarcity of scientific instructions and the high costs associated with manual annotation, we developed a novel instruction generation and quality control process. This involved creating a probability table of domain-specific keywords and a list of scientific task descriptions. We then sampled keywords and tasks to generate a diverse dataset of domain-specific instructions by GPT-4o.
An Example in SciLitIns.
The quality assessment of SciLitIns.
Key Observations:
To assess the contribution of each component in the SciLitLLM pipeline, we conducted a comprehensive ablation study.
Please consider citing our paper if you find our work helpful!
@misc{li2024scilitllmadaptllmsscientific,
title={SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding},
author={Sihang Li and Jin Huang and Jiaxi Zhuang and Yaorui Shi and Xiaochen Cai and Mingjun Xu and Xiang Wang and Linfeng Zhang and Guolin Ke and Hengxing Cai},
year={2024},
eprint={2408.15545},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2408.15545},
}