ZeroEA: A Zero-Training Entity Alignment Framework via Pre-Trained Language Model-Reference-Cited by-同舟云学术

ZeroEA: A Zero-Training Entity Alignment Framework via Pre-Trained Language Model

Published:2024-03 Issue:7 Volume:17 Page:1765-1774
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Huo Nan¹,Cheng Reynold¹,Kao Ben¹,Ning Wentao¹,Haldar Nur Al Hasan²,Li Xiaodong¹,Li Jinyang¹,Najafi Mohammad Matin¹,Li Tian³,Qu Ge¹

Affiliation:

1. The University of Hong Kong, Hong Kong, China

2. The University of Western Australia

3. TCL Research, Hong Kong, China

Abstract

Entity alignment (EA), a crucial task in knowledge graph (KG) research, aims to identify equivalent entities across different KGs to support downstream tasks like KG integration, text-to-SQL, and question-answering systems. Given rich semantic information within KGs, pre-trained language models (PLMs) have shown promise in EA tasks due to their exceptional context-aware encoding capabilities. However, the current solutions based on PLMs encounter obstacles such as the need for extensive training, expensive data annotation, and inadequate incorporation of structural information. In this study, we introduce a novel zero-training EA framework, ZeroEA, which effectively captures both semantic and structural information for PLMs. To be specific, Graph2Prompt module serves as the bridge between graph structure and plain text by converting KG topology into textual context suitable for PLM input. Additionally, in order to provide PLMs with concise and clear input text of reasonable length, we design a motif-based neighborhood filter to eliminate noisy neighbors. The comprehensive experiments and analyses on 5 benchmark datasets demonstrate the effectiveness of ZeroEA, outperforming all leading competitors and achieving state-of-the-art performance in entity alignment. Notably, our study highlights the considerable potential of EA technique in improving the performance of downstream tasks, thereby benefitting the broader research field.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.14778/3654621.3654640

Reference65 articles.

1. Efficient Graphlet Counting for Large Networks

2. Austin R Benson, David F Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science 353, 6295 (2016), 163--166.

3. Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26 (2013).

4. Hongtai Cao, Qihao Wang, Xiaodong Li, Mohammad Matin Najafi, Kevin Chen-Chuan Chang, and Reynold Cheng. 2024. Large Subgraph Matching: A Comprehensive and Efficient Approach for Heterogeneous Graphs. In 2024 IEEE 40th International Conference on Data Engineering (ICDE). IEEE.

5. Ruisheng Cao Lu Chen Zhi Chen Yanbin Zhao Su Zhu and Kai Yu. 2021. LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics Online 2541--2555.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. SSNF: Optimizing Entity Alignment with a Novel Structural and Semantic Neighbor Filtering;Lecture Notes in Computer Science;2024