446
Views
0
CrossRef citations to date
0
Altmetric
Research Article

STSG: A Short Text Semantic Graph Model for Similarity Computing Based on Dependency Parsing and Pre-trained Language Models

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, &
Article: 2321552 | Received 13 Jun 2023, Accepted 07 Feb 2024, Published online: 04 Mar 2024
 

ABSTRACT

Short text semantic similarity is a crucial research area in nature language processing, which is used to predict the similarity between two sentences. Due to the sparsity features of short texts, words are isolated in the sentence and the correlations of words are ignored, it is very difficult to calculate the global semantic information. Based on this, short text semantic graph (STSG) model based on dependency parsing and pre-trained language models is proposed in this paper. It utilizes the syntactic information to obtain word dependency relationships and incorporate it into pre-trained language models to enhance the global semantic information of sentences. So it can address the semantic sparsity more effectively. A text semantic graph layer based on the graph attention network (GAT) is also realized, which regards word vectors as node features and word dependency as edge features. The attention mechanism of GAT can identify the importance of different word correlations and solve the word dependency modeling effectively. On the challenging short text semantic benchmark dataset MRPC, the STSG model achieves an F1-score of .946, which is further improved 2.16% over previous SOTA approaches. At the time of writing, STSG has achieved a new SOTA performance on the MRPC dataset.

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by the Open Project of Sichuan Provincial Key Laboratory of Philosophy and Social Science for Language Intelligence in Special Education under Grant No. YYZN-2023-4 and the Ph.D. Fund of Chengdu Technological University under Grant No. 2020RC002.