Generic placeholder image

Recent Advances in Computer Science and Communications

Editor-in-Chief

ISSN (Print): 2666-2558
ISSN (Online): 2666-2566

Research Article

Keyphrase Extraction by Improving TextRank with an Integration of Word Embedding and Syntactic Information

Author(s): Sheng Zhang, Qi Luo, Yukun Feng, Ke Ding, Daniela Gifu, Silan Zhang*, Xiaohang Ma and Jingbo Xia

Volume 14, Issue 9, 2021

Published on: 20 August, 2020

Page: [2969 - 2975] Pages: 7

DOI: 10.2174/2666255813999200820155846

Abstract

Background: As a known keyphrase extraction algorithm, TextRank is an analog of the PageRank algorithm, which relies heavily on the statistics of term frequency in the manner of cooccurrence analysis.

Objective: The frequency-based characteristic made it a bottleneck for performance enhancement, and various improved TextRank algorithms were proposed in recent years. Most of the improvements incorporated semantic information into the keyphrase extraction algorithm and achieved improvement.

Method: In this research, taking both syntactic and semantic information into consideration, we integrated syntactic tree algorithm and word embedding and put forward an algorithm of Word Embedding and Syntactic Information Algorithm (WESIA), which improved the accuracy of the TextRank algorithm.

Results: By applying our method on a self-made test set and a public test set, the result implied that the proposed unsupervised keyphrase extraction algorithm outperformed the other algorithms to some extent.

Keywords: Key phrases, Syntactic distance, Word embedding, Algorithm, TextRank, Word Embedding and Syntactic Information Algorithm (WESIA).

Graphical Abstract

© 2022 Bentham Science Publishers | Privacy Policy