Keyphrase Extraction by Improving TextRank with an Integration of Word Embedding and Syntactic Information

Author(s): Sheng Zhang, Qi Luo, Yukun Feng, Ke Ding, Daniela Gifu, Silan Zhang*, Xiaohang Ma, Jingbo Xia

Journal Name: Recent Advances in Computer Science and Communications
Formerly Recent Patents on Computer Science

Volume 14 , Issue 9 , 2021


Become EABM
Become Reviewer
Call for Editor

Abstract:

Background: As a known keyphrase extraction algorithm, TextRank is an analog of the PageRank algorithm, which relies heavily on the statistics of term frequency in the manner of cooccurrence analysis.

Objective: The frequency-based characteristic made it a bottleneck for performance enhancement, and various improved TextRank algorithms were proposed in recent years. Most of the improvements incorporated semantic information into the keyphrase extraction algorithm and achieved improvement.

Method: In this research, taking both syntactic and semantic information into consideration, we integrated syntactic tree algorithm and word embedding and put forward an algorithm of Word Embedding and Syntactic Information Algorithm (WESIA), which improved the accuracy of the TextRank algorithm.

Results: By applying our method on a self-made test set and a public test set, the result implied that the proposed unsupervised keyphrase extraction algorithm outperformed the other algorithms to some extent.

Keywords: Key phrases, Syntactic distance, Word embedding, Algorithm, TextRank, Word Embedding and Syntactic Information Algorithm (WESIA).

Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 14
ISSUE: 9
Year: 2021
Page: [2987 - 2993]
Pages: 7
DOI: 10.2174/2666255813999200820155846
Price: $95

Article Metrics

PDF: 2