Search in:

IETE Journal of Research Volume 69, 2023 - Issue 12

Views

CrossRef citations to date

Altmetric

Computers and computing

Keyphrase Extraction Using Enhanced Word and Document Embedding

Fahd Saleh Alotaibi1 Faculty of Computing and Information Technology, King Abdulaziz University, 21589 Jeddah, Saudi ArabiaView further author information

Saurabh Sharma2 Electronics and Computer Engineering Department, Thapar Institute of Engineering and Technology, 147 001 Patiala, India;3 Faculty of Engineering and Technology, University Institute of Engineering & Technology, Panjab University, 160 014 Chandigarh, IndiaCorrespondence[email protected]
View further author information

Vishal Gupta3 Faculty of Engineering and Technology, University Institute of Engineering & Technology, Panjab University, 160 014 Chandigarh, IndiaView further author information

Savita Gupta3 Faculty of Engineering and Technology, University Institute of Engineering & Technology, Panjab University, 160 014 Chandigarh, IndiaView further author information

Pages 8876-8888 | Published online: 07 Aug 2022

Cite this article
https://doi.org/10.1080/03772063.2022.2103036
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

A. Onan, “Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification,” J. King Saud Univ.-Comput. Inf. Sci., Vol. 34, no. 5, pp. 2098–2117, 2022.
Google Scholar
A. Onan, S. Korukoğlu, and H. Bulut, “A hybrid ensemble pruning approach based on consensus clustering and multi-objective evolutionary algorithm for sentiment classification,” Inf. Process. Manag., Vol. 53, no. 4, pp. 814–33, 2017. DOI:10.1016/j.ipm.2017.02.008
Web of Science ®Google Scholar
A. Onan, “An ensemble scheme based on language function analysis and feature engineering for text genre classification,” J. Inf. Sci., Vol. 44, no. 1, pp. 28–47, 2018. DOI:10.1177/0165551516677911
Web of Science ®Google Scholar
A. Onan, “Two-stage topic extraction model for bibliometric data analysis based on word embeddings and clustering,” IEEE Access., Vol. 7, pp. 145614–33, 2019. DOI:10.1109/ACCESS.2019.2945911
Web of Science ®Google Scholar
A. Onan, “Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks,” Concurrency and Computation: Practice and Experience, Vol. 33, no. 23, pp. e5909, 2021. DOI:10.1002/cpe.5909
Web of Science ®Google Scholar
A. Onan, “Sentiment analysis on massive open online course evaluations: a text mining and deep learning approach,” Comput. Appl. Eng. Educ., Vol. 29, no. 3, pp. 572–89, 2021. DOI:10.1002/cae.22253
Web of Science ®Google Scholar
A. Onan, “Biomedical text categorization based on ensemble pruning and optimized topic modelling,” Comput. Math. Methods. Med., Vol. 2018, pp. 1–22, 2018. DOI:10.1155/2018/2497471
Web of Science ®Google Scholar
A. Onan, S. Korukoğlu, and H. Bulut, “Ensemble of keyword extraction methods and classifiers in text classification,” Expert. Syst. Appl., Vol. 57, pp. 232–47, 2016. DOI:10.1016/j.eswa.2016.03.045
Web of Science ®Google Scholar
O. Medelyan, and I. H. Witten, “Domain-independent automatic key-phrase indexing with small training sets,” J. Am. Soc. Inf. Sci. Technol., Vol. 59, no. 7, pp. 1026–40, 2008. DOI:10.1002/asi.20790
Web of Science ®Google Scholar
M. Litvak, and M. Last, “Graph-based keyword extraction for single document summarization,” in Proceedings of the Workshop on Multi-Source Multilingual Information Extraction and Summarization, 2008, pp. 17–24.
Google Scholar
K. S. Hasan, and V. Ng, “Automatic keyphrase extraction: a survey of the state of the Art,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014, pp. 1262–73.
Google Scholar
J. M. Kleinberg, “Authoritative sources in a hyperlinked environment,” Journal of the ACM (JACM), Vol. 46, no. 5, pp. 604–32, 1999. DOI:10.1145/324133.324140
Web of Science ®Google Scholar
R. Mihalcea, and P. Tarau, “Textrank: bringing order into text,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2004, pp. 404–411.
Google Scholar
L. Zhiyuan, H. Wenyi, Z. Yabin, and S. Maosong, “Automatic keyphrase extraction via topic decomposition,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2010, pp. 366–76.
Google Scholar
S. Brin, and L. Page, “The anatomy of a large-scale hypertextual web search engine,” Comput. Netw. ISDN Syst., Vol. 30, no. 1, pp. 107–117, 1998. DOI:10.1016/S0169-7552(98)00110-X
Google Scholar
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” in Proceedings of the International Conference on Learning Representations, 2013.
Google Scholar
J. H. Lau, and T. Baldwin, “An empirical evaluation of doc2vec with Practical insights into document embedding generation,” in Proceedings of the 1st Workshop on Representation Learning for NLP, 2016, pp. 78–86.
Google Scholar
J. Pennington, R. Socher, and C. Manning, “Glove: global vectors for word representation,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014, pp. 1532–43.
Google Scholar
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Proceeding of the 26th International Conference on Neural Information Processing Systems, 2013, pp. 3111–9.
Google Scholar
G. Salton, A. Wong, and C. Yang, “A vector space model for automatic indexing,” Commun. ACM, Vol. 18, no. 11, pp. 613–20, 1975. DOI:10.1145/361219.361220
Web of Science ®Google Scholar
S. Rose, D. Engel, N. Cramer, and W. Cowley. “Automatic keyword extraction from individual documents,” Text Mining: Theory and Applications., Vol. 1, pp. 1–20, 2010.
Google Scholar
I. H. Witten, G. W. Paynter, E. Frank, C. Gutwin, and C. G. N. Manning, “KEA: Practical automatic keyword extraction,” in Proceedings of the 4th ACM Conference on Digital Libraries, 1999, pp. 254–5.
Google Scholar
X. Wan, and J. Xiao, “Single document keyword extraction using neighborhood knowledge,” in Proceedings of the 23rd National Conference on Artificial Intelligence, 2008, pp. 855–60.
Google Scholar
R. Wang, W. Liu, and C. McDonald, “Corpus-independent generic key-phrase extraction using word embedding vectors,” in Proceedings of the Software Engineering Research Conference, 2014, pp. 39–46.
Google Scholar
J. Rafiei-Asl, and A. Nickabadi, “TSAKE: A topical and structural automatic key-phrase extractor,” Appl. Soft. Comput., Vol. 58, pp. 620–30, 2017. DOI:10.1016/j.asoc.2017.05.014
Web of Science ®Google Scholar
S. Danesh, T. Sumner, and J. H. Martin, “SGrank: combining statistical and graphical methods to improve the state of the art in unsupervised key-phrase extraction,” in Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, 2015, pp. 117–26.
Google Scholar
Y. Sun, H. Qiu, Y. Zheng, Z. Wang, and C. Zhang, “SIFRank: A new baseline for unsupervised keyphrase extraction based on Pre-trained Language model,” IEEE Access., Vol. 8, pp. 10896–906, 2020. DOI:10.1109/ACCESS.2020.2965087
Web of Science ®Google Scholar
K. B. Smires, C. Musat, A. Hossmann, M. Baeriswyl, and M. Jaggi, “Simple unsupervised Keyphrase extraction using sentence embeddings,” in Proceedings of the 22nd Conference on Computational Natural Language Learning, 2018, pp. 221–9.
Google Scholar
M. Baroni, G. Dinu, and G. Kruszewski, “Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014, pp. 238–47.
Google Scholar
A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jegou, and T. Mikolov, “Compressing text classification models,” arXiv Preprint ArXiv: 1612.03651; 2016.
Google Scholar
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” J. Am. Soc. Inf. Sci., Vol. 41, no. 6, pp. 391–407, 1990. DOI:10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
Web of Science ®Google Scholar
P. S. Dhillon, D. P. Foster, and L. H. Ungar, “Eigenwords: spectral word embeddings,” J. Mach. Learn. Res., Vol. 16, pp. 3035–78, 2015.
Web of Science ®Google Scholar
H. Hotelling, “The most predictable criterion,” J. Educ. Psychol., Vol. 26, no. 2, pp. 139–42, 1935. DOI:10.1037/h0058165
Google Scholar
N. Kalchbrenner, and P. Blunsom, “Recurrent continuous translation models,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2013, pp. 1700–1709.
Google Scholar
P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” Transactions of the Association for Computational Linguistics, Vol. 5, pp. 135–146, 2017. DOI:10.1162/tacl_a_00051
Google Scholar
G. Lev, B. Klein, and L. Wolf, “In defense of word embedding for generic text representation,” in Proceedings of the Natural Language Processing and Information Systems of Lecture Notes in Computer Science, 2015, pp. 35–50.
Google Scholar
P. Zeng, Q. Tan, Y. Yan, Q. Xie, J. Xu, and W. Cao, “Automatic Keyword Extraction Using word embedding and clustering,” in Proceedings of the International Conference on Computer Systems, Electronics and Control, 2017, pp. 1402–8.
Google Scholar
D. Mahata, J. Kuriakose, R. R. Shah, R. Zimmermann, and J. R. Talburt, “Theme-weighted ranking of keywords from text documents using phrase embeddings,” arXiv Preprint ArXiv: 1807.05962; 2018.
Google Scholar
Q. Qinjun, X. Zhong, W. Liang, and L. Wenjia, “Geoscience key-phrase extraction algorithm using enhanced word embedding,” Expert. Syst. Appl., Vol. 125, no. 1, pp. 157–69, 2019.
Google Scholar
E. Papagiannopoulou, and G. Tsoumakas, “Local word vectors guiding keyphrase extraction,” Inf. Process. Manag., Vol. 54, no. 6, pp. 888–902, 2018. DOI:10.1016/j.ipm.2018.06.004
Web of Science ®Google Scholar
J. Li, G. Huang, C. Fan, Z. Sun, and H. Zhu, “Key word extraction for short text via word2vec, doc2vec, and textrank,” Turkish Journal of Electrical Engineering & Computer Sciences, Vol. 27, no. 3, pp. 1794–805, 2019. DOI:10.3906/elk-1806-38
Web of Science ®Google Scholar
T. Newman, and P. E. Anderson. “Alignment-Based topic extraction using word embedding,” in IAL@ PKDD/ECML, pp. 60–72, 2018.
Google Scholar
F. S. Alotaibi, and S. Sharma. Distributed Feature Sets for Document Specific Key-Phrase xtraction, Cybernetics and Systems, DOI: 10.1080/01969722.2022.2055990, 2022.
Google Scholar
T. Kiss, and J. Strunk, “Unsupervised multilingual sentence boundary detection,” Comput. Linguist., Vol. 32, no. 4, pp. 485–525, 2006. DOI:10.1162/coli.2006.32.4.485
Web of Science ®Google Scholar
W. H. E. Day, and H. Edelsbrunner, “Efficient algorithms for agglomerative hierarchical clustering methods,” J. Classif., Vol. 1, no. 1, pp. 7–24, 1984. DOI:10.1007/BF01890115
Web of Science ®Google Scholar
A. Hulth, “Improved automatic keyword extraction given more linguistic knowledge,” in Proceedings of the ACM Conference on Empirical Methods in Natural Language Processing, 2003, pp. 216–23.
Google Scholar
S. N. Kim, O. Medelyan, M. Y. Kan, and T. Baldwin, “SemEval- 2010 task 5: automatic keyword extraction from scientific articles,” in Proceedings of the 5th International Workshop on Semantic Evaluation, 2010, pp. 21–6.
Google Scholar
C. Caragea, F. A. Bulgarov, A. Godea, and S. D. Gollapalli, “Citation-Enhanced keyword extraction from research papers: A supervised approach,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1435–46.
Google Scholar
H. Yeom, Y. Ko, and J. Seo, “Unsupervised-learning-based keyphrase extraction from a single document by the effective combination of the graph-based model and the modified C-value method,” Comput. Speech. Lang., Vol. 58, pp. 304–18, 2019. DOI:10.1016/j.csl.2019.04.008
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Keyphrase Extraction Using Enhanced Word and Document Embedding

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Keyphrase Extraction Using Enhanced Word and Document Embedding

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date