98
Views
1
CrossRef citations to date
0
Altmetric
Computers and computing

Keyphrase Extraction Using Enhanced Word and Document Embedding

, , &

References

  • A. Onan, “Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification,” J. King Saud Univ.-Comput. Inf. Sci., Vol. 34, no. 5, pp. 2098–2117, 2022.
  • A. Onan, S. Korukoğlu, and H. Bulut, “A hybrid ensemble pruning approach based on consensus clustering and multi-objective evolutionary algorithm for sentiment classification,” Inf. Process. Manag., Vol. 53, no. 4, pp. 814–33, 2017. DOI:10.1016/j.ipm.2017.02.008
  • A. Onan, “An ensemble scheme based on language function analysis and feature engineering for text genre classification,” J. Inf. Sci., Vol. 44, no. 1, pp. 28–47, 2018. DOI:10.1177/0165551516677911
  • A. Onan, “Two-stage topic extraction model for bibliometric data analysis based on word embeddings and clustering,” IEEE Access., Vol. 7, pp. 145614–33, 2019. DOI:10.1109/ACCESS.2019.2945911
  • A. Onan, “Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks,” Concurrency and Computation: Practice and Experience, Vol. 33, no. 23, pp. e5909, 2021. DOI:10.1002/cpe.5909
  • A. Onan, “Sentiment analysis on massive open online course evaluations: a text mining and deep learning approach,” Comput. Appl. Eng. Educ., Vol. 29, no. 3, pp. 572–89, 2021. DOI:10.1002/cae.22253
  • A. Onan, “Biomedical text categorization based on ensemble pruning and optimized topic modelling,” Comput. Math. Methods. Med., Vol. 2018, pp. 1–22, 2018. DOI:10.1155/2018/2497471
  • A. Onan, S. Korukoğlu, and H. Bulut, “Ensemble of keyword extraction methods and classifiers in text classification,” Expert. Syst. Appl., Vol. 57, pp. 232–47, 2016. DOI:10.1016/j.eswa.2016.03.045
  • O. Medelyan, and I. H. Witten, “Domain-independent automatic key-phrase indexing with small training sets,” J. Am. Soc. Inf. Sci. Technol., Vol. 59, no. 7, pp. 1026–40, 2008. DOI:10.1002/asi.20790
  • M. Litvak, and M. Last, “Graph-based keyword extraction for single document summarization,” in Proceedings of the Workshop on Multi-Source Multilingual Information Extraction and Summarization, 2008, pp. 17–24.
  • K. S. Hasan, and V. Ng, “Automatic keyphrase extraction: a survey of the state of the Art,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014, pp. 1262–73.
  • J. M. Kleinberg, “Authoritative sources in a hyperlinked environment,” Journal of the ACM (JACM), Vol. 46, no. 5, pp. 604–32, 1999. DOI:10.1145/324133.324140
  • R. Mihalcea, and P. Tarau, “Textrank: bringing order into text,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2004, pp. 404–411.
  • L. Zhiyuan, H. Wenyi, Z. Yabin, and S. Maosong, “Automatic keyphrase extraction via topic decomposition,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2010, pp. 366–76.
  • S. Brin, and L. Page, “The anatomy of a large-scale hypertextual web search engine,” Comput. Netw. ISDN Syst., Vol. 30, no. 1, pp. 107–117, 1998. DOI:10.1016/S0169-7552(98)00110-X
  • T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” in Proceedings of the International Conference on Learning Representations, 2013.
  • J. H. Lau, and T. Baldwin, “An empirical evaluation of doc2vec with Practical insights into document embedding generation,” in Proceedings of the 1st Workshop on Representation Learning for NLP, 2016, pp. 78–86.
  • J. Pennington, R. Socher, and C. Manning, “Glove: global vectors for word representation,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014, pp. 1532–43.
  • T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Proceeding of the 26th International Conference on Neural Information Processing Systems, 2013, pp. 3111–9.
  • G. Salton, A. Wong, and C. Yang, “A vector space model for automatic indexing,” Commun. ACM, Vol. 18, no. 11, pp. 613–20, 1975. DOI:10.1145/361219.361220
  • S. Rose, D. Engel, N. Cramer, and W. Cowley. “Automatic keyword extraction from individual documents,” Text Mining: Theory and Applications., Vol. 1, pp. 1–20, 2010.
  • I. H. Witten, G. W. Paynter, E. Frank, C. Gutwin, and C. G. N. Manning, “KEA: Practical automatic keyword extraction,” in Proceedings of the 4th ACM Conference on Digital Libraries, 1999, pp. 254–5.
  • X. Wan, and J. Xiao, “Single document keyword extraction using neighborhood knowledge,” in Proceedings of the 23rd National Conference on Artificial Intelligence, 2008, pp. 855–60.
  • R. Wang, W. Liu, and C. McDonald, “Corpus-independent generic key-phrase extraction using word embedding vectors,” in Proceedings of the Software Engineering Research Conference, 2014, pp. 39–46.
  • J. Rafiei-Asl, and A. Nickabadi, “TSAKE: A topical and structural automatic key-phrase extractor,” Appl. Soft. Comput., Vol. 58, pp. 620–30, 2017. DOI:10.1016/j.asoc.2017.05.014
  • S. Danesh, T. Sumner, and J. H. Martin, “SGrank: combining statistical and graphical methods to improve the state of the art in unsupervised key-phrase extraction,” in Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, 2015, pp. 117–26.
  • Y. Sun, H. Qiu, Y. Zheng, Z. Wang, and C. Zhang, “SIFRank: A new baseline for unsupervised keyphrase extraction based on Pre-trained Language model,” IEEE Access., Vol. 8, pp. 10896–906, 2020. DOI:10.1109/ACCESS.2020.2965087
  • K. B. Smires, C. Musat, A. Hossmann, M. Baeriswyl, and M. Jaggi, “Simple unsupervised Keyphrase extraction using sentence embeddings,” in Proceedings of the 22nd Conference on Computational Natural Language Learning, 2018, pp. 221–9.
  • M. Baroni, G. Dinu, and G. Kruszewski, “Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014, pp. 238–47.
  • A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jegou, and T. Mikolov, “Compressing text classification models,” arXiv Preprint ArXiv: 1612.03651; 2016.
  • S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” J. Am. Soc. Inf. Sci., Vol. 41, no. 6, pp. 391–407, 1990. DOI:10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
  • P. S. Dhillon, D. P. Foster, and L. H. Ungar, “Eigenwords: spectral word embeddings,” J. Mach. Learn. Res., Vol. 16, pp. 3035–78, 2015.
  • H. Hotelling, “The most predictable criterion,” J. Educ. Psychol., Vol. 26, no. 2, pp. 139–42, 1935. DOI:10.1037/h0058165
  • N. Kalchbrenner, and P. Blunsom, “Recurrent continuous translation models,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2013, pp. 1700–1709.
  • P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” Transactions of the Association for Computational Linguistics, Vol. 5, pp. 135–146, 2017. DOI:10.1162/tacl_a_00051
  • G. Lev, B. Klein, and L. Wolf, “In defense of word embedding for generic text representation,” in Proceedings of the Natural Language Processing and Information Systems of Lecture Notes in Computer Science, 2015, pp. 35–50.
  • P. Zeng, Q. Tan, Y. Yan, Q. Xie, J. Xu, and W. Cao, “Automatic Keyword Extraction Using word embedding and clustering,” in Proceedings of the International Conference on Computer Systems, Electronics and Control, 2017, pp. 1402–8.
  • D. Mahata, J. Kuriakose, R. R. Shah, R. Zimmermann, and J. R. Talburt, “Theme-weighted ranking of keywords from text documents using phrase embeddings,” arXiv Preprint ArXiv: 1807.05962; 2018.
  • Q. Qinjun, X. Zhong, W. Liang, and L. Wenjia, “Geoscience key-phrase extraction algorithm using enhanced word embedding,” Expert. Syst. Appl., Vol. 125, no. 1, pp. 157–69, 2019.
  • E. Papagiannopoulou, and G. Tsoumakas, “Local word vectors guiding keyphrase extraction,” Inf. Process. Manag., Vol. 54, no. 6, pp. 888–902, 2018. DOI:10.1016/j.ipm.2018.06.004
  • J. Li, G. Huang, C. Fan, Z. Sun, and H. Zhu, “Key word extraction for short text via word2vec, doc2vec, and textrank,” Turkish Journal of Electrical Engineering & Computer Sciences, Vol. 27, no. 3, pp. 1794–805, 2019. DOI:10.3906/elk-1806-38
  • T. Newman, and P. E. Anderson. “Alignment-Based topic extraction using word embedding,” in IAL@ PKDD/ECML, pp. 60–72, 2018.
  • F. S. Alotaibi, and S. Sharma. Distributed Feature Sets for Document Specific Key-Phrase xtraction, Cybernetics and Systems, DOI: 10.1080/01969722.2022.2055990, 2022.
  • T. Kiss, and J. Strunk, “Unsupervised multilingual sentence boundary detection,” Comput. Linguist., Vol. 32, no. 4, pp. 485–525, 2006. DOI:10.1162/coli.2006.32.4.485
  • W. H. E. Day, and H. Edelsbrunner, “Efficient algorithms for agglomerative hierarchical clustering methods,” J. Classif., Vol. 1, no. 1, pp. 7–24, 1984. DOI:10.1007/BF01890115
  • A. Hulth, “Improved automatic keyword extraction given more linguistic knowledge,” in Proceedings of the ACM Conference on Empirical Methods in Natural Language Processing, 2003, pp. 216–23.
  • S. N. Kim, O. Medelyan, M. Y. Kan, and T. Baldwin, “SemEval- 2010 task 5: automatic keyword extraction from scientific articles,” in Proceedings of the 5th International Workshop on Semantic Evaluation, 2010, pp. 21–6.
  • C. Caragea, F. A. Bulgarov, A. Godea, and S. D. Gollapalli, “Citation-Enhanced keyword extraction from research papers: A supervised approach,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1435–46.
  • H. Yeom, Y. Ko, and J. Seo, “Unsupervised-learning-based keyphrase extraction from a single document by the effective combination of the graph-based model and the modified C-value method,” Comput. Speech. Lang., Vol. 58, pp. 304–18, 2019. DOI:10.1016/j.csl.2019.04.008

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.