Arun, Rajkumar, Venkatasubramaniyan Suresh, CE Veni Madhavan, and MN Narasimha Murthy. 2010. “On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations.” In Pacific-Asia Conference on Knowledge Discovery and Data Mining, 391–402.
Baccianella, Stefano, Andrea Esuli, and Fabrizio Sebastiani. 2010. “Sentiwordnet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining.” In Lrec, 2200–2204.
Bar, Daniel, Torsten Zesch, and Iryna Gurevych. 2011. “A Reflective View on Text Similarity.” Proceedings of Recent Advances in Natural Language Processing: 515–20.
Benoit, Kenneth et al. 2018. “Quanteda: An r Package for the Quantitative Analysis of Textual Data.” Journal of Open Source Software 3(30): 774.
Blei, David M, Andrew Y Ng, and Michael I Jordan. 2003. “Latent Dirichlet Allocation.” Journal of machine Learning research 3(Jan): 993–1022.
Brady, Henry E. 2019. “The Challenge of Big Data and Data Science.” Annual Review of Political Science 22(1): 297–323. (April 15, 2021).
Burtejin, Zorgit. 2016. “Csoportosítás (Klaszterezés).” In Kvantitatív Szövegelemzés És Szövegbányászat a Politikatudományban, ed. Miklós Sebők. Budapest: L’Harmattan, 85–101.
Cao, Juan et al. 2009. “A Density-Based Method for Adaptive LDA Model Selection.” Neurocomputing 72(7-9): 1775–81.
Deveaud, Romain, Eric SanJuan, and Patrice Bellot. 2014. “Accurate and Effective Latent Concept Modeling for Ad Hoc Information Retrieval.” Document numérique 17(1): 61–84.
Griffiths, T. L., and M. Steyvers. 2004. “Finding Scientific Topics.” Proceedings of the National Academy of Sciences 101(Supplement 1): 5228–35. (February 23, 2021).
Grimmer, Justin, and Brandon M Stewart. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political analysis 21(3): 267–97.
Hjorth, Frederik et al. 2015. “Computers, Coders, and Voters: Comparing Automated Methods for Estimating Party Positions.” Research & Politics 2(2): 2053168015580476.
Jacobi, Carina, Wouter Van Atteveldt, and Kasper Welbers. 2016. “Quantitative Analysis of Large Amounts of Journalistic Texts Using Topic Modelling.” Digital Journalism 4(1): 89–106.
Kusner, Matt J, Yu Sun, Nicholas I Kolkin, and Kilian Q Weinberger. 2015. “From Word Embeddings To Document Distances.” Proceedings of the 32nd International Conference on Machine Learning.
Kwartler, Ted. 2017. Text Mining in Practice with R. John Wiley & Sons.
Ladd, John R. 2020. “Understanding and Using Common Similarity Measures for Text Analysis.” The Programming Historian 9.
Laver, Michael, Kenneth Benoit, and John Garry. 2003. “Extracting Policy Positions from Political Texts Using Words as Data.” American political science review: 311–31.
Laver, Michael, and John Garry. 2000. “Estimating Policy Positions from Political Texts.” American Journal of Political Science: 619–34.
Liu, Bing. 2010. “Sentiment Analysis and Subjectivity.” Handbook of natural language processing 2(2010): 627–66.
Loughran, Tim, and Bill McDonald. 2011. “When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks.” The Journal of Finance 66(1): 35–65.
Máté, Ákos, Miklós Sebők, and Tamás Barczikay. 2021. “The Effect of Central Bank Communication on Sovereign Bond Yields: The Case of Hungary ed. Hiranya K. Nath. PLOS ONE 16(2): e0245515. (February 15, 2021).
Mikolov, Tomas et al. 2018. “Advances in Pre-Training Distributed Word Representations.” In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018),.
Mikolov, Tomas, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. “Efficient Estimation of Word Representations in Vector Space.” arXiv preprint arXiv:1301.3781.
Niwattanakul, Suphakit, Jatsada Singthongchai, Ekkachai Naenudorn, and Supachanun Wanapu. 2013. “Using of Jaccard Coefficient for Keywords Similarity.” Hong Kong: 5.
Pennington, Jeffrey, Richard Socher, and Christopher D Manning. 2014. “Glove: Global Vectors for Word Representation.” In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–43.
Phan, Xuan-Hieu, Le-Minh Nguyen, and Susumu Horiguchi. 2008. “Learning to Classify Short and Sparse Text & Web with Hidden Topics from Large-Scale Data Collections.” In, 91–100.
Roberts, Margaret E et al. 2014. “Structural Topic Models for Open-Ended Survey Responses.” American Journal of Political Science 58(4): 1064–82.
Russel, Stuart, and Peter Norvig. 2005. Mesterséges Intelligencia. Panem Kft.
Schütze, Hinrich, Christopher D Manning, and Prabhakar Raghavan. 2008. 39 Introduction to Information Retrieval. Cambridge University Press Cambridge.
Sebők, Miklós. 2016. Kvantitatív Szövegelemzés És Szövegbányászat a Politikatudományban. L’Harmattan Kiadó.
Sebők, Miklós et al. 2021. “Viscosity Revisited: The Power of Legislatures in New and Old Democracies - A Comparative Text Reuse Analysis.”
Selivanov, Dmitriy, Manuel Bickel, and Qing Wang. 2020. Text2vec: Modern Text Mining Framework for r.
Sieg, Adrien. 2018. “Text Similarities : Estimate the Degree of Similarity Between Two Texts.” Medium.
Silge, Julia, and David Robinson. 2017. Text Mining with R: A Tidy Approach. " O’Reilly Media, Inc.".
Slapin, Jonathan B, and Sven‐Oliver Proksch. 2008. “A Scaling Model for Estimating Time‐series Party Positions from Texts.” American Journal of Political Science 52(3): 705–22.
Spirling, Arthur, and Pedro L Rodriguez. 2021. “Word Embeddings.” Journal of Politics.
Straka, Milan, and Jana Straková. 2017. “Tokenizing, Pos Tagging, Lemmatizing and Parsing Ud 2.0 with Udpipe.” In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 88–99.
Szarvas, György, Richárd Farkas, and András Kocsor. 2006. “A Multilingual Named Entity Recognition System Using Boosting and C4. 5 Decision Tree Learning Algorithms.” In International Conference on Discovery Science, Springer, 267–78.
Tan, Pang-Ning, Michael Steinbach, and Vipin Kumar. 2011. Bevezetés Az Adatbányászatba. Panem Kft.
Tikk, Domonkos. 2007. Szövegbányászat. Budapest: Typotext.
Üveges, István. 2019. “Named Entity Recognition in the Miskolc Legal Corpus.”
Vincze, Veronika. 2019. “Beszéd És Nyelvelemző Szoftverek.” In Beszéd És Nyelvelemző Szoftverek a Versenyképességért És Az Esélyegyenlőségért HunCLARIN Korpuszok És Nyelvtechnológiai Eszközök a Bölcsészet És Társadalomtudományokban, Szeged, 7–22.
Wang, Jiapeng, and Yihong Dong. 2020. “Measurement of Text Similarity: A Survey.” Information 11(9): 421. (March 20, 2021).
Welbers, Kasper, Wouter Van Atteveldt, and Kenneth Benoit. 2017. “Text Analysis in R.” Communication Methods and Measures 11(4): 245–65.
Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. " O’Reilly Media, Inc.".
Young, Lori, and Stuart Soroka. 2012. “Affective News: The Automated Coding of Sentiment in Political Texts.” Political Communication 29(2): 205–31.
Zsibrita, János, Veronika Vincze, and Richárd Farkas. 2013. “Magyarlanc: A Tool for Morphological and Dependency Parsing of Hungarian.” In Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013, 763–71.