Irodalomjegyzék
Arun, Rajkumar, Venkatasubramaniyan Suresh, CE Veni Madhavan, and MN Narasimha Murthy. 2010. “On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations.” In 14th Pacific-Asia Conference, PAKDD 2010, Hyderabat, India, June 21-24, 2010, Proceedings, 391–402.
Baccianella, Stefano, Andrea Esuli, and Fabrizio Sebastiani. 2010. “SENTIWORDNET 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining.” In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), 2200–2204. http://www.lrec-conf.org/proceedings/lrec2010/pdf/769_Paper.pdf.
Bar, Daniel, Torsten Zesch, and Iryna Gurevych. 2011. “A Reflective View on Text Similarity.” In Proceedings of the International Conference Recent Advances in Natural Language Processing 2011, 515–20. https://aclanthology.org/R11-1071.
Benoit, Kenneth et al. 2018. “Quanteda: An R Package for the Quantitative Analysis of Textual Data.” Journal of Open Source Software 3(30): 774. https://quanteda.io.
Blei, David M, Andrew Y Ng, and Michael I Jordan. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning Research 3(Jan): 993–1022. https://www.jmlr.org/papers/v3/blei03a.html.
Brady, Henry E. 2019. “The Challenge of Big Data and Data Science.” Annual Review of Political Science 22(1): 297–323. https://www.annualreviews.org/doi/10.1146/annurev-polisci-090216-023229 (April 15, 2021).
Burtejin, Zorgit. 2016. “Csoportosítás (Klaszterezés).” In Kvantitatív Szövegelemzés És Szövegbányászat a Politikatudományban, ed. Miklós Sebők. Budapest: L’Harmattan, 85–101.
Cao, Juan et al. 2009. “A Density-Based Method for Adaptive LDA Model Selection.” Neurocomputing 72(7-9): 1775–81.
Deveaud, Romain, Eric SanJuan, and Patrice Bellot. 2014. “Accurate and Effective Latent Concept Modeling for Ad Hoc Information Retrieval.” Document numérique 17(1): 61–84.
Griffiths, T. L., and M. Steyvers. 2004. “Finding Scientific Topics.” Proceedings of the National Academy of Sciences 101(Supplement 1): 5228–35. http://www.pnas.org/cgi/doi/10.1073/pnas.0307752101 (February 23, 2021).
Grimmer, Justin, and Brandon M Stewart. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21(3): 267–97.
Hjorth, Frederik et al. 2015. “Computers, Coders, and Voters: Comparing Automated Methods for Estimating Party Positions.” Research & Politics 2(2): 1–9.
Jacobi, Carina, Wouter Van Atteveldt, and Kasper Welbers. 2015. “Quantitative Analysis of Large Amounts of Journalistic Texts Using Topic Modelling.” Digital Journalism 4(1): 89–106.
Kusner, Matt J, Yu Sun, Nicholas I Kolkin, and Kilian Q Weinberger. 2015. “From Word Embeddings To Document Distances.” In Proceedings of the 32nd International Conference on Machine Learning, 957–66. https://proceedings.mlr.press/v37/kusnerb15.html.
Kwartler, Ted. 2017. Text Mining in Practice with R. Hoboken, NJ: John Wiley & Sons.
Ladd, John R. 2020. “Understanding and Using Common Similarity Measures for Text Analysis.” Programming Historian (9). https://programminghistorian.org/en/lessons/common-similarity-measures.
Laver, Michael, Kenneth Benoit, and John Garry. 2003. “Extracting Policy Positions from Political Texts Using Words as Data.” American Political Science Review 97(2): 311–31.
Laver, Michael, and John Garry. 2000. “Estimating Policy Positions from Political Texts.” American Journal of Political Science 44(3): 619–34.
Liu, Bing. 2010. “Sentiment Analysis and Subjectivity.” Handbook of Natural Language Processing 2: 627–66. https://www.researchgate.net/profile/Bing-Liu-120/publication/228667268_Sentiment_analysis_and_subjectivity/links/5472bbea0cf24bc8ea199f7c/Sentiment-analysis-and-subjectivity.pdf.
Loughran, Tim, and Bill McDonald. 2011. “When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks.” The Journal of Finance 66(1): 35–65. https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1540-6261.2010.01625.x.
Máté, Ákos, Miklós Sebők, and Tamás Barczikay. 2021. “The Effect of Central Bank Communication on Sovereign Bond Yields: The Case of Hungary” ed. Hiranya K. Nath. PLOS ONE 16(2): e0245515. https://dx.plos.org/10.1371/journal.pone.0245515 (February 15, 2021).
Mikolov, Tomas et al. 2018. “Advances in Pre-Training Distributed Word Representations.” In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 52–55. https://aclanthology.org/L18-1008.
Mikolov, Tomas, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. “Efficient Estimation of Word Representations in Vector Space.” arXiv preprint arXiv:1301.3781.
Niwattanakul, Suphakit, Jatsada Singthongchai, Ekkachai Naenudorn, and Supachanun Wanapu. 2013. “Using of Jaccard Coefficient for Keywords Similarity.” In Proceedings of the International MultiConference of Engineers and Computer Scientists 2013 Vol I, IMECS 2013, 1–5. https://www.iaeng.org/publication/IMECS2013/IMECS2013_pp380-384.pdf.
Pennington, Jeffrey, Richard Socher, and Christopher D Manning. 2014. “GloVe: Global Vectors for Word Representation.” In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–43.
Phan, Xuan-Hieu, Le-Minh Nguyen, and Susumu Horiguchi. 2008. “Learning to Classify Short and Sparse Text & Web with Hidden Topics from Large-Scale Data Collections.” In WWW ’08: Proceedings of the 17th International Conference on World Wide Web, 91–100.
Roberts, Margaret E et al. 2014. “Structural Topic Models for Open-Ended Survey Responses.” American Journal of Political Science 58(4): 1064–82.
Russel, Stuart, and Peter Norvig. 2005. Mesterséges Intelligencia Modern Megközelítésben. Budapest: Panem.
Schütze, Hinrich, Christopher D Manning, and Prabhakar Raghavan. 2008. Introduction to Information Retrieval. New York: Cambridge University Press Cambridge.
Sebők, Miklós, ed. 2016. Kvantitatív Szövegelemzés És Szövegbányászat a Politikatudományban. Budapest: L’Harmattan.
Sebők, Miklós, Tamás Berki, and Flóra Bolonyai. 2021. “Viscosity Revisited: The Power of Legislatures in New and Old Democracies - A Comparative Text Reuse Analysis.”
Selivanov, Dmitriy, Manuel Bickel, and Qing Wang. 2020. “Text2vec: Modern Text Mining Framework for R.” https://cran.r-project.org/web/packages/text2vec/index.html.
Sieg, Adrien. 2018. “Text Similarities : Estimate the Degree of Similarity Between Two Texts.” Medium. https://medium.com/@adriensieg/text-similarities-da019229c894.
Silge, Julia, and David Robinson. 2017. Text Mining with R: A Tidy Approach. Beijing ; Boston: O’Reilly Media, Inc.
Slapin, Jonathan B, and Sven‐Oliver Proksch. 2008. “A Scaling Model for Estimating Time-Series Party Positions from Texts.” American Journal of Political Science 52(3): 705–22.
Spirling, Arthur, and Pedro L Rodriguez. 2021. “Word Embeddings: What Works, What Doesn’t, and How to Tell the Difference for Applied Research.” The Journal of Politics 84(1). https://polmeth.mit.edu/sites/default/files/documents/Pedro_Rodriguez.pdf.
Tan, Pang-Ning, Michael Steinbach, and Vipin Kumar. 2011. Bevezetés Az Adatbányászatba. Panem. https://gyires.inf.unideb.hu/KMITT/a04/.
Tikk, Domonkos, ed. 2007. Szövegbányászat. Budapest: Typotex Kiadó.
Üveges, István. 2019. “Named Entity Recognition in the Miskolc Legal Corpus.” In, 113–22. https://publicatio.bibl.u-szeged.hu/19233/7/31140948.pdf.
Vincze, Veronika. 2019. “Bevezetés a Korpuszok És Nyelvi Adatbázisok Világába.” In Beszéd És Nyelvelemző Szoftverek a Versenyképességért És Az Esélyegyenlőségért: HunCLARIN Korpuszok És Nyelvtechnológiai Eszközök a Bölcsészet- És Társadalomtudományokban, eds. Hedvig Sulyok, Valéria Juhász, and Tamás Erdei. Szeged: SZTE JGYPK Magyar és Alkalmazott Nyelvészeti Tanszék, 5–20.
Wang, Jiapeng, and Yihong Dong. 2020. “Measurement of Text Similarity: A Survey.” Information 11(9): 421. https://www.mdpi.com/2078-2489/11/9/421 (March 20, 2021).
Welbers, Kasper, Wouter Van Atteveldt, and Kenneth Benoit. 2017. “Text Analysis in R.” Communication Methods and Measures 11(4): 245–65.
Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. Sebastopol, CA: O’Reilly Media, Inc.
Young, Lori, and Stuart Soroka. 2012. “Affective News: The Automated Coding of Sentiment in Political Texts.” Political Communication 29(2): 205–31.