Dissertations / Theses on the topic 'Authorship attribution'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Authorship attribution.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Calarota, Gabriele. "On Authorship Attribution." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/22809/.
Full textHonaker, Randale J. "Novel topic authorship attribution." Thesis, Monterey, California. Naval Postgraduate School, 2011. http://hdl.handle.net/10945/5761.
Full textThe practice of using statistical models in predicting authorship (so-called author-attribution models) is long established. Several recent authorship attribution studies have indicated that topic-specific cues impact author-attribution machine learning models. The arrival of new topics should be anticipated rather than ignored in an author attribution evaluation methodology; a model that relies heavily on topic cues will be problematic in deployment settings where novel topics are common. In order to effectively deal with novel topics, we create author and topic vectors and attempt to project out the topic influences from each document. Although our experiments did not validate our assumptions, they do point out a possible problem with a common assumption in authorship attribution research.
Lalla, Himal. "E-mail forensic authorship attribution." Thesis, University of Fort Hare, 2010. http://hdl.handle.net/10353/360.
Full textGerritsen, Corey M. (Corey Metcalf) 1979. "Authorship attribution using lexical attraction." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/87414.
Full textIncludes bibliographical references (p. 56-57).
by Corey M. Gerritsen.
M.Eng.and S.B.
Tennyson, Matthew Francis. "Authorship Attribution of Source Code." NSUWorks, 2013. http://nsuworks.nova.edu/gscis_etd/322.
Full textGrant, T. D. "Authorship attribution in a forensic context." Thesis, University of Birmingham, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.529439.
Full textPires, David Laranjo. "Authorship attribution using co-occurrence networks." Master's thesis, Universidade de Évora, 2021. http://hdl.handle.net/10174/30831.
Full textGopalakrishnan, Sridharan. "Authorship Attribution based on Grammar Signatures." University of Cincinnati / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1368026620.
Full textCaver, Johnnie F. "Novel topic impact on authorship attribution." Thesis, Monterey, California : Naval Postgraduate School, 2009. http://edocs.nps.edu/npspubs/scholarly/theses/2009/Dec/09Dec%5FCaver.pdf.
Full textThesis Advisor(s): Schein, Andrew I. ; Martell, Craig H. "December 2009." Description based on title screen as viewed on February 01, 2010. Author(s) subject terms: Authorship detection, topic detection, author-topic correlation, topic-author correlation, maximum entropy, New York Times Annotated Corpus. Includes bibliographical references (p. 61-63). Also available in print.
Zhao, Ying, and ying zhao@rmit edu au. "Effective Authorship Attribution in Large Document Collections." RMIT University. Computer Science and Information Technology, 2008. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20080730.162501.
Full textJohnson, Russell Clark. "Authorship Attribution with Function Word N-Grams." NSUWorks, 2013. http://nsuworks.nova.edu/gscis_etd/188.
Full textTeixeira, Filipe. "Boosting compression-based classifiers for authorship attribution." Master's thesis, Universidade de Aveiro, 2016. http://hdl.handle.net/10773/18375.
Full textAtribuição de autoria é o ato de atribuir um autor a documento anónimo. Apesar de esta tarefa ser tradicionalmente feita por especialistas, muitos novos métodos foram apresentados desde o aparecimento de computadores, em meados do século XX, alguns deles recorrendo a compressores para encontrar padrões recorrentes nos dados. Neste trabalho vamos apresentar os resultados que podem ser alcançados ao utilizar mais do que um compressor, utilizando um meta-algoritmo conhecido como Boosting.
Authorship attribution is the task of assigning an author to an anonymous document. Although the task was traditionally performed by expert linguists, many new techniques have been suggested since the appearance of computers, in the middle of the XX century, some of them using compressors to find repeating patterns in the data. This work will present the results that can be achieved by a collaboration of more than one compressor using a meta-algorithm known as Boosting.
Boutwell, Sarah R. "Authorship attribution of short messages using multimodal features." Thesis, Monterey, California. Naval Postgraduate School, 2011. http://hdl.handle.net/10945/5813.
Full textIn this thesis, we develop a multimodal classifier for authorship attribution of short messages. Standard natural language processing authorship attribution techniques are applied to a Twitter text corpus. Using character n-gram features and a NaiÌ ve Bayes classifier, we build statistical models of the set of authors. The social network of the selected Twitter users is analyzed using the screen names referenced in their messages. The timestamps of the messages are used to generate a pattern-of-life model. We analyze the physical layer of a network by measuring modulation characteristics of GSM cell phones. A statistical model of each cell phone is created using a NaiÌ ve Bayes classifier. Each phone is assigned to a Twitter user, and the probability outputs of the individual classifiers are combined to show that the combination of natural-language and network-feature classifiers identifies a user to phone binding better than when the individual classifiers are used independently.
Sari, Yunita. "Neural and non-neural approaches to authorship attribution." Thesis, University of Sheffield, 2018. http://etheses.whiterose.ac.uk/21415/.
Full textSapkota, Upendra. "Improving the performance of cross-domain authorship attribution." Thesis, The University of Alabama at Birmingham, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=3739881.
Full textMost previous research on authorship attribution (AA) assumes that the training and test data are drawn from the same distribution. But in real scenarios, this assumption is too strong. Because of domain mismatches, the AA approaches that perform well on same domain scenarios will degrade performance in cross-domain settings. The goal of this research is to improve the prediction results in cross-domain AA (CDAA), where there is no training data available from the target domain. We propose three different CDAA frameworks to overcome the lack of training samples from the target domain. Our first framework is driven by the hypothesis that a simple model built from all available out-of-domain data effectively discriminates among authors for a new domain. In addition to improving the performance of CDAA, we also study the effectiveness of the three most commonly used feature types in AA. In the second framework, we explore character n-grams by separating them into ten distinct categories based on the linguistic aspect they represent. Finally, the third framework tries to represent each instance with a common feature representation that is meaningful across domains. Based on the findings of our first and second framework, we propose to use and compare two formulations of features for CDAA.
We use prediction accuracy as the performance metric. We compare the performance of proposed frameworks with state-of-the-art approaches, whenever possible. We first demonstrate that addition of training data even if it comes from out-of-topic improves the performance of cross-topic AA. Also we find that character n-grams are the most effective author discriminator for both single as well as cross-domain AA. Once we demonstrate the efficacy of character n-grams in CDAA, we then propose to categorize them to further understand their predictive value. We then demonstrate the discriminative power of each n-gram category, and propose to discard some of the worst performing categories. In the third framework, we demonstrate that structural correspondence learning can induce feature correspondences for AA, and these feature correspondences combine with our character n-gram categorization to yield superior performance on cross-domain AA.
Shaker, Kareem. "Investigating features and techniques for Arabic authorship attribution." Thesis, Heriot-Watt University, 2012. http://hdl.handle.net/10399/2576.
Full textBalla, Stefano. "On code stylometry: authorship attribution of source codesnippets." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/24264/.
Full textLindh, Morén Jonas. "The Application of Closed Frequent Subtrees to Authorship Attribution." Thesis, Umeå universitet, Institutionen för datavetenskap, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-86458.
Full textGrieve, Jack William. "Quantitative authorship attribution : a history and an evaluation of techniques /." Burnaby B.C. : Simon Fraser University, 2005. http://ir.lib.sfu.ca/handle/1892/2055.
Full textCavalcante, Thiago 1989. "Authorship attribution on micro-messages = Atribuição de autoria em micro-mensagens." [s.n.], 2014. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275539.
Full textDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação Científica
Made available in DSpace on 2018-08-26T21:23:31Z (GMT). No. of bitstreams: 1 Cavalcante_Thiago_M.pdf: 3493838 bytes, checksum: 369bd6608e7326d0a998b426a1c7455b (MD5) Previous issue date: 2014
Resumo: Com o crescimento continuo do uso de midias sociais, a atribuição de autoria tem um papel imortante na prevenção dos crimes cibernéticos e na análise de rastros online deixados por assediadores, \textit{bullies}, ladrões de identidade entre outros. Nesta dissertação, nós propusemos um método para atribuição de autoria que é de cem a mil vezes mais rápido que o estado da arte. Nós também obtivemos uma acurácia 65\% na classificação de 50 autores. O método proposto se baseia numa representação de caracteristicas escalável utilizando os padrões das mensagens dos micro-blogs, e também nos utilizamos de um classificador de padrões customizado para lidar com grandes quantidades de dados e alta dimensionalidade. Por fim, nós discutimos a redução do espaço de busca na análise de centenas de suspeitos online e milões de micro mensagens online, o que torna essa abordagem valiosa para forense digital e aplicação das leis
Abstract: With the ever-growing use of social media, authorship attribution plays an important role in avoiding cybercrime, and helping the analysis of online trails left behind by cyber pranks, stalkers, bullies, identity thieves and alike. In this dissertation, we propose a method for authorship attribution in micro blogs with efficiency one hundred to a thousand times faster than state-of-the-art counterparts. We also achieved a accuracy of 65% when classifying texts from 50 authors. The method relies on a powerful and scalable feature representation approach taking advantage of user patterns on micro-blog messages, and also on a custom-tailored pattern classifier adapted to deal with big data and high-dimensional data. Finally, we discuss search space reduction when analysing hundreds of online suspects and millions of online micro messages, which makes this approach invaluable for digital forensics and law enforcement
Mestrado
Ciência da Computação
Mestre em Ciência da Computação
Hendrikse, Steven. "The Effect of Code Obfuscation on Authorship Attribution of Binary Computer Files." NSUWorks, 2017. http://nsuworks.nova.edu/gscis_etd/1009.
Full textCorney, Malcolm W. "Analysing e-mail text authorship for forensic purposes." Thesis, Queensland University of Technology, 2003. https://eprints.qut.edu.au/16069/1/Malcolm_Corney_Thesis.pdf.
Full textCorney, Malcolm W. "Analysing E-mail Text Authorship for Forensic Purposes." Queensland University of Technology, 2003. http://eprints.qut.edu.au/16069/.
Full textLindholm, Lars. ""Art Made Tongue-tied By Authority?" : The Shakespeare Authorship Question." Thesis, Stockholms universitet, Engelska institutionen, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-78261.
Full textLiterary Degree Project
Kimler, Marco. "Using Style Markers for Detecting Plagiarism in Natural Language Documents." Thesis, University of Skövde, Department of Computer Science, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-824.
Full textMost of the existing plagiarism detection systems compare a text to a database of other texts. These external approaches, however, are vulnerable because texts not contained in the database cannot be detected as source texts. This paper examines an internal plagiarism detection method that uses style markers from authorship attribution studies in order to find stylistic changes in a text. These changes might pinpoint plagiarized passages. Additionally, a new style marker called specific words is introduced. A pre-study tests if the style markers can fingerprint an author s style and if they are constant with sample size. It is shown that vocabulary richness measures do not fulfil these prerequisites. The other style markers - simple ratio measures, readability scores, frequency lists, and entropy measures - have these characteristics and are, together with the new specific words measure, used in a main study with an unsupervised approach for detecting stylistic changes in plagiarized texts at sentence and paragraph levels. It is shown that at these small levels the style markers generally cannot detect plagiarized sections because of intra-authorial stylistic variations (i.e. noise), and that at bigger levels the results are strongly a ected by the sliding window approach. The specific words measure, however, can pinpoint single sentences written by another author.
Bugo, Laura. "authorship analysis: studio delle metodologie e sviluppo di un sistema di riconoscimento." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2018.
Find full textMarinho, Vanessa Queiroz. "Development of new models for authorship recognition using complex networks." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-14112017-091805/.
Full textRedes complexas vem sendo aplicadas com sucesso em diferentes domínios, sendo o tema de estudo de distintas áreas que incluem, por exemplo, a física e a computação. A descoberta de que métodos de redes complexas podem ser utilizados para analisar textos em seus distintos níveis de complexidade proporcionou avanços em tarefas de processamento de línguas naturais (PLN). Exemplos de aplicações analisadas com os métodos de redes complexas são a detecção de palavras-chave, a criação de sumarizadores automáticos e o reconhecimento de autoria. Esta última tarefa tem sido estudada com certo sucesso através da representação de redes de co-ocorrência (ou adjacência) de palavras que conectam apenas as palavras mais próximas no texto. Apesar deste sucesso, poucos trabalhos tentaram estender essas redes ou utilizar diferentes representações. Além disso, muitas das abordagens utilizam um conjunto semelhante de medidas de redes complexas e não combinam suas técnicas com as utilizadas tradicionalmente na tarefa de reconhecimento de autoria. Esta pesquisa de mestrado propõe extensões à modelagem tradicional de co-ocorrência e investiga a adequabilidade de novos atributos e de outras modelagens (como as redes mesoscópicas e de entidades nomeadas) para a tarefa. A informação de conectividade de palavras funcionais é utilizada para complementar a caracterização da escrita dos autores, uma vez que essas palavras são relevantes para a tarefa. Finalmente, a maior contribuição deste trabalho consiste no desenvolvimento de classificadores híbridos, denominados labelled motifs, que combinam fatores tradicionais com as propriedades fornecidas pela análise topológica de redes complexas. A relevância desses classificadores é verificada no contexto de reconhecimento de autoria e identificação de translationese. Com esta abordagem híbrida, mostra-se que é possível melhorar o desempenho de técnicas baseadas em rede ao combiná-las com técnicas tradicionais em PLN. Através da adaptação, combinação e aperfeiçoamento da modelagem, não apenas o desempenho dos sistemas de reconhecimento de autoria foi melhorado, mas também foi possível entender melhor quais são os fatores quantitativos textuais (medidos via redes) que podem ser utilizados na área de estilometria. Os avanços obtidos durante este projeto podem ser utilizados para estudar aplicações relacionadas, como é o caso da análise de inconsistências estilísticas e plagiarismos, e análise da complexidade textual. Além disso, muitos dos métodos propostos neste trabalho podem ser facilmente aplicados em diversas línguas naturais.
Schneider, Michael J. "A Study on the Efficacy of Sentiment Analysis in Author Attribution." Digital Commons @ East Tennessee State University, 2015. https://dc.etsu.edu/etd/2538.
Full textTaromi, Kurosh [Verfasser]. "Authorship Attribution in Modern Persian Prose : An Innovative Method to Find Style Discriminators Between Any Set of Authors / Kurosh Taromi." Saarbrücken : VDM Verlag Dr. Müller, 2010. http://www.vdm-verlag.de.
Full textLevy-Minzie, Kori. "Authorship attribution in the e-mail domain a study of the effect of size of author corpus and topic on accuracy of identification." Thesis, Monterey, California. Naval Postgraduate School, 2011. http://hdl.handle.net/10945/5780.
Full textWe determined that it is possible to achieve authorship attribution in the e-mail domain when training on "ersonal" e-mails and testing on "work" e-mails and vice versa. These results are unique since they simulate two different e-mail addresses belonging to the same person where the topic of the e-mails from the two different addresses do not intersect. As we only used one classification technique, these results are preliminary and may serve as a baseline for future work in this area. The corpus of data was the entirety of the Enron corpus as well as a subsection of hand-annotated work and personal e-mails. We discovered that there is enough author signal in each class to identify an author in a sea of noise. We included suggestions for future work in the areas of expanding feature selection, increasing corpus size, and including more classification methods. Advancement in this area will contribute to increasing cyber security by identifying the senders of anonymous derogatory e-mails and reducing cyber bullying.
Funai, Tomohiko. "Extensions of Nearest Shrunken Centroid Method for Classification." BYU ScholarsArchive, 2010. https://scholarsarchive.byu.edu/etd/2402.
Full textDubois, François-Ronan. "L'Appropriation de l'œuvre : Instances et visées de l'attribution des œuvres à leur auteur dans la France de l'Ancien Régime (1645-1777)." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAL038/document.
Full textLiterary property rights in early modern France are often understood through the prism of the contemporary droit d’auteur. Many studies see the early modern period as a laboratory for an on-going experiment in law and ideology, still ill-fitted to the literary practices of the authors. This thesis offers a fresh start in the examination of the question of literary property, taking the whole library system from the 1650s to the 1780s to be an effective articulation of agents, tools, and discursives practices. Through the study of institutional policies in the domain of literary property as well as judicial responsibility, through a careful reading of the bibliographical discourse with dictionaries, anas, and periodicals, and through the description of editorial endeavors undertaken by authors themselves, it shows the dynamics of the early modern library and literary world. With roots in literary history, history of law, and book history, this dissertation seeks to understand how the concept of literary property is aggregated, against the very interests of the authors, to consolidate a commercial book-trade where the State slowly delegates its regulatory powers. Through the study of literary attribution, this work follows its demonstrations with an acute interest in a close-reading of literary paratexts
Java, James. "Characterization of Prose by Rhetorical Structure for Machine Learning Classification." NSUWorks, 2015. http://nsuworks.nova.edu/gscis_etd/347.
Full textValencia, Camilo Akimushkin. "Propriedades de redes aplicadas à atribuição de autoria." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/76/76131/tde-12092017-081937/.
Full textAuthorship attribution is an active research area with many applications, including detection of plagiarism, analysis of historical texts, terrorist message identification or document falsification. Theoretical models of complex networks are already used for authorship attribution, but some issues have been ignored. In this thesis, we explore the dynamics of co-occurrence networks and the role of words, and found that they are both clear signatures of authorship. Using optimized descriptors for the network topology and machine learning algorithms, it has been possible to achieve accuracy rates above 85%, with a rate of 98.75% being reached in a particular case, for collections of 80 books produced by 8 English-speaking writers with 10 books per author. It is also shown that there are still many unexplored aspects of co-occurrence networks of texts, which seems promising for near future developments.
Belvisi, Nicole Mariah Sharon. "Document Forensics Through Textual Analysis." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-40157.
Full textŽalkauskaitė, Gintarė. "Idiolekto požymiai elektroniniuose laiškuose." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2012. http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2012~D_20120118_131320-23362.
Full textThe current study aims to establish, if authors idiolect can be recognized in electronic mails language and to determine the features of lexis and graphics, which can be linked to idiolect. The data has been derived from a corpus of 65,000 words consisting of electronic letters written in Lithuanian by six persons. The WordSmith Tools software was used to generate frequency lists of six subcorpora, representing each person’s language. By using the contrastive method the frequency data of six persons language were compared. The lexis and graphics elements, which were used by one person more often or more rarely than by others and were not determined by the topic, were linked to authors idiolect. As a result of the analysis the classification of lexical and graphical elements is given, which can help recognizing idiolect. The study shows that on a lexical level the main differences between idiolects are in the usage of the modality and stance expressing words, and also the words and abbreviations, which are differently chosen from possible variants. On a graphical level idiolects can be recognized from punctuation marks, emoticons and graphic symbols, used at a different frequency. Based on research results the recommendations for authorship attribution examinations are given.
Žalkauskaitė, Gintarė. "Features of Idiolect in E-mails." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2012. http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2012~D_20120118_131329-68478.
Full textŠiuo darbu siekta nustatyti, ar asmeninių elektroninių laiškų kalboje atsiskleidžia autoriaus idiolektas ir kokiais leksiniais bei grafiniais požymiais jis pasireiškia.. Tyrimui buvo surinktas šešių autorių asmeninių neoficialaus bendravimo elektroninių laiškų tekstynas. Tekstyno duomenys apdoroti pasitelkiant WordSmith Tools programą ir atlikta gretinamoji tekstų analizė: lyginti kalbos vienetų pasikartojimo dažniai tiriamųjų autorių laiškuose ir nustatyta, kad vienų autorių dažniau ar rečiau nei kitų vartojami kalbos vienetai skiria autorių idiolektus. Iš nustatytų kalbos požymių apibendrintos su idiolektu sietinų kalbinės raiškos vienetų grupės. Nustatyta, kad leksikos lygmenyje idiolektus aiškiausiai skiria autoriaus vertinimą ir nuostatas perteikiantys bei modalumą reiškiantys žodžiai bei iš galimų leksinių konkurentų pasirenkami žodžiai ir trumpiniai. Taip pat idiolektus žymi skirtingų autorių nevienodai dažnai pasirenkamų skyrybos ir grafinių ženklų vartojimas. Remiantis atlikto tyrimo rezultatais disertacijoje pateikiamos rekomendacijos teismo lingvistinius autorystės tyrimus atliekantiems ekspertams.
Chen, Beichen. "Stylometric Embeddings for Book Similarities." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-303125.
Full textStilometri eller stilistisk statistik är ett forskningsområde som arbetar med att definiera särdrag för att kvantitativt studera stilistisk variation hos författare. Stilometri har mest fokuserat på författarbestämning, där uppgiften är att avgöra vem som skrivit en viss text där författaren är okänd, givet tidigare texter med kända författare. I denna stude valdes ett antal lexikala och syntaktiska stilistiska särdrag vilka användes för att bestämma författare. Experimentella resultat redovisas för två samlingar litterära verk: en mindre med 27 böcker skrivna av 25 författare och en större med 11 063 böcker skrivna av 316 författare. Neurala nätverk användes för att koda de valda särdragen som vektorer varefter de närmaste grannarna för de okända texterna i vektorrummet användes för att bestämma författarna. För den mindre samlingen uppnåddes en träffsäkerhet på 91,25% genom att använda de 50 vanligaste funktionsorden, syntaktiska dependensrelationer och ordklassinformation. För den större samlingen uppnåddes en träffsäkerhet på 69,18% med liknande särdrag. Ett användartest visar att modellen utöver att bestämma författare har potential att representera likhet mellan författares stil. Detta skulle kunna tillämpas för att rekommendera böcker till läsare baserat på stil.
Koh, Kok Chuan. "Modeling Alcohol Consumption Using Blog Data." Thesis, University of North Texas, 2013. https://digital.library.unt.edu/ark:/67531/metadc271843/.
Full textCONSALVI, ANDREA. "ATTRIBUZIONE DI TESTI LETTERARI: UNA PROPOSTA METODOLOGICA." Doctoral thesis, Università Cattolica del Sacro Cuore, 2022. http://hdl.handle.net/10280/122312.
Full textThe dissertation provides some theoretical and practical resources for scientific investigations into the field of authorship attribution studies of literary works. The first chapter provides a glossary of the correct terminology to be used when addressing various degrees of declared and elaborated intertextuality. Chapter Two follows with an excursus from the Hellenistic age to the present day aimed at retracing the main stages of the history of authorship attribution studies to gain greater awareness of how the field has changed, especially in terms of the disciplines involved and potential analyses. The third chapter offers a methodological proposal, with related IT tools, on how to conduct a study of this nature, from the initial reading to the interpretation of data and the conclusions. The present research has made it possible to trace and better define the field of authorship attribution studies, both from a historical and a practical-methodological perspective. Additionally, the contribution of digital humanities, and consequently of its interactions with other disciplines, have proved to be fundamental.
Bertocchini, Pietro. "Il dilemma dell'autenticità del «Clitofonte»: studio del dialogo e ipotesi di attribuzione." Doctoral thesis, Università degli studi di Padova, 2019. http://hdl.handle.net/11577/3424836.
Full textThe dissertation approaches the dilemma of «Clitophon»'s authenticity from different perspectives. It offers an introduction, a translation and a thorough and updated analysis of the text and of the many issues that it elicits. Some stylometric tools were deployed along with the traditional research methods. If, as it seems, the author of the dialogue is not Plato, it may have been written by a member of the Academy of his time.
Хомицька, Ірина Юріївна. "Методи та засоби диференціації фоностатистичних структур функціональних стилів англійської мови." Diss., Національний університет «Львівська політехніка», 2021. https://ena.lpnu.ua/handle/ntb/56676.
Full textViverit, Guido. "Problemi di attribuzione conflittuale nella musica strumentale veneta del Settecento." Doctoral thesis, Università degli studi di Padova, 2015. http://hdl.handle.net/11577/3423996.
Full textLa tesi affronta il fenomeno delle attribuzioni conflittuali, un problema che si verifica quando una composizione è attribuita a differenti autori nelle fonti in qui essa appare. Lo scopo della ricerca è stato quello di approfondire il fenomeno per comprenderne le cause, considerando come ambito di indagine la musica strumentale veneta del Settecento e ponendo particolare attenzione sia all’aspetto storico-musicologico che a quello concettuale. Per indagare più a fondo il fenomeno sono stati presi in esame tre casi di studio attentamente selezionati in quanto rappresentativi dell’ampia casistica che il repertorio presenta: il Concerto per oboe in Re minore attribuito ad Alessandro e Benedetto Marcello, Antonio Vivaldi e Johann Sebastian Bach; la raccolta di Sonate a tre attribuite a Domenico Gallo e Giovanni Battista Pergolesi; la raccolta di Concerti a cinque op. 1 libro terzo attribuita a Giuseppe Tartini e a Gasparo Visconti. L’indagine riguardante i singoli casi di studio ha condotto all’individuazione di nuovi testimoni e di nuove informazioni relative ai soggetti coinvolti nelle attribuzioni. La ricostruzione dettagliata della storia attributiva e l’esame delle fonti ha reso possibile avanzare alcune ipotesi in merito all’origine delle varie attribuzioni considerate. Più in generale la tesi tenta di indagare in profondità tutti gli aspetti relativi al contesto in cui un’opera nacque e fu trasmessa; agli interessi economici che gravitarono attorno alla diffusione di un’opera; alle modalità di produzione delle fonti musicali; agli strumenti di cui il compositore disponeva per tutelare la propria opera e la propria condizione autoriale; in definitiva, si interroga sul concetto di autore e di proprietà intellettuale nell’ambito della musica strumentale medio-settecentesca.
Nobre, Neto Francisco Dantas. "Atribuição automática de autoria de obras da literatura brasileira." Universidade Federal da Paraíba, 2010. http://tede.biblioteca.ufpb.br:8080/handle/tede/6121.
Full textCoordenação de Aperfeiçoamento de Pessoal de Nível Superior
Authorship attribution consists in categorizing an unknown document among some classes of authors previously selected. Knowledge about authorship of a text can be useful when it is required to detect plagiarism in any literary document or to properly give the credits to the author of a book. The most intuitive form of human analysis of a text is by selecting some characteristics that it has. The study of selecting attributes in any written document, such as average word length and vocabulary richness, is known as stylometry. For human analysis of an unknown text, the authorship discovery can take months, also becoming tiring activity. Some computational tools have the functionality of extracting such characteristics from the text, leaving the subjective analysis to the researcher. However, there are computational methods that, in addition to extract attributes, make the authorship attribution, based in the characteristics gathered in the text. Techniques such as neural network, decision tree and classification methods have been applied to this context and presented results that make them relevant to this question. This work presents a data compression method, Prediction by Partial Matching (PPM), as a solution of the authorship attribution problem of Brazilian literary works. The writers and works selected to compose the authors database were, mainly, by their representative in national literature. Besides, the availability of the books has also been considered. The PPM performs the authorship identification without any subjective interference in the text analysis. This method, also, does not make use of attributes presents in the text, differently of others methods. The correct classification rate obtained with PPM, in this work, was approximately 93%, while related works exposes a correct rate between 72% and 89%. In this work, was done, also, authorship attribution with SVM approach. For that, were selected attributes in the text divided in two groups, one word based and other in function-words frequency, obtaining a correct rate of 36,6% and 88,4%, respectively.
Atribuição de autoria consiste em categorizar um documento desconhecido dentre algumas classes de autores previamente selecionadas. Saber a autoria de um texto pode ser útil quando é necessário detectar plágio em alguma obra literária ou dar os devidos créditos ao autor de um livro. A forma mais intuitiva ao ser humano para se analisar um texto é selecionando algumas características que ele possui. O estudo de selecionar atributos em um documento escrito, como tamanho médio das palavras e riqueza vocabular, é conhecido como estilometria. Para análise humana de um texto desconhecido, descobrir a autoria pode demandar meses, além de se tornar uma tarefa cansativa. Algumas ferramentas computacionais têm a funcionalidade de extrair tais características do texto, deixando a análise subjetiva para o pesquisador. No entanto, existem métodos computacionais que, além de extrair atributos, atribuem a autoria baseado nas características colhidas ao longo do texto. Técnicas como redes neurais, árvores de decisão e métodos de classificação já foram aplicados neste contexto e apresentaram resultados que os tornam relevantes para tal questão. Este trabalho apresenta um método de compressão de dados, o Prediction by Partial Matching (PPM), para solução do problema de atribuição de autoria de obras da literatura brasileira. Os escritores e obras selecionados para compor o banco de autores se deram, principalmente, pela representatividade que possuem na literatura nacional. Além disso, a disponibilidade dos livros em formato eletrônico também foi considerada. O PPM realiza a identificação de autoria sem ter qualquer interferência subjetiva na análise do texto. Este método, também, não faz uso de atributos presentes ao longo do texto, diferentemente de outros métodos. A taxa de classificação correta alcançada com o PPM, neste trabalho, foi de aproximadamente 93%, enquanto que trabalhos relacionados mostram uma taxa de acerto entre 72% e 89%. Neste trabalho, também foi realizado atribuição de autoria com a abordagem SVM. Para isso, foram selecionados atributos no texto dividido em dois tipos, sendo um baseado em palavras e o outro na contagem de palavrasfunção, obtendo uma taxa de acerto de 36,6% e 88,4%, respectivamente.
Queralt, Sheila 1987. "Estudio piloto para la evaluación de evidencias lingüísticas en la comparación forense de textos mediante distribuciones poblacionales y relaciones de verosimilitudes." Doctoral thesis, Universitat Pompeu Fabra, 2015. http://hdl.handle.net/10803/318374.
Full textLa present tesi proposa la implementació de tècniques estadístiques en l’anàlisi de variables lingüístiques per tal de crear un model de distribució poblacional útil en l’àrea de la comparació forense de textos escrits. Finalment, en una última fase es pretén aplicar el marc teòric i metodològic de la raó de verosimilitut. L’objectiu és poder millorar els resultats en la tasca d’atribuir/determinar l’autoria per tal d’assessorar d’una manera més objectiva els diversos agents judicials i poder protegir a totes aquelles persones involucrades en processos judicials d’un possible error de la justícia.
La present tesi proposa la implementació de tècniques estadístiques en l’anàlisi de variables lingüístiques per tal de crear un model de distribució poblacional útil en l’àrea de la comparació forense de textos escrits. Finalment, en una última fase es pretén aplicar el marc teòric i metodològic de la raó de verosimilitut. L’objectiu és poder millorar els resultats en la tasca d’atribuir/determinar l’autoria per tal d’assessorar d’una manera més objectiva els diversos agents judicials i poder protegir a totes aquelles persones involucrades en processos judicials d’un possible error de la justícia.
Jacovino, Julia Maureen. "Authorship Attribution Through Words Surrounding Named Entities." 2013. http://digital.library.duq.edu/u?/etd,162270.
Full textMcAnulty College and Graduate School of Liberal Arts;
Computational Mathematics
MS;
Thesis;
Xie, Cheng-En, and 謝承恩. "Quantitative Authorship Attribution in Early Chinese Buddhist Translations." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/76439061447388436507.
Full text法鼓佛教學院
佛教學系
100
The Taishō edition of the Chinese Buddhist canon (1924-1932) collects ca. 1000 Indian texts that were translated into Chinese between the 2nd and the 11th century CE. 153 of these texts are marked as 失譯 indicating that the name(s) of the translator(s) are unknown. For the texts translated between the 2nd and the late 6th century, however, we have to confront the dilemma that many attributions are uncertain, problematic or simply wrong. Over the years Buddhist scholars have leveraged traditional text-critical methods to corroborate or dispute traditional attributions. Although these methods method can produce high quality results, they often rely heavily on the intuition of a single scholar honed over many years of research. Information technology offers an alternative vector of inquiry that aims to complement rather than supersede more traditional approaches. For this we will adopt statistical, quantitative methods and artificial intelligence algorithms to analyze ancient Buddhist texts translated into Chinese in order to discover new evidence to address the translator attribution. The major advantage of stylometrics and quantitative authorship attribution is being able to discover hidden patterns, which cannot be discerned by traditional approaches. In the past four decades, considerable attention has been paid to quantitative authorship attribution of literature in western languages; however, there have been only few attempts focusing on texts written in classical Chinese, much less in the particular form of ‘Indian Buddhist Chinese’ of early translated texts. In this paper, our main focus will be on grammatical particles (xuci 虛詞) that are widely used in classical Chinese to express grammatical relations. After measuring their occurrence in Indian Buddhist Chinese we use Principle Component Analysis (PCA) to discuss how their use reflects on the authorship of some selected sutras. Our analysis explores different scenarios that have to be accounted for such as a translator changing his style in the course of his career, how to understand commonalities between of contemporaneous translations, and how to quantify the difference between different translations of the same sutra. Also, in latter part of this thesis, we apply our analysis model with the three different translations of Gandhavyūha from different dynasties. We have conduct a series of experiments with different values of arguments. Through the experiments, we demonstrates that the T.278 was translated three to four hundred years earlier than T.279 and T.293 and shows which of its features can identify it as an earlier text.
(9193709), Yifei Hu. "A Study of Media Polarization with Authorship Attribution Methods." Thesis, 2020.
Find full textTANI, RAFFAELLI GIULIO. "Generative models for inference: an application to authorship attribution." Doctoral thesis, 2022. http://hdl.handle.net/11573/1637529.
Full textKing, Edmund (Edmund George Coghill). "In the character of Shakespeare: canon, authorship and attribution in eighteenth-century England." 2008. http://hdl.handle.net/2292/2615.
Full text