Academic literature on the topic 'Allocation de Dirichlet'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Allocation de Dirichlet.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Allocation de Dirichlet"

1

Du, Lan, Wray Buntine, Huidong Jin, and Changyou Chen. "Sequential latent Dirichlet allocation." Knowledge and Information Systems 31, no. 3 (June 10, 2011): 475–503. http://dx.doi.org/10.1007/s10115-011-0425-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Schwarz, Carlo. "Ldagibbs: A Command for Topic Modeling in Stata Using Latent Dirichlet Allocation." Stata Journal: Promoting communications on statistics and Stata 18, no. 1 (March 2018): 101–17. http://dx.doi.org/10.1177/1536867x1801800107.

Full text
Abstract:
In this article, I introduce the ldagibbs command, which implements latent Dirichlet allocation in Stata. Latent Dirichlet allocation is the most popular machine-learning topic model. Topic models automatically cluster text documents into a user-chosen number of topics. Latent Dirichlet allocation represents each document as a probability distribution over topics and represents each topic as a probability distribution over words. Therefore, latent Dirichlet allocation provides a way to analyze the content of large unclassified text data and an alternative to predefined document classifications.
APA, Harvard, Vancouver, ISO, and other styles
3

Yoshida, Takahiro, Ryohei Hisano, and Takaaki Ohnishi. "Gaussian hierarchical latent Dirichlet allocation: Bringing polysemy back." PLOS ONE 18, no. 7 (July 12, 2023): e0288274. http://dx.doi.org/10.1371/journal.pone.0288274.

Full text
Abstract:
Topic models are widely used to discover the latent representation of a set of documents. The two canonical models are latent Dirichlet allocation, and Gaussian latent Dirichlet allocation, where the former uses multinomial distributions over words, and the latter uses multivariate Gaussian distributions over pre-trained word embedding vectors as the latent topic representations, respectively. Compared with latent Dirichlet allocation, Gaussian latent Dirichlet allocation is limited in the sense that it does not capture the polysemy of a word such as “bank.” In this paper, we show that Gaussian latent Dirichlet allocation could recover the ability to capture polysemy by introducing a hierarchical structure in the set of topics that the model can use to represent a given document. Our Gaussian hierarchical latent Dirichlet allocation significantly improves polysemy detection compared with Gaussian-based models and provides more parsimonious topic representations compared with hierarchical latent Dirichlet allocation. Our extensive quantitative experiments show that our model also achieves better topic coherence and held-out document predictive accuracy over a wide range of corpus and word embedding vectors which significantly improves the capture of polysemy compared with GLDA and CGTM. Our model learns the underlying topic distribution and hierarchical structure among topics simultaneously, which can be further used to understand the correlation among topics. Moreover, the added flexibility of our model does not necessarily increase the time complexity compared with GLDA and CGTM, which makes our model a good competitor to GLDA.
APA, Harvard, Vancouver, ISO, and other styles
4

Archambeau, Cedric, Balaji Lakshminarayanan, and Guillaume Bouchard. "Latent IBP Compound Dirichlet Allocation." IEEE Transactions on Pattern Analysis and Machine Intelligence 37, no. 2 (February 2015): 321–33. http://dx.doi.org/10.1109/tpami.2014.2313122.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Pion-Tonachini, Luca, Scott Makeig, and Ken Kreutz-Delgado. "Crowd labeling latent Dirichlet allocation." Knowledge and Information Systems 53, no. 3 (April 19, 2017): 749–65. http://dx.doi.org/10.1007/s10115-017-1053-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

S.S., Ramyadharshni, and Pabitha Dr.P. "Topic Categorization on Social Network Using Latent Dirichlet Allocation." Bonfring International Journal of Software Engineering and Soft Computing 8, no. 2 (April 30, 2018): 16–20. http://dx.doi.org/10.9756/bijsesc.8390.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Syed, Shaheen, and Marco Spruit. "Exploring Symmetrical and Asymmetrical Dirichlet Priors for Latent Dirichlet Allocation." International Journal of Semantic Computing 12, no. 03 (September 2018): 399–423. http://dx.doi.org/10.1142/s1793351x18400184.

Full text
Abstract:
Latent Dirichlet Allocation (LDA) has gained much attention from researchers and is increasingly being applied to uncover underlying semantic structures from a variety of corpora. However, nearly all researchers use symmetrical Dirichlet priors, often unaware of the underlying practical implications that they bear. This research is the first to explore symmetrical and asymmetrical Dirichlet priors on topic coherence and human topic ranking when uncovering latent semantic structures from scientific research articles. More specifically, we examine the practical effects of several classes of Dirichlet priors on 2000 LDA models created from abstract and full-text research articles. Our results show that symmetrical or asymmetrical priors on the document–topic distribution or the topic–word distribution for full-text data have little effect on topic coherence scores and human topic ranking. In contrast, asymmetrical priors on the document–topic distribution for abstract data show a significant increase in topic coherence scores and improved human topic ranking compared to a symmetrical prior. Symmetrical or asymmetrical priors on the topic–word distribution show no real benefits for both abstract and full-text data.
APA, Harvard, Vancouver, ISO, and other styles
8

Li, Gen, and Hazri Jamil. "Teacher professional learning community and interdisciplinary collaborative teaching path under the informationization basic education model." Yugoslav Journal of Operations Research, no. 00 (2024): 29. http://dx.doi.org/10.2298/yjor2403029l.

Full text
Abstract:
The construction of a learning community cannot be separated from the participation of information technology. The current teacher learning community has problems of low interaction efficiency and insufficient enthusiasm for group cooperative teaching. This study adopts the Latent Dirichlet allocation method to process text data generated by teacher interaction from the evolution of knowledge topics in the learning community network space. At the same time, the interaction data of the network community learning space is used to extract the interaction characteristics between teachers, and a collaborative teaching group is formed using the K-means clustering algorithm. This study verifies the management effect of Latent Dirichlet allocation and Kmeans algorithm in learning community space through experiments. The experiment showed that the Latent Dirichlet allocation algorithm had the highest F1 value at a K value of 12, which is 0.88. It collaborated with the filtering algorithm on the overall F1 value. At the same time, there were a total of 4 samples with incorrect judgments in Latent Dirichlet allocation, with an accuracy of 86.7%, which is higher than other algorithm models. The results indicate that the proposed Latent Dirichlet allocation combined with K-means algorithm has superior performance in the management of teacher professional learning communities, and can effectively improve the service level of teacher work.
APA, Harvard, Vancouver, ISO, and other styles
9

Garg, Mohit, and Priya Rangra. "Bibliometric Analysis of Latent Dirichlet Allocation." DESIDOC Journal of Library & Information Technology 42, no. 2 (February 28, 2022): 105–13. http://dx.doi.org/10.14429/djlit.42.2.17307.

Full text
Abstract:
Latent Dirichlet Allocation (LDA) has emerged as an important algorithm in big data analysis that finds the group of topics in the text data. It posits that each text document consists of a group of topics, and each topic is a mixture of words related to it. With the emergence of a plethora of text data, the LDA has become a popular algorithm for topic modeling among researchers from different domains. Therefore, it is essential to understand the trends of LDA researches. Bibliometric techniques are established methods to study the research progress of a topic. In this study, bibliographic data of 18715 publications that have cited the LDA were extracted from the Scopus database. The software R and Vosviewer were used to carry out the analysis. The analysis revealed that research interest in LDA had grown exponentially. The results showed that most authors preferred “Book Series” followed by “Conference Proceedings” as the publication venue. The majority of the institutions and authors were from the USA, followed by China. The co-occurrence analysis of keywords indicated that text mining and machine learning were dominant topics in LDA research with significant interest in social media. This study attempts to provide a comprehensive analysis and intellectual structure of LDA compared to previous studies.
APA, Harvard, Vancouver, ISO, and other styles
10

Chauhan, Uttam, and Apurva Shah. "Topic Modeling Using Latent Dirichlet allocation." ACM Computing Surveys 54, no. 7 (September 30, 2022): 1–35. http://dx.doi.org/10.1145/3462478.

Full text
Abstract:
We are not able to deal with a mammoth text corpus without summarizing them into a relatively small subset. A computational tool is extremely needed to understand such a gigantic pool of text. Probabilistic Topic Modeling discovers and explains the enormous collection of documents by reducing them in a topical subspace. In this work, we study the background and advancement of topic modeling techniques. We first introduce the preliminaries of the topic modeling techniques and review its extensions and variations, such as topic modeling over various domains, hierarchical topic modeling, word embedded topic models, and topic models in multilingual perspectives. Besides, the research work for topic modeling in a distributed environment, topic visualization approaches also have been explored. We also covered the implementation and evaluation techniques for topic models in brief. Comparison matrices have been shown over the experimental results of the various categories of topic modeling. Diverse technical challenges and future directions have been discussed.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Allocation de Dirichlet"

1

Ponweiser, Martin. "Latent Dirichlet Allocation in R." WU Vienna University of Economics and Business, 2012. http://epub.wu.ac.at/3558/1/main.pdf.

Full text
Abstract:
Topic models are a new research field within the computer sciences information retrieval and text mining. They are generative probabilistic models of text corpora inferred by machine learning and they can be used for retrieval and text mining tasks. The most prominent topic model is latent Dirichlet allocation (LDA), which was introduced in 2003 by Blei et al. and has since then sparked off the development of other topic models for domain-specific purposes. This thesis focuses on LDA's practical application. Its main goal is the replication of the data analyses from the 2004 LDA paper ``Finding scientific topics'' by Thomas Griffiths and Mark Steyvers within the framework of the R statistical programming language and the R~package topicmodels by Bettina Grün and Kurt Hornik. The complete process, including extraction of a text corpus from the PNAS journal's website, data preprocessing, transformation into a document-term matrix, model selection, model estimation, as well as presentation of the results, is fully documented and commented. The outcome closely matches the analyses of the original paper, therefore the research by Griffiths/Steyvers can be reproduced. Furthermore, this thesis proves the suitability of the R environment for text mining with LDA. (author's abstract)
Series: Theses / Institute for Statistics and Mathematics
APA, Harvard, Vancouver, ISO, and other styles
2

Arnekvist, Isac, and Ludvig Ericson. "Finding competitors using Latent Dirichlet Allocation." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-186386.

Full text
Abstract:
Identifying business competitors is of interest to many, but is becoming increasingly hard in an expanding global market. The aim of this report is to investigate whether Latent Dirichlet Allocation (LDA) can be used to identify and rank competitors based on distances between LDA representations of company descriptions. The performance of the LDA model was compared to that of bag-of-words and random ordering by evaluating then comparing them on a handful of common information retrieval metrics. Several different distance metrics were evaluated to determine which metric had best correspondence between representation distance and companies being competitors. Cosine similarity was found to outperform the other distance metrics. While both LDA and bag-of-words representations were found to be significantly better than random ordering, LDA was found to perform worse than bag-of-words. However, computation of distance metrics was considerably faster for LDA representations. The LDA representations capture features that are not helpful for identifying competitors, and it is suggested that LDA representations could be used together with some other data source or heuristic.
Det finns ett intresse av att kunna identifiera affärskonkurrenter, men detta blir allt svårare på en ständigt växande och alltmer global marknad. Syftet med denna rapport är att undersöka om Latent Dirichlet Allocation (LDA) kan användas för att identifiera och rangordna konkurrenter. Detta genom att jämföra avstånden mellan LDA-representationerna av dessas företagsbeskrivningar. Effektiviteten av LDA i detta syfte jämfördes med den för bag-of-words samt slumpmässig ordning, detta med hjälp av några vanliga informationsteoretiska mått. Flera olika avståndsmått utvärderades för att bestämma vilken av dessa som bäst åstadkommer att konkurrerande företag hamnar nära varandra. I detta fall fanns Cosine similarity överträffa andra avståndsmått. Medan både LDA och bag-of-words konstaterades vara signifikant bättre än slumpmässig ordning så fanns att LDA presterar kvalitativt sämre än bag-of-words. Uträkning av avståndsmått var dock betydligt snabbare med LDA-representationer. Att omvandla webbinnehåll till LDA-representationer fångar dock vissa ospecifika likheter som inte nödvändigt beskriver konkurrenter. Det kan möjligen vara fördelaktigt att använda LDA-representationer ihop med någon ytterligare datakälla och/eller heuristik.
APA, Harvard, Vancouver, ISO, and other styles
3

Choubey, Rahul. "Tag recommendation using Latent Dirichlet Allocation." Thesis, Kansas State University, 2011. http://hdl.handle.net/2097/9785.

Full text
Abstract:
Master of Science
Department of Computing and Information Sciences
Doina Caragea
The vast amount of data present on the internet calls for ways to label and organize this data according to specific categories, in order to facilitate search and browsing activities. This can be easily accomplished by making use of folksonomies and user provided tags. However, it can be difficult for users to provide meaningful tags. Tag recommendation systems can guide the users towards informative tags for online resources such as websites, pictures, etc. The aim of this thesis is to build a system for recommending tags to URLs available through a bookmark sharing service, called BibSonomy. We assume that the URLs for which we recommend tags do not have any prior tags assigned to them. Two approaches are proposed to address the tagging problem, both of them based on Latent Dirichlet Allocation (LDA) Blei et al. [2003]. LDA is a generative and probabilistic topic model which aims to infer the hidden topical structure in a collection of documents. According to LDA, documents can be seen as mixtures of topics, while topics can be seen as mixtures of words (in our case, tags). The first approach that we propose, called topic words based approach, recommends the top words in the top topics representing a resource as tags for that particular resource. The second approach, called topic distance based approach, uses the tags of the most similar training resources (identified using the KL-divergence Kullback and Liebler [1951]) to recommend tags for a test untagged resource. The dataset used in this work was made available through the ECML/PKDD Discovery Challenge 2009. We construct the documents that are provided as input to LDA in two ways, thus producing two different datasets. In the first dataset, we use only the description and the tags (when available) corresponding to a URL. In the second dataset, we crawl the URL content and use it to construct the document. Experimental results show that the LDA approach is not very effective at recommending tags for new untagged resources. However, using the resource content gives better results than using the description only. Furthermore, the topic distance based approach is better than the topic words based approach, when only the descriptions are used to construct documents, while the topic words based approach works better when the contents are used to construct documents.
APA, Harvard, Vancouver, ISO, and other styles
4

Risch, Johan. "Detecting Twitter topics using Latent Dirichlet Allocation." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-277260.

Full text
Abstract:
Latent Dirichlet Allocations is evaluated for its suitability when detecting topics in a stream of short messages limited to 140 characters. This is done by assessing its ability to model the incoming messages and its ability to classify previously unseen messages with known topics. The evaluation shows that the model can be suitable for certain applications in topic detection when the stream size is small enough. Furthermoresuggestions on how to handle larger streams are outlined.
APA, Harvard, Vancouver, ISO, and other styles
5

Liu, Zelong. "High performance latent dirichlet allocation for text mining." Thesis, Brunel University, 2013. http://bura.brunel.ac.uk/handle/2438/7726.

Full text
Abstract:
Latent Dirichlet Allocation (LDA), a total probability generative model, is a three-tier Bayesian model. LDA computes the latent topic structure of the data and obtains the significant information of documents. However, traditional LDA has several limitations in practical applications. LDA cannot be directly used in classification because it is a non-supervised learning model. It needs to be embedded into appropriate classification algorithms. LDA is a generative model as it normally generates the latent topics in the categories where the target documents do not belong to, producing the deviation in computation and reducing the classification accuracy. The number of topics in LDA influences the learning process of model parameters greatly. Noise samples in the training data also affect the final text classification result. And, the quality of LDA based classifiers depends on the quality of the training samples to a great extent. Although parallel LDA algorithms are proposed to deal with huge amounts of data, balancing computing loads in a computer cluster poses another challenge. This thesis presents a text classification method which combines the LDA model and Support Vector Machine (SVM) classification algorithm for an improved accuracy in classification when reducing the dimension of datasets. Based on Density-Based Spatial Clustering of Applications with Noise (DBSCAN), the algorithm automatically optimizes the number of topics to be selected which reduces the number of iterations in computation. Furthermore, this thesis presents a noise data reduction scheme to process noise data. When the noise ratio is large in the training data set, the noise reduction scheme can always produce a high level of accuracy in classification. Finally, the thesis parallelizes LDA using the MapReduce model which is the de facto computing standard in supporting data intensive applications. A genetic algorithm based load balancing algorithm is designed to balance the workloads among computers in a heterogeneous MapReduce cluster where the computers have a variety of computing resources in terms of CPU speed, memory space and hard disk space.
APA, Harvard, Vancouver, ISO, and other styles
6

Kulhanek, Raymond Daniel. "A Latent Dirichlet Allocation/N-gram Composite Language Model." Wright State University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=wright1379520876.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Anaya, Leticia H. "Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as Classifiers." Thesis, University of North Texas, 2011. https://digital.library.unt.edu/ark:/67531/metadc103284/.

Full text
Abstract:
In the Information Age, a proliferation of unstructured text electronic documents exists. Processing these documents by humans is a daunting task as humans have limited cognitive abilities for processing large volumes of documents that can often be extremely lengthy. To address this problem, text data computer algorithms are being developed. Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are two text data computer algorithms that have received much attention individually in the text data literature for topic extraction studies but not for document classification nor for comparison studies. Since classification is considered an important human function and has been studied in the areas of cognitive science and information science, in this dissertation a research study was performed to compare LDA, LSA and humans as document classifiers. The research questions posed in this study are: R1: How accurate is LDA and LSA in classifying documents in a corpus of textual data over a known set of topics? R2: How accurate are humans in performing the same classification task? R3: How does LDA classification performance compare to LSA classification performance? To address these questions, a classification study involving human subjects was designed where humans were asked to generate and classify documents (customer comments) at two levels of abstraction for a quality assurance setting. Then two computer algorithms, LSA and LDA, were used to perform classification on these documents. The results indicate that humans outperformed all computer algorithms and had an accuracy rate of 94% at the higher level of abstraction and 76% at the lower level of abstraction. At the high level of abstraction, the accuracy rates were 84% for both LSA and LDA and at the lower level, the accuracy rate were 67% for LSA and 64% for LDA. The findings of this research have many strong implications for the improvement of information systems that process unstructured text. Document classifiers have many potential applications in many fields (e.g., fraud detection, information retrieval, national security, and customer management). Development and refinement of algorithms that classify text is a fruitful area of ongoing research and this dissertation contributes to this area.
APA, Harvard, Vancouver, ISO, and other styles
8

Jaradat, Shatha. "OLLDA: Dynamic and Scalable Topic Modelling for Twitter : AN ONLINE SUPERVISED LATENT DIRICHLET ALLOCATION ALGORITHM." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-177535.

Full text
Abstract:
Providing high quality of topics inference in today's large and dynamic corpora, such as Twitter, is a challenging task. This is especially challenging taking into account that the content in this environment contains short texts and many abbreviations. This project proposes an improvement of a popular online topics modelling algorithm for Latent Dirichlet Allocation (LDA), by incorporating supervision to make it suitable for Twitter context. This improvement is motivated by the need for a single algorithm that achieves both objectives: analyzing huge amounts of documents, including new documents arriving in a stream, and, at the same time, achieving high quality of topics’ detection in special case environments, such as Twitter. The proposed algorithm is a combination of an online algorithm for LDA and a supervised variant of LDA - labeled LDA. The performance and quality of the proposed algorithm is compared with these two algorithms. The results demonstrate that the proposed algorithm has shown better performance and quality when compared to the supervised variant of LDA, and it achieved better results in terms of quality in comparison to the online algorithm. These improvements make our algorithm an attractive option when applied to dynamic environments, like Twitter. An environment for analyzing and labelling data is designed to prepare the dataset before executing the experiments. Possible application areas for the proposed algorithm are tweets recommendation and trends detection.
Tillhandahålla högkvalitativa ämnen slutsats i dagens stora och dynamiska korpusar, såsom Twitter, är en utmanande uppgift. Detta är särskilt utmanande med tanke på att innehållet i den här miljön innehåller korta texter och många förkortningar. Projektet föreslår en förbättring med en populär online ämnen modellering algoritm för Latent Dirichlet Tilldelning (LDA), genom att införliva tillsyn för att göra den lämplig för Twitter sammanhang. Denna förbättring motiveras av behovet av en enda algoritm som uppnår båda målen: analysera stora mängder av dokument, inklusive nya dokument som anländer i en bäck, och samtidigt uppnå hög kvalitet på ämnen "upptäckt i speciella fall miljöer, till exempel som Twitter. Den föreslagna algoritmen är en kombination av en online-algoritm för LDA och en övervakad variant av LDA - Labeled LDA. Prestanda och kvalitet av den föreslagna algoritmen jämförs med dessa två algoritmer. Resultaten visar att den föreslagna algoritmen har visat bättre prestanda och kvalitet i jämförelse med den övervakade varianten av LDA, och det uppnådde bättre resultat i fråga om kvalitet i jämförelse med den online-algoritmen. Dessa förbättringar gör vår algoritm till ett attraktivt alternativ när de tillämpas på dynamiska miljöer, som Twitter. En miljö för att analysera och märkning uppgifter är utformad för att förbereda dataset innan du utför experimenten. Möjliga användningsområden för den föreslagna algoritmen är tweets rekommendation och trender upptäckt.
APA, Harvard, Vancouver, ISO, and other styles
9

Yalamanchili, Hima Bindu. "A Novel Approach For Cancer Characterization Using Latent Dirichlet Allocation and Disease-Specific Genomic Analysis." Wright State University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=wright1527600876174758.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Sheikha, Hassan. "Text mining Twitter social media for Covid-19 : Comparing latent semantic analysis and latent Dirichlet allocation." Thesis, Högskolan i Gävle, Avdelningen för datavetenskap och samhällsbyggnad, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-32567.

Full text
Abstract:
In this thesis, the Twitter social media is data mined for information about the covid-19 outbreak during the month of March, starting from the 3’rd and ending on the 31’st. 100,000 tweets were collected from Harvard’s opensource data and recreated using Hydrate. This data is analyzed further using different Natural Language Processing (NLP) methodologies, such as termfrequency inverse document frequency (TF-IDF), lemmatizing, tokenizing, Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). Furthermore, the results of the LSA and LDA algorithms is reduced dimensional data that will be clustered using clustering algorithms HDBSCAN and K-Means for later comparison. Different methodologies are used to determine the optimal parameters for the algorithms. This is all done in the python programing language, as there are libraries for supporting this research, the most important being scikit-learn. The frequent words of each cluster will then be displayed and compared with factual data regarding the outbreak to discover if there are any correlations. The factual data is collected by World Health Organization (WHO) and is then visualized in graphs in ourworldindata.org. Correlations with the results are also looked for in news articles to find any significant moments to see if that affected the top words in the clustered data. The news articles with good timelines used for correlating incidents are that of NBC News and New York Times. The results show no direct correlations with the data reported by WHO, however looking into the timelines reported by news sources some correlation can be seen with the clustered data. Also, the combination of LDA and HDBSCAN yielded the most desireable results in comparison to the other combinations of the dimnension reductions and clustering. This was much due to the use of GridSearchCV on LDA to determine the ideal parameters for the LDA models on each dataset as well as how well HDBSCAN clusters its data in comparison to K-Means.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Allocation de Dirichlet"

1

Shi, Feng. Learn About Latent Dirichlet Allocation in R With Data From the News Articles Dataset (2016). 1 Oliver's Yard, 55 City Road, London EC1Y 1SP United Kingdom: SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526495693.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Shi, Feng. Learn About Latent Dirichlet Allocation in Python With Data From the News Articles Dataset (2016). 1 Oliver's Yard, 55 City Road, London EC1Y 1SP United Kingdom: SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526497727.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Augmenting Latent Dirichlet Allocation and Rank Threshold Detection with Ontologies. CreateSpace Independent Publishing Platform, 2014.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Jockers, Matthew L. Theme. University of Illinois Press, 2017. http://dx.doi.org/10.5406/illinois/9780252037528.003.0008.

Full text
Abstract:
This chapter demonstrates how big data and computation can be used to identify and track recurrent themes as the products of external influence. It first considers the limitations of the Google Ngram Viewer as a tool for tracing thematic trends over time before turning to Douglas Biber's Corpus Linguistics: Investigating Language Structure and Use, a primer on various factors complicating word-focused text analysis and the subsequent conclusions one might draw regarding word meanings. It then discusses the results of the author's application of latent Dirichlet allocation (LDA) to a corpus of 3,346 nineteenth-century novels using the open-source MALLET (MAchine Learning for LanguagE Toolkit), a software package for topic modeling. It also explains the different types of analyses performed by the author, including text segmentation, word chunking, and author nationality, gender and time-themes relationship analyses. The thematic data from the LDA model reveal the degree to which author nationality, author gender, and date of publication could be predicted by the thematic signals expressed in the nineteenth-century novels corpus.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Allocation de Dirichlet"

1

Li, Hang. "Latent Dirichlet Allocation." In Machine Learning Methods, 439–71. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-3917-6_20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Tang, Yi-Kun, Xian-Ling Mao, and Heyan Huang. "Labeled Phrase Latent Dirichlet Allocation." In Web Information Systems Engineering – WISE 2016, 525–36. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-48740-3_39.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Moon, Gordon E., Israt Nisa, Aravind Sukumaran-Rajam, Bortik Bandyopadhyay, Srinivasan Parthasarathy, and P. Sadayappan. "Parallel Latent Dirichlet Allocation on GPUs." In Lecture Notes in Computer Science, 259–72. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-93701-4_20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Calvo, Hiram, Ángel Hernández-Castañeda, and Jorge García-Flores. "Author Identification Using Latent Dirichlet Allocation." In Computational Linguistics and Intelligent Text Processing, 303–12. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-77116-8_22.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Hao, Jing, and Hongxi Wei. "Latent Dirichlet Allocation Based Image Retrieval." In Lecture Notes in Computer Science, 211–21. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-68699-8_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Maanicshah, Kamal, Manar Amayri, and Nizar Bouguila. "Interactive Generalized Dirichlet Mixture Allocation Model." In Lecture Notes in Computer Science, 33–42. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-23028-8_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Wheeler, Jordan M., Shiyu Wang, and Allan S. Cohen. "Latent Dirichlet Allocation of Constructed Responses." In The Routledge International Handbook of Automated Essay Evaluation, 535–55. New York: Routledge, 2024. http://dx.doi.org/10.4324/9781003397618-31.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Rus, Vasile, Nobal Niraula, and Rajendra Banjade. "Similarity Measures Based on Latent Dirichlet Allocation." In Computational Linguistics and Intelligent Text Processing, 459–70. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37247-6_37.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Bíró, István, and Jácint Szabó. "Latent Dirichlet Allocation for Automatic Document Categorization." In Machine Learning and Knowledge Discovery in Databases, 430–41. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-04174-7_28.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Lovato, Pietro, Manuele Bicego, Vittorio Murino, and Alessandro Perina. "Robust Initialization for Learning Latent Dirichlet Allocation." In Similarity-Based Pattern Recognition, 117–32. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-24261-3_10.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Allocation de Dirichlet"

1

Tahsin, Faiza, Hafsa Ennajari, and Nizar Bouguila. "Author Dirichlet Multinomial Allocation Model with Generalized Distribution (ADMAGD)." In 2024 International Symposium on Networks, Computers and Communications (ISNCC), 1–7. IEEE, 2024. http://dx.doi.org/10.1109/isncc62547.2024.10758998.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Koltcov, Sergei, Olessia Koltsova, and Sergey Nikolenko. "Latent dirichlet allocation." In the 2014 ACM conference. New York, New York, USA: ACM Press, 2014. http://dx.doi.org/10.1145/2615569.2615680.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Chien, Jen-Tzung, Chao-Hsi Lee, and Zheng-Hua Tan. "Dirichlet mixture allocation." In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2016. http://dx.doi.org/10.1109/mlsp.2016.7738866.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Shen, Zhi-Yong, Jun Sun, and Yi-Dong Shen. "Collective Latent Dirichlet Allocation." In 2008 Eighth IEEE International Conference on Data Mining (ICDM). IEEE, 2008. http://dx.doi.org/10.1109/icdm.2008.75.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Li, Shuangyin, Guan Huang, Ruiyang Tan, and Rong Pan. "Tag-Weighted Dirichlet Allocation." In 2013 IEEE International Conference on Data Mining (ICDM). IEEE, 2013. http://dx.doi.org/10.1109/icdm.2013.11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Hsin, Wei-Cheng, and Jen-Wei Huang. "Multi-dependent Latent Dirichlet Allocation." In 2017 Conference on Technologies and Applications of Artificial Intelligence (TAAI). IEEE, 2017. http://dx.doi.org/10.1109/taai.2017.51.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Krestel, Ralf, Peter Fankhauser, and Wolfgang Nejdl. "Latent dirichlet allocation for tag recommendation." In the third ACM conference. New York, New York, USA: ACM Press, 2009. http://dx.doi.org/10.1145/1639714.1639726.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Tan, Yimin, and Zhijian Ou. "Topic-weak-correlated Latent Dirichlet allocation." In 2010 7th International Symposium on Chinese Spoken Language Processing (ISCSLP). IEEE, 2010. http://dx.doi.org/10.1109/iscslp.2010.5684906.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Xiang, Yingzhuo, Dongmei Yang, and Jikun Yan. "The Auto Annotation Latent Dirichlet Allocation." In First International Conference on Information Sciences, Machinery, Materials and Energy. Paris, France: Atlantis Press, 2015. http://dx.doi.org/10.2991/icismme-15.2015.387.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Bhutada, Sunil, V. V. S. S. S. Balaram, and Vishnu Vardhan Bulusu. "Latent Dirichlet Allocation based multilevel classification." In 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT). IEEE, 2014. http://dx.doi.org/10.1109/iccicct.2014.6993109.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Allocation de Dirichlet"

1

Teh, Yee W., David Newman, and Max Welling. A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation. Fort Belvoir, VA: Defense Technical Information Center, September 2007. http://dx.doi.org/10.21236/ada629956.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Antón Sarabia, Arturo, Santiago Bazdresch, and Alejandra Lelo-de-Larrea. The Influence of Central Bank's Projections and Economic Narrative on Professional Forecasters' Expectations: Evidence from Mexico. Banco de México, December 2023. http://dx.doi.org/10.36095/banxico/di.2023.21.

Full text
Abstract:
This paper evaluates the influence of central bank's projections and narrative signals provided in the summaries of its Inflation Report on the expectations of professional forecasters for inflation and GDP growth in the case of Mexico. We use the Latent Dirichlet Allocation model, a textmining technique, to identify narrative signals. We show that both quantitative and qualitative information have an influence on inflation and GDP growth expectations. We also find that narrative signals related to monetary policy, observed inflation, aggregate demand, and inflation and employment projections stand out as the most relevant in accounting for changes in analysts' expectations. If the period of the COVID-19 pandemic is excluded, we still find that forecasters consider both types of information for their inflation expectations.
APA, Harvard, Vancouver, ISO, and other styles
3

Moreno Pérez, Carlos, and Marco Minozzo. “Making Text Talk”: The Minutes of the Central Bank of Brazil and the Real Economy. Madrid: Banco de España, November 2022. http://dx.doi.org/10.53479/23646.

Full text
Abstract:
This paper investigates the relationship between the views expressed in the minutes of the meetings of the Central Bank of Brazil’s Monetary Policy Committee (COPOM) and the real economy. It applies various computational linguistic machine learning algorithms to construct measures of the minutes of the COPOM. First, we create measures of the content of the paragraphs of the minutes using Latent Dirichlet Allocation (LDA). Second, we build an uncertainty index for the minutes using Word Embedding and K-Means. Then, we combine these indices to create two topic-uncertainty indices. The first one is constructed from paragraphs with a higher probability of topics related to “general economic conditions”. The second topic-uncertainty index is constructed from paragraphs that have a higher probability of topics related to “inflation” and the “monetary policy discussion”. Finally, we employ a structural VAR model to explore the lasting effects of these uncertainty indices on certain Brazilian macroeconomic variables. Our results show that greater uncertainty leads to a decline in inflation, the exchange rate, industrial production and retail trade in the period from January 2000 to July 2019.
APA, Harvard, Vancouver, ISO, and other styles
4

Alonso-Robisco, Andrés, José Manuel Carbó, and José Manuel Carbó. Machine Learning methods in climate finance: a systematic review. Madrid: Banco de España, February 2023. http://dx.doi.org/10.53479/29594.

Full text
Abstract:
Preventing the materialization of climate change is one of the main challenges of our time. The involvement of the financial sector is a fundamental pillar in this task, which has led to the emergence of a new field in the literature, climate finance. In turn, the use of Machine Learning (ML) as a tool to analyze climate finance is on the rise, due to the need to use big data to collect new climate-related information and model complex non-linear relationships. Considering the proliferation of articles in this field, and the potential for the use of ML, we propose a review of the academic literature to assess how ML is enabling climate finance to scale up. The main contribution of this paper is to provide a structure of application domains in a highly fragmented research field, aiming to spur further innovative work from ML experts. To pursue this objective, first we perform a systematic search of three scientific databases to assemble a corpus of relevant studies. Using topic modeling (Latent Dirichlet Allocation) we uncover representative thematic clusters. This allows us to statistically identify seven granular areas where ML is playing a significant role in climate finance literature: natural hazards, biodiversity, agricultural risk, carbon markets, energy economics, ESG factors & investing, and climate data. Second, we perform an analysis highlighting publication trends; and thirdly, we show a breakdown of ML methods applied by research area.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography