To see the other types of publications on this topic, follow the link: Dirichlet modeling.

Journal articles on the topic 'Dirichlet modeling'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Dirichlet modeling.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Makgai, Seitebaleng, Andriette Bekker, and Mohammad Arashi. "Compositional Data Modeling through Dirichlet Innovations." Mathematics 9, no. 19 (October 3, 2021): 2477. http://dx.doi.org/10.3390/math9192477.

Full text
Abstract:
The Dirichlet distribution is a well-known candidate in modeling compositional data sets. However, in the presence of outliers, the Dirichlet distribution fails to model such data sets, making other model extensions necessary. In this paper, the Kummer–Dirichlet distribution and the gamma distribution are coupled, using the beta-generating technique. This development results in the proposal of the Kummer–Dirichlet gamma distribution, which presents greater flexibility in modeling compositional data sets. Some general properties, such as the probability density functions and the moments are presented for this new candidate. The method of maximum likelihood is applied in the estimation of the parameters. The usefulness of this model is demonstrated through the application of synthetic and real data sets, where outliers are present.
APA, Harvard, Vancouver, ISO, and other styles
2

Chauhan, Uttam, and Apurva Shah. "Topic Modeling Using Latent Dirichlet allocation." ACM Computing Surveys 54, no. 7 (September 30, 2022): 1–35. http://dx.doi.org/10.1145/3462478.

Full text
Abstract:
We are not able to deal with a mammoth text corpus without summarizing them into a relatively small subset. A computational tool is extremely needed to understand such a gigantic pool of text. Probabilistic Topic Modeling discovers and explains the enormous collection of documents by reducing them in a topical subspace. In this work, we study the background and advancement of topic modeling techniques. We first introduce the preliminaries of the topic modeling techniques and review its extensions and variations, such as topic modeling over various domains, hierarchical topic modeling, word embedded topic models, and topic models in multilingual perspectives. Besides, the research work for topic modeling in a distributed environment, topic visualization approaches also have been explored. We also covered the implementation and evaluation techniques for topic models in brief. Comparison matrices have been shown over the experimental results of the various categories of topic modeling. Diverse technical challenges and future directions have been discussed.
APA, Harvard, Vancouver, ISO, and other styles
3

Navarro, Daniel J., Thomas L. Griffiths, Mark Steyvers, and Michael D. Lee. "Modeling individual differences using Dirichlet processes." Journal of Mathematical Psychology 50, no. 2 (April 2006): 101–22. http://dx.doi.org/10.1016/j.jmp.2005.11.006.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Lingwall, Jeff W., William F. Christensen, and C. Shane Reese. "Dirichlet based Bayesian multivariate receptor modeling." Environmetrics 19, no. 6 (September 2008): 618–29. http://dx.doi.org/10.1002/env.902.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Schwarz, Carlo. "Ldagibbs: A Command for Topic Modeling in Stata Using Latent Dirichlet Allocation." Stata Journal: Promoting communications on statistics and Stata 18, no. 1 (March 2018): 101–17. http://dx.doi.org/10.1177/1536867x1801800107.

Full text
Abstract:
In this article, I introduce the ldagibbs command, which implements latent Dirichlet allocation in Stata. Latent Dirichlet allocation is the most popular machine-learning topic model. Topic models automatically cluster text documents into a user-chosen number of topics. Latent Dirichlet allocation represents each document as a probability distribution over topics and represents each topic as a probability distribution over words. Therefore, latent Dirichlet allocation provides a way to analyze the content of large unclassified text data and an alternative to predefined document classifications.
APA, Harvard, Vancouver, ISO, and other styles
6

Şahin, Büşra, Atıf Evren, Elif Tuna, Zehra Zeynep Şahinbaşoğlu, and Erhan Ustaoğlu. "Parameter Estimation of the Dirichlet Distribution Based on Entropy." Axioms 12, no. 10 (October 5, 2023): 947. http://dx.doi.org/10.3390/axioms12100947.

Full text
Abstract:
The Dirichlet distribution as a multivariate generalization of the beta distribution is especially important for modeling categorical distributions. Hence, its applications vary within a wide range from modeling cell probabilities of contingency tables to modeling income inequalities. Thus, it is commonly used as the conjugate prior of the multinomial distribution in Bayesian statistics. In this study, the parameters of a bivariate Dirichlet distribution are estimated by entropy formalism. As an alternative to maximum likelihood and the method of moments, two methods based on the principle of maximum entropy are used, namely the ordinary entropy method and the parameter space expansion method. It is shown that in estimating the parameters of the bivariate Dirichlet distribution, the ordinary entropy method and the parameter space expansion method give the same results as the method of maximum likelihood. Thus, we emphasize that these two methods can be used alternatively in modeling bivariate and multinomial Dirichlet distributions.
APA, Harvard, Vancouver, ISO, and other styles
7

Bouguila, N., and D. Ziou. "A Dirichlet Process Mixture of Generalized Dirichlet Distributions for Proportional Data Modeling." IEEE Transactions on Neural Networks 21, no. 1 (January 2010): 107–22. http://dx.doi.org/10.1109/tnn.2009.2034851.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Christy, A., Anto Praveena, and Jany Shabu. "A Hybrid Model for Topic Modeling Using Latent Dirichlet Allocation and Feature Selection Method." Journal of Computational and Theoretical Nanoscience 16, no. 8 (August 1, 2019): 3367–71. http://dx.doi.org/10.1166/jctn.2019.8234.

Full text
Abstract:
In this information age, Knowledge discovery and pattern matching plays a significant role. Topic Modeling, an area of Text mining is used detecting hidden patterns in a document collection. Topic Modeling and Document Clustering are two important key terms which are similar in concepts and functionality. In this paper, topic modeling is carried out using Latent Dirichlet Allocation-Brute Force Method (LDA-BF), Latent Dirichlet Allocation-Back Tracking (LDA-BT), Latent Semantic Indexing (LSI) method and Nonnegative Matrix Factorization (NMF) method. A hybrid model is proposed which uses Latent Dirichlet Allocation (LDA) for extracting feature terms and Feature Selection (FS) method for feature reduction. The efficiency of document clustering depends upon the selection of good features. Topic modeling is performed by enriching the good features obtained through feature selection method. The proposed hybrid model produces improved accuracy than K-Means clustering method.
APA, Harvard, Vancouver, ISO, and other styles
9

Elliott, Lloyd T., Maria De Iorio, Stefano Favaro, Kaustubh Adhikari, and Yee Whye Teh. "Modeling Population Structure Under Hierarchical Dirichlet Processes." Bayesian Analysis 14, no. 2 (June 2019): 313–39. http://dx.doi.org/10.1214/17-ba1093.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Li, Yuelin, Elizabeth Schofield, and Mithat Gönen. "A tutorial on Dirichlet process mixture modeling." Journal of Mathematical Psychology 91 (August 2019): 128–44. http://dx.doi.org/10.1016/j.jmp.2019.04.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Kim, Anastasiia, Sanna Sevanto, Eric R. Moore, and Nicholas Lubbers. "Latent Dirichlet Allocation modeling of environmental microbiomes." PLOS Computational Biology 19, no. 6 (June 8, 2023): e1011075. http://dx.doi.org/10.1371/journal.pcbi.1011075.

Full text
Abstract:
Interactions between stressed organisms and their microbiome environments may provide new routes for understanding and controlling biological systems. However, microbiomes are a form of high-dimensional data, with thousands of taxa present in any given sample, which makes untangling the interaction between an organism and its microbial environment a challenge. Here we apply Latent Dirichlet Allocation (LDA), a technique for language modeling, which decomposes the microbial communities into a set of topics (non-mutually-exclusive sub-communities) that compactly represent the distribution of full communities. LDA provides a lens into the microbiome at broad and fine-grained taxonomic levels, which we show on two datasets. In the first dataset, from the literature, we show how LDA topics succinctly recapitulate many results from a previous study on diseased coral species. We then apply LDA to a new dataset of maize soil microbiomes under drought, and find a large number of significant associations between the microbiome topics and plant traits as well as associations between the microbiome and the experimental factors, e.g. watering level. This yields new information on the plant-microbial interactions in maize and shows that LDA technique is useful for studying the coupling between microbiomes and stressed organisms.
APA, Harvard, Vancouver, ISO, and other styles
12

Cao, Shunhua, and Stewart Greenhalgh. "Attenuating boundary conditions for numerical modeling of acoustic wave propagation." GEOPHYSICS 63, no. 1 (January 1998): 231–43. http://dx.doi.org/10.1190/1.1444317.

Full text
Abstract:
Four types of boundary conditions: Dirichlet, Neumann, transmitting, and modified transmitting, are derived by combining the damped wave equation with corresponding boundary conditions. The Dirichlet attenuating boundary condition is the easiest to implement. For an appropriate choice of attenuation parameter, it can achieve a boundary reflection coefficient of a few percent in a one‐wavelength wide zone. The Neumann‐attenuating boundary condition has characteristics similar to the Dirichlet attenuating boundary condition, but it is numerically more difficult to implement. Both the transmitting boundary condition and the modified transmitting boundary condition need an absorbing boundary condition at the termination of the attenuating region. The modified transmitting boundary condition is the most effective in the suppression of boundary reflections. For multidimensional modeling, there is no perfect absorbing boundary condition, and an approximate absorbing boundary condition is used.
APA, Harvard, Vancouver, ISO, and other styles
13

Muhaimin, Amri, Tresna Maulana Fahrudin, Syifa Syarifah Alamiyah, Heidy Arviani, Ade Kusuma, Allan Ruhui Fatmah Sari, and Angela Lisanthoni. "Social Media Analysis and Topic Modeling: Case Study of Stunting in Indonesia." Telematika 20, no. 3 (November 15, 2023): 406. http://dx.doi.org/10.31315/telematika.v20i3.10797.

Full text
Abstract:
Purpose: Stunting is a problem that currently requires special attention in Indonesia. The stunting rate in 2022 will drop to 21.6%, and for the future, the government has set a target of up to 14% in 2024. Rapid technological developments and freedom of expression on the internet produce review text data that can be analyzed for evaluation. This study analyzes the text data of Twitter users' reviews on stunting. The method used is a text-mining approach and topic modeling based on Latent Dirichlet Allocation.Design/methodology/approach: The methodology used in this study is Latent Dirichlet Allocation. The data was collected from twitter with the keyword 'stunting'. After, the data was cleaned and then modeled using the Latent Dirichlet Allocation.Findings/results: The results show that negative sentiment dominates by 60.6%, positive sentiment by 31.5%, and neutral by 7.9%. In addition, this research shows that 'children', 'decrease', 'number', 'prevention', and 'nutrition' are among the words that often appear on stunting.Originality/value/state of the art: This study uses the keyword stunting and analyzes it. Social media analytics show that the people of Indonesia are primarily aware of stunting. Also, the Latent Dirichlet Analysis can be used to create the model.
APA, Harvard, Vancouver, ISO, and other styles
14

Altarturi, Hamza H. M., Muntadher Saadoon, and Nor Badrul Anuar. "Web content topic modeling using LDA and HTML tags." PeerJ Computer Science 9 (July 11, 2023): e1459. http://dx.doi.org/10.7717/peerj-cs.1459.

Full text
Abstract:
An immense volume of digital documents exists online and offline with content that can offer useful information and insights. Utilizing topic modeling enhances the analysis and understanding of digital documents. Topic modeling discovers latent semantic structures or topics within a set of digital textual documents. The Internet of Things, Blockchain, recommender system, and search engine optimization applications use topic modeling to handle data mining tasks, such as classification and clustering. The usefulness of topic models depends on the quality of resulting term patterns and topics with high quality. Topic coherence is the standard metric to measure the quality of topic models. Previous studies build topic models to generally work on conventional documents, and they are insufficient and underperform when applied to web content data due to differences in the structure of the conventional and HTML documents. Neglecting the unique structure of web content leads to missing otherwise coherent topics and, therefore, low topic quality. This study aims to propose an innovative topic model to learn coherence topics in web content data. We present the HTML Topic Model (HTM), a web content topic model that takes into consideration the HTML tags to understand the structure of web pages. We conducted two series of experiments to demonstrate the limitations of the existing topic models and examine the topic coherence of the HTM against the widely used Latent Dirichlet Allocation (LDA) model and its variants, namely the Correlated Topic Model, the Dirichlet Multinomial Regression, the Hierarchical Dirichlet Process, the Hierarchical Latent Dirichlet Allocation, the pseudo-document based Topic Model, and the Supervised Latent Dirichlet Allocation models. The first experiment demonstrates the limitations of the existing topic models when applied to web content data and, therefore, the essential need for a web content topic model. When applied to web data, the overall performance dropped an average of five times and, in some cases, up to approximately 20 times lower than when applied to conventional data. The second experiment then evaluates the effectiveness of the HTM model in discovering topics and term patterns of web content data. The HTM model achieved an overall 35% improvement in topic coherence compared to the LDA.
APA, Harvard, Vancouver, ISO, and other styles
15

Ding, Yi-qun, Shan-ping Li, Zhen Zhang, and Bin Shen. "Hierarchical topic modeling with nested hierarchical Dirichlet process." Journal of Zhejiang University-SCIENCE A 10, no. 6 (June 2009): 858–67. http://dx.doi.org/10.1631/jzus.a0820796.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Gelfand, Alan E., Athanasios Kottas, and Steven N. MacEachern. "Bayesian Nonparametric Spatial Modeling With Dirichlet Process Mixing." Journal of the American Statistical Association 100, no. 471 (September 2005): 1021–35. http://dx.doi.org/10.1198/016214504000002078.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Finegold, Michael, and Mathias Drton. "Robust Bayesian Graphical Modeling Using Dirichlet $t$ -Distributions." Bayesian Analysis 9, no. 3 (September 2014): 521–50. http://dx.doi.org/10.1214/13-ba856.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Ferrari, Alberto. "Modeling Information Content Via Dirichlet-Multinomial Regression Analysis." Multivariate Behavioral Research 52, no. 2 (February 16, 2017): 259–70. http://dx.doi.org/10.1080/00273171.2017.1279957.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Obiorah ,, Philip, Friday Onuodu, and Batholowmeo Eke. "Topic Modeling Using Latent Dirichlet Allocation & Multinomial Logistic Regression." Advances in Multidisciplinary and scientific Research Journal Publication 10, no. 4 (December 30, 2022): 99–112. http://dx.doi.org/10.22624/aims/digital/v10n4p11a.

Full text
Abstract:
Unsupervised categorization for datasets has benefits, but not without a few difficulties. Unsupervised algorithms cluster groups of documents in an unsupervised fashion, and often output findings as vectors containing distributions of words clustered according to their probability of occurring together. Additionally, this technique requires human or domain expert interpretation in order to correctly identify clusters of words as belonging to a certain topic. We propose combining Latent Dirichlet Allocation (LDA) with multi-class Logistic Regression for topic modelling as a multi-step classification process in order to extract and classify topics from unseen texts without relying on human labelling or domain expert interpretation in order to correctly identify clusters of words as belonging to a certain topic. The findings suggest that the two procedures were complementary in terms of identifying textual subjects and overcoming the difficulty of comprehending the array of topics from the output of LDA. Keywords: Natural Language Processing; Topic Modeling; Latent Dirichlet Allocation; Logistic Regression
APA, Harvard, Vancouver, ISO, and other styles
20

Calistus, Ugorji C., Rapheal O. Okonkwo, Nwankwo Chekwube, and Godspower I. Akawuku. "Advancing Message Board Topic Modeling Through Stack Ensemble Techniques." IDOSR JOURNAL OF SCIENTIFIC RESEARCH 9, no. 1 (March 11, 2024): 81–90. http://dx.doi.org/10.59298/idosrjsr/2024/9.1.8190.100.

Full text
Abstract:
In the digital era, message boards serve as vital hubs for diverse discussions, knowledge dissemination, and community interaction. However, navigating the vast and varied content on these platforms presents a formidable challenge. This research pioneers the utilization of stack ensemble techniques to revolutionize topic modeling on message board data. Integrating Latent Dirichlet Allocation (LDA), Non-negative Matrix Factorization (NMF), and Latent Semantic Analysis (LSA) within a sophisticated ensemble framework, this study introduces a paradigm shift in extracting nuanced insights. Incorporating domain-specific features, sentiment analysis, and temporal patterns enriches contextual understanding. Rigorous evaluation across diverse message board datasets underscores the ensemble method's unparalleled accuracy, stability, and interpretability, setting a new standard for discourse analysis in online communities. Keywords: Topic Modeling, Latent Dirichlet Allocation, Stack Ensemble Techniques, Natural Language Processing, Message Boards, Ensemble Learning
APA, Harvard, Vancouver, ISO, and other styles
21

Tervonen, Tommi, Francesco Pignatti, and Douwe Postmus. "From Individual to Population Preferences: Comparison of Discrete Choice and Dirichlet Models for Treatment Benefit-Risk Tradeoffs." Medical Decision Making 39, no. 7 (September 9, 2019): 879–85. http://dx.doi.org/10.1177/0272989x19873630.

Full text
Abstract:
Introduction. The Dirichlet distribution has been proposed for representing preference heterogeneity, but there is limited evidence on its suitability for modeling population preferences on treatment benefits and risks. Methods. We conducted a simulation study to compare how the Dirichlet and standard discrete choice models (multinomial logit [MNL] and mixed logit [MXL]) differ in their convergence to stable estimates of population benefit-risk preferences. The source data consisted of individual-level tradeoffs from an existing 3-attribute patient preference study ( N = 560). The Dirichlet population model was fit directly to the attribute weights in the source data. The MNL and MXL population models were fit to the outcomes of a simulated discrete choice experiment in the same sample of 560 patients. Convergence to the parameter values of the Dirichlet and MNL population models was assessed with sample sizes ranging from 20 to 500 (100 simulations per sample size). Model variability was also assessed with coefficient P values. Results. Population preference estimates of all models were very close to the sample mean, and the MNL and MXL models had good fit (McFadden’s adjusted R2 = 0.12 and 0.13). The Dirichlet model converged reliably to within 0.05 distance of the population preference estimates with a sample size of 100, where the MNL model required a sample size of 240 for this. The MNL model produced consistently significant coefficient estimates with sample sizes of 100 and higher. Conclusion. The Dirichlet model is likely to have smaller sample size requirements than standard discrete choice models in modeling population preferences for treatment benefit-risk tradeoffs and is a useful addition to health preference analyst’s toolbox.
APA, Harvard, Vancouver, ISO, and other styles
22

Ferrari, Diogo. "Modeling Context-Dependent Latent Effect Heterogeneity." Political Analysis 28, no. 1 (May 20, 2019): 20–46. http://dx.doi.org/10.1017/pan.2019.13.

Full text
Abstract:
Classical generalized linear models assume that marginal effects are homogeneous in the population given the observed covariates. Researchers can never be sure a priori if that assumption is adequate. Recent literature in statistics and political science have proposed models that use Dirichlet process priors to deal with the possibility of latent heterogeneity in the covariate effects. In this paper, we extend and generalize those approaches and propose a hierarchical Dirichlet process of generalized linear models in which the latent heterogeneity can depend on context-level features. Such a model is important in comparative analyses when the data comes from different countries and the latent heterogeneity can be a function of country-level features. We provide a Gibbs sampler for the general model, a special Gibbs sampler for gaussian outcome variables, and a Hamiltonian Monte Carlo within Gibbs to handle discrete outcome variables. We demonstrate the importance of accounting for latent heterogeneity with a Monte Carlo exercise and with two applications that replicate recent scholarly work. We show how Simpson’s paradox can emerge in the empirical analysis if latent heterogeneity is ignored and how the proposed model can be used to estimate heterogeneity in the effect of covariates.
APA, Harvard, Vancouver, ISO, and other styles
23

Sofo, Anthony. "EVALUATING LOG-TANGENT INTEGRALS VIA EULER SUMS." Mathematical Modelling and Analysis 27, no. 1 (February 7, 2022): 1–18. http://dx.doi.org/10.3846/mma.2022.13100.

Full text
Abstract:
An investigation into the representation of integrals involving the product of the logarithm and the arctan functions, reducing to log-tangent integrals, will be undertaken in this paper. We will show that in many cases these integrals take an explicit form involving the Riemann zeta function, the Dirichlet eta function, Dirichlet lambda function and many other special functions. Some examples illustrating the theorems will be detailed.
APA, Harvard, Vancouver, ISO, and other styles
24

Fahlevvi, Mohammad Rezza, and Azhari SN. "Topic Modeling on Online News.Portal Using Latent Dirichlet Allocation (LDA)." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 16, no. 4 (October 31, 2022): 335. http://dx.doi.org/10.22146/ijccs.74383.

Full text
Abstract:
The amount of News displayed on online news portals. Often does not indicate the topic being discussed, but the News can be read and analyzed. You can find the main issues and trends in the News being discussed. It would be best if you had a quick and efficient way to find trending topics in the News. One of the methods that can be used to solve this problem is topic modeling. Theme modeling is necessary to allow users to easily and quickly understand modern themes' development. One of the algorithms in topic modeling is the Latent Dirichlet Allocation (LDA). This research stage begins with data collection, preprocessing, n-gram formation, dictionary representation, weighting, topic model validation, topic model formation, and topic modeling results. Based on the results of the topic evaluation, the. The best value of topic modeling using coherence was related to the number of passes. The number of topics produced 20 keys, five cases with a 0.53 coherence value. It can be said to be relatively stable based on the standard coherence value.
APA, Harvard, Vancouver, ISO, and other styles
25

XUE, Jianfei, and Koji EGUCHI. "Video Data Modeling Using Sequential Correspondence Hierarchical Dirichlet Processes." IEICE Transactions on Information and Systems E100.D, no. 1 (2017): 33–41. http://dx.doi.org/10.1587/transinf.2016mup0007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Taylor-Rodríguez, Daniel, Kimberly Kaufeld, Erin M. Schliep, James S. Clark, and Alan E. Gelfand. "Joint Species Distribution Modeling: Dimension Reduction Using Dirichlet Processes." Bayesian Analysis 12, no. 4 (December 2017): 939–67. http://dx.doi.org/10.1214/16-ba1031.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Mazzuchi, T. A., E. S. Soofi, and R. Soyer. "Computation of maximum entropy Dirichlet for modeling lifetime data." Computational Statistics & Data Analysis 32, no. 3-4 (January 2000): 361–78. http://dx.doi.org/10.1016/s0167-9473(99)00090-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Lu, Hsin-Min, Chih-Ping Wei, and Fei-Yuan Hsiao. "Modeling healthcare data using multiple-channel latent Dirichlet allocation." Journal of Biomedical Informatics 60 (April 2016): 210–23. http://dx.doi.org/10.1016/j.jbi.2016.02.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Wiranto, Wiranto, and Mila Rosyida Uswatunnisa. "Topic Modeling for Support Ticket using Latent Dirichlet Allocation." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 6, no. 6 (December 29, 2022): 998–1005. http://dx.doi.org/10.29207/resti.v6i6.4542.

Full text
Abstract:
In the business world, communication over customers must be built properly to make it easier for companies to find out what customers want. Support ticket is one of the business instrument for communication between the customers and the companies. Through a support ticket, customers can respond, complain or ask questions about products with a support team. Increasing the business process of the companies will be increasing the support ticket volume that should be handled by support team. It also has a value for analysis to get business intelligence decision. With that chance, an efficient data processing method is needed to find topics are being discussed by customers. One way that can be used to solve this problem is Topic Modeling. This research uses several parameters the number of topics, alpha value, beta value, iteration, and random seed. With this combination of parameters, the best results based on evaluation of human judgement and topic coherence with 5 topics, an alpha value of 50, a beta value of 0.01, 100 iterations, and 50 random seeds. The five topics interpretation consists of hosting migration, error problems in wordpress, domain email settings and domain transfer, ticketing and transaction processing. The total of 5 topics has a coherence value of 0.507897.
APA, Harvard, Vancouver, ISO, and other styles
30

Momtazi, Saeedeh, and Felix Naumann. "Topic modeling for expert finding using latent Dirichlet allocation." Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 3, no. 5 (August 20, 2013): 346–53. http://dx.doi.org/10.1002/widm.1102.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Wubneh, Kahsay Godifey. "Solving Fundamental Solution of Non-Homogeneous Heat Equation with Dirichlet Boundary Conditions." Bulletin of Mathematical Sciences and Applications 22 (October 2020): 1–9. http://dx.doi.org/10.18052/www.scipress.com/bmsa.22.1.

Full text
Abstract:
In this study, we developed a solution of nonhomogeneous heat equation with Dirichlet boundary conditions. moreover, the non-homogeneous heat equation with constant coefficient. since heat equation has a simple form, we would like to start from the heat equation to find the exact solution of the partial differential equation with constant coefficient. to emphasize our main results, we also consider some important way of solving of partial differential equation specially solving heat equation with Dirichlet boundary conditions. the main results of our paper are quite general in nature and yield some interesting solution of non-homogeneous heat equation with Dirichlet boundary conditions and it is used for problems of mathematical modeling and mathematical physics.
APA, Harvard, Vancouver, ISO, and other styles
32

Nguyen, Tran Diem Hanh. "Topic based document modeling for information filtering." CTU Journal of Innovation and Sustainable Development 15, ISDS (October 16, 2023): 102–9. http://dx.doi.org/10.22144/ctujoisd.2023.040.

Full text
Abstract:
Information Filtering (IF), which has been popularly studied in recent years, is one of the areas that applies document retrieval techniques for dealing with the huge amount of information. In IF systems, modelling user’s interest and filtering relevant documents are major parts of the systems. Various approaches have been proposed for modelling the first component. In this study, we utilized a topic-modelling technique, Latent Dirichlet Topic Modelling, to model user’s interest for IFs. In particular, an extended model of it to represent user’s interest named Latent Dirichlet Topic Modelling with high Frequency Occurrences, shorted as LDA_HF, was proposed with the intention to enhance retrieving performance of IFs. The new model was then compared to the existing methods in modelling user’s interest such as BM25, pLSA, and LDA_IF over the big benchmark datasets, RCV1 and R8. The results of extensive experiments showed that the new proposed model outperformed all the state-of-the-art baseline models in user modelling such as BM25, pLSA and LDA_IF according to 4 major measurement metrics including Top20, B/P, MAP, and F1. Hence, the model LDA_HF promises one of the reliable methods of enhancing performance of IFs.
APA, Harvard, Vancouver, ISO, and other styles
33

Muhajir, Muhammad, Dedi Rosadi, and Danardono Danardono. "Improving the term weighting log entropy of latent dirichlet allocation." Indonesian Journal of Electrical Engineering and Computer Science 34, no. 1 (April 1, 2024): 455. http://dx.doi.org/10.11591/ijeecs.v34.i1.pp455-462.

Full text
Abstract:
<p class="AbstractText">The process of analyzing textual data involves the utilization of topic modeling techniques to uncover latent subjects within documents. The presence of numerous short texts in the Indonesian language poses additional challenges in the field of topic modeling. This study presents a substantial enhancement to the term weighting log entropy (TWLE) approach within the latent dirichlet allocation (LDA) framework, specifically tailored for topic modeling of Indonesian short texts. This work places significant emphasis on the utilization of LDA for word weighting. The research endeavor aimed to enhance the coherence and interpretability of an Indonesian topic model through the integration of local and global weights. Local Weight focuses on the distinct characteristics of each document, whereas global weight examines the broader perspective of the entire corpus of documents. The objective was to enhance the effectiveness of LDA themes by this amalgamation. The TWLE model of LDA was found to be more informative and effective than the TF-IDF LDA when compared with short Indonesian text. This work improves topic modeling in brief Indonesian compositions. Transfer learning for NLP and Indonesian language adaptation helps improve subject analysis knowledge and precision, this could boost NLP and topic modeling in Indonesian.</p>
APA, Harvard, Vancouver, ISO, and other styles
34

Arianto, Bagus Wicaksono, and Gangga Anuraga. "Topic Modeling for Twitter Users Regarding the "Ruanggguru" Application." Jurnal ILMU DASAR 21, no. 2 (July 7, 2020): 149. http://dx.doi.org/10.19184/jid.v21i2.17112.

Full text
Abstract:
PT Ruang Raya Indonesia ("Ruangguru") is the largest and most comprehensive technology company in Indonesia that focuses on education-based services. In 2019 there were 15 million Ruangguru users and 300.00 teachers who had joined and were present in 32 provinces in Indonesia. It prepared a number of expansion strategies to become a company valued at more than US $ 1 billion in the next year or two. The purpose of this research is to classify the opinions of Ruangguru users about the services provided so that it can be an evaluation material in improving their services using the latent direchlet allocation method. The data used comes from a collection of tweets of Twitter users in Indonesia using the Twitter API. The Twitter account used in this study is @ruangguru. The results of the analysis showed that the public perception of Twitter users by using latent dirichlet allocation was formed into 28 topics.Keywords: latent dirichlet allocation, ruangguru, twitter.
APA, Harvard, Vancouver, ISO, and other styles
35

A. Al-Sultany, Ghaidaa, and Hiba J. Aleqabie. "Events Tagging in Twitter Using Twitter Latent Dirichlet Allocation." International Journal of Engineering & Technology 7, no. 4.19 (November 27, 2018): 884–88. http://dx.doi.org/10.14419/ijet.v7i4.19.28065.

Full text
Abstract:
Twitter has become a great platform to publish and carrying news, advisements, events, topics and even daily events in our lives. Twitter Post has limitations on the length and noise. These limitations make that the post is unsuitable for topic modeling due to sparsity.  In this paper, Twitter Latent Dirichlet allocation (TLDA) method for topics modeling was applied to overcome the sparsity problem of tweets modeling. Many steps were implemented for event tagging on Twitter. First: construct a dataset by hashtag pooling technique, and then the preprocessing was performed to extract the features. Secondly, find the suitable number of topics through Perplexity criterion, then, the topics are labeled by WordNet lexicon. Finally, events are tagging using Pricewise Mutual Information (PMI) criterion. The dataset is constructed about various topics including the American elections, Football world cup 2018, and a natural phenomenon and many others; the number of tweets is 63458. This study shows good results in training tweets dataset.Â
APA, Harvard, Vancouver, ISO, and other styles
36

A. Al-Sultany, Ghaidaa, and Hiba J. Aleqabie. "Events Tagging in Twitter Using Twitter Latent Dirichlet Allocation." International Journal of Engineering & Technology 7, no. 4.19 (November 27, 2018): 884–88. http://dx.doi.org/10.14419/ijet.v7i4.19.28064.

Full text
Abstract:
Twitter has become a great platform to publish and carrying news, advisements, events, topics and even daily events in our lives. Twitter Post has limitations on the length and noise. These limitations make that the post is unsuitable for topic modeling due to sparsity.  In this paper, Twitter Latent Dirichlet allocation (TLDA) method for topics modeling was applied to overcome the sparsity problem of tweets modeling. Many steps were implemented for event tagging on Twitter. First: construct a dataset by hashtag pooling technique, and then the preprocessing was performed to extract the features. Secondly, find the suitable number of topics through Perplexity criterion, then, the topics are labeled by WordNet lexicon. Finally, events are tagging using Pricewise Mutual Information (PMI) criterion. The dataset is constructed about various topics including the American elections, Football world cup 2018, and a natural phenomenon and many others; the number of tweets is 63458. This study shows good results in training tweets dataset.Â
APA, Harvard, Vancouver, ISO, and other styles
37

Shahbazi, Zeinab, and Yung-Cheol Byun. "Topic prediction and knowledge discovery based on integrated topic modeling and deep neural networks approaches." Journal of Intelligent & Fuzzy Systems 41, no. 1 (August 11, 2021): 2441–57. http://dx.doi.org/10.3233/jifs-202545.

Full text
Abstract:
Understanding the real-world short texts become an essential task in the recent research area. The document deduction analysis and latent coherent topic named as the important aspect of this process. Latent Dirichlet Allocation (LDA) and Probabilistic Latent Semantic Analysis (PLSA) are suggested to model huge information and documents. This type of contexts’ main problem is the information limitation, words relationship, sparsity, and knowledge extraction. The knowledge discovery and machine learning techniques integrated with topic modeling were proposed to overcome this issue. The knowledge discovery was applied based on the hidden information extraction to increase the suitable dataset for further analysis. The integration of machine learning techniques, Artificial Neural Network (ANN) and Long Short-Term (LSTM) are applied to anticipate topic movements. LSTM layers are fed with latent topic distribution learned from the pre-trained Latent Dirichlet Allocation (LDA) model. We demonstrate general information from different techniques applied in short text topic modeling. We proposed three categories based on Dirichlet multinomial mixture, global word co-occurrences, and self-aggregation using representative design and analysis of all categories’ performance in different tasks. Finally, the proposed system evaluates with state-of-art methods on real-world datasets, comprises them with long document topic modeling algorithms, and creates a classification framework that considers further knowledge and represents it in the machine learning pipeline.
APA, Harvard, Vancouver, ISO, and other styles
38

Ma, Zhanyu, Yuping Lai, W. Bastiaan Kleijn, Yi-Zhe Song, Liang Wang, and Jun Guo. "Variational Bayesian Learning for Dirichlet Process Mixture of Inverted Dirichlet Distributions in Non-Gaussian Image Feature Modeling." IEEE Transactions on Neural Networks and Learning Systems 30, no. 2 (February 2019): 449–63. http://dx.doi.org/10.1109/tnnls.2018.2844399.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Obiora, Philip, Batholowmeo Eke, and Friday Onuodu. "mlChatApp: Topic Modeling in Online Chat Groups." Advances in Multidisciplinary and scientific Research Journal Publication 10, no. 3 (September 30, 2022): 100–110. http://dx.doi.org/10.22624/aims/maths/v10n3p8.

Full text
Abstract:
Most of the time, online messaging group users must scroll and read a large number of irrelevant posts in order to gain a clear understanding of what is being discussed in the group to which they belong. Messaging groups can get congested with unnecessary messages, causing members to miss out on important issues and information. There is a need to assist users of multi-user chat systems in understanding what the group discussion is all about at any particular time without having to read all of the posted messages. This paper describes an approach to discovering topics in online chat groups. In order to extract and categorize subjects from unseen texts in online group discussions. We developed a new multi-user chat system (ML-CHAT-APP) that automatically identifies and categorizes topics within posts/messages as they appear. We implemented a combination of a Latent Dirichlet Allocation (LDA)-based model with Multinomial Logistic Regression. The resulting model was integrated into the ML-CHAT-APP built with Python and Tkinker framework for Graphical User Interface. The results show that the application was helpful in identifying topics in text conversations and adding identified topics as labels to message posts in real-time. Keywords: NLP, Topic Modeling, Latent Dirichlet Allocation; Logistic Regression
APA, Harvard, Vancouver, ISO, and other styles
40

Guo, Yunyan, and Jianzhong Li. "Distributed Latent Dirichlet Allocation on Streams." ACM Transactions on Knowledge Discovery from Data 16, no. 1 (July 3, 2021): 1–20. http://dx.doi.org/10.1145/3451528.

Full text
Abstract:
Latent Dirichlet Allocation (LDA) has been widely used for topic modeling, with applications spanning various areas such as natural language processing and information retrieval. While LDA on small and static datasets has been extensively studied, several real-world challenges are posed in practical scenarios where datasets are often huge and are gathered in a streaming fashion. As the state-of-the-art LDA algorithm on streams, Streaming Variational Bayes (SVB) introduced Bayesian updating to provide a streaming procedure. However, the utility of SVB is limited in applications since it ignored three challenges of processing real-world streams: topic evolution , data turbulence , and real-time inference . In this article, we propose a novel distributed LDA algorithm—referred to as StreamFed-LDA— to deal with challenges on streams. For topic modeling of streaming data, the ability to capture evolving topics is essential for practical online inference. To achieve this goal, StreamFed-LDA is based on a specialized framework that supports lifelong (continual) learning of evolving topics. On the other hand, data turbulence is commonly present in streams due to real-life events. In that case, the design of StreamFed-LDA allows the model to learn new characteristics from the most recent data while maintaining the historical information. On massive streaming data, it is difficult and crucial to provide real-time inference results. To increase the throughput and reduce the latency, StreamFed-LDA introduces additional techniques that substantially reduce both computation and communication costs in distributed systems. Experiments on four real-world datasets show that the proposed framework achieves significantly better performance of online inference compared with the baselines. At the same time, StreamFed-LDA also reduces the latency by orders of magnitudes in real-world datasets.
APA, Harvard, Vancouver, ISO, and other styles
41

Peterson, Daniel, Susan Brown, and Martha Palmer. "Verb Class Induction with Partial Supervision." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 8616–23. http://dx.doi.org/10.1609/aaai.v34i05.6385.

Full text
Abstract:
Dirichlet-multinomial (D-M) mixtures like latent Dirichlet allocation (LDA) are widely used for both topic modeling and clustering. Prior work on constructing Levin-style semantic verb clusters achieves state-of-the-art results using D-M mixtures for verb sense induction and clustering. We add a bias toward known clusters by explicitly labeling a small number of observations with their correct VerbNet class. We demonstrate that this partial supervision guides the resulting clusters effectively, improving the recovery of both labeled and unlabeled classes by 16%, for a joint 12% absolute improvement in F1 score compared to clustering without supervision. The resulting clusters are also more semantically coherent. Although the technical change is minor, it produces a large effect, with important practical consequences for supervised topic modeling in general.
APA, Harvard, Vancouver, ISO, and other styles
42

Jasas, Mindaugas, Antanas Laurinčikas, Mindaugas Stoncelis, and Darius Šiaučiūnas. "DISCRETE UNIVERSALITY OF ABSOLUTELY CONVERGENT DIRICHLET SERIES." Mathematical Modelling and Analysis 27, no. 1 (February 7, 2022): 78–87. http://dx.doi.org/10.3846/mma.2022.15069.

Full text
Abstract:
In the paper, an universality theorem of discrete type on the approximation of analytic functions by shifts of a special absolutely convergent Dirichlet series is obtained. These series is close in a certain sense to the periodic zeta-function and depends on a parameter.
APA, Harvard, Vancouver, ISO, and other styles
43

Tong, Xin. "Semiparametric Bayesian Methods in Growth Curve Modeling for Nonnormal Data Analysis." Journal of Behavioral Data Science 1, no. 1 (May 2021): 53–84. http://dx.doi.org/10.35566/jbds/v1n1/p4.

Full text
Abstract:
Semiparametric Bayesian methods have been proposed in the literature for growth curve modeling to reduce the adverse effect of having nonnormal data. The normality assumption of measurement errors in traditional growth curve models was {replaced} by a random distribution with Dirichlet process mixture priors. However, both the random effects and measurement errors are equally likely to be nonnormal. Therefore, in this study, three types of robust distributional growth curve models are proposed from a semiparametric Bayesian perspective, in which random coefficients or measurement errors follow either normal distributions or unknown random distributions with Dirichlet process mixture priors. Based on a Monte Carlo simulation study, we evaluate the performance of the robust models and demonstrate that selecting an appropriate model for practical data analyses is very important, by comparing the three types of robust distributional models as well as the traditional growth curve models with the normality assumption. We also provide a straightforward strategy to select the appropriate model.
APA, Harvard, Vancouver, ISO, and other styles
44

Anisi, Mohammad, and Morteza Analoui. "Multinomial Agent's Trust Modeling Using Entropy of the Dirichlet Distribution." International Journal of Artificial Intelligence & Applications 2, no. 3 (July 31, 2011): 1–11. http://dx.doi.org/10.5121/ijaia.2011.2301.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Kwak, Minho, and HyunSuk Han. "Topic Modeling of Psychometric Journals Based on Latent Dirichlet Allocation." Journal of Research Methodology 5, no. 3 (November 30, 2020): 29–63. http://dx.doi.org/10.21487/jrm.2020.11.5.3.29.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Sutherland, Ian, Youngseok Sim, Seul Ki Lee, Jaemun Byun, and Kiattipoom Kiatkawsin. "Topic Modeling of Online Accommodation Reviews via Latent Dirichlet Allocation." Sustainability 12, no. 5 (February 28, 2020): 1821. http://dx.doi.org/10.3390/su12051821.

Full text
Abstract:
There is a lot of attention given to the determinants of guest satisfaction and consumer behavior in the tourism literature. While much extant literature uses a deductive approach for identifying guest satisfaction dimensions, we apply an inductive approach by utilizing large unstructured text data of 104,161 online reviews of Korean accommodation customers to frame which topics of interest guests find important. Using latent Dirichlet allocation, a generative, Bayesian, hierarchical statistical model, we extract and validate topics of interest in the dataset. The results corroborate extant literature in that dimensions, such as location and service quality, are important. However, we extend existing dimensions of importance by more precisely distinguishing aspects of location and service quality. Furthermore, by comparing the characteristics of the accommodations in terms of metropolitan versus rural and the type of accommodation, we reveal differences in topics of importance between different characteristics of the accommodations. Specifically, we find a higher importance for points of competition and points of uniqueness among the accommodation characteristics. This has implications for how managers can improve customer satisfaction and how researchers can more precisely measure customer satisfaction in the hospitality industry.
APA, Harvard, Vancouver, ISO, and other styles
47

Heck, Michael, Sakriani Sakti, and Satoshi Nakamura. "Dirichlet Process Mixture of Mixtures Model for Unsupervised Subword Modeling." IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, no. 11 (November 2018): 2027–42. http://dx.doi.org/10.1109/taslp.2018.2852500.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Artiles, William, and André Nachbin. "Asymptotic nonlinear wave modeling through the Dirichlet-to-Neumann operator." Methods and Applications of Analysis 11, no. 4 (2004): 475–92. http://dx.doi.org/10.4310/maa.2004.v11.n4.a3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Traunmüller, Richard, Andreas Murr, and Jeff Gill. "Modeling Latent Information in Voting Data with Dirichlet Process Priors." Political Analysis 23, no. 1 (2015): 1–20. http://dx.doi.org/10.1093/pan/mpu018.

Full text
Abstract:
We apply a specialized Bayesian method that helps us deal with the methodological challenge of unobserved heterogeneity among immigrant voters. Our approach is based ongeneralized linear mixed Dirichlet models(GLMDMs) where random effects are specified semiparametrically using a Dirichlet process mixture prior that has been shown to account for unobserved grouping in the data. Such models are drawn from Bayesian nonparametrics to help overcome objections handling latent effects with strongly informed prior distributions. Using 2009 German voting data of immigrants, we show that for difficult problems of missing key covariates and unexplained heterogeneity this approach provides (1) overall improved model fit, (2) smaller standard errors on average, and (3) less bias from omitted variables. As a result, the GLMDM changed our substantive understanding of the factors affecting immigrants' turnout and vote choice. Once we account for unobserved heterogeneity among immigrant voters, whether a voter belongs to the first immigrant generation or not is much less important than the extant literature suggests. When looking at vote choice, we also found that an immigrant's degree of structural integration does not affect the vote in favor of the CDU/CSU, a party that is traditionally associated with restrictive immigration policy.
APA, Harvard, Vancouver, ISO, and other styles
50

Ladouceur, Martin, Elham Rahme, Patrick Bélisle, Allison N. Scott, Kevin Schwartzman, and Lawrence Joseph. "Modeling continuous diagnostic test data using approximate Dirichlet process distributions." Statistics in Medicine 30, no. 21 (July 22, 2011): 2648–62. http://dx.doi.org/10.1002/sim.4320.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography