Teses / dissertações: "Data mining – social aspects"

1

Chen, Weidong. "Discovering communities by information diffusion and link density propagation". HKBU Institutional Repository, 2012. https://repository.hkbu.edu.hk/etd_ra/1422.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

2

Nguyen, Ngoc Buu Cat. "Data Mining in Knowledge Management Processes: Developing an Implementing Framework". Thesis, Umeå universitet, Institutionen för informatik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-149668.

Texto completo da fonte

Resumo:

Analyzing a huge amount of data becomes a tricky challenge and an opportunity for data miners and businessmen today. Knowledge management processes can deal with big knowledge source to find tacit intelligence making businesses more agile and effective. Data mining is a powerful tool working with big data to create capabilities of forecasting and analysis. Yet there is a lack of research on where and how data mining can add value in knowledge management processes in organizations to maximize valuable knowledge for innovation and business management. The knowledge management processes of a psychiatry section in a Swedish hospital was used as a case study for this thesis. Interviews with manager, psychiatrist, auxiliary nurse and data scientists are conducted. Collected data is analyzed to create values of data mining based on a value creation framework through the knowledge management processes of psychiatry section in the hospital. Relying on this process, the limitations and strengths are exposed; whereby, a data mining implementing framework is formulated, and potentials of data mining for the process are suggested to support for all employees of psychiatry section in the hospital in decision making and caring for patients.

Estilos ABNT, Harvard, Vancouver, APA, etc.

3

Yang, Shuang-Hong. "Predictive models for online human activities". Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/43689.

Texto completo da fonte

Resumo:

The availability and scale of user generated data in online systems raises tremendous challenges and opportunities to analytic study of human activities. Effective modeling of online human activities is not only fundamental to the understanding of human behavior, but also important to the online industry. This thesis focuses on developing models and algorithms to predict human activities in online systems and to improve the algorithmic design of personalized/socialized systems (e.g., recommendation, advertising, Web search systems). We are particularly interested in three types of online user activities, i.e., decision making, social interactions and user-generated contents. Centered around these activities, the thesis focuses on three challenging topics: 1. Behavior prediction, i.e., predicting users' online decisions. We present Collaborative-Competitive Filtering, a novel game-theoretic framework for predicting users' online decision making behavior and leverage the knowledge to optimize the design of online systems (e.g., recommendation systems) in respect of certain strategic goals (e.g., sales revenue, consumption diversity). 2. Social contagion, i.e., modeling the interplay between social interactions and individual behavior of decision making. We establish the joint Friendship-Interest Propagation model and the Behavior-Relation Interplay model, a series of statistical approaches to characterize the behavior of individual user's decision making, the interactions among socially connected users, and the interplay between these two activities. These techniques are demonstrated by applications to social behavior targeting. 3. Content mining, i.e., understanding user generated contents. We propose the Topic-Adapted Latent Dirichlet Allocation model, a probabilistic model for identifying a user's hidden cognitive aspects (e.g., knowledgability) from the texts created by the user. The model is successfully applied to address the challenge of ``language gap" in medical information retrieval.

Estilos ABNT, Harvard, Vancouver, APA, etc.

4

Cai, Zhongming. "Technical aspects of data mining". Thesis, Cardiff University, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.395784.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

5

Eriksson, Jesper, e Samuel Björeqvist. "Datadriven Innovation : En komparativ studie om dataanalysmetoder och verktyg för små företag". Thesis, Umeå universitet, Institutionen för informatik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-149865.

Texto completo da fonte

Resumo:

Businesses today are often operating in a highly competitive environment where information is a noticeably valuable asset. Businesses are therefore in need of powerful tools for extracting actionable business knowledge. Research show that SME companies are lagging behind large companies in the use of data analytics; even though they know the potential benefits. We want to study and compare different tools for data analytics and how they can be used by small companies. Our research questions are therefore: what analytical tools are today available on the market, and what are their possibilities and challenges for small companies? And: how can these analytical tools aid in the development of a business, product or service? We conclude in our research that there are several data analytics tools available for small businesses, that their different usages can be applied successfully and without big cost, and that their relevance, both in business development and innovation, depends on the business objectives and goals of their utilization.

Estilos ABNT, Harvard, Vancouver, APA, etc.

6

Wang, Guan. "Graph-Based Approach on Social Data Mining". Thesis, University of Illinois at Chicago, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=3668648.

Texto completo da fonte

Resumo:

Powered by big data infrastructures, social network platforms are gathering data on many aspects of our daily lives. The online social world is reflecting our physical world in an increasingly detailed way by collecting people's individual biographies and their various of relationships with other people. Although massive amount of social data has been gathered, an urgent challenge remain unsolved, which is to discover meaningful knowledge that can empower the social platforms to really understand their users from different perspectives.

Motivated by this trend, my research addresses the reasoning and mathematical modeling behind interesting phenomena on social networks. Proposing graph based data mining framework regarding to heterogeneous data sources is the major goal of my research. The algorithms, by design, utilize graph structure with heterogeneous link and node features to creatively represent social networks' basic structures and phenomena on top of them.

The graph based heterogeneous mining methodology is proved to be effective on a series of knowledge discovery topics, including network structure and macro social pattern mining such as magnet community detection (87), social influence propagation and social similarity mining (85), and spam detection (86). The future work is to consider dynamic relation on social data mining and how graph based approaches adapt from the new situations.

Estilos ABNT, Harvard, Vancouver, APA, etc.

7

Ip, Lai Cheng. "Mining on social network community for marketing". Thesis, University of Macau, 2018. http://umaclib3.umac.mo/record=b3950661.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

8

Costa, Alceu Ferraz. "Mining User Activity Data in Social Media Services". Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-11092017-151000/.

Texto completo da fonte

Resumo:

Social media services have a growing impact in our society. Individuals often rely on social media to get their news, decide which products to buy or to communicate with their friends. As consequence of the widespread adoption of social media, a large volume of data on how users behave is created every day and stored into large databases. Learning how to analyze and extract useful knowledge from this data has a number of potential applications. For instance, a deeper understanding on how legitimate users interact with social media services could be explored to design more accurate spam and fraud detection methods. This PhD research is based on the following hypothesis: data generated by social media users present patterns that can be exploited to improve the effectiveness of tasks such as prediction, forecasting and modeling in the domain of social media. To validate our hypothesis, we focus on designing data mining methods tailored to social media data. The main contributions of this PhD can be divided into three parts. First, we propose Act-M, a mathematical model that describes the timing of users actions. We also show that Act-M can be used to automatically detect bots among social media users based only on the timing (i.e. time-stamp) data. Our second contribution is VnC (Vote-and-Comment), a model that explains how the volume of different types of user interactions evolve over time when a piece of content is submitted to a social media service. In addition to accurately matching real data, VnC is useful, as it can be employed to forecast the number of interactions received by social media content. Finally, our third contribution is the MFS-Map method. MFS-Map automatically provides textual annotations to social media images by efficiently combining visual and metadata features. Our contributions were validated using real data from several social media services. Our experiments show that the Act-M and VnC models provided a more accurate fit to the data than existing models for communication dynamics and information diffusion, respectively. MFS-Map obtained both superior precision and faster speed when compared to other widely employed image annotation methods.
O impacto dos serviços de mídia social em nossa sociedade é crescente. Indivíduos frequentemente utilizam mídias sociais para obter notícias, decidir quais os produtos comprar ou para se comunicar com amigos. Como consequência da adoção generalizada de mídias sociais, um grande volume de dados sobre como os usuários se comportam é gerado diariamente e armazenado em grandes bancos de dados. Aprender a analisar e extrair conhecimentos úteis a partir destes dados tem uma série de potenciais aplicações. Por exemplo, um entendimento mais detalhado sobre como usuários legítimos interagem com serviços de mídia social poderia ser explorado para projetar métodos mais precisos de detecção de spam e fraude. Esta pesquisa de doutorado baseia-se na seguinte hipótese: dados gerados por usuários de mídia social apresentam padrões que podem ser explorados para melhorar a eficácia de tarefas como previsão e modelagem no domínio das mídias sociais. Para validar esta hipótese, foram projetados métodos de mineração de dados adaptados aos dados de mídia social. As principais contribuições desta pesquisa de doutorado podem ser divididas em três partes. Primeiro, foi desenvolvido o Act-M, um modelo matemático que descreve o tempo das ações dos usuários. O autor demonstrou que o Act-M pode ser usado para detectar automaticamente bots entre usuários de mídia social com base apenas nos dados de tempo. A segunda contribuição desta tese é o VnC (Vote-and- Comment), um modelo que explica como o volume de diferentes tipos de interações de usuário evolui ao longo do tempo quando um conteúdo é submetido a um serviço de mídia social. Além de descrever precisamente os dados reais, o VnC é útil, pois pode ser empregado para prever o número de interações recebidas por determinado conteúdo de mídia social. Por fim, nossa terceira contribuição é o método MFS-Map. O MFS-Map fornece automaticamente anotações textuais para imagens de mídias sociais, combinando eficientemente características visuais e de metadados das imagens. As contribuições deste doutorado foram validadas utilizando dados reais de diversos serviços de mídia social. Os experimentos mostraram que os modelos Act-M e VnC forneceram um ajuste mais preciso aos dados quando comparados, respectivamente, a modelos existentes para dinâmica de comunicação e difusão de informação. O MFS-Map obteve precisão superior e tempo de execução reduzido quando comparado com outros métodos amplamente utilizados para anotação de imagens.

Estilos ABNT, Harvard, Vancouver, APA, etc.

9

Meneghello, James. "A scalable framework for integrated social data mining". Thesis, Meneghello, James (2017) A scalable framework for integrated social data mining. PhD thesis, Murdoch University, 2017. https://researchrepository.murdoch.edu.au/id/eprint/36690/.

Texto completo da fonte

Resumo:

Social Networking Sites (SNS) are ubiquitous within modern society, forming communications networks that span across cultural and geographical boundaries. The information posted to these sites provide useful insights into individuals, but can also provide a wealth of information that can be used for further analysis into the surrounding environment. Three main challenges limit the use of this information in applications: the quantity of data is often unmanageable, there is a significant amount of data unavailable for use due to a lack of generic interfaces for access, and there is difficulty in integrating multiple disparate social data sources. The overall aim of the research described in this thesis is to advance the field of data science and improve accessibility of social data in analytical applications, in both academic and commercial settings. This aim has been addressed with three primary contributions; new algorithms to efficiently locate and collect relevant social data, new methods of performing unsupervised data extraction from generic social sites, and the development and subsequent empirical evaluation of a framework to facilitate the collection, integration, storage and presentation of social data for use in applications. The first contribution was the presentation of a search query optimisation algorithm designed to reduce the amount of noise resulting from social data collection by learning from collected content and iteratively building new query keyword sets. The algorithm was empirically evaluated and the results indicated that it provides significantly more data than existing search tools while minimising signal-to-noise ratio. The second contribution aimed to improve access to social data available on Web 2.0 sites but without any existing interface access to the data. The algorithm is designed to extract social data from sites without any a priori knowledge of design or page layout. Its efficacy was empirically evaluated against a testbed consisting of popular news and current affairs websites. Results indicated that the algorithm was very effective at unsupervised retrieval of social data. The third major contribution presented a framework that integrated the previous two contributions into a framework designed to streamline use of social data in academic and commercial applications. The generic, component-based design was evaluated in real-world scenarios and determined to provide a full social collection and analytics workflow in an extensible and scalable manner. This research has theoretical and practical implications for the use of social data in analytical research and commercial use. It extends the data extraction field to include user-generated content, while providing new avenues for performing semi-intelligent social data sourcing, and significantly improves the accessibility of social data.

Estilos ABNT, Harvard, Vancouver, APA, etc.

10

Alsaleh, Slah. "Recommending people in social networks using data mining". Thesis, Queensland University of Technology, 2013. https://eprints.qut.edu.au/61736/1/Slah_Alsaleh_Thesis.pdf.

Texto completo da fonte

Resumo:

This thesis improves the process of recommending people to people in social networks using new clustering algorithms and ranking methods. The proposed system and methods are evaluated on the data collected from a real life social network. The empirical analysis of this research confirms that the proposed system and methods achieved improvements in the accuracy and efficiency of matching and recommending people, and overcome some of the problems that social matching systems usually suffer.

Estilos ABNT, Harvard, Vancouver, APA, etc.

11

Isah, Haruna. "Social Data Mining for Crime Intelligence: Contributions to Social Data Quality Assessment and Prediction Methods". Thesis, University of Bradford, 2017. http://hdl.handle.net/10454/16066.

Texto completo da fonte

Resumo:

With the advancement of the Internet and related technologies, many traditional crimes have made the leap to digital environments. The successes of data mining in a wide variety of disciplines have given birth to crime analysis. Traditional crime analysis is mainly focused on understanding crime patterns, however, it is unsuitable for identifying and monitoring emerging crimes. The true nature of crime remains buried in unstructured content that represents the hidden story behind the data. User feedback leaves valuable traces that can be utilised to measure the quality of various aspects of products or services and can also be used to detect, infer, or predict crimes. Like any application of data mining, the data must be of a high quality standard in order to avoid erroneous conclusions. This thesis presents a methodology and practical experiments towards discovering whether (i) user feedback can be harnessed and processed for crime intelligence, (ii) criminal associations, structures, and roles can be inferred among entities involved in a crime, and (iii) methods and standards can be developed for measuring, predicting, and comparing the quality level of social data instances and samples. It contributes to the theory, design and development of a novel framework for crime intelligence and algorithm for the estimation of social data quality by innovatively adapting the methods of monitoring water contaminants. Several experiments were conducted and the results obtained revealed the significance of this study in mining social data for crime intelligence and in developing social data quality filters and decision support systems.

Estilos ABNT, Harvard, Vancouver, APA, etc.

12

Liu, Lian. "PRIVACY PRESERVING DATA MINING FOR NUMERICAL MATRICES, SOCIAL NETWORKS, AND BIG DATA". UKnowledge, 2015. http://uknowledge.uky.edu/cs_etds/31.

Texto completo da fonte

Resumo:

Motivated by increasing public awareness of possible abuse of confidential information, which is considered as a significant hindrance to the development of e-society, medical and financial markets, a privacy preserving data mining framework is presented so that data owners can carefully process data in order to preserve confidential information and guarantee information functionality within an acceptable boundary. First, among many privacy-preserving methodologies, as a group of popular techniques for achieving a balance between data utility and information privacy, a class of data perturbation methods add a noise signal, following a statistical distribution, to an original numerical matrix. With the help of analysis in eigenspace of perturbed data, the potential privacy vulnerability of a popular data perturbation is analyzed in the presence of very little information leakage in privacy-preserving databases. The vulnerability to very little data leakage is theoretically proved and experimentally illustrated. Second, in addition to numerical matrices, social networks have played a critical role in modern e-society. Security and privacy in social networks receive a lot of attention because of recent security scandals among some popular social network service providers. So, the need to protect confidential information from being disclosed motivates us to develop multiple privacy-preserving techniques for social networks. Affinities (or weights) attached to edges are private and can lead to personal security leakage. To protect privacy of social networks, several algorithms are proposed, including Gaussian perturbation, greedy algorithm, and probability random walking algorithm. They can quickly modify original data in a large-scale situation, to satisfy different privacy requirements. Third, the era of big data is approaching on the horizon in the industrial arena and academia, as the quantity of collected data is increasing in an exponential fashion. Three issues are studied in the age of big data with privacy preservation, obtaining a high confidence about accuracy of any specific differentially private queries, speedily and accurately updating a private summary of a binary stream with I/O-awareness, and launching a mutual private information retrieval for big data. All three issues are handled by two core backbones, differential privacy and the Chernoff Bound.

Estilos ABNT, Harvard, Vancouver, APA, etc.

13

Wang, Yan. "Student Modeling From Different Aspects". Digital WPI, 2016. https://digitalcommons.wpi.edu/etd-theses/205.

Texto completo da fonte

Resumo:

With the wide usage of online tutoring systems, researchers become interested in mining data from logged files of these systems, so as to get better understanding of students. Varieties of aspects of studentsâ€™ learning have become focus of studies, such as modeling studentsâ€™ mastery status and affects. On the other hand, Randomized Controlled Trial (RCT), which is an unbiased method for getting insights of education, finds its way in Intelligent Tutoring System. Firstly, people are curious about what kind of settings would work better. Secondly, such a tutoring system, with lots of students and teachers using it, provides an opportunity for building a RCT infrastructure underlying the system. With the increasing interest in Data mining and RCTs, the thesis focuses on these two aspects. In the first part, we focus on analyzing and mining data from ASSISTments, an online tutoring system run by a team in Worcester Polytechnic Institute. Through the data, we try to answer several questions from different aspects of students learning. The first question we try to answer is what matters more to student modeling, skill information or student information. The second question is whether it is necessary to model studentsâ€™ learning at different opportunity count. The third question is about the benefits of using partial credit, rather than binary credit as measurement of studentsâ€™ learning in RCTs. The fourth question focuses on the amount that students spent Wheel Spinning in the tutoring system. The fifth questions studies the tradeoff between the mastery threshold and the time spent in the tutoring system. By answering the five questions, we both propose machine learning methodology that can be applied in educational data mining, and present findings from analyzing and mining the data. In the second part, we focused on RCTs within ASSISTments. Firstly, we looked at a pilot study of reassessment and relearning, which suggested a better system setting to improve studentsâ€™ robust learning. Secondly, we proposed the idea to build an infrastructure of learning within ASSISTments, which provides the opportunities to improve the whole educational environment.

Estilos ABNT, Harvard, Vancouver, APA, etc.

14

Bergami, Giacomo. "Hypergraph Mining for Social Networks". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2014. http://amslaurea.unibo.it/7106/.

Texto completo da fonte

Resumo:

Nowadays, more and more data is collected in large amounts, such that the need of studying it both efficiently and profitably is arising; we want to acheive new and significant informations that weren't known before the analysis. At this time many graph mining algorithms have been developed, but an algebra that could systematically define how to generalize such operations is missing. In order to propel the development of a such automatic analysis of an algebra, We propose for the first time (to the best of my knowledge) some primitive operators that may be the prelude to the systematical definition of a hypergraph algebra in this regard.

Estilos ABNT, Harvard, Vancouver, APA, etc.

15

Nhlabano, Valentine Velaphi. "Fast Data Analysis Methods For Social Media Data". Diss., University of Pretoria, 2018. http://hdl.handle.net/2263/72546.

Texto completo da fonte

Resumo:

The advent of Web 2.0 technologies which supports the creation and publishing of various social media content in a collaborative and participatory way by all users in the form of user generated content and social networks has led to the creation of vast amounts of structured, semi-structured and unstructured data. The sudden rise of social media has led to their wide adoption by organisations of various sizes worldwide in order to take advantage of this new way of communication and engaging with their stakeholders in ways that was unimaginable before. Data generated from social media is highly unstructured, which makes it challenging for most organisations which are normally used for handling and analysing structured data from business transactions. The research reported in this dissertation was carried out to investigate fast and efficient methods available for retrieving, storing and analysing unstructured data form social media in order to make crucial and informed business decisions on time. Sentiment analysis was conducted on Twitter data called tweets. Twitter, which is one of the most widely adopted social network service provides an API (Application Programming Interface), for researchers and software developers to connect and collect public data sets of Twitter data from the Twitter database. A Twitter application was created and used to collect streams of real-time public data via a Twitter source provided by Apache Flume and efficiently storing this data in Hadoop File System (HDFS). Apache Flume is a distributed, reliable, and available system which is used to efficiently collect, aggregate and move large amounts of log data from many different sources to a centralized data store such as HDFS. Apache Hadoop is an open source software library that runs on low-cost commodity hardware and has the ability to store, manage and analyse large amounts of both structured and unstructured data quickly, reliably, and flexibly at low-cost. A Lexicon based sentiment analysis approach was taken and the AFINN-111 lexicon was used for scoring. The Twitter data was analysed from the HDFS using a Java MapReduce implementation. MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster. The results demonstrate that it is fast, efficient and economical to use this approach to analyse unstructured data from social media in real time.
Dissertation (MSc)--University of Pretoria, 2019.
National Research Foundation (NRF) - Scarce skills
Computer Science
MSc
Unrestricted

Estilos ABNT, Harvard, Vancouver, APA, etc.

16

Li, Jingxuan. "Mining the Online Social Network Data: Influence, Summarization, and Organization". FIU Digital Commons, 2014. http://digitalcommons.fiu.edu/etd/1241.

Texto completo da fonte

Resumo:

Online Social Network (OSN) services provided by Internet companies bring people together to chat, share the information, and enjoy the information. Meanwhile, huge amounts of data are generated by those services (they can be regarded as the social media ) every day, every hour, even every minute, and every second. Currently, researchers are interested in analyzing the OSN data, extracting interesting patterns from it, and applying those patterns to real-world applications. However, due to the large-scale property of the OSN data, it is difficult to effectively analyze it. This dissertation focuses on applying data mining and information retrieval techniques to mine two key components in the social media data — users and user-generated contents. Specifically, it aims at addressing three problems related to the social media users and contents: (1) how does one organize the users and the contents? (2) how does one summarize the textual contents so that users do not have to go over every post to capture the general idea? (3) how does one identify the influential users in the social media to benefit other applications, e.g., Marketing Campaign? The contribution of this dissertation is briefly summarized as follows. (1) It provides a comprehensive and versatile data mining framework to analyze the users and user-generated contents from the social media. (2) It designs a hierarchical co-clustering algorithm to organize the users and contents. (3) It proposes multi-document summarization methods to extract core information from the social network contents. (4) It introduces three important dimensions of social influence, and a dynamic influence model for identifying influential users.

Estilos ABNT, Harvard, Vancouver, APA, etc.

17

Jiang, Fan. "Efficient frequent pattern mining from big data and its applications". Springer, 2014. http://hdl.handle.net/1993/32083.

Texto completo da fonte

Resumo:

Frequent pattern mining is an important research areas in data mining. Since its introduction, it has drawn attention of many researchers. Consequently, many algorithms have been proposed. Popular algorithms include level-wise Apriori based algorithms, tree based algorithms, and hyperlinked array structure based algorithms. While these algorithms are popular and beneficial due to some nice properties, they also suffer from some drawbacks such as multiple database scans, recursive tree constructions, or multiple hyperlink adjustments. In the current era of big data, high volumes of a wide variety of valuable data of different veracities can be easily collected or generated at high velocity in various real-life applications. Among these 5V's of big data, I focus on handling high volumes of big data in my Ph.D. thesis. Specifically, I design and implement a new efficient frequent pattern mining algorithmic technique called B-mine, which overcomes some of the aforementioned drawbacks and achieves better performance when compared with existing algorithms. I also extend my B-mine algorithm into a family of algorithms that can perform big data mining efficiently. Moreover, I design four different frameworks that apply this family of algorithms to the real-life application of social network mining. Evaluation results show the efficiency and practicality of all these algorithms.
February 2017

Estilos ABNT, Harvard, Vancouver, APA, etc.

18

Chen, Nai Chun. "Urban data mining : social media data analysis as a complementary tool for urban design". Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/106414.

Texto completo da fonte

Resumo:

Thesis: S.M., Massachusetts Institute of Technology, Department of Architecture, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 70-71).
The emergence of "big data" has resulted in a large amount of information documenting daily events, perceptions, thoughts, and emotions of citizens, all annotated with the location and time that they were recorded. This data presents an unprecedented opportunity to help identify and solve urban problems. This thesis aimed to explore the potential of machine learning and data mining in finding patterns in "big" urban data. We explored several different types of user generated urban data, including Call Detail Records (CDR) data and social media (Crunch Base, Yelp, Twitter, and Flickr, and Trip Advisor) data on two primary urban issues. First, we aimed to explore an important 21st century urban problem: how to make successful "Innovative district". Using data mining, we discovered several important characteristics of "innovative districts". Second, we aimed to see if big data is able to help diagnose and alleviate existing problems in cities. For this, we focused on the city of Andorra, and discovered potential reasons for recent declines in tourism in the city. We also discovered that we can learn the travel patterns of tourists to Andorra from their past behavior. In this way, we can predict their future travel plans and help their travels, showing the power of data mining urban data in helping to solve future urban problems as well as diagnose and improve existing problems.
by Nai Chun Chen.
S.M.

Estilos ABNT, Harvard, Vancouver, APA, etc.

19

Akay, Altug. "A Novel Method to Intelligently Mine Social Media to Assess Consumer Sentiment of Pharmaceutical Drugs". Doctoral thesis, KTH, Systemsäkerhet och organisation, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-203119.

Texto completo da fonte

Resumo:

This thesis focuses on the development of novel data mining techniques that convert user interactions in social media networks into readable data that would benefit users, companies, and governments. The readable data can either warn of dangerous side effects of pharmaceutical drugs or improve intervention strategies. A weighted model enabled us to represent user activity in the network, that allowed us to reflect user sentiment of a pharmaceutical drug and/or service. The result is an accurate representation of user sentiment. This approach, when modified for specific diseases, drugs, and services, can enable rapid user feedback that can be converted into rapid responses from consumers to industry and government to withdraw possibly dangerous drugs and services from the market or improve said drugs and services. Our approach monitors social media networks in real-time, enabling government and industry to rapidly respond to consumer sentiment of pharmaceutical drugs and services.

QC 20170314

Estilos ABNT, Harvard, Vancouver, APA, etc.

20

Goyal, Amit. "Social influence and its applications : an algorithmic and data mining study". Thesis, University of British Columbia, 2013. http://hdl.handle.net/2429/43935.

Texto completo da fonte

Resumo:

Social influence occurs when one's actions are affected by others. If leveraged carefully, social influence can be exploited in many applications like viral marketing (or targeted advertising in general), recommender systems, social network analysis, events detection, experts finding, link prediction, ranking of feeds etc. One of the fundamental problems in this fascinating field is the problem of influence maximization, primarily motivated by the application of viral marketing. The objective is to identify a small set of users in a social network, who when convinced to adopt a product will influence others in the network leading to a large number of adoptions. The vision of our work is to take the algorithmic and data mining aspects of viral marketing out of the lab. We organize ours goals and contributions into four categories: (i) With the ever-increasing scale of online social networks, it is extremely important to develop efficient algorithms for influence maximization. We propose two algorithms -- CELF++ and SIMPATH that significantly improve the scalability. (ii) We remark that previous studies often make unrealistic assumptions and rely on simulations, instead of validating models against real world data. For instance, they assume an arbitrary assignment of influence probabilities in their studies, which focused more on algorithms than on validity with respect to real data. We attack the problem of learning influence probabilities. In another work, we propose a novel data driven approach to influence models and show that it predicts influence diffusion with much better accuracy. (iii) Next, we propose alternative problem formulations -- MINTSS and MINTIME and show interesting theoretical results. These problem formulations capture the problem of deploying viral campaigns on budget and time constraints. In an additional work, we take a fresh perspective on identifying community leaders using a pattern mining approach. (iv) Finally, we examine applications of social influence. We begin with the application of viral marketing. We show that product adoption is not exactly influence. Given this, we develop a product adoption model and study the problem of maximizing product adoption. Furthermore, we propose and investigate a novel problem in recommender systems, for targeted advertising -- RECMAX.

Estilos ABNT, Harvard, Vancouver, APA, etc.

21

Pochet, Gilberto Flores. "Analysis of online virtual environments using Data Mining and social networks". reponame:Repositório Institucional da UFABC, 2015.

Encontre o texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

22

Corley, Courtney D. Mikler Armin. "Social network simulation and mining social media to advance epidemiology". [Denton, Tex.] : University of North Texas, 2009. http://digital.library.unt.edu/permalink/meta-dc-11053.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

23

GRIMAUDO, LUIGI. "Data Mining Algorithms for Internet Data: from Transport to Application Layer". Doctoral thesis, Politecnico di Torino, 2014. http://hdl.handle.net/11583/2537089.

Texto completo da fonte

Resumo:

Nowadays we live in a data-driven world. Advances in data generation, collection and storage technology have enabled organizations to gather data sets of massive size. Data mining is a discipline that blends traditional data analysis methods with sophisticated algorithms to handle the challenges posed by these new types of data sets. The Internet is a complex and dynamic system with new protocols and applications that arise at a constant pace. All these characteristics designate the Internet a valuable and challenging data source and application domain for a research activity, both looking at Transport layer, analyzing network tra c flows, and going up to Application layer, focusing on the ever-growing next generation web services: blogs, micro-blogs, on-line social networks, photo sharing services and many other applications (e.g., Twitter, Facebook, Flickr, etc.). In this thesis work we focus on the study, design and development of novel algorithms and frameworks to support large scale data mining activities over huge and heterogeneous data volumes, with a particular focus on Internet data as data source and targeting network tra c classification, on-line social network analysis, recommendation systems and cloud services and Big data.

Estilos ABNT, Harvard, Vancouver, APA, etc.

24

Mengwe, Moses Seargent. "Towards social impact assessment of copper-nickel mining in Botswana". Thesis, Nelson Mandela Metropolitan University, 2010. http://hdl.handle.net/10948/1443.

Texto completo da fonte

Resumo:

This research study is more of an initiative towards Social Impact Assessment of copper-nickel mining in Botswana. The specific objectives of the study were centred on the assessment of the social impacts of copper-nickel mining in Botswana from the initial mining stage of exploration, surveying and mine site development to mine closure. The study was carried out under the broad hypotheses that mining influences population movement that impact on areas of mining; mining activities have both economic benefits and deleterious social impacts on the local communities found in the areas where mining is taking place; and mine closure has far reaching socio-economic, investment and developmental implications over and above the obvious interests of project owners. To achieve the broad aim as summarised above, the research study used a multi-disciplinary methodology and approach that required several kinds of expertise and sources of information. Hence it used both primary and secondary sources centred on interactive informative interviews, site visits and observations, questionnaires, census data records, mining companies’ publications, published textbooks and journal articles. The research study comprised of three different mines operated by three different mining companies in three varied socio-cultural and ethnic regions of Botswana. First was a detailed Social Impact Assessment of the initial phase of exploration, surveying and mine site development represented by Mowana mine project operated by African Copper in the rural areas of Dugwi and Mosetse. This case study yielded results showing that the social impacts of mining in the area are diverse and extensive. The findings suggest that the impacts relate not only to the possible economic benefits of foreign exchange, employment, the optimal use of available mineral resources and the possible development of Dugwi and Mosetse villages, but extends to the deleterious social impacts. The results also indicated that the social impacts have just begun in the two communities. Hence they point towards a possible disruption within the socio-cultural system of the local people if serious mitigation measures are not put in place; thus suggesting that the early stages of exploration and mine site development results in the most conflict between the mine and the local people. Second was a comprehensive Social Impact Assessment of Tati-Nickel Phoenix mining project in the peri-urban areas of Matshelagabedi and Matsiloje areas representing the mining stage of mine production and expansion. The results from this case study suggest that during vi mine production and expansion, many people were relocated. However, the overriding impression gained from the case study was Tati-Nickel Mining Company’s elaborate corporate policies that suggested good corporate governance and best practices that promote sustainable development. A notable milestone on good corporate governance and best practice that the other two case studies (mining company) could benchmark on is Tati-Nickel’s corporate social responsibility programme that has been designed to ensure that the communities within a fifty kilometre mine radius benefit from the mine. The results from the case study also distinguished the mining stage of production and expansion from the other two because it is associated with the deep entrenchment of the social impacts into the communities near to mining areas. Third was a detailed Social Impact Assessment on Bamangwato Concession Limited mine in the industrial town of Selebi-Phikwe. The case study represented the stage of mine closure. Through the findings of this case study, it became apparent that the economic dependence of Selebi-Phikwe on mining has seen the town developing into a mining town, increasing its vulnerability at mine closure. The results from the case study further suggest that mine closure will degrade the socio-economic sector of the town with ever far reaching socio-economic implications as many people lose their gainful employment, hence suggesting that a possible complete mine closure will be the most traumatic phase leading to major social conflict within the area. Thus the results suggest that at mine closure, the deleterious social impacts will overspill to other areas in Botswana with disastrous effects for the economy of the country. The results yielded through this study established in clear and passionate language that copper-nickel mining in Botswana influences population movements that lead to positive and negative impacts on the communities found in mining areas. Another major finding of the study is that copper-nickel mining activities have both economic benefits and deleterious social impacts on the local communities, hence the recommendation that the copper-nickel mining companies should embrace the concept of sustainable mining for sustainable development to avoid most of the negative impacts of their operations on the local communities.

Estilos ABNT, Harvard, Vancouver, APA, etc.

25

ATTANASIO, ANTONIO. "Mining Heterogeneous Urban Data at Multiple Granularity Layers". Doctoral thesis, Politecnico di Torino, 2018. http://hdl.handle.net/11583/2709888.

Texto completo da fonte

Resumo:

The recent development of urban areas and of the new advanced services supported by digital technologies has generated big challenges for people and city administrators, like air pollution, high energy consumption, traffic congestion, management of public events. Moreover, understanding the perception of citizens about the provided services and other relevant topics can help devising targeted actions in the management. With the large diffusion of sensing technologies and user devices, the capability to generate data of public interest within the urban area has rapidly grown. For instance, different sensors networks deployed in the urban area allow collecting a variety of data useful to characterize several aspects of the urban environment. The huge amount of data produced by different types of devices and applications brings a rich knowledge about the urban context. Mining big urban data can provide decision makers with knowledge useful to tackle the aforementioned challenges for a smart and sustainable administration of urban spaces. However, the high volume and heterogeneity of data increase the complexity of the analysis. Moreover, different sources provide data with different spatial and temporal references. The extraction of significant information from such diverse kinds of data depends also on how they are integrated, hence alternative data representations and efficient processing technologies are required. The PhD research activity presented in this thesis was aimed at tackling these issues. Indeed, the thesis deals with the analysis of big heterogeneous data in smart city scenarios, by means of new data mining techniques and algorithms, to study the nature of urban related processes. The problem is addressed focusing on both infrastructural and algorithmic layers. In the first layer, the thesis proposes the enhancement of the current leading techniques for the storage and elaboration of Big Data. The integration with novel computing platforms is also considered to support parallelization of tasks, tackling the issue of automatic scaling of resources. At algorithmic layer, the research activity aimed at innovating current data mining algorithms, by adapting them to novel Big Data architectures and to Cloud computing environments. Such algorithms have been applied to various classes of urban data, in order to discover hidden but important information to support the optimization of the related processes. This research activity focused on the development of a distributed framework to automatically aggregate heterogeneous data at multiple temporal and spatial granularities and to apply different data mining techniques. Parallel computations are performed according to the MapReduce paradigm and exploiting in-memory computing to reach near-linear computational scalability. By exploring manifold data resolutions in a relatively short time, several additional patterns of data can be discovered, allowing to further enrich the description of urban processes. Such framework is suitably applied to different use cases, where many types of data are used to provide insightful descriptive and predictive analyses. In particular, the PhD activity addressed two main issues in the context of urban data mining: the evaluation of buildings energy efficiency from different energy-related data and the characterization of people's perception and interest about different topics from user-generated content on social networks. For each use case within the considered applications, a specific architectural solution was designed to obtain meaningful and actionable results and to optimize the computational performance and scalability of algorithms, which were extensively validated through experimental tests.

Estilos ABNT, Harvard, Vancouver, APA, etc.

26

Degnen, Cathrine. "Mining experience : the ageing self, narrative, and social memory in Dodworth, England". Thesis, McGill University, 2003. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=19487.

Texto completo da fonte

Resumo:

In response to the anthropological literature on old age and ageing that remains largely isolated from more contemporary anthropological theory, this thesis re-focuses anthropological attention on the experiences of ageing. Towards this end, I examine the way macro- (history, politics, economics) and micro-level processes (social relations, intergenerational relations, local contexts, individual histories) intersect to frame the cultural construction of old age, personal experiences of "being old", and the self. A central point of intersection between these processes comes from the recent history of social transformation in my fieldsite, Dodworth, a former coal-mining village. Since the late 1980s, this is an area that has been grappling with the rupturing effects of the closure of the coal-mining industry. Attending to these conditions and how they inform the everyday reality and the experiences of ageing and of the self are critical concerns in this thesis. My approach to the ageing self is one that privileges narrativity and temporality as key constitutive elements and which considers the potentially different position of older people in relation to time and to the self. Growing older is a complicated mixture of bodily and social change, and negotiating these shifts has crucial implications for one's sense of self and subjectivity. While "old age" is a category which is readily used in daily discourse and living, what old age is and who is old nevertheless resists anchoring. What old age, being old and ageing meant to my research participants are key questions in order to understand the experience of growing older in Dodworth. Throughout the thesis, I focus on the dialectics of interpersonal interactions in order to speak meaningfully about how the experience of old age is organised and constructed. Emerging in tandem with these issues is another major topic of this thesis: social memory. Talk in Dodworth about places, absences, and relations continually brought the past and present together and was involved in how a sense of self is created. What emerged was a three-dimensionality of memory, an individual and collective way of placing oneself and others in relation to spatial aspects of the villagescape.

Estilos ABNT, Harvard, Vancouver, APA, etc.

27

Fan, Xiaoguang, e 樊晓光. "Study of social-network-based information propagation". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2013. http://hub.hku.hk/bib/B50899600.

Texto completo da fonte

Resumo:

Information propagation has attracted increasing attention from sociologists, marketing researchers and Information Technology entrepreneurs. With the rapid developments in online and mobile social applications like Facebook, Twitter, and LinkedIn, large-scale, high-speed and instantaneous information dissemination becomes possible, spawning tremendous opportunities for electronic commerce. It is non-trivial to make an accurate analysis on how information is propagated due to the uncertainty of human behavior and the complexity of the social environment. This dissertation is concerned with exploring models, formulations, and heuristics for the social-network-based information propagation process. It consists of three major parts: information diffusion through online social network, modeling social influence propagation, and social-network-based information spreading in opportunistic mobile networks. Firstly, I consider the problem of maximizing the influence propagation through online social networks. To solve it, I introduce a probabilistic maximum coverage problem, and propose a cluster-based heuristic and a neighbor-removal heuristic for two basic diffusion models, namely, the Linear Threshold Model and the Independent Cascade Model, respectively. Realizing that the selection of influential nodes is mainly based on the accuracy and efficiency in estimating the social influence, I build a framework of up-to-2-hop hierarchical network to approximate the spreading of social influence, and further propose a hierarchy-based algorithm to solve the influence maximization problem. Our heuristic is proved to be efficient and robust with competitive performance, low computation cost, and high scalability. The second part explores the modeling on social influence propagation. I develop an analytical model for the influence propagation process based on discrete-time Markov chains, and deduce a close-form equation to express the n-step transition probability matrix. We show that given any initial state the probability distribution of the converged network state could be easily obtained by calculating a matrix product. Finally, I study the social-network-based information spreading in opportunistic mobile networks by analyzing the opportunistic routing process. I propose three social-network-based communication pattern models and utilize them to evaluate the performance of different social-network-based routing protocols based on several human mobility traces. Moreover, I discuss the fairness evaluation in opportunistic routing, and propose a fair packet forwarding strategy which operates as a plugin for traditional social- network-based routing protocols. My strategy improves the imbalance of success rates among users while maintaining approximately the same system throughput.
published_or_final_version
Electrical and Electronic Engineering
Doctoral
Doctor of Philosophy

Estilos ABNT, Harvard, Vancouver, APA, etc.

28

Jonathan, Joan. "Prediction of Factors Influencing Rats Tuberculosis Detection Performance Using Data Mining Techniques". Thesis, Uppsala universitet, Institutionen för informatik och media, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-385471.

Texto completo da fonte

Resumo:

This thesis aimed to predict the factors that influence rats TB detection performance using data mining techniques. A rats TB detection performance dataset was given from APOPO TB training and research center in Morogoro, Tanzania. After data preprocessing, the size of the dataset was 471,133 rats TB detection performance observations and a sample size of 4 female rats. However, in the analysis, only 200,000 data observations were used. Based on the CRISP-DM methodology, this thesis used R language as a data mining tool to analyze the given data. To build the predictive model the classification technique was used to predict the influencing factors and classify rats using a decision tree, random forest, and naive Bayes algorithms. The built predictive models were validated with the same test data to check their classification prediction accuracy and to find the best. The results pinpoint that the random forest is the best predictive model with an accuracy of 78.82%. However, the accuracy differences are negligible. When considering the predictive model accuracy (78.78%) and speed (3 seconds) of the decision tree, it is the best predictive model since it has less building time compared to the random forest (154 seconds). Moreover, the results manifest that age is the most significant influencing factor, and rats of ages between 3.1 to 6 years portrayed potentiality in detection performance. The other predicted factors are Session_Completion_Time, Session_Start_Time, and Av_Weight_Per_Year. These results are useful as a reference to rats TB trainers and researchers in rats TB and Information Systems. Further research using other data mining techniques and tools is valuable.

Estilos ABNT, Harvard, Vancouver, APA, etc.

29

Epstein, Greg. "Harnessing User Data to Improve Facebook Features". Thesis, Boston College, 2010. http://hdl.handle.net/2345/1211.

Texto completo da fonte

Resumo:

Thesis advisor: Sergio Alvarez
The recent explosion of online social networking through sites like Twitter, MySpace, Facebook has millions of users spending hours a day sorting through information on their friends, coworkers and other contacts. These networks also house massive amounts of user activity information that is often used for advertising purposes but can be utilized for other activities as well. Facebook, now the most popular in terms of registered users, active users and page rank, has a sparse offering of built-in filtering and predictive tools such as ``suggesting a friend'' or the ``Top News'' feed filter. However these basic tools seem to underutilize the information that Facebook stores on all of its users. This paper explores how to better use available Facebook data to create more useful tools to assist users in sorting through their activities on Facebook
Thesis (BS) — Boston College, 2010
Submitted to: Boston College. College of Arts and Sciences
Discipline: Computer Science Honors Program
Discipline: College Honors Program
Discipline: Computer Science

Estilos ABNT, Harvard, Vancouver, APA, etc.

30

Hassanzadeh, Reza. "Anomaly detection in online social networks : using data-mining techniques and fuzzy logic". Thesis, Queensland University of Technology, 2014. https://eprints.qut.edu.au/78679/1/Reza_Hassanzadeh_Thesis.pdf.

Texto completo da fonte

Resumo:

This research is a step forward in improving the accuracy of detecting anomaly in a data graph representing connectivity between people in an online social network. The proposed hybrid methods are based on fuzzy machine learning techniques utilising different types of structural input features. The methods are presented within a multi-layered framework which provides the full requirements needed for finding anomalies in data graphs generated from online social networks, including data modelling and analysis, labelling, and evaluation.

Estilos ABNT, Harvard, Vancouver, APA, etc.

31

Malliaros, Fragkiskos. "Mining Social and Information Networks: Dynamics and Applications". Palaiseau, Ecole polytechnique, 2015. https://theses.hal.science/tel-01245134/document.

Texto completo da fonte

Resumo:

Networks (or graphs) have become ubiquitous as data from diverse disciplines can naturally be mapped to graph structures. The problem of extracting meaningful information from large scale graph data in an efficient and effective way has become crucial and challenging with several important applications and towards this end, graph mining and analysis methods constitute prominent tools. This dissertation contributes models, tools and observations to problems that arise in the area of mining social and information networks. We built upon computationally efficient graph mining methods in order to: (i) design models for analyzing the structure and dynamics of real-world networks towards unraveling properties that can further be used in practical applications; (ii) develop algorithmic tools for large-scale analytics on data with inherent (e. G. , social networks) or without inherent (e. G. , text) graph structure. In particular, for the former point we show how to model the engagement dynamics of large social networks and how to assess their vulnerability with respect to user departures from the network. In both cases, by unraveling the dynamics of real social networks, regularities and patterns about their structure and formation can be identified; such knowledge can further be used in various applications including churn prediction, anomaly detection and building robust social networking systems. For the latter, we examine how to identify influential users in complex networks, having direct applications to epidemic control and viral marketing and how to utilize graph mining techniques in order to enhance text analytics tasks and in particular the one of text categorization.

Estilos ABNT, Harvard, Vancouver, APA, etc.

32

Casas, Roma Jordi. "Privacy-preserving and data utility in graph mining". Doctoral thesis, Universitat Autònoma de Barcelona, 2014. http://hdl.handle.net/10803/285566.

Texto completo da fonte

Resumo:

En los últimos años, ha sido puesto a disposición del público una gran cantidad de los datos con formato de grafo. Incrustado en estos datos hay información privada acerca de los usuarios que aparecen en ella. Por lo tanto, los propietarios de datos deben respetar la privacidad de los usuarios antes de liberar los conjuntos de datos a terceros. En este escenario, los procesos de anonimización se convierten en un proceso muy importante. Sin embargo, los procesos de anonimización introducen, generalmente, algún tipo de ruido en los datos anónimos y también en sus resultados en procesos de minería de datos. Generalmente, cuanto mayor la privacidad, mayor será el ruido. Por lo tanto, la utilidad de los datos es un factor importante a tener en cuenta en los procesos de anonimización. El equilibrio necesario entre la privacidad de datos y utilidad de éstos puede mejorar mediante el uso de medidas y métricas para guiar el proceso de anonimización, de tal forma que se minimice la pérdida de información. En esta tesis hemos trabajo los campos de la preservación de la privacidad del usuario en las redes sociales y la utilidad y calidad de los datos publicados. Un compromiso entre ambos campos es un punto crítico para lograr buenos métodos de anonimato, que permitan mejorar los posteriores procesos de minería de datos. Parte de esta tesis se ha centrado en la utilidad de los datos y la pérdida de información. En primer lugar, se ha estudiado la relación entre las medidas de pérdida de información genéricas y las específicas basadas en clustering, con el fin de evaluar si las medidas genéricas de pérdida de información son indicativas de la utilidad de los datos para los procesos de minería de datos posteriores. Hemos encontrado una fuerte correlación entre algunas medidas genéricas de pérdida de información (average distance, betweenness centrality, closeness centrality, edge intersection, clustering coefficient y transitivity) y el índice de precisión en los resultados de varios algoritmos de clustering, lo que demuestra que estas medidas son capaces de predecir el perturbación introducida en los datos anónimos. En segundo lugar, se han presentado dos medidas para reducir la pérdida de información en los procesos de modificación de grafos. La primera, Edge neighbourhood centrality, se basa en el flujo de información de a través de la vecindad a distancia 1 de una arista específica. El segundo se basa en el core number sequence y permite conservar mejor la estructura subyacente, mejorando la utilidad de los datos. Hemos demostrado que ambos métodos son capaces de preservar las aristas más importantes del grafo, manteniendo mejor las propiedades básicas estructurales y espectrales. El otro tema importante de esta tesis ha sido los métodos de preservación de la privacidad. Hemos presentado nuestro algoritmo de base aleatoria, que utiliza el concepto de Edge neighbourhood centrality para guiar el proceso de modificación preservando los bordes más importantes del grafo, logrando una menor pérdida de información y una mayor utilidad de los datos. Por último, se han desarrollado dos algoritmos diferentes para el k-anonimato en los grafos. En primer lugar, se ha presentado un algoritmo basado en la computación evolutiva. Aunque este método nos permite cumplir el nivel de privacidad deseado, presenta dos inconvenientes: la pérdida de información es bastante grande en algunas propiedades estructurales del grafo y no es lo suficientemente rápido para trabajar con grandes redes. Por lo tanto, un segundo algoritmo se ha presentado, que utiliza el micro-agregación univariante para anonimizar la secuencia de grados. Este método es cuasi-óptimo y se traduce en una menor pérdida de información y una mejor utilidad de los datos.
In recent years, an explosive increase of graph-formatted data has been made publicly available. Embedded within this data there is private information about users who appear in it. Therefore, data owners must respect the privacy of users before releasing datasets to third parties. In this scenario, anonymization processes become an important concern. However, anonymization processes usually introduce some kind of noise in the anonymous data, altering the data and also their results on graph mining processes. Generally, the higher the privacy, the larger the noise. Thus, data utility is an important factor to consider in anonymization processes. The necessary trade-off between data privacy and data utility can be reached by using measures and metrics to lead the anonymization process to minimize the information loss, and therefore, to maximize the data utility. In this thesis we have covered the fields of user's privacy-preserving in social networks and the utility and quality of the released data. A trade-off between both fields is a critical point to achieve good anonymization methods for the subsequent graph mining processes. Part of this thesis has focused on data utility and information loss. Firstly, we have studied the relation between the generic information loss measures and the clustering-specific ones, in order to evaluate whether the generic information loss measures are indicative of the usefulness of the data for subsequent data mining processes. We have found strong correlation between some generic information loss measures (average distance, betweenness centrality, closeness centrality, edge intersection, clustering coefficient and transitivity) and the precision index over the results of several clustering algorithms, demonstrating that these measures are able to predict the perturbation introduced in anonymous data. Secondly, two measures to reduce the information loss on graph modification processes have been presented. The first one, Edge neighbourhood centrality, is based on information flow throw 1-neighbourhood of a specific edge in the graph. The second one is based on the core number sequence and it preserves better the underlying graph structure, retaining more data utility. By an extensive experimental set up, we have demonstrated that both methods are able to preserve the most important edges in the network, keeping the basic structural and spectral properties close to the original ones. The other important topic of this thesis has been privacy-preserving methods. We have presented our random-based algorithm, which utilizes the concept of Edge neighbourhood centrality to drive the edge modification process to better preserve the most important edges in the graph, achieving lower information loss and higher data utility on the released data. Our method obtains a better trade-off between data utility and data privacy than other methods. Finally, two different approaches for k-degree anonymity on graphs have been developed. First, an algorithm based on evolutionary computing has been presented and tested on different small and medium real networks. Although this method allows us to fulfil the desired privacy level, it presents two main drawbacks: the information loss is quite large in some graph structural properties and it is not fast enough to work with large networks. Therefore, a second algorithm has been presented, which uses the univariate micro-aggregation to anonymize the degree sequence and reduce the distance from the original one. This method is quasi-optimal and it results in lower information loss and better data utility.

Estilos ABNT, Harvard, Vancouver, APA, etc.

33

Straub, Kayla Marie. "Data Mining Academic Emails to Model Employee Behaviors and Analyze Organizational Structure". Thesis, Virginia Tech, 2016. http://hdl.handle.net/10919/71320.

Texto completo da fonte

Resumo:

Email correspondence has become the predominant method of communication for businesses. If not for the inherent privacy concerns, this electronically searchable data could be used to better understand how employees interact. After the Enron dataset was made available, researchers were able to provide great insight into employee behaviors based on the available data despite the many challenges with that dataset. The work in this thesis demonstrates a suite of methods to an appropriately anonymized academic email dataset created from volunteers' email metadata. This new dataset, from an internal email server, is first used to validate feature extraction and machine learning algorithms in order to generate insight into the interactions within the center. Based solely on email metadata, a random forest approach models behavior patterns and predicts employee job titles with $96%$ accuracy. This result represents classifier performance not only on participants in the study but also on other members of the center who were connected to participants through email. Furthermore, the data revealed relationships not present in the center's formal operating structure. The culmination of this work is an organic organizational chart, which contains a fuller understanding of the center's internal structure than can be found in the official organizational chart.
Master of Science

Estilos ABNT, Harvard, Vancouver, APA, etc.

34

Velichety, Srikar, e Srikar Velichety. "Essays on Data Driven Insights from Crowd Sourcing, Social Media and Social Networks". Diss., The University of Arizona, 2016. http://hdl.handle.net/10150/620677.

Texto completo da fonte

Resumo:

The beginning of this decade has seen a phenomenal raise in the amount of data generated in the world. While this increase provides us with opportunities to understand various aspects of human behavior and mechanisms behind new phenomena, the technologies, statistical techniques and theories required to gain an in depth and comprehensive understanding haven't progressed at an equal pace. As little as 5 years back, we used to deal with problems where there is insufficient prior social science or economic theory and the interest is only in prediction of the outcome or where there is an appropriate social science or economic theory and the interest is in explaining a given phenomenon. Today, we deal with problems where there is insufficient social science or economic theory but the interest is in explaining a given phenomenon. This creates a big challenge the solution to which is of equal interest to both academics and practitioners. In my research, I contribute towards addressing these challenges by building exploratory frameworks that leverage a variety of techniques including social network analysis, text and data mining, econometrics, statistical computing and visualization. My three essay dissertation focuses on understanding the antecedents to the quality of user generated content and on subscription and un-subscription behavior of users from lists on Social Media. Using a data science approach on population sized samples from Wikipedia and Twitter, I demonstrate the power of customized exploratory analyses in uncovering facts that social science or economic theory doesn't dictate and show how metrics from these analyses can be used to build prediction models with higher accuracy. I also demonstrate a method for combining exploration, prediction and explanatory modeling and propose to extend this methodology to provide causal inference. This dissertation has general implications for building better predictive and explanatory models and for mining text efficiently in Social Media.

Estilos ABNT, Harvard, Vancouver, APA, etc.

35

Spomer, Judith E. "Latent semantic analysis and classification modeling in applications for social movement theory /". Abstract Full Text (HTML) Full Text (PDF), 2008. http://eprints.ccsu.edu/archive/00000552/02/1996FT.htm.

Texto completo da fonte

Resumo:

Thesis (M.S.) -- Central Connecticut State University, 2008.
Thesis advisor: Roger Bilisoly. "... in partial fulfillment of the requirements for the degree of Master of Science in Data Mining." Includes bibliographical references (leaves 122-127). Also available via the World Wide Web.

Estilos ABNT, Harvard, Vancouver, APA, etc.

36

Rossi, Maria. "Graph Mining for Influence Maximization in Social Networks". Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLX083/document.

Texto completo da fonte

Resumo:

La science moderne des graphes est apparue ces dernières années comme un domaine d'intérêt et a apporté des progrès significatifs à notre connaissance des réseaux. Jusqu'à récemment, les algorithmes d'exploration de données existants étaient destinés à des données structurées / relationnelles, alors que de nombreux ensembles de données nécessitent une représentation graphique, comme les réseaux sociaux, les réseaux générés par des données textuelles, les structures protéiques 3D ou encore les composés chimiques. Il est donc crucial de pouvoir extraire des informations pertinantes à partir de ce type de données et, pour ce faire, les méthodes d'extraction et d'analyse des graphiques ont été prouvées essentielles.L'objectif de cette thèse est d'étudier les problèmes dans le domaine de la fouille de graphes axés en particulier sur la conception de nouveaux algorithmes et d'outils liés à la diffusion d'informations et plus spécifiquement sur la façon de localiser des entités influentes dans des réseaux réels. Cette tâche est cruciale dans de nombreuses applications telles que la diffusion de l'information, les contrôles épidémiologiques et le marketing viral.Dans la première partie de la thèse, nous avons étudié les processus de diffusion dans les réseaux sociaux ciblant la recherche de caractéristiques topologiques classant les entités du réseau en fonction de leurs capacités influentes. Nous nous sommes spécifiquement concentrés sur la décomposition K-truss qui est une extension de la décomposition k-core. On a montré que les noeuds qui appartiennent au sous-graphe induit par le maximal K-truss présenteront de meilleurs proprietés de propagation par rapport aux critères de référence. De tels épandeurs ont la capacité non seulement d'influencer une plus grande partie du réseau au cours des premières étapes d'un processus d'étalement, mais aussi de contaminer une plus grande partie des noeuds.Dans la deuxième partie de la thèse, nous nous sommes concentrés sur l'identification d'un groupe de noeuds qui, en agissant ensemble, maximisent le nombre attendu de nœuds influencés à la fin du processus de propagation, formellement appelé Influence Maximization (IM). Le problème IM étant NP-hard, il existe des algorithmes efficaces garantissant l’approximation de ses solutions. Comme ces garanties proposent une approximation gloutonne qui est coûteuse en termes de temps de calcul, nous avons proposé l'algorithme MATI qui réussit à localiser le groupe d'utilisateurs qui maximise l'influence, tout en étant évolutif. L'algorithme profite des chemins possibles créés dans le voisinage de chaque nœud et précalcule l'influence potentielle de chaque nœud permettant ainsi de produire des résultats concurrentiels, comparés à ceux des algorithmes classiques.Finallement, nous étudions le point de vue de la confidentialité quant au partage de ces bons indicateurs d’influence dans un réseau social. Nous nous sommes concentrés sur la conception d'un algorithme efficace, correct, sécurisé et de protection de la vie privée, qui résout le problème du calcul de la métrique k-core qui mesure l'influence de chaque noeud du réseau. Nous avons spécifiquement adopté une approche de décentralisation dans laquelle le réseau social est considéré comme un système Peer-to-peer (P2P). L'algorithme est construit de telle sorte qu'il ne devrait pas être possible pour un nœud de reconstituer partiellement ou entièrement le graphe en utilisant les informations obtiennues lors de son exécution. Notre contribution est un algorithme incrémental qui résout efficacement le problème de maintenance de core en P2P tout en limitant le nombre de messages échangés et les calculs. Nous fournissons également une étude de sécurité et de confidentialité de la solution concernant la désanonymisation des réseaux, nous montrons ainsi la rélation avec les strategies d’attaque précédemment definies tout en discutant les contres-mesures adaptés
Modern science of graphs has emerged the last few years as a field of interest and has been bringing significant advances to our knowledge about networks. Until recently the existing data mining algorithms were destined for structured/relational data while many datasets exist that require graph representation such as social networks, networks generated by textual data, 3D protein structures and chemical compounds. It has become therefore of crucial importance to be able to extract meaningful information from that kind of data and towards this end graph mining and analysis methods have been proven essential. The goal of this thesis is to study problems in the area of graph mining focusing especially on designing new algorithms and tools related to information spreading and specifically on how to locate influential entities in real-world networks. This task is crucial in many applications such as information diffusion, epidemic control and viral marketing. In the first part of the thesis, we have studied spreading processes in social networks focusing on finding topological characteristics that rank entities in the network based on their influential capabilities. We have specifically focused on the K-truss decomposition which is an extension of the core decomposition of the graph. Extensive experimental analysis showed that the nodes that belong to the maximal K-truss subgraph show a better spreading behavior when compared to baseline criteria. Such spreaders can influence a greater part of the network during the first steps of a spreading process but also the total fraction of the influenced nodes at the end of the epidemic is greater. We have also observed that node members of such dense subgraphs are those achieving the optimal spreading in the network.In the second part of the thesis, we focused on identifying a group of nodes that by acting all together maximize the expected number of influenced nodes at the end of the spreading process, formally called Influence Maximization (IM). The IM problem is actually NP-hard though there exist approximation guarantees for efficient algorithms that can solve the problem while obtaining a solution within the 63% of optimal classes of models. As those guarantees propose a greedy approximation which is computationally expensive especially for large graphs, we proposed the MATI algorithm which succeeds in locating the group of users that maximize the influence while also being scalable. The algorithm takes advantage the possible paths created in each node’s neighborhood to precalculate each node’s potential influence and produces competitive results in quality compared to those of baseline algorithms such as the Greedy, LDAG and SimPath. In the last part of the thesis, we study the privacy point of view of sharing such metrics that are good influential indicators in a social network. We have focused on designing an algorithm that addresses the problem of computing through an efficient, correct, secure, and privacy-preserving algorithm the k-core metric which measures the influence of each node of the network. We have specifically adopted a decentralization approach where the social network is considered as a Peer-to-peer (P2P) system. The algorithm is built based on the constraint that it should not be possible for a node to reconstruct partially or entirely the graph using the information they obtain during its execution. While a distributed algorithm that computes the nodes’ coreness is already proposed, dynamic networks are not taken into account. Our main contribution is an incremental algorithm that efficiently solves the core maintenance problem in P2P while limiting the number of messages exchanged and computations. We provide a security and privacy analysis of the solution regarding network de-anonimization and show how it relates to previously defined attacks models and discuss countermeasures

Estilos ABNT, Harvard, Vancouver, APA, etc.

37

Soni, Swapnil. "Domain-Specific Document Retrieval Framework for Near Real-time Social Health Data". Wright State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=wright1440954750.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

38

MY, DO TRA. "Apply data mining to segment retail market based on purchasing portfolios". Thesis, Högskolan i Borås, Institutionen Handels- och IT-högskolan, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:hb:diva-20774.

Texto completo da fonte

Resumo:

Market segmentation is becoming very familiar and essential to every marketer in the process of designing and implementing an effective target-marketing strategy. It is confirmed in the grocery retail industry about the importance of appropriate market segmentation. In this industry, customer purchasing behavior needs to be acknowledged not only in specific products, but also the interaction among the whole range of products. Therefore, the motivation for this thesis is to discover a segmentation based on this purchasing behavior among whole range of products, which is called purchasing pattern. The Purchasing pattern is interpreted by purchasing portfolios, which include list of categories that a certain customer purchases and also consumption behavior on these categories.This thesis is acknowledged from related theories to design a theoretical model of market segmentation based on purchasing portfolios. Then, data mining techniques are applied to process a practical database in order to test the theory’s hypotheses, as well as illustrate for the model.As a result, the availability of segmentation is verified from a technical view and the practical significance of segmentation is confirmed from a marketing view. The result from data mining has shown four segments from the analysis of purchasing portfolios. These four segments cover most of the market, and remain over time. The segmentation is assessed from marketing view to be appropriate for practical application.Furthermore, there are three segments that are selected to be analyzed further. They represent three distinct purchasing behaviors. Three specific purchasing portfolios are built for each segment, which can be used to direct for marketing strategy.

Estilos ABNT, Harvard, Vancouver, APA, etc.

39

Zulfiqar, Omer. "Detecting Public Transit Service Disruptions Using Social Media Mining and Graph Convolution". Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/103745.

Texto completo da fonte

Resumo:

In recent years we have seen an increase in the number of public transit service disruptions due to aging infrastructure, system failures and the regular need for maintenance. With the fleeting growth in the usage of these transit networks there has been an increase in the need for the timely detection of such disruptions. Any types of disruptions in these transit networks can lead to delays which can have major implications on the daily passengers. Most current disruption detection systems either do not operate in real-time or lack transit network coverage. The theme of this thesis was to leverage Twitter data to help in earlier detection of service disruptions. This work involves developing a pure Data Mining approach and a couple different approaches that use Graph Neural Networks to identify transit disruption related information in Tweets from a live Twitter stream related to the Washington Metropolitan Area Transit Authority (WMATA) metro system. After developing three different models, a Dynamic Query Expansion model, a Tweet-GCN and a Tweet-Level GCN to represent the data corpus we performed various experiments and benchmark evaluations against other existing baseline models, to justify the efficacy of our approaches. After seeing astounding results across both the Tweet-GCN and Tweet-Level GCN, with an average accuracy of approximately 87.3% and 89.9% we can conclude that not only are these two graph neural models superior for basic NLP text classification, but they also outperform other models in identifying transit disruptions.
Master of Science
Millions of people worldwide rely on public transit networks for their daily commutes and day to day movements. With the growth in the number of people using the service, there has been an increase in the number of daily passengers affected by service disruptions. This thesis and research involves proposing and developing three different approaches to help aid in the timely detection of these disruptions. In this work we have developed a pure data mining approach along with two deep learning models using neural networks and live data from Twitter to identify these disruptions. The data mining approach uses a set of dirsuption related input keywords to identify similar keywords within the live Twitter data. By collecting historical data we were able to create deep learning models that represent the vocabulary from the disruptions related Tweets in the form of a graph. A graph is a collection of data values where the data points are connected to one another based on their relationships. A longer chain of connection between two words defines a weak relationship, a shorter chain defines a stronger relationship. In our graph, words with similar contextual meanings are connected to each other over shorter distances, compared to words with different meanings. At the end we use a neural network as a classifier to scan this graph to learn the semantic relationships within our data. Afterwards, this learned information can be used to accurately classify the disruption related Tweets within a pool of random Tweets. Once all the proposed approaches have been developed, a benchmark evaluation is performed against other existing text classification techniques, to justify the effectiveness of the approaches. The final results indicate that the proposed graph based models achieved a higher accuracy, compared to the data mining model, and also outperformed all the other baseline models. Our Tweet-Level GCN had the highest accuracy of 89.9%.

Estilos ABNT, Harvard, Vancouver, APA, etc.

40

Michel, Pablo Anaxágoras. "Análise explorátoria de dados sócio-econômicos de vestibulandos". Florianópolis, SC, 2002. http://repositorio.ufsc.br/xmlui/handle/123456789/83690.

Texto completo da fonte

Resumo:

Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Engenharia de Produção.
Made available in DSpace on 2012-10-20T02:25:29Z (GMT). No. of bitstreams: 0
Ao longo dos anos, as mais variadas organizações acumularam milhares de informações que ajudaram as empresas a evoluir e conquistar mercado, permitindo que os administradores, baseados nelas, de diferentes formas, tomassem decisões. Novas e avançadas ferramentas automatizadas de apoio a decisão têm sido desenvolvidas para auxiliar o administrador a decidir os rumos de seu negócio, possibilitando descobrir, no meio da massa de dados, aquilo que realmente interessa. Data mining ou mineração de dados é o processo de extração de informação de grandes bancos de dados. Este processo pode ser visto como uma nova disciplina na interface da estatística, do aprendizado de máquina, do reconhecimento de padrão e da tecnologia de bancos de dados. O objetivo desse trabalho é propor um modelo utilizando a metodologia do data mining e aplicar as técnicas mais conhecidas (Associação e Agrupamento), na base de dados dos inscritos no vestibular de uma instituição de ensino superior, visando obter um conhecimento aprofundado e informações desconhecidas sobre os candidatos. Os dados analisados são provenientes da ficha sócio-econômica e, para se executar todas as três técnicas citadas, necessitou-se aplicar previamente rotinas de limpeza, removendo dados ausentes (em branco) e convertendo em um valor padrão dados sujos (não pertencentes aos limites válidos). A apresentação dos resultados obtidos sugerirá alguns passos ou ações a serem tomadas no intuito de melhorar a qualidade do ensino superior e a satisfação dos alunos.

Estilos ABNT, Harvard, Vancouver, APA, etc.

41

Corley, Courtney David. "Social Network Simulation and Mining Social Media to Advance Epidemiology". Thesis, University of North Texas, 2009. https://digital.library.unt.edu/ark:/67531/metadc11053/.

Texto completo da fonte

Resumo:

Traditional Public Health decision-support can benefit from the Web and social media revolution. This dissertation presents approaches to mining social media benefiting public health epidemiology. Through discovery and analysis of trends in Influenza related blogs, a correlation to Centers for Disease Control and Prevention (CDC) influenza-like-illness patient reporting at sentinel health-care providers is verified. A second approach considers personal beliefs of vaccination in social media. A vaccine for human papillomavirus (HPV) was approved by the Food and Drug Administration in May 2006. The virus is present in nearly all cervical cancers and implicated in many throat and oral cancers. Results from automatic sentiment classification of HPV vaccination beliefs are presented which will enable more accurate prediction of the vaccine's population-level impact. Two epidemic models are introduced that embody the intimate social networks related to HPV transmission. Ultimately, aggregating these methodologies with epidemic and social network modeling facilitate effective development of strategies for targeted interventions.

Estilos ABNT, Harvard, Vancouver, APA, etc.

42

Torres, Alvarez Hernán. "Mineral exploration, junior mining companies and aspects to be considered for its promotion". IUS ET VERITAS, 2016. http://repositorio.pucp.edu.pe/index/handle/123456789/122605.

Texto completo da fonte

Resumo:

The author makes an analysis of the measures to be taken into account to promote mining activities, with special emphasis in the area of exploration as the main activity in the mining industry. Therefore, the present article focuses on everything that made such activity from its main actors to the considerations to take into account in its regulation and the effectiveness of it. Finally the author presents his conclusions focusing on the importance of generating investment and therefore expedition to implement the best mechanisms in the mining sector.
El autor hace un análisis acerca de las medidas a tomar en cuenta para promover las actividades mineras, poniendo especial énfasis en el área de la exploración como actividad principal de la industria minera. De tal forma, que el presente artículo se centra en todo aquello que compone dicha actividad, desde sus principales actores hasta las consideraciones a tomar en cuenta para su regulación y la eficacia de la misma. Finalmente el autor expone sus conclusiones centrándose en la importancia que genera la inversión y por tanto la expedición para implementar los mejores mecanismos en el rubro minero.

Estilos ABNT, Harvard, Vancouver, APA, etc.

43

Deller, Yannick. "Raw Data for Peace and Security - The Extraction and Mining of People's Behaviour". Thesis, Malmö universitet, Fakulteten för kultur och samhälle (KS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-22599.

Texto completo da fonte

Resumo:

In 2015, the United Nations Global Pulse launched an experimentation process assessing the viability of big data and artificial intelligence analysis to support peace and security. The proposition of using such analysis, and thereby creating early warning systems based on real-time monitoring, warrants a critical assessment. This thesis engages in an explanatory critique of the discursive (re-)definitions of peace and security as well as big data and artificial intelligence in the United Nations Global Pulse Lab Kampala report Experimenting with Big Data and Artificial Intelligence to Support Peace and Security. The paper follows a qualitative design and utilises critical discourse analysis as its methodology while using instrumentarian violence as a theoretical lens. The study argues that the use of big data and artificial intelligence analysis, in conjunction with data mining on social media and radio broadcasts for the purposes of early warning systems, creates and manifests social relations marked by asymmetric power and knowledge dynamics. The analysis suggests that the report’s discursive and social practices indicate a conceptualisation of peace and security rooted in the notion of social control through prediction. The study reflects on the consequences for social identities, social relations, and the social world itself and suggests potential areas for future research.

Estilos ABNT, Harvard, Vancouver, APA, etc.

44

Kumar, Arun. "Ground control ramifications and economic impact of retreat mining on room and pillar coal mines". Diss., Virginia Polytechnic Institute and State University, 1986. http://hdl.handle.net/10919/49815.

Texto completo da fonte

Resumo:

As the coal reserves at shallow depths become exhausted companies have to develop deeper deposits and increase percentage extraction to maintain production levels. Total extraction for room and pillar mines can only be achieved by pillar extraction. The unsupported roof increases during pillar extraction and hence the cost of ground control also increases. Nevertheless, pillar extraction where possible has many potential advantages such as decreased operating cost, increased utilization of reserves, and extended life of the mine. There are several variables such as depth, mining height, rock strength, mining geometry, roof and floor conditions, and retreat mining methods, which affect pillar extraction cost. Cost components of pillar extraction are classified as direct, indirect, fixed, and subsidence compensation costs. A discounted cash flow pillar extraction cost simulator has been developed and used to compute total pillar extraction cost for a variety of conditions and to explore the possibilities of optimizing ground control and retreat mining techniques to maximize extraction ratio. The computer program computes the safe and optimum pillar dimensions and determines the suitable pillar extraction method for the computed pillar width. Pillar extraction cost components are generated and totalled using the net present value method by the simulator. The total extraction cost simulator evaluates the potential advantages of pillar extraction and tests individual variables for sensitivity to changes in other variables attributable to ground control and pillar extraction techniques. Cost of pillar extraction per ton of coal versus depth is presented in the form of a simple nomogram by the simulator. The simulator can be used to determine the economic feasibility of pillar extraction at a particular depth, geologic and mining environment when the market price of mined coal is known.
Ph. D.
incomplete_metadata

Estilos ABNT, Harvard, Vancouver, APA, etc.

45

Franzke, Maximilian [Verfasser], e Matthias [Akademischer Betreuer] Renz. "Querying and mining heterogeneous spatial, social, and temporal data / Maximilian Franzke ; Betreuer: Matthias Renz". München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2019. http://d-nb.info/1190563630/34.

Texto completo da fonte

Estilos ABNT, Harvard, Vancouver, APA, etc.

46

"Algorithmic aspects of social network mining". 2013. http://library.cuhk.edu.hk/record=b5884351.

Texto completo da fonte

Resumo:

Li, Ronghua = 社会网络挖掘的算法问题研究 / 李荣华.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2013.
Includes bibliographical references (leaves 157-171).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstract also in Chinese.
Li, Ronghua = She hui wang luo wa jue de suan fa wen ti yan jiu / Li Ronghua.

Estilos ABNT, Harvard, Vancouver, APA, etc.

47

Miskelly, Kenna Jill. "Exploring ethical issues in data mining: the role of collective privacy". Thesis, 2006. http://hdl.handle.net/1828/2302.

Texto completo da fonte

Resumo:

Data mining and other information technologies cause ethical concerns when they are used to categorize and discriminate. Even though there is an intuitive connection between privacy and personal information, it is hard to conceptualize the troubling issues raised by certain data mining applications in terms of privacy. This is largely due to the emphasis that traditional privacy definitions place on the value and protection that privacy provides to the individual. A notion of "collective privacy" emphasizes the broader social importance of privacy and provides philosophical clarity to the privacy issues raised by data mining. The policy suggestions that result from acknowledging the social value of privacy could benefit many in our society and work to fortify our privacy in this information age.

Estilos ABNT, Harvard, Vancouver, APA, etc.

48

Nkosi, Lolah. "Social impact of mining". Thesis, 2015. http://hdl.handle.net/10210/13886.

Texto completo da fonte

Resumo:

LL.M. (International Law)
Mining is an activity which contributes greatly and positively to a country’s economic development by creating job opportunities, development of roads, health care centres and educational facilities. However, mining in certain instances can also have a long lasting negative environmental and social impact on communities. The focus of this dissertation will be to address those instances where mining has a negative social impact on the communities where such mining projects are taking place. The negative social impact of mining in certain cases is regarded as a universal phenomenon. Citizens of many countries where mining activities take place i.e. “mining counties” especially in the under-developed, developing and countries with economies in transition, such as Ghana, Mali, South Africa and Tanzania in an African Continent are confronted with an array of negative consequences associated with the negative social impact of mining activities. However this does not mean that other continents are immune from this. Asian countries such as Paupau New Guinea, India, and China are also faced with the negative social impact of mining.

Estilos ABNT, Harvard, Vancouver, APA, etc.

49

Sigodi, Mzontsundu Gugulethu. "Corporate social investment by mining companies". Thesis, 2014. http://hdl.handle.net/10210/11865.

Texto completo da fonte

Resumo:

M.Com. (Business Management)
Corporate social investment (CSI) does not have a universal definition, but corporations tend to interpret it according to the extent of their activity in community social programmes of development. It is of particular importance in South Africa given the fact that South Africa is still a developing country that struggles with high unemployment and inequality. This dissertation explores this concept of CSI in research that was conducted in the community of Letswaleng (Embalenhle), in Mpumalanga, in order to establish whether there is a relationship between the mining company that operates in the community and the community within which it operates. Mining corporations continue to assume little responsibility for the health, education or housing of the families of their black employees while operating in monopolistic conditions and making exorbitant profits. A wide variety of these mining opportunities have attracted multinational enterprises and local firms to invest in the region of Mpumalanga. The purpose of the research was to explore the relationship between the community and the mining company in terms of CSI initiatives. It was also to establish if there are any community structures to ensure that the mining company does consult with the community in making sure that they are kept informed concerning the plans of the mining house within the community. The nature of this research was exploratory, qualitative research and, for this reason, structured interviews were conducted and these were face-to-face. Corporate social investment is an issue that the government needs to take seriously by setting up audit committees to monitor the implementation of these ventures. Government structures such as the Department of Trade and Industry need to fund community structures in order for them to be more effective.

Estilos ABNT, Harvard, Vancouver, APA, etc.

50

LIN, YUNG-JIE, e 林勇傑. "Data Mining For Social Medlia Marketing Application". Thesis, 2019. http://ndltd.ncl.edu.tw/handle/8hyrvq.

Texto completo da fonte

Resumo:

碩士
國立勤益科技大學
資訊工程系
107
With the development of the Internet, a platform for consumers to share their products on the Internet, but less to discuss the platform of food, in order to understand the satisfaction of consumers, traditional catering companies can only Using paper to let customers fill in, this method additionally increases the printing cost of paper, and cannot give feedback immediately. In this thesis, we use web crawler technology to capture webpage comments of social media to collect data and understand consumers’ degree of satisfaction through text analysis. By researching consumer reviews, we can improve the restaurant's business model and enhance the restaurant's performance. The restaurant can keep abreast of the information and let the restaurant know more about the customer's needs.

Estilos ABNT, Harvard, Vancouver, APA, etc.

Teses / dissertações sobre o tema "Data mining – social aspects"

Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos