Dissertations / Theses on the topic 'Sensitive date'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Sensitive date.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Ema, Ismat. "Sensitive Data Migration to the Cloud." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-64736.
Full textFolkesson, Carl. "Anonymization of directory-structured sensitive data." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160952.
Full textSubbiah, Arun. "Efficient Proactive Security for Sensitive Data Storage." Diss., Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/19719.
Full textBakri, Mustafa al. "Uncertainty-Sensitive Reasoning over the Web of Data." Thesis, Grenoble, 2014. http://www.theses.fr/2014GRENM073.
Full textIn this thesis we investigate several approaches that help users to find useful and trustful informationin the Web of Data using the Semantic Web technologies. In this purpose, we tackle tworesearch issues: Data Linkage in Linked Data and Trust in Semantic P2P Networks. We model the problem of data linkage in Linked Data as a reasoning problem on possibly decentralized data. We describe a novel Import-by-Query algorithm that alternates steps of subquery rewriting and of tailored querying the Linked Data cloud in order to import data as specific as possible for inferring or contradicting given target same-as facts. Experiments conducted on real-world datasets have demonstrated the feasibility of this approach and its usefulness in practice for data linkage and disambiguation. Furthermore, we propose an adaptation of this approach to take into account possibly uncertain data and knowledge, with a result the inference of same-as and different-from links associated with probabilistic weights. In this adaptation we model uncertainty as probability values. Our experiments have shown that our adapted approach scales to large data sets and produces meaningful probabilistic weights. Concerning trust, we introduce a trust mechanism for guiding the query-answering process in Semantic P2P Networks. Peers in Semantic P2P Networks organize their information using separate ontologies and rely on alignments between their ontologies for translating queries. Trust is such a setting is subjective and estimates the probability that a peer will provide satisfactory answers for specific queries in future interactions. In order to compute trust, the mechanism exploits the information provided by alignments, along with the one that comes from peer's experiences. The calculated trust values are refined over time using Bayesian inference as more queries are sent and answers received. For the evaluation of our mechanism, we have built a semantic P2P bookmarking system (TrustMe) in which we can vary different quantitative and qualitative parameters. The experimental results show the convergence of trust, and highlight the gain in the quality of peers' answers —measured with precision and recall— when the process of query answering is guided by our trust mechanism
Ljus, Simon. "Purging Sensitive Data in Logs Using Machine Learning." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-411610.
Full textDetta examensarbete undersöker om det är möjligt att skapa ett program som automatiskt identifierar och tar bort persondata från dataloggar med hjälp av maskinlärning. Att förstå innebörden av vissa ord kräver också kontext: Banan kan syfta på en banan som man kan äta eller en bana som man kan springa på. Kan en maskinlärningsmodell ta nytta av föregående och efterkommande ord i en sekvens av ord för att få en bättre noggrannhet på om ordet är känsligt eller ej. Typen av data som förekommer i loggarna kan vara bland annat namn, personnummer, användarnamn och epostadress. För att modellen ska kunna lära sig att känna igen datan krävs det att det finns data som är färdigannoterad med facit i hand. Telefonnummer, personnummer och epostadress kan bara se ut på ett visst sätt och behöver nödvändigtvis ingen maskininlärning för att kunna pekas ut. Kan man skapa en generell modell som fungerar på flera typer av dataloggar utan att använda regelbaserade algoritmer. Resultaten visar att den annoterade datan som användes för träning kan ha skiljt allt för mycket från de loggar som har testats på (osedd data), vilket betyder att modellen inte är bra på att generalisera.
Oshima, Sonoko. "Neuromelanin‐Sensitive Magnetic Resonance Imaging Using DANTE Pulse." Doctoral thesis, Kyoto University, 2021. http://hdl.handle.net/2433/263531.
Full textEl-Khoury, Hiba. "Introduction of New Products in the Supply Chain : Optimization and Management of Risks." Thesis, Jouy-en Josas, HEC, 2012. http://www.theses.fr/2012EHEC0001/document.
Full textShorter product life cycles and rapid product obsolescence provide increasing incentives to introduce newproducts to markets more quickly. As a consequence of rapidly changing market conditions, firms focus onimproving their new product development processes to reap the benefits of early market entry. Researchershave analyzed market entry, but have seldom provided quantitative approaches for the product rolloverproblem. This research builds upon the literature by using established optimization methods to examine howfirms can minimize their net loss during the rollover process. Specifically, our work explicitly optimizes thetiming of removal of old products and introduction of new products, the optimal strategy, and the magnitudeof net losses when the market entry approval date of a new product is unknown. In the first paper, we use theconditional value at risk to optimize the net loss and investigate the effect of risk perception of the manageron the rollover process. We compare it to the minimization of the classical expected net loss. We deriveconditions for optimality and unique closed-form solutions for single and dual rollover cases. In the secondpaper, we investigate the rollover problem, but for a time-dependent demand rate for the second producttrying to approximate the Bass Model. Finally, in the third paper, we apply the data-driven optimizationapproach to the product rollover problem where the probability distribution of the approval date is unknown.We rather have historical observations of approval dates. We develop the optimal times of rollover and showthe superiority of the data-driven method over the conditional value at risk in case where it is difficult to guessthe real probability distribution
Gholami, Ali. "Security and Privacy of Sensitive Data in Cloud Computing." Doctoral thesis, KTH, Parallelldatorcentrum, PDC, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-186141.
Full text“Cloud computing”, eller “molntjänster” som blivit den vanligaste svenska översättningen, har stor potential. Molntjänster kan tillhandahålla exaktden datakraft som efterfrågas, nästan oavsett hur stor den är; dvs. molntjäns-ter möjliggör vad som brukar kallas för “elastic computing”. Effekterna avmolntjänster är revolutionerande inom många områden av datoranvändning.Jämfört med tidigare metoder för databehandling ger molntjänster mångafördelar; exempelvis tillgänglighet av automatiserade verktyg för att monte-ra, ansluta, konfigurera och re-konfigurera virtuella resurser “allt efter behov”(“on-demand”). Molntjänster gör det med andra ord mycket lättare för or-ganisationer att uppfylla sina målsättningar. Men det paradigmskifte, sominförandet av molntjänster innebär, skapar även säkerhetsproblem och förutsätter noggranna integritetsbedömningar. Hur bevaras det ömsesidiga förtro-endet, hur hanteras ansvarsutkrävandet, vid minskade kontrollmöjligheter tillföljd av delad information? Följaktligen behövs molnplattformar som är såkonstruerade att de kan hantera känslig information. Det krävs tekniska ochorganisatoriska hinder för att minimera risken för dataintrång, dataintrångsom kan resultera i enormt kostsamma skador såväl ekonomiskt som policymässigt. Molntjänster kan innehålla känslig information från många olikaområden och domäner. Hälsodata är ett typiskt exempel på sådan information. Det är uppenbart att de flesta människor vill att data relaterade tillderas hälsa ska vara skyddad. Så den ökade användningen av molntjänster påsenare år har medfört att kraven på integritets- och dataskydd har skärptsför att skydda individer mot övervakning och dataintrång. Exempel på skyd-dande lagstiftning är “EU Data Protection Directive” (DPD) och “US HealthInsurance Portability and Accountability Act” (HIPAA), vilka båda kräverskydd av privatlivet och bevarandet av integritet vid hantering av informa-tion som kan identifiera individer. Det har gjorts stora insatser för att utvecklafler mekanismer för att öka dataintegriteten och därmed göra molntjänsternasäkrare. Exempel på detta är; kryptering, “trusted platform modules”, säker“multi-party computing”, homomorfisk kryptering, anonymisering, container-och “sandlåde”-tekniker.Men hur man korrekt ska skapa användbara, integritetsbevarande moln-tjänster för helt säker behandling av känsliga data är fortfarande i väsentligaavseenden ett olöst problem på grund av två stora forskningsutmaningar. Fördet första: Existerande integritets- och dataskydds-lagar kräver transparensoch noggrann granskning av dataanvändningen. För det andra: Bristande kän-nedom om en rad kommande och redan existerande säkerhetslösningar för att skapa effektiva molntjänster.Denna avhandling fokuserar på utformning och utveckling av system ochmetoder för att hantera känsliga data i molntjänster på lämpligaste sätt.Målet med de framlagda lösningarna är att svara de integritetskrav som ställsi redan gällande lagstiftning, som har som uttalad målsättning att skyddaindividers integritet vid användning av molntjänster.Vi börjar med att ge en överblick av de viktigaste begreppen i molntjäns-ter, för att därefter identifiera problem som behöver lösas för säker databe-handling vid användning av molntjänster. Avhandlingen fortsätter sedan med en beskrivning av bakgrundsmaterial och en sammanfattning av befintligasäkerhets- och integritets-lösningar inom molntjänster.Vårt främsta bidrag är en ny metod för att simulera integritetshot vidanvändning av molntjänster, en metod som kan användas till att identifierade integritetskrav som överensstämmer med gällande dataskyddslagar. Vårmetod används sedan för att föreslå ett ramverk som möter de integritetskravsom ställs för att hantera data inom området “genomik”. Genomik handlari korthet om hälsodata avseende arvsmassan (DNA) hos enskilda individer.Vårt andra större bidrag är ett system för att bevara integriteten vid publice-ring av biologiska provdata. Systemet har fördelen att kunna sammankopplaflera olika uppsättningar med data. Avhandlingen fortsätter med att före-slå och beskriva ett system kallat ScaBIA, ett integritetsbevarande systemför hjärnbildsanalyser processade via molntjänster. Avhandlingens avslutan-de kapitel beskriver ett nytt sätt för kvantifiering och minimering av risk vid“kernel exploitation” (“utnyttjande av kärnan”). Denna nya ansats är ävenett bidrag till utvecklingen av ett nytt system för (Call interposition referencemonitor for Lind - the dual layer sandbox).
QC 20160516
Mathew, George. "A Perturbative Decision Making Framework for Distributed Sensitive Data." Diss., Temple University Libraries, 2014. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/269109.
Full textPh.D.
In various business domains, intelligence garnered from data owned by peer institutions can provide useful information. But, due to regulations, privacy concerns and legal ramifications, peer institutions are reluctant to share raw data. For example, in medical domain, HIPAA regulations, Personally Identifiable Information and privacy issues are impediments to data sharing. However, intelligence can be learned from distributed data sets if their key characteristics are shared among desired parties. In scenarios where samples are rare locally, but adequately available collectively from other sites, sharing key statistics about the data may be sufficient to make proper decisions. The objective of this research is to provide a framework in a distributed environment that helps decision-making using statistics of data from participating sites; thereby eliminating the need for raw data to be shared outside the institution. Distributed ID3-based Decision Tree (DIDT) model building is proposed for effectively building a Decision Support System based on labeled data from distributed sites. The framework includes a query mechanism, a global schema generation process brokered by a clearing-house (CH), crosstable matrices generation by participating sites and entropy calculation (for test) using aggregate information from the crosstable matrices by CH. Empirical evaluations were done using synthetic and real data sets. Due to local data policies, participating sites may place restrictions on attribute release. The concept of "constraint graphs" is introduced as an out of band high level filtering for data in transit. Constraint graphs can be used to implement various data transformations including attributes exclusions. Experiments conducted using constraint graphs yielded results consistent with baseline results. In the case of medical data, it was shown that communication costs for DIDT can be contained by auto-reduction of features with predefined thresholds for near constant attributes. In another study, it was shown that hospitals with insufficient data to build local prediction models were able to collaboratively build a common prediction model with better accuracy using DIDT. This prediction model also reduced the number of incorrectly classified patients. A natural follow up question is: Can a hospital with sufficiently large number of instances provide a prediction model to a hospital with insufficient data? This was investigated and the signature of a large hospital dataset that can provide such a model is presented. It is also shown that the error rates of such a model is not statistically significant compared to the collaboratively built model. When rare instances of data occur in local database, it is quite valuable to draw conclusions collectively from such occurrences in other sites. However, in such situations, there will be huge imbalance in classes among the relevant base population. We present a system that can collectively build a distributed classification model without the need for raw data from each site in the case of imbalanced data. The system uses a voting ensemble of experts for the decision model, where each expert is built using DIDT on selective data generated by oversampling of minority class and undersampling of majority class data. The imbalance condition can be detected and the number of experts needed for the ensemble can be determined by the system.
Temple University--Theses
Ansell, Peter. "A context sensitive model for querying linked scientific data." Thesis, Queensland University of Technology, 2011. https://eprints.qut.edu.au/49777/1/Peter_Ansell_Thesis.pdf.
Full textSobel, Louis (Louis A. ). "Secure Input Overlays : increasing security for sensitive data on Android." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/100624.
Full textThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 44-47).
Mobile devices and the applications that run on them are an important part of people's lives. Often, an untrusted mobile application will need to obtain sensitive inputs, such as credit card information or passwords, from the user. The application needs these sensitive inputs in order to send them to a trusted service provider that enables the application to implement some useful functionality such as authentication or payment. In order for the inputs to be secure, there needs to be a trusted path from the user, through a trusted base on the mobile device, and to the remote service provider. In addition, remote attestation is necessary to convince the service provider that the inputs it receives traveled through the trusted path. There are two orthogonal parts to establishing the trusted path: local attestation and data protection. Local attestation is the user being convinced that they are interacting with the trusted base. Data protection is ensuring that inputs remain isolated from untrusted applications until they reach the trusted service provider. This paper categorizes previous research solutions to these two components of a trusted path. I then introduce a new solution addressing data protection: Secure Input Overlays. They keep input safe from untrusted applications by completely isolating the inputs from the untrusted mobile application. However, the untrusted application is still able to perform a limited set of queries for validation purposes. These queries are logged. When the application wants to send the inputs to a remote service provider, it declaratively describes the request. The trusted base sends the contents and the log of queries. An attestation generated by trusted hardware verifies that the request is coming from an Android device. The remote service provider can use this attestation to verify the request, then check the log of queries against a whitelist to make a trust decision about the supplied data.
by Louis Sobel.
M. Eng.
Landegren, Nils. "How Sensitive Are Cross-Lingual Mappings to Data-Specific Factors?" Thesis, Stockholms universitet, Institutionen för lingvistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-185069.
Full textLindblad, Christopher John. "A programming system for the dynamic manipulation of temporally sensitive data." Thesis, Massachusetts Institute of Technology, 1994. http://hdl.handle.net/1721.1/37744.
Full textIncludes bibliographical references (p. 255-277).
by Christopher John Lindblad.
Ph.D.
Le, Tallec Yann. "Robust, risk-sensitive, and data-driven control of Markov Decision Processes." Thesis, Massachusetts Institute of Technology, 2007. http://hdl.handle.net/1721.1/38598.
Full textIncludes bibliographical references (p. 201-211).
Markov Decision Processes (MDPs) model problems of sequential decision-making under uncertainty. They have been studied and applied extensively. Nonetheless, there are two major barriers that still hinder the applicability of MDPs to many more practical decision making problems: * The decision maker is often lacking a reliable MDP model. Since the results obtained by dynamic programming are sensitive to the assumed MDP model, their relevance is challenged by model uncertainty. * The structural and computational results of dynamic programming (which deals with expected performance) have been extended with only limited success to accommodate risk-sensitive decision makers. In this thesis, we investigate two ways of dealing with uncertain MDPs and we develop a new connection between robust control of uncertain MDPs and risk-sensitive control of dynamical systems. The first approach assumes a model of model uncertainty and formulates the control of uncertain MDPs as a problem of decision-making under (model) uncertainty. We establish that most formulations are at least NP-hard and thus suffer from the "'curse of uncertainty." The worst-case control of MDPs with rectangular uncertainty sets is equivalent to a zero-sum game between the controller and nature.
(cont.) The structural and computational results for such games make this formulation appealing. By adding a penalty for unlikely parameters, we extend the formulation of worst-case control of uncertain MDPs and mitigate its conservativeness. We show a duality between the penalized worst-case control of uncertain MDPs with rectangular uncertainty and the minimization of a Markovian dynamically consistent convex risk measure of the sample cost. This notion of risk has desirable properties for multi-period decision making, including a new Markovian property that we introduce and motivate. This Markovian property is critical in establishing the equivalence between minimizing some risk measure of the sample cost and solving a certain zero-sum Markov game between the decision maker and nature, and to tackling infinite-horizon problems. An alternative approach to dealing with uncertain MDPs, which avoids the curse of uncertainty, is to exploit directly observational data. Specifically, we estimate the expected performance of any given policy (and its gradient with respect to certain policy parameters) from a training set comprising observed trajectories sampled under a known policy.
(cont.) We propose new value (and value gradient) estimators that are unbiased and have low training set to training set variance. We expect our approach to outperform competing approaches when there are few system observations compared to the underlying MDP size, as indicated by numerical experiments.
by Yann Le Tallec.
Ph.D.
Torabian, Hajaralsadat. "Protecting sensitive data using differential privacy and role-based access control." Master's thesis, Université Laval, 2016. http://hdl.handle.net/20.500.11794/26580.
Full textIn nowadays world where most aspects of modern life are handled and managed by computer systems, privacy has increasingly become a big concern. In addition, data has been massively generated and processed especially over the last two years. The rate at which data is generated on one hand, and the need to efficiently store and analyze it on the other hand, lead people and organizations to outsource their massive amounts of data (namely Big Data) to cloud environments supported by cloud service providers (CSPs). Such environments can perfectly undertake the tasks for storing and analyzing big data since they mainly rely on Hadoop MapReduce framework, which is designed to efficiently handle big data in parallel. Although outsourcing big data into the cloud facilitates data processing and reduces the maintenance cost of local data storage, it raises new problem concerning privacy protection. The question is how one can perform computations on sensitive and big data while still preserving privacy. Therefore, building secure systems for handling and processing such private massive data is crucial. We need mechanisms to protect private data even when the running computation is untrusted. There have been several researches and work focused on finding solutions to the privacy and security issues for data analytics on cloud environments. In this dissertation, we study some existing work to protect the privacy of any individual in a data set, specifically a notion of privacy known as differential privacy. Differential privacy has been proposed to better protect the privacy of data mining over sensitive data, ensuring that the released aggregate result gives almost nothing about whether or not any given individual has been contributed to the data set. Finally, we propose an idea of combining differential privacy with another available privacy preserving method.
Hedlin, Johan, and Joakim Kahlström. "Detecting access to sensitive data in software extensions through static analysis." Thesis, Linköpings universitet, Programvara och system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162281.
Full textBarreau, Emilie. "Accès aux droits sociaux et numérique : les enjeux de la digitalisation dans l’accès aux aides sociales départementales." Electronic Thesis or Diss., Angers, 2024. http://www.theses.fr/2024ANGE0012.
Full textThe dematerialization of administrative procedures is a general fact that has a specific scope in terms of social rights. When it comes to social assistance, these rights are aimed at a vulnerable public that can combine difficulty factors. The dematerialization of administrative procedures, which results in the lack of offices/desks and interlocutors, is deployed without the particularity of social rights or the vulnerability of the persons concerned being considered. Consequently, the desired objective of strengthening access to social rights through the potential of digital technology quickly gives way to uncertainty about the effectiveness of social rights. This is particularly the case in the context of platforms that constitute interfaces between the applicant or the beneficiary of social assistance and the authority that must ensure and monitor it, such as departmental councils. The innovative nature of these tools must not, however, lose sight of their initial social function. While a more inclusive framework of practices is developing, the current legal framework seems to be mobilized in favor of digital (dematerialization, open data, algorithms, etc.). In this respect, the relationship between access to social rights and digital reveals differences such as the local organization of departmental councils, the sensitivity of personal data, the consequences of automating individual administrative decisions and the economic value of data. Therefore, the position adopted in this research is to highlight all the conditions allowing to ensure the respect of social rights in the face of these changes
Krishnaswamy, Vijaykumar. "Shared state management for time-sensitive distributed applications." Diss., Georgia Institute of Technology, 2001. http://hdl.handle.net/1853/8197.
Full textEl, KHOURY Hiba. "Introduction of New Products in the Supply Chain : Optimization and Management of Risks." Phd thesis, HEC, 2012. http://pastel.archives-ouvertes.fr/pastel-00708801.
Full textKhire, Sourabh Mohan. "Time-sensitive communication of digital images, with applications in telepathology." Thesis, Atlanta, Ga. : Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/29761.
Full textCommittee Chair: Jayant, Nikil; Committee Member: Anderson, David; Committee Member: Lee, Chin-Hui. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Murphy, Brian R. "Order-sensitive XML query processing over relational sources." Link to electronic thesis, 2003. http://www.wpi.edu/Pubs/ETD/Available/etd-0505103-123753.
Full textKeywords: computation pushdown; XML; order-based Xquery processing; relational database; ordered SQL queries; data model mapping; XQuery; XML data mapping; SQL; XML algebra rewrite rules; XML document order. Includes bibliographical references (p. 64-67).
McCullagh, Karen. "The social, cultural, epistemological and technical basis of the concept of 'private' data." Thesis, University of Manchester, 2012. https://www.research.manchester.ac.uk/portal/en/theses/the-social-cultural-epistemological-and-technical-basis-of-the-concept-of-private-data(e2ea538a-8e5b-43e3-8dc2-4cdf602a19d3).html.
Full textRaber, Frederic Christian [Verfasser]. "Supporting lay users in privacy decisions when sharing sensitive data / Frederic Christian Raber." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2020. http://d-nb.info/1220691127/34.
Full textAilem, Melissa. "Sparsity-sensitive diagonal co-clustering algorithms for the effective handling of text data." Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB087.
Full textIn the current context, there is a clear need for Text Mining techniques to analyse the huge quantity of unstructured text documents available on the Internet. These textual data are often represented by sparse high dimensional matrices where rows and columns represent documents and terms respectively. Thus, it would be worthwhile to simultaneously group these terms and documents into meaningful clusters, making this substantial amount of data easier to handle and interpret. Co-clustering techniques just serve this purpose. Although many existing co-clustering approaches have been successful in revealing homogeneous blocks in several domains, these techniques are still challenged by the high dimensionality and sparsity characteristics exhibited by document-term matrices. Due to this sparsity, several co-clusters are primarily composed of zeros. While homogeneous, these co-clusters are irrelevant and must be filtered out in a post-processing step to keep only the most significant ones. The objective of this thesis is to propose new co-clustering algorithms tailored to take into account these sparsity-related issues. The proposed algorithms seek a block diagonal structure and allow to straightaway identify the most useful co-clusters, which makes them specially effective for the text co-clustering task. Our contributions can be summarized as follows: First, we introduce and demonstrate the effectiveness of a novel co-clustering algorithm based on a direct maximization of graph modularity. While existing graph-based co-clustering algorithms rely on spectral relaxation, the proposed algorithm uses an iterative alternating optimization procedure to reveal the most meaningful co-clusters in a document-term matrix. Moreover, the proposed optimization has the advantage of avoiding the computation of eigenvectors, a task which is prohibitive when considering high dimensional data. This is an improvement over spectral approaches, where the eigenvectors computation is necessary to perform the co-clustering. Second, we use an even more powerful approach to discover block diagonal structures in document-term matrices. We rely on mixture models, which offer strong theoretical foundations and considerable flexibility that makes it possible to uncover various specific cluster structure. More precisely, we propose a rigorous probabilistic model based on the Poisson distribution and the well known Latent Block Model. Interestingly, this model includes the sparsity in its formulation, which makes it particularly effective for text data. Setting the estimate of this model’s parameters under the Maximum Likelihood (ML) and the Classification Maximum Likelihood (CML) approaches, four co-clustering algorithms have been proposed, including a hard, a soft, a stochastic and a fourth algorithm which leverages the benefits of both the soft and stochastic variants, simultaneously. As a last contribution of this thesis, we propose a new biomedical text mining framework that includes some of the above mentioned co-clustering algorithms. This work shows the contribution of co-clustering in a real biomedical text mining problematic. The proposed framework is able to propose new clues about the results of genome wide association studies (GWAS) by mining PUBMED abstracts. This framework has been tested on asthma disease and allowed to assess the strength of associations between asthma genes reported in previous GWAS as well as discover new candidate genes likely associated to asthma. In a nutshell, while several text co-clustering algorithms already exist, their performance can be substantially increased if more appropriate models and algorithms are available. According to the extensive experiments done on several challenging real-world text data sets, we believe that this thesis has served well this objective
Ailem, Melissa. "Sparsity-sensitive diagonal co-clustering algorithms for the effective handling of text data." Electronic Thesis or Diss., Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB087.
Full textIn the current context, there is a clear need for Text Mining techniques to analyse the huge quantity of unstructured text documents available on the Internet. These textual data are often represented by sparse high dimensional matrices where rows and columns represent documents and terms respectively. Thus, it would be worthwhile to simultaneously group these terms and documents into meaningful clusters, making this substantial amount of data easier to handle and interpret. Co-clustering techniques just serve this purpose. Although many existing co-clustering approaches have been successful in revealing homogeneous blocks in several domains, these techniques are still challenged by the high dimensionality and sparsity characteristics exhibited by document-term matrices. Due to this sparsity, several co-clusters are primarily composed of zeros. While homogeneous, these co-clusters are irrelevant and must be filtered out in a post-processing step to keep only the most significant ones. The objective of this thesis is to propose new co-clustering algorithms tailored to take into account these sparsity-related issues. The proposed algorithms seek a block diagonal structure and allow to straightaway identify the most useful co-clusters, which makes them specially effective for the text co-clustering task. Our contributions can be summarized as follows: First, we introduce and demonstrate the effectiveness of a novel co-clustering algorithm based on a direct maximization of graph modularity. While existing graph-based co-clustering algorithms rely on spectral relaxation, the proposed algorithm uses an iterative alternating optimization procedure to reveal the most meaningful co-clusters in a document-term matrix. Moreover, the proposed optimization has the advantage of avoiding the computation of eigenvectors, a task which is prohibitive when considering high dimensional data. This is an improvement over spectral approaches, where the eigenvectors computation is necessary to perform the co-clustering. Second, we use an even more powerful approach to discover block diagonal structures in document-term matrices. We rely on mixture models, which offer strong theoretical foundations and considerable flexibility that makes it possible to uncover various specific cluster structure. More precisely, we propose a rigorous probabilistic model based on the Poisson distribution and the well known Latent Block Model. Interestingly, this model includes the sparsity in its formulation, which makes it particularly effective for text data. Setting the estimate of this model’s parameters under the Maximum Likelihood (ML) and the Classification Maximum Likelihood (CML) approaches, four co-clustering algorithms have been proposed, including a hard, a soft, a stochastic and a fourth algorithm which leverages the benefits of both the soft and stochastic variants, simultaneously. As a last contribution of this thesis, we propose a new biomedical text mining framework that includes some of the above mentioned co-clustering algorithms. This work shows the contribution of co-clustering in a real biomedical text mining problematic. The proposed framework is able to propose new clues about the results of genome wide association studies (GWAS) by mining PUBMED abstracts. This framework has been tested on asthma disease and allowed to assess the strength of associations between asthma genes reported in previous GWAS as well as discover new candidate genes likely associated to asthma. In a nutshell, while several text co-clustering algorithms already exist, their performance can be substantially increased if more appropriate models and algorithms are available. According to the extensive experiments done on several challenging real-world text data sets, we believe that this thesis has served well this objective
Jarvis, Ryan D. "Protecting Sensitive Credential Content during Trust Negotiation." Diss., CLICK HERE for online access, 2003. http://contentdm.lib.byu.edu/ETD/image/etd192.pdf.
Full textBecerra, Bonache Leonor. "On the learnibility of Mildly Context-Sensitive languages using positive data and correction queries." Doctoral thesis, Universitat Rovira i Virgili, 2006. http://hdl.handle.net/10803/8780.
Full textNuestras tres principales aportaciones son:
1. Introducción de una nueva clase de lenguajes llamada Simple p-dimensional external contextual (SEC). A pesar de que las investigaciones en inferencia gramatical se han centrado en lenguajes regulares o independientes del contexto, en nuestra tesis proponemos centrar esos estudios en clases de lenguajes más relevantes desde un punto de vista lingüístico (familias de lenguajes que ocupan una posición ortogonal en la jerarquía de Chomsky y que son suavemente dependientes del contexto, por ejemplo, SEC).
2. Presentación de un nuevo paradigma de aprendizaje basado en preguntas de corrección. Uno de los principales resultados positivos dentro de la teoría del aprendizaje formal es el hecho de que los autómatas finitos deterministas (DFA) se pueden aprender de manera eficiente utilizando preguntas de pertinencia y preguntas de equivalencia. Teniendo en cuenta que en el aprendizaje de primeras lenguas la corrección de errores puede jugar un papel relevante, en nuestra tesis doctoral hemos introducido un nuevo modelo de aprendizaje que reemplaza las preguntas de pertinencia por preguntas de corrección.
3. Presentación de resultados basados en las dos previas aportaciones. En primer lugar, demostramos que los SEC se pueden aprender a partir de datos positivos. En segundo lugar, demostramos que los DFA se pueden aprender a partir de correcciones y que el número de preguntas se reduce considerablemente.
Los resultados obtenidos con esta tesis doctoral suponen una aportación importante para los estudios en inferencia gramatical (hasta el momento las investigaciones en este ámbito se habían centrado principalmente en los aspectos matemáticos de los modelos). Además, estos resultados se podrían extender a diversos campos de aplicación que gozan de plena actualidad, tales como el aprendizaje automático, la robótica, el procesamiento del lenguaje natural y la bioinformática.
With this dissertation, we bring together the Theory of the Grammatical Inference and Studies of language acquisition, in pursuit of our final goal: to go deeper in the understanding of the process of language acquisition by using the theory of inference of formal grammars.
Our main three contributions are:
1. Introduction of a new class of languages called Simple p-dimensional external contextual (SEC). Despite the fact that the field of Grammatical Inference has focused its research on learning regular or context-free languages, we propose in our dissertation to focus these studies in classes of languages more relevant from a linguistic point of view (families of languages that occupy an orthogonal position in the Chomsky Hierarchy and are Mildly Context-Sensitive, for example SEC).
2. Presentation of a new learning paradigm based on correction queries. One of the main results in the theory of formal learning is that deterministic finite automata (DFA) are efficiently learnable from membership query and equivalence query. Taken into account that in first language acquisition the correction of errors can play an important role, we have introduced in our dissertation a novel learning model by replacing membership queries with correction queries.
3. Presentation of results based on the two previous contributions. First, we prove that SEC is learnable from only positive data. Second, we prove that it is possible to learn DFA from corrections and that the number of queries is reduced considerably.
The results obtained with this dissertation suppose an important contribution to studies of Grammatical Inference (the current research in Grammatical Inference has focused mainly on the mathematical aspects of the models). Moreover, these results could be extended to studies related directly to machine translation, robotics, natural language processing, and bioinformatics.
Vilsmaier, Christian. "Contextualized access to distributed and heterogeneous multimedia data sources." Thesis, Lyon, INSA, 2014. http://www.theses.fr/2014ISAL0094/document.
Full textMaking multimedia data available online becomes less expensive and more convenient on a daily basis. This development promotes web phenomenons such as Facebook, Twitter, and Flickr. These phenomena and their increased acceptance in society in turn leads to a multiplication of the amount of available images online. This vast amount of, frequently public and therefore searchable, images already exceeds the zettabyte bound. Executing a similarity search on the magnitude of images that are publicly available and receiving a top quality result is a challenge that the scientific community has recently attempted to rise to. One approach to cope with this problem assumes the use of distributed heterogeneous Content Based Image Retrieval system (CBIRs). Following from this anticipation, the problems that emerge from a distributed query scenario must be dealt with. For example the involved CBIRs’ usage of distinct metadata formats for describing their content, as well as their unequal technical and structural information. An addition issue is the individual metrics that are used by the CBIRs to calculate the similarity between pictures, as well as their specific way of being combined. Overall, receiving good results in this environment is a very labor intensive task which has been scientifically but not yet comprehensively explored. The problem primarily addressed in this work is the collection of pictures from CBIRs, that are similar to a given picture, as a response to a distributed multimedia query. The main contribution of this thesis is the construction of a network of Content Based Image Retrieval systems that are able to extract and exploit the information about an input image’s semantic concept. This so called semantic CBIRn is mainly composed of CBIRs that are configured by the semantic CBIRn itself. Complementarily, there is a possibility that allows the integration of specialized external sources. The semantic CBIRn is able to collect and merge results of all of these attached CBIRs. In order to be able to integrate external sources that are willing to join the network, but are not willing to disclose their configuration, an algorithm was developed that approximates these configurations. By categorizing existing as well as external CBIRs and analyzing incoming queries, image queries are exclusively forwarded to the most suitable CBIRs. In this way, images that are not of any use for the user can be omitted beforehand. The hereafter returned images are rendered comparable in order to be able to merge them to one single result list of images, that are similar to the input image. The feasibility of the approach and the hereby obtained improvement of the search process is demonstrated by a prototypical implementation. Using this prototypical implementation an augmentation of the number of returned images that are of the same semantic concept as the input images is achieved by a factor of 4.75 with respect to a predefined non-semantic CBIRn
Darshana, Dipika. "DELAY SENSITIVE ROUTING FOR REAL TIME TRAFFIC OVER AD-HOC NETWORKS." Master's thesis, University of Central Florida, 2008. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/2802.
Full textM.S.
School of Electrical Engineering and Computer Science
Engineering and Computer Science
Computer Engineering MSCpE
Winandy, Marcel [Verfasser]. "Security and Trust Architectures for Protecting Sensitive Data on Commodity Computing Platforms / Marcel Winandy." Aachen : Shaker, 2012. http://d-nb.info/106773497X/34.
Full textVaskovich, Daria. "Cloud Computing and Sensitive Data : A Case of Beneficial Co-Existence or Mutual Exclusiveness?" Thesis, KTH, Hållbarhet och industriell dynamik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-169597.
Full textCloud computing is today a hot topic, which has changed how IT is delivered and created new business models to pursue. The main listed benefits of Cloud computing are, among others, flexibility and scalability. It is widely adopted by individuals in services, such as Google Drive and Dropbox. However, there exist a certain degree of precaution towards Cloud computing at organizations, which possess sensitive data, which may decelerate the adoption. Hence, this master thesis aims to investigate the topic of Cloud computing in a combination with sensitive data in order to support organizations in their decision making with a base of knowledge when a transition into the Cloud is considered. Sensitive data is defined as information protected by the Swedish Personal Data Act. Previous studies show that organizations value high degree of security when making a transition into Cloud computing, and request several measures to be implemented by the Cloud computing service provider. Legislative conformation of a Cloud computing service is another important aspect. The data gathering activities consisted of a survey, directed towards 101 Swedish organizations in order to map their usage of Cloud computing services and to identify aspects, which may decelerate the adoption. Moreover, interviews with three (3) experts within the fields of law and Cloud computing were conducted. The results were analyzed and discussed, which led to conclusions that hybrid Cloud is a well chosen alternative for a precautious organization, the SLA between the organizations should be thoroughly negotiated and that primarily providers well established on the Swedish market should be chosen in order to minimize the risk of legally non-consisting solution. Finally, each organization should decide whether the security provided by the Cloud computing provider is sufficient for organization’s purposes.
Peng, Zhen. "Novel Data Analytics for Developing Sensitive and Reliable Damage Indicators in Structural Health Monitoring." Thesis, Curtin University, 2022. http://hdl.handle.net/20.500.11937/89064.
Full textAljandal, Waleed A. "Itemset size-sensitive interestingness measures for association rule mining and link prediction." Diss., Manhattan, Kan. : Kansas State University, 2009. http://hdl.handle.net/2097/1119.
Full textFlory, Long Mrs. "A WEB PERSONALIZATION ARTIFACT FOR UTILITY-SENSITIVE REVIEW ANALYSIS." VCU Scholars Compass, 2015. http://scholarscompass.vcu.edu/etd/3739.
Full textOrding, Marcus. "Context-Sensitive Code Completion : Improving Predictions with Genetic Algorithms." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-205334.
Full textInom området kontextkänslig kodkomplettering finns det ett behov av precisa förutsägande modeller för att kunna föreslå användbara kodkompletteringar. Den traditionella metoden för att optimera prestanda hos kodkompletteringssystem är att empiriskt utvärdera effekten av varje systemparameter individuellt och finjustera parametrarna. Det här arbetet presenterar en genetisk algoritm som kan optimera systemparametrarna med en frihetsgrad som är lika stor som antalet parametrar att optimera. Studien utvärderar effekten av de optimerade parametrarna på det studerade kodkompletteringssystemets pre- diktiva kvalitet. Tidigare utvärdering av referenssystemet utökades genom att även inkludera modellstorlek och slutledningstid. Resultaten av studien visar att den genetiska algoritmen kan förbättra den prediktiva kvali- teten för det studerade kodkompletteringssystemet. Jämfört med referenssystemet så lyckas det förbättrade systemet korrekt känna igen 1 av 10 ytterligare kodmönster som tidigare varit osedda. Förbättringen av prediktiv kvalietet har inte en signifikant inverkan på systemet, då slutledningstiden förblir mindre än 1 ms för båda systemen.
Li, Xinfeng. "Time-sensitive Information Communication, Sensing, and Computing in Cyber-Physical Systems." The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1397731767.
Full textEngin, Melih. "Text Classificaton In Turkish Marketing Domain And Context-sensitive Ad Distribution." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/12610457/index.pdf.
Full textForde, Edward Steven. "Security Strategies for Hosting Sensitive Information in the Commercial Cloud." ScholarWorks, 2017. https://scholarworks.waldenu.edu/dissertations/3604.
Full textHe, Yuting. "RVD2: An ultra-sensitive variant detection model for low-depth heterogeneous next-generation sequencing data." Digital WPI, 2014. https://digitalcommons.wpi.edu/etd-theses/499.
Full textHsu, William. "Using knowledge encoded in graphical disease models to support context-sensitive visualization of medical data." Diss., Restricted to subscribing institutions, 2009. http://proquest.umi.com/pqdweb?did=1925776141&sid=13&Fmt=2&clientId=1564&RQT=309&VName=PQD.
Full textHoang, Van-Hoan. "Securing data access and exchanges in a heterogeneous ecosystem : An adaptive and context-sensitive approach." Thesis, La Rochelle, 2022. http://www.theses.fr/2022LAROS009.
Full textCloud-based data storage and sharing services have been proven successful since the last decades. The underlying model helps users not to expensively spend on hardware to store data while still being able to access and share data anywhere and whenever they desire. In this context, security is vital to protecting users and their resources. Regarding users, they need to be securely authenticated to prove their eligibility to access resources. As for user privacy, showing credentials enables the service provider to detect sharing-related people or build a profile for each. Regarding outsourced data, due to complexity in deploying an effective key management in such services, data is often not encrypted by users but service providers. This enables them to read users’ data. In this thesis, we make a set of contributions which address these issues. First, we design a password-based authenticated key exchange protocol to establish a secure channel between users and service providers over insecure environment. Second, we construct a privacy-enhancing decentralized public key infrastructure which allows building secure authentication protocols while preserving user privacy. Third, we design two revocable ciphertext-policy attribute-based encryption schemes. These provide effective key management systems to help a data owner to encrypt data before outsourcing it while still retaining the capacity to securely share it with others. Fourth, we build a decentralized data sharing platform by leveraging the blockchain technology and the IPFS network. The platform aims at providing high data availability, data confidentiality, secure access control, and user privacy
Sankara, Krishnan Shivaranjani. "Delay sensitive delivery of rich images over WLAN in telemedicine applications." Thesis, Atlanta, Ga. : Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/29673.
Full textCommittee Chair: Jayant, Nikil; Committee Member: Altunbasak, Yucel; Committee Member: Sivakumar, Raghupathy. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Bhattacharya, Arindam. "Gradient Dependent Reconstruction from Scalar Data." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1449181983.
Full textLjungberg, Lucas. "Using unsupervised classification with multiple LDA derived models for text generation based on noisy and sensitive data." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-255010.
Full textAtt skapa modeller som genererar kontextuella svar på frågor är ett svårt problem från början, någonting som blir än mer svårt när tillgänglig data innehåller både brus och känslig information. Det är både viktigt och av stort intresse att hitta modeller och metoder som kan hantera dessa svårigheter så att även problematisk data kan användas produktivt.Detta examensarbete föreslår en modell baserat på ett par samarbetande Topic Models (ämnesbaserade modeller) med skiljande ansvarsområden (LDA och GSDMM) för att underlätta de problematiska egenskaperna av datan. Modellen testas på ett verkligt dataset med dessa svårigheter samt ett dataset utan dessa. Målet är att 1) inspektera båda ämnesmodellernas beteende för att se om dessa kan representera datan på ett sådant sätt att andra modeller kan använda dessa som indata eller utdata och 2) förstå vilka av dessa svårigheter som kan hanteras som följd.Resultaten visar att ämnesmodellerna kan representera semantiken och betydelsen av dokument bra nog för att producera välartad indata för andra modeller. Denna representation kan även hantera stora ordlistor och brus i texten. Resultaten visar även att ämnesgrupperingen av responsdatan är godartad nog att användas som mål för klassificeringsmodeller sådant att korrekta meningar kan genereras som respons.
Qi, Hao. "Computing resources sensitive parallelization of neural neworks for large scale diabetes data modelling, diagnosis and prediction." Thesis, Brunel University, 2011. http://bura.brunel.ac.uk/handle/2438/6346.
Full textKoop, Martin [Verfasser], and Stefan [Akademischer Betreuer] Katzenbeisser. "Preventing the Leakage of Privacy Sensitive User Data on the Web / Martin Koop ; Betreuer: Stefan Katzenbeisser." Passau : Universität Passau, 2021. http://d-nb.info/1226425577/34.
Full textOguchi, Chizoba. "A Comparison of Sensitive Splice Aware Aligners in RNA Sequence Data Analysis in Leaping towards Benchmarking." Thesis, Högskolan i Skövde, Institutionen för biovetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18513.
Full textOlorunnimbe, Muhammed. "Intelligent Adaptation of Ensemble Size in Data Streams Using Online Bagging." Thesis, Université d'Ottawa / University of Ottawa, 2015. http://hdl.handle.net/10393/32340.
Full textJun, Mi Kyung. "Effects of survey mode, gender, and perceived sensitivity on the quality of data regarding sensitive health behaviors." [Bloomington, Ind.] : Indiana University, 2005. http://wwwlib.umi.com/dissertations/fullcit/3167794.
Full textSource: Dissertation Abstracts International, Volume: 66-04, Section: B, page: 2011. Adviser: Nathan W. Shier. "Title of dissertation home page (viewed Nov. 22, 2006)."
Hellman, Hanna. "Data Aggregation in Time Sensitive Multi-Sensor Systems : Study and Implementation of Wheel Data Aggregation for Slip Detection in an Autonomous Vehicle Convoy." Thesis, KTH, Mekatronik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-217857.
Full textWith an impending shift to more advanced safety systems and driver assistance (ADAS) in the vehicles we drive, and also increased autonomousity, comes increased amounts of data on the internal vehicle data bus. There is a need to lessen the amount of data and at the same time increase its value. Data aggregation, often applied in the field of environmental sensing or small mobile robots (WMR’s), could be a partial solution. This thesis choses to investigate an aggregation strategy applied to a use case regarding slip detection in a vehicle convoy. The approach was implemented in a physical demonstrator in the shape of a small autonomousvehicle convoy to produce quantitative data. The results imply that a weighted adaptive average can be used for vehicle velocity estimation based on the input of four individual wheel velocities. There after a slip ratio can be calculated which is used to decide if slip exists or not. Limitations of the proposed approach is however the number of velocity references that is needed since the results currently apply to one-wheel slipon a four-wheel vehicle. A proposed future direction related to the use case of convoy driving could be to include platooning vehicles as extra velocity references for the vehicles in the convoy, thus increasing the accuracy of the slip detection and merging the areas of CO-CPS and data aggregation.