Dissertations / Theses on the topic 'Classification des logiciels malveillants'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 47 dissertations / theses for your research on the topic 'Classification des logiciels malveillants.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Puodzius, Cassius. "Data-driven malware classification assisted by machine learning methods." Electronic Thesis or Diss., Rennes 1, 2022. https://ged.univ-rennes1.fr/nuxeo/site/esupversions/3dabb48c-b635-46a5-bcbe-23992a2512ec.
Full textHistorically, malware (MW) analysis has heavily resorted to human savvy for manual signature creation to detect and classify MW. This procedure is very costly and time consuming, thus unable to cope with modern cyber threat scenario. The solution is to widely automate MW analysis. Toward this goal, MW classification allows optimizing the handling of large MW corpora by identifying resemblances across similar instances. Consequently, MW classification figures as a key activity related to MW analysis, which is paramount in the operation of computer security as a whole. This thesis addresses the problem of MW classification taking an approach in which human intervention is spared as much as possible. Furthermore, we steer clear of subjectivity inherent to human analysis by designing MW classification solely on data directly extracted from MW analysis, thus taking a data-driven approach. Our objective is to improve the automation of malware analysis and to combine it with machine learning methods that are able to autonomously spot and reveal unwitting commonalities within data. We phased our work in three stages. Initially we focused on improving MW analysis and its automation, studying new ways of leveraging symbolic execution in MW analysis and developing a distributed framework to scale up our computational power. Then we concentrated on the representation of MW behavior, with painstaking attention to its accuracy and robustness. Finally, we fixed attention on MW clustering, devising a methodology that has no restriction in the combination of syntactical and behavioral features and remains scalable in practice. As for our main contributions, we revamp the use of symbolic execution for MW analysis with special attention to the optimal use of SMT solver tactics and hyperparameter settings; we conceive a new evaluation paradigm for MW analysis systems; we formulate a compact graph representation of behavior, along with a corresponding function for pairwise similarity computation, which is accurate and robust; and we elaborate a new MW clustering strategy based on ensemble clustering that is flexible with respect to the combination of syntactical and behavioral features
Calvet, Joan. "Analyse Dynamique de Logiciels Malveillants." Phd thesis, Université de Lorraine, 2013. http://tel.archives-ouvertes.fr/tel-00922384.
Full textThierry, Aurélien. "Désassemblage et détection de logiciels malveillants auto-modifiants." Thesis, Université de Lorraine, 2015. http://www.theses.fr/2015LORR0011/document.
Full textThis dissertation explores tactics for analysis and disassembly of malwares using some obfuscation techniques such as self-modification and code overlapping. Most malwares found in the wild use self-modification in order to hide their payload from an analyst. We propose an hybrid analysis which uses an execution trace derived from a dynamic analysis. This analysis cuts the self-modifying binary into several non self-modifying parts that we can examine through a static analysis using the trace as a guide. This second analysis circumvents more protection techniques such as code overlapping in order to recover the control flow graph of the studied binary. Moreover we review a morphological malware detector which compares the control flow graph of the studied binary against those of known malwares. We provide a formalization of this graph comparison problem along with efficient algorithms that solve it and a use case in the software similarity field
Pektaş, Abdurrahman. "Behavior based malware classification using online machine learning." Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAM065/document.
Full textRecently, malware, short for malicious software has greatly evolved and became a major threat to the home users, enterprises, and even to the governments. Despite the extensive use and availability of various anti-malware tools such as anti-viruses, intrusion detection systems, firewalls etc., malware authors can readily evade these precautions by using obfuscation techniques. To mitigate this problem, malware researchers have proposed various data mining and machine learning approaches for detecting and classifying malware samples according to the their static or dynamic feature set. Although the proposed methods are effective over small sample set, the scalability of these methods for large data-set are in question.Moreover, it is well-known fact that the majority of the malware is the variant of the previously known samples. Consequently, the volume of new variant created far outpaces the current capacity of malware analysis. Thus developing malware classification to cope with increasing number of malware is essential for security community. The key challenge in identifying the family of malware is to achieve a balance between increasing number of samples and classification accuracy. To overcome this limitation, unlike existing classification schemes which apply machine learning algorithm to stored data, i.e., they are off-line, we proposed a new malware classification system employing online machine learning algorithms that can provide instantaneous update about the new malware sample by following its introduction to the classification scheme.To achieve our goal, firstly we developed a portable, scalable and transparent malware analysis system called VirMon for dynamic analysis of malware targeting Windows OS. VirMon collects the behavioral activities of analyzed samples in low kernel level through its developed mini-filter driver. Secondly we set up a cluster of five machines for our online learning framework module (i.e. Jubatus), which allows to handle large scale of data. This configuration allows each analysis machine to perform its tasks and delivers the obtained results to the cluster manager.Essentially, the proposed framework consists of three major stages. The first stage consists in extracting the behavior of the sample file under scrutiny and observing its interactions with the OS resources. At this stage, the sample file is run in a sandboxed environment. Our framework supports two sandbox environments: VirMon and Cuckoo. During the second stage, we apply feature extraction to the analysis report. The label of each sample is determined by using Virustotal, an online multiple anti-virus scanner framework consisting of 46 engines. Then at the final stage, the malware dataset is partitioned into training and testing sets. The training set is used to obtain a classification model and the testing set is used for evaluation purposes .To validate the effectiveness and scalability of our method, we have evaluated our method on 18,000 recent malicious files including viruses, trojans, backdoors, worms, etc., obtained from VirusShare, and our experimental results show that our method performs malware classification with 92% of accuracy
Lemay, Frédérick. "Instrumentation optimisée de code pour prévenir l'exécution de code malicieux." Thesis, Université Laval, 2012. http://www.theses.ulaval.ca/2012/29030/29030.pdf.
Full textKhoury, Raphaël. "Détection du code malicieux : système de type à effets et instrumentation du code." Thesis, Université Laval, 2005. http://www.theses.ulaval.ca/2005/23250/23250.pdf.
Full textThe purpose of this thesis is twofold. In the first place it presents a comparative study of the advantages and drawbacks of several approaches to insure software safety and security. It then focuses more particularly on combining static analyses and dynamic monitoring in order to produce a more powerful security architecture. The first chapters of the thesis present an analytical review of the various static, dynamic and hybrid approaches that can be used to secure a potentially malicious code. The advantages and drawbacks of each approach are thereby analyzed and the field of security properties that can be enforced by using it are identified. The thesis then focuses on the possibility of combining static and dynamic analysis through a new hybrid approach. This approach consists in a code instrumentation, that only alters those parts of a program where it is necessary to do so to insure the respect of a user-defined security policy expressed in a set of modal μ-calculus properties. this instrumentation is guided by a static analysis based on a type and effect system. The effects represent the accesses made to pretested system ressources.
Palisse, Aurélien. "Analyse et détection de logiciels de rançon." Thesis, Rennes 1, 2019. http://www.theses.fr/2019REN1S003/document.
Full textThis phD thesis takes a look at ransomware, presents an autonomous malware analysis platform and proposes countermeasures against these types of attacks. Our countermeasures are real-time and are deployed on a machine (i.e., end-hosts). In 2013, the ransomware become a hot subject of discussion again, before becoming one of the biggest cyberthreats beginning of 2015. A detailed state of the art for existing countermeasures is included in this thesis. This state of the art will help evaluate the contribution of this thesis in regards to the existing current publications. We will also present an autonomous malware analysis platform composed of bare-metal machines. Our aim is to avoid altering the behaviour of analysed samples. A first countermeasure based on the use of a cryptographic library is proposed, however it can easily be bypassed. It is why we propose a second generic and agnostic countermeasure. This time, compromission indicators are used to analyse the behaviour of process on the file system. We explain how we configured this countermeasure in an empiric way to make it useable and effective. One of the challenge of this thesis is to collate performance, detection rate and a small amount of false positive. To finish, results from a user experience are presented. This experience analyses the user's behaviour when faced with a threat. In the final part, I propose ways to enhance our contributions but also other avenues that could be explored
Lespérance, Pierre-Luc. "Détection des variations d'attaques à l'aide d'une logique temporelle." Thesis, Université Laval, 2006. http://www.theses.ulaval.ca/2006/23481/23481.pdf.
Full textBeaucamps, Philippe. "Analyse de Programmes Malveillants par Abstraction de Comportements." Phd thesis, Institut National Polytechnique de Lorraine - INPL, 2011. http://tel.archives-ouvertes.fr/tel-00646395.
Full textTa, Thanh Dinh. "Modèle de protection contre les codes malveillants dans un environnement distribué." Thesis, Université de Lorraine, 2015. http://www.theses.fr/2015LORR0040/document.
Full textThe thesis consists in two principal parts: the first one discusses the message for- mat extraction and the second one discusses the behavioral obfuscation of malwares and the detection. In the first part, we study the problem of “binary code coverage” and “input message format extraction”. For the first problem, we propose a new technique based on “smart” dynamic tainting analysis and reverse execution. For the second one, we propose a new method using an idea of classifying input message values by the corresponding execution traces received by executing the program with these input values. In the second part, we propose an abstract model for system calls interactions between malwares and the operating system at a host. We show that, in many cases, the behaviors of a malicious program can imitate ones of a benign program, and in these cases a behavioral detector cannot distinguish between the two programs
Lacasse, Alexandre. "Approche algébrique pour la prévention d'intrusions." Thesis, Université Laval, 2006. http://www.theses.ulaval.ca/2006/23379/23379.pdf.
Full textLebel, Bernard. "Analyse de maliciels sur Android par l'analyse de la mémoire vive." Master's thesis, Université Laval, 2018. http://hdl.handle.net/20.500.11794/29851.
Full textMobile devices are at the core of modern society. Their versatility has allowed third-party developers to generate a rich experience for the user through mobile apps of all types (e.g. productivity, games, communications). As mobile platforms have become connected devices that gather nearly all of our personal and professional information, they are seen as a lucrative market by malware developers. Android is an open-sourced operating system from Google targeting specifically the mobile market and has been targeted by malicious activity due the widespread adoption of the latter by the consumers. As Android malwares threaten many consumers, it is essential that research in malware analysis address specifically this mobile platform. The work conducted during this Master’s focuses on the analysis of malwares on the Android platform. This was achieved through a literature review of the current malware trends and the approaches in static and dynamic analysis that exists to mitigate them. It was also proposed to explore live memory forensics applied to the analysis of malwares as a complement to existing methods. To demonstrate the applicability of the approach and its relevance to the Android malwares, a case study was proposed where an experimental malware has been designed to express malicious behaviours difficult to detect through current methods. The approach explored is called differential live memory analysis. It consists of analyzing the difference in the content of the live memory before and after the deployment of the malware. The results of the study have shown that this approach is promising and should be explored in future studies as a complement to current approaches.
RAMES, ERIC. "Sur la reutilisation de composants logiciels : classification et recherche." Toulouse 3, 1991. http://www.theses.fr/1991TOU30098.
Full textNisi, Dario. "Unveiling and mitigating common pitfalls in malware analysis." Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS528.
Full textAs the importance of computer systems in modern-day societies grows, so does the damage that malicious software causes. The security industry and malware authors engaged in an arms race, in which the first creates better detection systems while the second try to evade them. In fact, any wrong assumption (no matter how subtle) in the design of an anti-malware tool may create new avenues for evading detection. This thesis focuses on two often overlooked aspects of modern malware analysis techniques: the use of API-level information to encode malicious behavior and the reimplementation of parsing routines for executable file formats in security-oriented tools. We show that taking advantage of these practices is possible on a large and automated scale. Moreover, we study the feasibility of fixing these problems at their roots, measuring the difficulties that anti-malware architects may encounter and providing strategies to solve them
El, Hatib Souad. "Une approche sémantique de détection de maliciel Android basée sur la vérification de modèles et l'apprentissage automatique." Master's thesis, Université Laval, 2020. http://hdl.handle.net/20.500.11794/66322.
Full textThe ever-increasing number of Android malware is accompanied by a deep concern about security issues in the mobile ecosystem. Unquestionably, Android malware detection has received much attention in the research community and therefore it becomes a crucial aspect of software security. Actually, malware proliferation goes hand in hand with the sophistication and complexity of malware. To illustrate, more elaborated malware like polymorphic and metamorphic malware, make use of code obfuscation techniques to build new variants that preserve the semantics of the original code but modify it’s syntax and thus escape the usual detection methods. In the present work, we propose a model-checking based approach that combines static analysis and machine learning. Mainly, from a given Android application we extract an abstract model expressed in terms of LNT, a process algebra language. Afterwards, security related Android behaviours specified by temporal logic formulas are checked against this model, the satisfaction of a specific formula is considered as a feature, finally machine learning algorithms are used to classify the application as malicious or not.
Rogouschi, Nicoleta. "Classification à base de modèles de mélanges topologiques des données catégorielles et continues." Paris 13, 2009. http://www.theses.fr/2009PA132015.
Full textThe research presented in this thesis concerns the development of self-organising map approaches based on mixture models which deal with different kinds of data : qualitative, mixed and sequential. For each type of data we propose an adapted unsupervised learning model. The first model, described in this work, is a new learning algorithm of topological map BeSOM (Bernoulli Self-Organizing Map) dedicated to binary data. Each map cell is associated with a Bernoulli distribution. In this model, the learning has the objective to estimate the density function presented as a mixture of densities. Each density is as well a mixture of Bernoulli distribution defined on a neighbourhood. The second model touches upon the problem of probability approaches for the mixeddata clustering (quantitative and qualitative). The model is inspired by previous workswhich define a distribution by a mixture of Bernoulli and Gaussian distributions. This approach gives a different dimension to topological map : it allows probability map interpretation and others the possibility to take advantage of local distribution associated with continuous and categorical variables. As for the third model presented in this thesis, it is a new Markov mixture model applied to treatment of the data structured in sequences. The approach that we propose is a generalisation of traditional Markov chains. There are two versions : the global approach, where topology is used implicitly, and the local approach where topology is used explicitly. The results obtained upon the validation of all the methods are encouragingand promising, both for classification and modelling
Grozavu, Nistor. "Classification topologique pondérée : approches modulaires, hybrides et collaboratives." Paris 13, 2009. http://www.theses.fr/2009PA132022.
Full textThis thesis is focused, on the one hand, to study clustering anlaysis approaches in an unsupervised topological learning, and in other hand, to the topological modular, hybrid and collaborative clustering. This study is adressed mainly on two problems: - cluster characterization using weighting and selection of relevant variables, and the use of the memory concept during the learning unsupervised topological process; - and the problem of the ensemble clustering techniques : the modularization, the hybridization and collaboration. We are particularly interested in this thesis in Kohonen's self-organizing maps which have been widely used for unsupervised classification and visualization of multidimensional datasets. We offer several weighting approaches and a new strategy which consists in the introduction of a memory process into the competition phase by calculating a voting matrix at each learning iteration. Using a statistical test for selecting relevant variables, we will respond to the problem of dimensionality reduction, and to the problem of the cluster characterization. For the second problem, we use the relational analysis approach (RA) to combine multiple topological clustering results
Fortuner, Renaud. "Variabilité et identification des espèces chez les nématodes du genre Helicotylenchus." Lyon 1, 1986. http://www.theses.fr/1986LYO19023.
Full textPiegay, Emmanuel. "Groupement, multirésolution, prétopologie : analogies entre la segmentation d'images et la classification automatique." Lyon, INSA, 1997. http://www.theses.fr/1997ISAL0119.
Full textClustering (or unsupervised learning) and picture segmentation fields are usually confined to different applied contexts, and handle data of different nature. Taken the fact that they have historically developed their own methodologies and tools; the commonly admitted idea of the existence of similarities between these research poles leads to the assumption that, putting up strong ties between them can contribute in their mutual enrichment. To start with, the following two steps contribute to this advance. The first one is of the methodological transfer type: we introduce the concept of multiresolution to the clustering field. The second one leads us into the direction of a unified vision of the two fields, and proposes a "generic" grouping method by spreading. These two ties between picture segmentation and clustering are built on the base of pretopology, a mathematical model of weak axiomatization which allows the adaptation to each of these fields. Later on we shall exploit these ties. It leads us to the proposal of a picture segmentation method employing the proceeding by catchment basins detection which has been made robust against noise and intrinsically paralyzed - on one band, and on an other - an original and powerful hierarchic clustering method, based on the association of hierarchy level and resolution level, completed by an approach of primal sketch from which we define cluster significance notion
Cellier, Peggy Ducassé Mireille Ridoux Olivier. "DeLLIS débogage de programmes par localisation de fautes avec un système d'information logique /." Rennes : [s.n.], 2008. ftp://ftp.irisa.fr/techreports/theses/2008/cellier.pdf.
Full textCellier, Peggy. "DeLLIS : débogage de programmes par localisation de fautes avec un système d’information logique." Rennes 1, 2008. ftp://ftp.irisa.fr/techreports/theses/2008/cellier.pdf.
Full textWhen testing a program, some executions can fail. Fault localization gives clues to locate the faults that cause those failures. The first contribution of this thesis is a new data structure for fault localization: a lattice that contains information from execution traces. The lattice is computed thanks to the combination of association rules and formal concept analysis, two data mining techniques. The lattice computes all differences between execution traces and, at the same time, gives a partial ordering on those differences. Unlike existing work, the method takes into account the dependencies between elements of the traces thanks to the lattice. The second contribution of this thesis is an algorithm that traverses the lattice in order to locate several faults in one pass of a test suite of the program. Experiments show that while the method takes into account multiple faults, it is not penalized, compared to existing work, when the program contains only one fault (in terms of number of lines to inspect). In addition, the study of the impact on the method of the dependences between faults shows that in three out of the four identified cases of dependency the faults can be located. The third contribution is an algorithm to compute association rules. The particularity of that algorithm is that it can take into account taxonomies, such as the hierarchy of the abstract syntax tree, without redundancy. It is used to generate association rules to build the lattice for fault localization
Contat, Marc. "Etude de stratégies d'allocation de ressources et de fusion de données dans un système multi-capteurs pour la classification et la reconnaissance de cibles aériennes." Paris 11, 2002. http://www.theses.fr/2002PA112310.
Full textThe need for obtaining a faithful and more and more precise operational situation in military or civil applications led to a fast development of the multisensor systems. The multiplication of the means of measurement and the increase in their performances reinforced the need to improve the strategies of acquisition for information. Thus a collaboration between the sensors of the same system would provide the possibility of increasing the total effectiveness by a refinement of the mechanisms of decision-making depending on the characteristics of the sensors and the environment. In addition, the resource allocation manages the problem of the adequacy between the available physical resources with limited capacity and the volume of information to be treated. The object of this thesis is to study the strategies of resource allocation and data fusion for the simultaneous follow-up and the identification of several targets. Accordingly, the classification and the recognition of the objects must be carried out as soon as possible, without awaiting all the data which the sensors can provide. A mechanism is proposed in order to rank the choice of the attributes to be determined in first and thus the selection of the sensor and its mode. It carries out the management of the requests addressed to the sensors according to a priori information and to the received data of preceding measurements. Firstly in modular form, the algorithm was then improved to take account of contextual information, for the choice of the modes of sensors to be implemented, in order to use them in an optimal way. The results were validated on a software developed within the framework of an application of non cooperative target recognition in air monitoring
Jaziri, Rakia. "Modèles de mélanges topologiques pour la classification de données structurées en séquences." Paris 13, 2013. http://scbd-sto.univ-paris13.fr/secure/edgalilee_th_2013_jaziri.pdf.
Full textRecent years have seen the development of data mining techniques in various application areas, with the purpose of analyzing sequential, large and complex data. In this work, the problem of clustering, visualization and structuring data is tackled by a three-stage proposal. The first proposal present a generative approach to learn a new probabilistic Self-Organizing Map (PrSOMS) for non independent and non identically distributed data sets. Our model defines a low dimensional manifold allowing friendly visualizations. To yield the topology preserving maps, our model exhibits the SOM like learning behavior with the advantages of probabilistic models. This new paradigm uses HMM (Hidden Markov Models) formalism and introduces relationships between the states. This allows us to take advantage of all the known classical views associated to topographic map. The second proposal concerns a hierarchical extension of the approach PrSOMS. This approach deals the complex aspect of the data in the classification process. We find that the resulting model ”H-PrSOMS” provides a good interpretability of classes built. The third proposal concerns an alternative approach statistical topological MGTM-TT, which is based on the same paradigm than HMM. It is a generative topographic modeling observation density mixtures, which is similar to a hierarchical extension of time GTM model. These proposals have then been applied to test data and real data from the INA (National Audiovisual Institute). This work is to provide a first step, a finer classification of audiovisual broadcast segments. In a second step, we sought to define a typology of the chaining of segments (multiple scattering of the same program, one of two inter-program) to provide statistically the characteristics of broadcast segments. The overall framework provides a tool for the classification and structuring of audiovisual programs
Denoue, Laurent. "De la création à la capitalisation des annotations dans une espace personnel d'informations." Chambéry, 2000. http://www.theses.fr/2000CHAMS017.
Full textGros, Damien. "Protection obligatoire répartie : usage pour le calcul intensif et les postes de travail." Thesis, Orléans, 2014. http://www.theses.fr/2014ORLE2017/document.
Full textThis thesis deals with two major issues in the computer security field. The first is enhancing the security of Linux systems for scientific computation, the second is the protection of Windows workstations. In order to strengthen the security and measure the performances, we offer a common method for the distributed observation of system calls. It relies on reference monitors to ensure confidentiality and integrity. Our solution uses specific high performance computing technologies to lower the communication latencies between the SELinux and PIGA monitors. Benchmarks study the integration of these distributed monitors in the scientific computation. Regarding workstation security, we propose a new reference monitor implementing state of the art protection models from Linux and simplifying administration. We present how to use our monitor to analyze the behavior of malware. This analysis enables an advanced protection to prevent attack scenarii in an optimistic manner. Thus, security is enforced while allowing legitimate activities
Mahé, Serge-André. "La programmation typologique." Montpellier 2, 1992. http://www.theses.fr/1992MON20024.
Full textChzhen, Evgenii. "Plug-in methods in classification." Thesis, Paris Est, 2019. http://www.theses.fr/2019PESC2027/document.
Full textThis manuscript studies several problems of constrained classification. In this frameworks of classification our goal is to construct an algorithm which performs as good as the best classifier that obeys some desired property. Plug-in type classifiers are well suited to achieve this goal. Interestingly, it is shown that in several setups these classifiers can leverage unlabeled data, that is, they are constructed in a semi-supervised manner.Chapter 2 describes two particular settings of binary classification -- classification with F-score and classification of equal opportunity. For both problems semi-supervised procedures are proposed and their theoretical properties are established. In the case of the F-score, the proposed procedure is shown to be optimal in minimax sense over a standard non-parametric class of distributions. In the case of the classification of equal opportunity the proposed algorithm is shown to be consistent in terms of the misclassification risk and its asymptotic fairness is established. Moreover, for this problem, the proposed procedure outperforms state-of-the-art algorithms in the field.Chapter 3 describes the setup of confidence set multi-class classification. Again, a semi-supervised procedure is proposed and its nearly minimax optimality is established. It is additionally shown that no supervised algorithm can achieve a so-called fast rate of convergence. In contrast, the proposed semi-supervised procedure can achieve fast rates provided that the size of the unlabeled data is sufficiently large.Chapter 4 describes a setup of multi-label classification where one aims at minimizing false negative error subject to almost sure type constraints. In this part two specific constraints are considered -- sparse predictions and predictions with the control over false negative errors. For the former, a supervised algorithm is provided and it is shown that this algorithm can achieve fast rates of convergence. For the later, it is shown that extra assumptions are necessary in order to obtain theoretical guarantees in this case
Guilment, Thomas. "Classification de vocalises de mammifères marins en environnement sismique." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2018. http://www.theses.fr/2018IMTA0080/document.
Full textIn partnership with Sercel, the thesis concerns the implementation of algorithms for recognizing the sounds emitted by mysticetes (baleen whales). These sounds can be studiedusing passive acoustic monitoring systems. Sercel, through its seismic activities related to oïl exploration, has its own software to detect and locate underwater sound energy sources. The thesis work therefore consists in adding a recognition module to identify if the detected andlocalized energy corresponds to a possible mysticete. Since seismic shooting campaigns areexpensive, the method used must be able to reduce the probability of false alarms, as recognitioncan invalidate detection. The proposed method is based on dictionary learning. It is dynamic, modular, depends on few parameters and is robust to false alarms. An experiment on five types of vocalizations is presented. We obtain an average recall of 92.1% while rejecting 97.3% of the noises (persistent and transient). In addition, a confidence coefficient is associated with each recognition and allows semi-supervised incremental learning to be achieved. Finally, we propose a method capable of managing detection and recognition together. This "multiclassdetector" best respects the constraints of false alarm management and allows several types of vocalizations to be identified at the same time. This method is well adapted to the industrial context for which it is dedicated. It also opens up very promising prospects in the bioacoustic context
Risch, Jean-Charles. "Enrichissement des Modèles de Classification de Textes Représentés par des Concepts." Thesis, Reims, 2017. http://www.theses.fr/2017REIMS012/document.
Full textMost of text-classification methods use the ``bag of words” paradigm to represent texts. However Bloahdom and Hortho have identified four limits to this representation: (1) some words are polysemics, (2) others can be synonyms and yet differentiated in the analysis, (3) some words are strongly semantically linked without being taken into account in the representation as such and (4) certain words lose their meaning if they are extracted from their nominal group. To overcome these problems, some methods no longer represent texts with words but with concepts extracted from a domain ontology (Bag of Concept), integrating the notion of meaning into the model. Models integrating the bag of concepts remain less used because of the unsatisfactory results, thus several methods have been proposed to enrich text features using new concepts extracted from knowledge bases. My work follows these approaches by proposing a model-enrichment step using a domain ontology, I proposed two measures to estimate to belong to the categories of these new concepts. Using the naive Bayes classifier algorithm, I tested and compared my contributions on the Ohsumed corpus using the domain ontology ``Disease Ontology”. The satisfactory results led me to analyse more precisely the role of semantic relations in the enrichment step. These new works have been the subject of a second experiment in which we evaluate the contributions of the hierarchical relations of hypernymy and hyponymy
Wacquet, Guillaume. "Classification spectrale semi-supervisée : Application à la supervision de l'écosystème marin." Thesis, Littoral, 2011. http://www.theses.fr/2011DUNK0389/document.
Full textIn the decision support systems, often, there a huge digital data and possibly some contextual knowledge available a priori or provided a posteriori by feedback. The performances of classification approaches, particularly spectral ones, depend on the integration of the domain knowledge in their design. Spectral classification algorithms address the problem of classification in terms of graph cuts. They classify the data in the eigenspace of the graph Laplacian matrix. The generated eigenspace may better reveal the presence of linearly separable data clusters. In this work, we are particularly interested in algorithms integrating pairwise constraints : constrained spectral clustering. The eigenspace may reveal the data structure while respecting the constraints. We present a state of the art approaches to constrained spectral clustering. We propose a new algorithm, which generates a subspace projection, by optimizing a criterion integrating both normalized multicut and penalties due to the constraints. The performances of the algorithms are demonstrated on different databases in comparison to other algorithms in the literature. As part of monitoring of the marine ecosystem, we developed a phytoplankton classification system, based on flow cytometric analysis. for this purpose, we proposed to characterize the phytoplanktonic cells by similarity measures using elastic comparison between their cytogram signals
Grosser, David. "Construction itérative de bases de connaissances descriptives et classificatoires avec la plate-forme à objets IKBS : application à la systèmatique des coraux des Mascareignes." La Réunion, 2002. http://tel.archives-ouvertes.fr/tel-00003415/fr/.
Full textDouar, Brahim. "Fouille de sous-graphes fréquents à base d'arc consistance." Thesis, Montpellier 2, 2012. http://www.theses.fr/2012MON20108/document.
Full textWith the important growth of requirements to analyze large amount of structured data such as chemical compounds, proteins structures, social networks, to cite but a few, graph mining has become an attractive track and a real challenge in the data mining field. Because of the NP-Completeness of subgraph isomorphism test as well as the huge search space, frequent subgraph miners are exponential in runtime and/or memory use. In order to alleviate the complexity issue, existing subgraph miners have explored techniques based on the minimal support threshold, the description language of the examples (only supporting paths, trees, etc.) or hypothesis (search for shared trees or common paths, etc.). In this thesis, we are using a new projection operator, named AC-projection, which exhibits nice complexity properties as opposed to the graph isomorphism operator. This operator comes from the constraints programming field and has the advantage of a polynomial complexity. We propose two frequent subgraph mining algorithms based on the latter operator. The first one, named FGMAC, follows a breadth-first order to find frequent subgraphs and takes advantage of the well-known Apriori levelwise strategy. The second is a pattern-growth approach that follows a depth-first search space exploration strategy and uses powerful pruning techniques in order to considerably reduce this search space. These two approaches extract a set of particular subgraphs named AC-reduced frequent subgraphs. As a first step, we have studied the search space for discovering such frequent subgraphs and proved that this one is smaller than the search space of frequent isomorphic subgraphs. Then, we carried out experiments in order to prove that FGMAC and AC-miner are more efficient than the state-of-the-art algorithms. In the same time, we have studied the relevance of frequent AC-reduced subgraphs, which are much fewer than isomorphic ones, on classification and we conclude that we can achieve an important performance gain without or with non-significant loss of discovered pattern's quality
Chebaro, Omar. "Classification de menaces d’erreurs par analyse statique, simplification syntaxique et test structurel de programmes." Thesis, Besançon, 2011. http://www.theses.fr/2011BESA2021/document.
Full textSoftware validation remains a crucial part in software development process. Two major techniques have improved in recent years, dynamic and static analysis. They have complementary strengths and weaknesses. We present in this thesis a new original combination of these methods to make the research of runtime errors more accurate, automatic and reduce the number of false alarms. We prove as well the correction of the method. In this combination, static analysis reports alarms of runtime errors some of which may be false alarms, and test generation is used to confirm or reject these alarms. When applied on large programs, test generation may lack time or space before confirming out certain alarms as real bugs or finding that some alarms are unreachable. To overcome this problem, we propose to reduce the source code by program slicing before running test generation. Program slicing transforms a program into another simpler program, which is equivalent to the original program with respect to certain criterion. Four usages of program slicing were studied. The first usage is called all. It applies the slicing only once, the simplification criterion is the set of all alarms in the program. The disadvantage of this usage is that test generation may lack time or space and alarms that are easier to classify are penalized by the analysis of other more complex alarms. In the second usage, called each, program slicing is performed with respect to each alarm separately. However, test generation is executed for each sliced program and there is a risk of redundancy if some alarms are included in many slices. To overcome these drawbacks, we studied dependencies between alarms on which we base to introduce two advanced usages of program slicing : min and smart. In the min usage, the slicing is performed with respect to subsets of alarms. These subsets are selected based on dependencies between alarms and the union of these subsets cover the whole set of alarms. With this usage, we analyze less slices than with each, and simpler slices than with all. However, the dynamic analysis of some slices may lack time or space before classifying some alarms, while the dynamic analysis of a simpler slice could possibly classify some. Usage smart applies previous usage iteratively by reducing the size of the subsets when necessary. When an alarm cannot be classified by the dynamic analysis of a slice, simpler slices are calculated. These works are implemented in sante, our tool that combines the test generation tool PathCrawler and the platform of static analysis Frama-C. Experiments have shown, firstly, that our combination is more effective than each technique used separately and, secondly, that the verification is faster after reducing the code with program slicing. Simplifying the program by program slicing also makes the detected errors and the remaining alarms easier to analyze
Lefrère, Laurent. "Contribution au développement d'outils pour l'analyse automatique de documents cartographiques." Rouen, 1993. http://www.theses.fr/1993ROUES045.
Full textPetitjean, François. "Dynamic time warping : apports théoriques pour l'analyse de données temporelles : application à la classification de séries temporelles d'images satellites." Thesis, Strasbourg, 2012. http://www.theses.fr/2012STRAD023.
Full textSatellite Image Time Series are becoming increasingly available and will continue to do so in the coming years thanks to the launch of space missions, which aim at providing a coverage of the Earth every few days with high spatial resolution (ESA’s Sentinel program). In the case of optical imagery, it will be possible to produce land use and cover change maps with detailed nomenclatures. However, due to meteorological phenomena, such as clouds, these time series will become irregular in terms of temporal sampling. In order to consistently handle the huge amount of information that will be produced (for instance, Sentinel-2 will cover the entire Earth’s surface every five days, with 10m to 60m spatial resolution and 13 spectral bands), new methods have to be developed. This Ph.D. thesis focuses on the “Dynamic Time Warping” similarity measure, which is able to take the most of the temporal structure of the data, in order to provide an efficient and relevant analysis of the remotely observed phenomena
Picarougne, Fabien. "Recherche d'information sur Internet par algorithmes évolutionnaires." Phd thesis, Tours, 2004. http://tel.archives-ouvertes.fr/tel-00008013.
Full textNjomgue, Sado Wilfried. "Indexation des documents dans un référentiel métier avec approche ontologique : Le système MAID au sein de l'Intranet de Suez-Environnement." Compiègne, 2005. http://www.theses.fr/2005COMP1572.
Full textThis work presents an automatic method of indexing documents based on semantic, linguistic and finally statistics approaches. System MAID, Multi-Approach for the Indexing of Documents applies successively these approaches: a semantic analysis of water domain' s ontology that annotates the document, a linguistic analysis that extracts significant terms, a statistical analysis by the decomposition in singular values of words composing the document. Here, weighting terms are set to take advantages of both their position compared to other terms (co-occurrence) and their local and global context. We will also highlight the contribution of semantics compared to the linguistic-statistic approach. MAID was developed in order to suggest assignments topics of documents to a referential. Finally, we will present experiments results (with or without semantic treatment) and evaluation carried out on documents of Suez-Environnement
Achouri, Anouar. "Contribution à l'évaluation des technologies CPL bas débit dans l'environnement domestique." Thesis, Tours, 2015. http://www.theses.fr/2015TOUR4013/document.
Full textThe Smart Grid is an important part of the third technological revolution. The final client is now able to improve his energy consumption efficiency via the control of the domestic appliances. The narrowband power lines protocols are adopted by many international utilities and DSO to ensure the control of the distribution power grid. In this thesis, we propose to use theses protocols for domestic electrical grid management. To assess the performances of the narrowband PLC systems in domestic environment, we have realized two measurements campaigns in many houses. The first campaign is dedicated to the domestic PLC channel response in the band of [9kHz-500kHz]. The measurements are classified into 5 classes according to their transmission capacities. To model the channel measurements, a modeling approach based on FIR filters is adopted. The second measurements campaign aims to characterize and to reproduce the PLC domestic noise in the band of [9kHz-500kHz]. The measurements are classified into stationary noise, periodic noise and aperiodic noise. Some examples of noise generation are proposed for every form of noise
Boughanem, Mohand. "Les systèmes de recherche d'informations d'un modèle classique à un modèle connexioniste." Toulouse 3, 1992. http://www.theses.fr/1992TOU30222.
Full textDilmahomed, Bocus Sadeck. "Test sans contact des circuits intégrés CMOS : observabilité et contrôlabilité du Latchup par microscopie électronique à balayage et microscopie à émission." Montpellier 2, 1992. http://www.theses.fr/1992MON20079.
Full textHamdan, Hussam. "Sentiment analysis in social media." Thesis, Aix-Marseille, 2015. http://www.theses.fr/2015AIXM4356.
Full textIn this thesis, we address the problem of sentiment analysis. More specifically, we are interested in analyzing the sentiment expressed in social media texts such as tweets or customer reviews about restaurant, laptop, hotel or the scholarly book reviews written by experts. We focus on two main tasks: sentiment polarity detection in which we aim to determine the polarity (positive, negative or neutral) of a given text and the opinion target extraction in which we aim to extract the targets that the people tend to express their opinions towards them (e.g. for restaurant we may extract targets as food, pizza, service).Our main objective is constructing state-of-the-art systems which could do the two tasks. Therefore, we have proposed different supervised systems following three research directions: improving the system performance by term weighting, by enriching the document representation and by proposing a new model for sentiment classification. For evaluation purpose, we have participated at an International Workshop on Semantic Evaluation (SemEval), we have chosen two tasks: Sentiment analysis in twitter in which we determine the polarity of a tweet and Aspect-Based sentiment analysis in which we extract the opinion targets in restaurant reviews, then we determine the polarity of each target. Our systems have been among the first three best systems in all subtasks. We also applied our systems on a French book reviews corpus constructed by OpenEdition team for extracting the opinion targets and their polarities
Lebboss, Georges. "Contribution à l’analyse sémantique des textes arabes." Thesis, Paris 8, 2016. http://www.theses.fr/2016PA080046/document.
Full textThe Arabic language is poor in electronic semantic resources. Among those resources there is Arabic WordNet which is also poor in words and relationships.This thesis focuses on enriching Arabic WordNet by synsets (a synset is a set of synonymous words) taken from a large general corpus. This type of corpus does not exist in Arabic, so we had to build it, before subjecting it to a number of pretreatments.We developed, Gilles Bernard and myself, a method of word vectorization called GraPaVec which can be used here. I built a system which includes a module Add2Corpus, pretreatments, word vectorization using automatically generated frequency patterns, which yields a data matrix whose rows are the words and columns the patterns, each component representing the frequency of a word in a pattern.The word vectors are fed to the neural model Self Organizing Map (SOM) ;the classification produced constructs synsets. In order to validate the method, we had to create a gold standard corpus (there are none in Arabic for this area) from Arabic WordNet, and then compare the GraPaVec method with Word2Vec and Glove ones. The result shows that GraPaVec gives for this problem the best results with a F-measure 25 % higher than the others. The generated classes will be used to create new synsets to be included in Arabic WordNet
Maquin, Didier. "Observabilité, diagnostic et validation de données des procédés industriels." Nancy 1, 1987. http://www.theses.fr/1987NAN10347.
Full textZaidi, Abdelhalim. "Recherche et détection des patterns d'attaques dans les réseaux IP à hauts débits." Phd thesis, Université d'Evry-Val d'Essonne, 2011. http://tel.archives-ouvertes.fr/tel-00878783.
Full textPugeault, Florence. "Extraction dans les textes de connaissances structurées : une méthode fondée sur la sémantique lexicale linguistique." Toulouse 3, 1995. http://www.theses.fr/1995TOU30164.
Full textPuget, Dominique. "Aspects sémantiques dans les Systèmes de Recherche d'Informations." Toulouse 3, 1993. http://www.theses.fr/1993TOU30139.
Full textTeboul, Bruno. "Le développement du neuromarketing aux Etats-Unis et en France. Acteurs-réseaux, traces et controverses." Thesis, Paris Sciences et Lettres (ComUE), 2016. http://www.theses.fr/2016PSLED036/document.
Full textOur research explores the comparative development of neuromarketing between the United States and France. We start by analyzing the literature on neuromarketing. We use as theoretical and methodological framework the Actor Network Theory (ANT) (in the wake of the work of Bruno Latour and Michel Callon). We show how “human and non-human” entities (“actants”): actor-network, traces (publications) and controversies form the pillars of a new discipline such as the neuromarketing. Our hybrid approach “qualitative-quantitative” allows us to build an applied methodology of the ANT: bibliometric analysis (Publish Or Perish), text mining, clustering and semantic analysis of the scientific literature and web of the neuromarketing. From these results, we build data visualizations, mapping of network graphs (Gephi) that reveal the interrelations and associations between actors, traces and controversies about neuromarketing