Зміст
Добірка наукової літератури з теми "GAN, Réseaux antagonistes génératifs"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "GAN, Réseaux antagonistes génératifs".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "GAN, Réseaux antagonistes génératifs"
Renucci, Franck, Benoît Le Blanc, and David Galli. "Communication et études de l’IA : mémoire du futur." Le design de l’« intelligence artificielle » à l’épreuve du vivant 9, no. 1 (June 2, 2020). http://dx.doi.org/10.25965/interfaces-numeriques.4088.
Повний текст джерелаДисертації з теми "GAN, Réseaux antagonistes génératifs"
Rana, Aakanksha. "Analyse d'images haute gamme dynamique." Electronic Thesis or Diss., Paris, ENST, 2018. http://www.theses.fr/2018ENST0015.
Повний текст джерелаHigh Dynamic Range (HDR) imaging enables to capture a wider dynamic range and color gamut, thus enabling us to draw on subtle, yet discriminating details present both in the extremely dark and bright areas of a scene. Such property is of potential interest for computer vision algorithms where performance degrades substantially when the scenes are captured using traditional low dynamic range (LDR) imagery. While such algorithms have been exhaustively designed using traditional LDR images, little work has been done so far in contex of HDR content. In this thesis, we present the quantitative and qualitative analysis of HDR imagery for such task-specific algorithms. This thesis begins by identifying the most natural and important questions of using HDR content for low-level feature extraction task, which is of fundamental importance for many high-level applications such as stereo vision, localization, matching and retrieval. By conducting a performance evaluation study, we demonstrate how different HDR-based modalities enhance algorithms performance with respect to LDR on a proposed dataset. However, we observe that none of them can optimally to do so across all the scenes. To examine this sub-optimality, we investigate the importance of task-specific objectives for designing optimal modalities through an experimental study. Based on the insights, we attempt to surpass this sub-optimality by designing task-specific HDR tone-mapping operators (TMOs). In this thesis, we propose three learning based methodologies aimed at optimal mapping of HDR content to enhance the efficiency of local features extraction at each stage namely, detection, description and final matching
Shahid, Mustafizur Rahman. "Deep learning for Internet of Things (IoT) network security." Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAS003.
Повний текст джерелаThe growing Internet of Things (IoT) introduces new security challenges for network activity monitoring. Most IoT devices are vulnerable because of a lack of security awareness from device manufacturers and end users. As a consequence, they have become prime targets for malware developers who want to turn them into bots. Contrary to general-purpose devices, an IoT device is designed to perform very specific tasks. Hence, its networking behavior is very stable and predictable making it well suited for data analysis techniques. Therefore, the first part of this thesis focuses on leveraging recent advances in the field of deep learning to develop network monitoring tools for the IoT. Two types of network monitoring tools are explored: IoT device type recognition systems and IoT network Intrusion Detection Systems (NIDS). For IoT device type recognition, supervised machine learning algorithms are trained to perform network traffic classification and determine what IoT device the traffic belongs to. The IoT NIDS consists of a set of autoencoders, each trained for a different IoT device type. The autoencoders learn the legitimate networking behavior profile and detect any deviation from it. Experiments using network traffic data produced by a smart home show that the proposed models achieve high performance.Despite yielding promising results, training and testing machine learning based network monitoring systems requires tremendous amount of IoT network traffic data. But, very few IoT network traffic datasets are publicly available. Physically operating thousands of real IoT devices can be very costly and can rise privacy concerns. In the second part of this thesis, we propose to leverage Generative Adversarial Networks (GAN) to generate bidirectional flows that look like they were produced by a real IoT device. A bidirectional flow consists of the sequence of the sizes of individual packets along with a duration. Hence, in addition to generating packet-level features which are the sizes of individual packets, our developed generator implicitly learns to comply with flow-level characteristics, such as the total number of packets and bytes in a bidirectional flow or the total duration of the flow. Experimental results using data produced by a smart speaker show that our method allows us to generate high quality and realistic looking synthetic bidirectional flows
Wei, Wen. "Apprentissage automatique des altérations cérébrales causées par la sclérose en plaques en neuro-imagerie multimodale." Thesis, Université Côte d'Azur, 2020. http://www.theses.fr/2020COAZ4021.
Повний текст джерелаMultiple Sclerosis (MS) is the most common progressive neurological disease of young adults worldwide and thus represents a major public health issue with about 90,000 patients in France and more than 500,000 people affected with MS in Europe. In order to optimize treatments, it is essential to be able to measure and track brain alterations in MS patients. In fact, MS is a multi-faceted disease which involves different types of alterations, such as myelin damage and repair. Under this observation, multimodal neuroimaging are needed to fully characterize the disease. Magnetic resonance imaging (MRI) has emerged as a fundamental imaging biomarker for multiple sclerosis because of its high sensitivity to reveal macroscopic tissue abnormalities in patients with MS. Conventional MR scanning provides a direct way to detect MS lesions and their changes, and plays a dominant role in the diagnostic criteria of MS. Moreover, positron emission tomography (PET) imaging, an alternative imaging modality, can provide functional information and detect target tissue changes at the cellular and molecular level by using various radiotracers. For example, by using the radiotracer [11C]PIB, PET allows a direct pathological measure of myelin alteration. However, in clinical settings, not all the modalities are available because of various reasons. In this thesis, we therefore focus on learning and predicting missing-modality-derived brain alterations in MS from multimodal neuroimaging data
Wang, Yaohui. "Apprendre à générer des vidéos de personnes." Thesis, Université Côte d'Azur, 2021. http://www.theses.fr/2021COAZ4116.
Повний текст джерелаGenerative Adversarial Networks (GANs) have witnessed increasing attention due to their abilities to model complex visual data distributions, which allow them to generate and translate realistic images. While realistic \textit{video generation} is the natural sequel, it is substantially more challenging w.r.t. complexity and computation, associated to the simultaneous modeling of appearance, as well as motion. Specifically, in inferring and modeling the distribution of human videos, generative models face three main challenges: (a) generating uncertain motion and retaining of human appearance, (b) modeling spatio-temporal consistency, as well as (c) understanding of latent representation. In this thesis, we propose three novel approaches towards generating high-visual quality videos and interpreting latent space in video generative models. We firstly introduce a method, which learns to conditionally generate videos based on single input images. Our proposed model allows for controllable video generation by providing various motion categories. Secondly, we present a model, which is able to produce videos from noise vectors by disentangling the latent space into appearance and motion. We demonstrate that both factors can be manipulated in both, conditional and unconditional manners. Thirdly, we introduce an unconditional video generative model that allows for interpretation of the latent space. We place emphasis on the interpretation and manipulation of motion. We show that our proposed method is able to discover semantically meaningful motion representations, which in turn allow for control in generated results. Finally, we describe a novel approach to combine generative modeling with contrastive learning for unsupervised person re-identification. Specifically, we leverage generated data as data augmentation and show that such data can boost re-identification accuracy
Yedroudj, Mehdi. "Steganalysis and steganography by deep learning." Thesis, Montpellier, 2019. http://www.theses.fr/2019MONTS095.
Повний текст джерелаImage steganography is the art of secret communication in order to exchange a secret message. In the other hand, image steganalysis attempts to detect the presence of a hidden message by searching artefacts within an image. For about ten years, the classic approach for steganalysis was to use an Ensemble Classifier fed by hand-crafted features. In recent years, studies have shown that well-designed convolutional neural networks (CNNs) can achieve superior performance compared to conventional machine-learning approaches.The subject of this thesis deals with the use of deep learning techniques for image steganography and steganalysis in the spatialdomain.The first contribution is a fast and very effective convolutional neural network for steganalysis, named Yedroudj-Net. Compared tomodern deep learning based steganalysis methods, Yedroudj-Net can achieve state-of-the-art detection results, but also takes less time to converge, allowing the use of a large training set. Moreover,Yedroudj-Net can easily be improved by using well known add-ons. Among these add-ons, we have evaluated the data augmentation, and the the use of an ensemble of CNN; Both increase our CNN performances.The second contribution is the application of deep learning techniques for steganography i.e the embedding. Among the existing techniques, we focus on the 3-player game approach.We propose an embedding algorithm that automatically learns how to hide a message secretly. Our proposed steganography system is based on the use of generative adversarial networks. The training of this steganographic system is conducted using three neural networks that compete against each other: the embedder, the extractor, and the steganalyzer. For the steganalyzer we use Yedroudj-Net, this for its affordable size, and for the fact that its training does not require the use of any tricks that could increase the computational time.This second contribution defines a research direction, by giving first reflection elements while giving promising first results
Pagliarini, Silvia. "Modeling the neural network responsible for song learning." Thesis, Bordeaux, 2021. http://www.theses.fr/2021BORD0107.
Повний текст джерелаDuring the first period of their life, babies and juvenile birds show comparable phases of vocal development: first, they listen to their parents/tutors in order to build a neural representation of the experienced auditory stimulus, then they start to produce sound and progressively get closer to reproducing their tutor song. This phase of learning is called the sensorimotor phase and is characterized by the presence of babbling, in babies, and subsong, in birds. It ends when the song crystallizes and becomes similar to the one produced by the adults.It is possible to find analogies between brain pathways responsible for sensorimotor learning in humans and birds: a vocal production pathway involves direct projections from auditory areas to motor neurons, and a vocal learning pathway is responsible for imitation and plasticity. The behavioral studies and the neuroanatomical structure of the vocal control circuit in humans and birds provide the basis for bio-inspired models of vocal learning.In particular, birds have brain circuits exclusively dedicated to song learning, making them an ideal model for exploring the representation of vocal learning by imitation of tutors.This thesis aims to build a vocal learning model underlying song learning in birds. An extensive review of the existing literature is discussed in the thesis: many previous studies have attempted to implement imitative learning in computational models and share a common structure. These learning architectures include the learning mechanisms and, eventually, exploration and evaluation strategies. A motor control function enables sound production and sensory response models either how sound is perceived or how it shapes the reward. The inputs and outputs of these functions lie (1)~in the motor space (motor parameters’ space), (2)~in the sensory space (real sounds) and (3)~either in the perceptual space (a low dimensional representation of the sound) or in the internal representation of goals (a non-perceptual representation of the target sound).The first model proposed in this thesis is a theoretical inverse model based on a simplified vocal learning model where the sensory space coincides with the motor space (i.e., there is no sound production). Such a simplification allows us to investigate how to introduce biological assumptions (e.g. non-linearity response) into a vocal learning model and which parameters influence the computational power of the model the most. The influence of the sharpness of auditory selectivity and the motor dimension are discussed.To have a complete model (which is able to perceive and produce sound), we needed a motor control function capable of reproducing sounds similar to real data (e.g. recordings of adult canaries). We analyzed the capability of WaveGAN (a Generative Adversarial Network) to provide a generator model able to produce realistic canary songs. In this generator model, the input space becomes the latent space after training and allows the representation of a high-dimensional dataset in a lower-dimensional manifold. We obtained realistic canary sounds using only three dimensions for the latent space. Among other results, quantitative and qualitative analyses demonstrate the interpolation abilities of the model, which suggests that the generator model we studied can be used as a motor function in a vocal learning model.The second version of the sensorimotor model is a complete vocal learning model with a full action-perception loop (i.e., it includes motor space, sensory space, and perceptual space). The sound production is performed by the GAN generator previously obtained. A recurrent neural network classifying syllables serves as the perceptual sensory response. Similar to the first model, the mapping between the perceptual space and the motor space is learned via an inverse model. Preliminary results show the influence of the learning rate when different sensory response functions are implemented
Hamis, Sébastien. "Compression de contenus visuels pour transmission mobile sur réseaux de très bas débit." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAS020.
Повний текст джерелаThe field of visual content compression (image, video, 2D/3D graphics elements) has known spectacular achievements for more than twenty years, with the emergence numerous international standards such as JPEG, JPEG2000 for still image compression, or MPEG-1/2/4 for video and 3D graphics content coding.The apparition of smartphones and of their related applications have also benefited from these advances, the image being today ubiquitous in a context of mobility. Nevertheless, image transmission requires reliable and available networks, since such visual data that are inherently bandwidth-intensive. While developed countries benefit today from high-performance mobile networks (3G, 4G...), this is not the case in a certain number of regions of the world, particularly in emerging countries, where communications still rely on 2G SMS networks. Transmitting visual content in such a context becomes a highly ambitious challenge, requiring the elaboration of new, for very low bitrate compression algorithm. The challenge is to ensure images transmission over a narrow bandwidth corresponding to a relatively small set (10 to 20) of SMS (140 bytes per SMS).To meet such constraints, multiple axes of development have been considered. After a state-of-the-art of traditional image compression techniques, we have oriented our research towards deep learning methods, aiming achieve post-treatments over strongly compressed data in order to improve the quality of the decoded content.Our contributions are structures around the creation of a new compression scheme, including existing codecs and a panel of post-processing bricks aiming at enhancing highly compressed content. Such bricks represent dedicated deep neural networks, which perform super-resolution and/or compression artifact reduction operations, specifically trained to meet the targeted objectives. These operations are carried out on the decoder side and can be interpreted as image reconstruction algorithms from heavily compressed versions. This approach offers the advantage of being able to rely on existing codecs, which are particularly light and resource-efficient. In our work, we have retained the BPG format, which represents the state of art in the field, but other compression schemes can also be considered.Regarding the type of neural networks, we have adopted Generative Adversarials Nets-GAN, which are particularly well-suited for objectives of reconstruction from incomplete data. Specifically, the two architectures retained and adapted to our objectives are the SRGAN and ESRGAN networks. The impact of the various elements and parameters involved, such as the super-resolution factors and the loss functions, are analyzed in detail.A final contribution concerns experimental evaluation performed. After showing the limitations of objective metrics, which fail to take into account the visual quality of the image, we have put in place a subjective evaluation protocol. The results obtained in terms of MOS (Mean Opinion Score) fully demonstrate the relevance of the proposed reconstruction approaches.Finally, we open our work to different use cases, of a more general nature. This is particularly the case for high-resolution image processing and for video compression
Ben, Tanfous Amor. "Représentations parcimonieuses dans les variétés de formes pour la classification et la génération de trajectoires humaines." Thesis, Lille 1, 2019. http://www.theses.fr/2019LIL1I111.
Повний текст джерелаDesigning intelligent systems to understand video content has been a hot research topic in the past few decades since it helps compensate the limited human capabilities of analyzing videos in an efficient way. In particular, human behavior understanding in videos is receiving a huge interest due to its many potential applications. At the same time, the detection and tracking of human landmarks in video streams has gained in reliability partly due to the availability of affordable RGB-D sensors. This infer time-varying geometric data which play an important role in the automatic human motion analysis. However, such analysis remains challenging due to enormous view variations, inaccurate detection of landmarks, large intra- and inter- class variations, and insufficiency of annotated data. In this thesis, we propose novel frameworks to classify and generate 2D/3D sequences of human landmarks. We first represent them as trajectories in the shape manifold which allows for a view-invariant analysis. However, this manifold is nonlinear and thereby standard computational tools and machine learning techniques could not be applied in a straightforward manner. As a solution, we exploit notions of Riemannian geometry to encode these trajectories based on sparse coding and dictionary learning. This not only overcomes the problem of nonlinearity of the manifold but also yields sparse representations that lie in vector space, that are more discriminative and less noisy than the original data. We study intrinsic and extrinsic paradigms of sparse coding and dictionary learning in the shape manifold and provide a comprehensive evaluation on their use according to the nature of the data (i.e., face or body in 2D or 3D). Based on these sparse representations, we present two frameworks for 3D human action recognition and 2D micro- and macro- facial expression recognition and show that they achieve competitive performance in comparison to the state-of-the-art. Finally, we design a generative model allowing to synthesize human actions. The main idea is to train a generative adversarial network to generate new sparse representations that are then transformed to pose sequences. This framework is applied to the task of data augmentation allowing to improve the classification performance. In addition, the generated pose sequences are used to guide a second framework to generate human videos by means of pose transfer of each pose to a texture image. We show that the obtained videos are realistic and have better appearance and motion consistency than a recent state-of-the-art baseline
Laifa, Oumeima. "A joint discriminative-generative approach for tumour angiogenesis assessment in computational pathology." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS230.
Повний текст джерелаAngiogenesis is the process through which new blood vessels are formed from pre-existing ones. During angiogenesis, tumour cells secrete growth factors that activate the proliferation and migration of endothelial cells and stimulate over production of the vascular endothelial growth factor (VEGF). The fundamental role of vascular supply in tumour growth and anti-cancer therapies makes the evaluation of angiogenesis crucial in assessing the effect of anti-angiogenic therapies as a promising anti-cancer therapy. In this study, we establish a quantitative and qualitative panel to evaluate tumour blood vessels structures on non-invasive fluorescence images and histopathological slide across the full tumour to identify architectural features and quantitative measurements that are often associated with prediction of therapeutic response. We develop a Markov Random Field (MFRs) and Watershed framework to segment blood vessel structures and tumour micro-enviroment components to assess quantitatively the effect of the anti-angiogenic drug Pazopanib on the tumour vasculature and the tumour micro-enviroment interaction. The anti-angiogenesis agent Pazopanib was showing a direct effect on tumour network vasculature via the endothelial cells crossing the whole tumour. Our results show a specific relationship between apoptotic neovascularization and nucleus density in murine tumor treated by Pazopanib. Then, qualitative evaluation of tumour blood vessels structures is performed in whole slide images, known to be very heterogeneous. We develop a discriminative-generative neural network model based on both learning driven model convolutional neural network (CNN), and rule-based knowledge model Marked Point Process (MPP) to segment blood vessels in very heterogeneous images using very few annotated data comparing to the state of the art. We detail the intuition and the design behind the discriminative-generative model, and we analyze its similarity with Generative Adversarial Network (GAN). Finally, we evaluate the performance of the proposed model on histopathology slide and synthetic data. The limits of this promising framework as its perspectives are shown
Dutil, Francis. "Prédiction et génération de données structurées à l'aide de réseaux de neurones et de décisions discrètes." Thèse, 2018. http://hdl.handle.net/1866/22124.
Повний текст джерела