Literatura académica sobre el tema "Multimodal Knowledge Representation"

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "Multimodal Knowledge Representation".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Artículos de revistas sobre el tema "Multimodal Knowledge Representation"

1

Azañón, Elena, Luigi Tamè, Angelo Maravita, Sally A. Linkenauger, Elisa R. Ferrè, Ana Tajadura-Jiménez y Matthew R. Longo. "Multimodal Contributions to Body Representation". Multisensory Research 29, n.º 6-7 (2016): 635–61. http://dx.doi.org/10.1163/22134808-00002531.

Texto completo
Resumen
Our body is a unique entity by which we interact with the external world. Consequently, the way we represent our body has profound implications in the way we process and locate sensations and in turn perform appropriate actions. The body can be the subject, but also the object of our experience, providing information from sensations on the body surface and viscera, but also knowledge of the body as a physical object. However, the extent to which different senses contribute to constructing the rich and unified body representations we all experience remains unclear. In this review, we aim to bring together recent research showing important roles for several different sensory modalities in constructing body representations. At the same time, we hope to generate new ideas of how and at which level the senses contribute to generate the different levels of body representations and how they interact. We will present an overview of some of the most recent neuropsychological evidence about multisensory control of pain, and the way that visual, auditory, vestibular and tactile systems contribute to the creation of coherent representations of the body. We will focus particularly on some of the topics discussed in the symposium on Multimodal Contributions to Body Representation held on the 15th International Multisensory Research Forum (2015, Pisa, Italy).
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Coelho, Ana, Paulo Marques, Ricardo Magalhães, Nuno Sousa, José Neves y Victor Alves. "A Knowledge Representation and Reasoning System for Multimodal Neuroimaging Studies". Inteligencia Artificial 20, n.º 59 (6 de febrero de 2017): 42. http://dx.doi.org/10.4114/intartif.vol20iss59pp42-52.

Texto completo
Resumen
Multimodal neuroimaging analyses are of major interest for both research and clinical practice, enabling the combined evaluation of the structure and function of the human brain. These analyses generate large volumes of data and consequently increase the amount of possibly useful information. Indeed, BrainArchive was developed in order to organize, maintain and share this complex array of neuroimaging data. It stores all the information available for each participant/patient, being dynamic by nature. Notably, the application of reasoning systems to this multimodal data has the potential to provide tools for the identification of undiagnosed diseases. As a matter of fact, in this work we explore how Artificial Intelligence techniques for decision support work, namely Case-Based Reasoning (CBR) that may be used to achieve such endeavour. Particularly, it is proposed a reasoning system that uses the information stored in BrainArchive as past knowledge for the identification of individuals that are at risk of contracting some brain disease.
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Bruni, E., N. K. Tran y M. Baroni. "Multimodal Distributional Semantics". Journal of Artificial Intelligence Research 49 (23 de enero de 2014): 1–47. http://dx.doi.org/10.1613/jair.4135.

Texto completo
Resumen
Distributional semantic models derive computational representations of word meaning from the patterns of co-occurrence of words in text. Such models have been a success story of computational linguistics, being able to provide reliable estimates of semantic relatedness for the many semantic tasks requiring them. However, distributional models extract meaning information exclusively from text, which is an extremely impoverished basis compared to the rich perceptual sources that ground human semantic knowledge. We address the lack of perceptual grounding of distributional models by exploiting computer vision techniques that automatically identify discrete “visual words” in images, so that the distributional representation of a word can be extended to also encompass its co-occurrence with the visual words of images it is associated with. We propose a flexible architecture to integrate text- and image-based distributional information, and we show in a set of empirical tests that our integrated model is superior to the purely text-based approach, and it provides somewhat complementary semantic information with respect to the latter.
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Toraldo, Maria Laura, Gazi Islam y Gianluigi Mangia. "Modes of Knowing". Organizational Research Methods 21, n.º 2 (14 de julio de 2016): 438–65. http://dx.doi.org/10.1177/1094428116657394.

Texto completo
Resumen
The current article argues that video-based methodologies offer unique potential for multimodal research applications. Multimodal research, further, can respond to the problem of “elusive knowledges,” that is, tacit, aesthetic, and embodied aspects of organizational life that are difficult to articulate in traditional methodological paradigms. We argue that the multimodal qualities of video, including but not limited to its visual properties, provide a scaffold for translating embodied, tacit, and aesthetic knowledge into discursive and textual forms, enabling the representation of organizational knowledge through academic discourse. First, we outline the problem of representation by comparing different forms of elusive knowledge, framing this problem as one of cross-modal translation. Second, we describe how video’s unique affordances place it in an ideal position to address this problem. Third, we demonstrate how video-based solutions can contribute to research, providing examples both from the literature and our own applied case work as models for video-based approaches. Finally, we discuss the implications and limitations of the proposed video approaches as a methodological support.
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Gül, Davut y Bayram Costu. "To What Extent Do Teachers of Gifted Students Identify Inner and Intermodal Relations in Knowledge Representation?" Mimbar Sekolah Dasar 8, n.º 1 (30 de abril de 2021): 55–80. http://dx.doi.org/10.53400/mimbar-sd.v8i1.31333.

Texto completo
Resumen
Gifted students get bored of reading authoritative and descriptive multimodal texts. They need coherent, explanatory, and interactive texts. Moreover, because of the pandemic, gifted students took courses online, and teachers had to conduct their lessons on digital online tools with multimodal representations. They posted supplementary teaching materials as multimodal texts to the students. Hence, teachers of gifted students should pay attention to inner and intermodal relations to meet the needs of gifted students and support their learning experience. The research aims at examining to what extent teachers of gifted students identify inner and intermodal relations because before designing these relations, the teacher should recognize these types of relations. The educational descriptive case study was applied. Six experienced primary school teachers were involved. The data were analyzed via content analysis. The results showed that teachers just identified the primitive level of inner and intermodal relations. The conclusion can be drawn that several educational design research should be increased to construct professional development courses for teachers about this issue. Learning and applying inner and intermodal relations are crucial for teachers of gifted students, in addition to having curiosity, they have a high cognitive level in different areas, thus they demand advanced forms of multimodal texts.
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Tomskaya, Maria y Irina Zaytseva. "MULTIMEDIA REPRESENTATION OF KNOWLEDGE IN ACADEMIC DISCOURSE". Verbum 8, n.º 8 (19 de enero de 2018): 129. http://dx.doi.org/10.15388/verb.2017.8.11357.

Texto completo
Resumen
The article focuses on academic presentations created with the help of multimedia programmes. The presentation is regarded as a special form of new academic knowledge representation. An academic presentation is explored as a multimodal phenomenon due to the fact that different channels or modes are activated during its perception. Data perception constitutes a part of the context which in itself is a semiotic event involving various components (an addresser, an addressee, the message itself, the channel of communication and the code). The choice of the code and the channel depends on different factors (type of the audience, the nature of the message, etc). In this way, the information for non-professionals will be most likely presented through visualization with the help of infographics (schemes, figures, charts, etc). Talking about the professional audience the speaker may resort to visualization to a lesser degree or he may not use it at all. His message will be transmitted only with the help of verbal means, which will not prevent the audience from perceiving and understanding new knowledge correctly. The presentation regime of rapid successive slide show may be regarded the heritage of ‘clip thinking’ which is characterized by a non-linear, simultaneous way of information perception. At the present stage of technology development visualization is becoming the most common means of transmitting information in academic discourse, due to peculiarities of data perception by the man of today.
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Cholewa, Wojciech, Marcin Amarowicz, Paweł Chrzanowski y Tomasz Rogala. "Development Environment for Diagnostic Multimodal Statement Networks". Key Engineering Materials 588 (octubre de 2013): 74–83. http://dx.doi.org/10.4028/www.scientific.net/kem.588.74.

Texto completo
Resumen
Development of effective diagnostic systems for the recognition of technical conditions of complex objects or processes requires the use of knowledge from multiple sources. Gathering of diagnostic knowledge acquired from diagnostic experiments as well as independent experts in the form of an information system database is one of the most important stages in the process of designing diagnostic systems. The task can be supported through suitable modeling activities and diagnostic knowledge management. Briefly, this paper presents an example of an application of multimodal diagnostic statement networks for the purpose of knowledge representation. Multimodal statement networks allow for approximate diagnostic reasoning based on a knowledge that is imprecise or even contradictory in part. The authors also describe the software environment REx for the development and testing of multimodal statement networks. The environment is a system for integrating knowledge from various sources and from independent domain experts in particular.
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Prieto-Velasco, Juan Antonio y Clara I. López Rodríguez. "Managing graphic information in terminological knowledge bases". Terminology 15, n.º 2 (11 de noviembre de 2009): 179–213. http://dx.doi.org/10.1075/term.15.2.02pri.

Texto completo
Resumen
The cognitive shift in Linguistics has affected the way linguists, lexicographers and terminologists understand and describe specialized language, and the way they represent scientific and technical concepts. The representation of terminological knowledge, as part of our encyclopaedic knowledge about the world, is crucial in multimedia terminological knowledge bases, where different media coexist to enhance the multidimensional character of knowledge representations. However, so far little attention has been paid in Terminology and Linguistics to graphic information, including visual resources and pictorial material. Frame-based Terminology (Faber et al. 2005, 2006, 2007, 2008) advocates a multimodal conceptual description in which the structured information in terminographic definitions meshes with visual information for a better understanding of specialized concepts. In this article, we explore the relationship between visual and textual information, and search for a principled way to select images that best represent the linguistic, conceptual and contextual information contained in terminological knowledge bases, in order to contribute to a better transfer of specialized knowledge.
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Laenen, Katrien y Marie-Francine Moens. "Learning Explainable Disentangled Representations of E-Commerce Data by Aligning Their Visual and Textual Attributes". Computers 11, n.º 12 (10 de diciembre de 2022): 182. http://dx.doi.org/10.3390/computers11120182.

Texto completo
Resumen
Understanding multimedia content remains a challenging problem in e-commerce search and recommendation applications. It is difficult to obtain item representations that capture the relevant product attributes since these product attributes are fine-grained and scattered across product images with huge visual variations and product descriptions that are noisy and incomplete. In addition, the interpretability and explainability of item representations have become more important in order to make e-commerce applications more intelligible to humans. Multimodal disentangled representation learning, where the independent generative factors of multimodal data are identified and encoded in separate subsets of features in the feature space, is an interesting research area to explore in an e-commerce context given the benefits of the resulting disentangled representations such as generalizability, robustness and interpretability. However, the characteristics of real-word e-commerce data, such as the extensive visual variation, noisy and incomplete product descriptions, and complex cross-modal relations of vision and language, together with the lack of an automatic interpretation method to explain the contents of disentangled representations, means that current approaches for multimodal disentangled representation learning do not suffice for e-commerce data. Therefore, in this work, we design an explainable variational autoencoder framework (E-VAE) which leverages visual and textual item data to obtain disentangled item representations by jointly learning to disentangle the visual item data and to infer a two-level alignment of the visual and textual item data in a multimodal disentangled space. As such, E-VAE tackles the main challenges in disentangling multimodal e-commerce data. Firstly, with the weak supervision of the two-level alignment our E-VAE learns to steer the disentanglement process towards discovering the relevant factors of variations in the multimodal data and to ignore irrelevant visual variations which are abundant in e-commerce data. Secondly, to the best of our knowledge our E-VAE is the first VAE-based framework that has an automatic interpretation mechanism that allows to explain the components of the disentangled item representations with text. With our textual explanations we provide insight in the quality of the disentanglement. Furthermore, we demonstrate that with our explainable disentangled item representations we achieve state-of-the-art outfit recommendation results on the Polyvore Outfits dataset and report new state-of-the-art cross-modal search results on the Amazon Dresses dataset.
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Li, Jinghua, Runze Liu, Dehui Kong, Shaofan Wang, Lichun Wang, Baocai Yin y Ronghua Gao. "Attentive 3D-Ghost Module for Dynamic Hand Gesture Recognition with Positive Knowledge Transfer". Computational Intelligence and Neuroscience 2021 (18 de noviembre de 2021): 1–12. http://dx.doi.org/10.1155/2021/5044916.

Texto completo
Resumen
Hand gesture recognition is a challenging topic in the field of computer vision. Multimodal hand gesture recognition based on RGB-D is with higher accuracy than that of only RGB or depth. It is not difficult to conclude that the gain originates from the complementary information existing in the two modalities. However, in reality, multimodal data are not always easy to acquire simultaneously, while unimodal RGB or depth hand gesture data are more general. Therefore, one hand gesture system is expected, in which only unimordal RGB or Depth data is supported for testing, while multimodal RGB-D data is available for training so as to attain the complementary information. Fortunately, a kind of method via multimodal training and unimodal testing has been proposed. However, unimodal feature representation and cross-modality transfer still need to be further improved. To this end, this paper proposes a new 3D-Ghost and Spatial Attention Inflated 3D ConvNet (3DGSAI) to extract high-quality features for each modality. The baseline of 3DGSAI network is Inflated 3D ConvNet (I3D), and two main improvements are proposed. One is 3D-Ghost module, and the other is the spatial attention mechanism. The 3D-Ghost module can extract richer features for hand gesture representation, and the spatial attention mechanism makes the network pay more attention to hand region. This paper also proposes an adaptive parameter for positive knowledge transfer, which ensures that the transfer always occurs from the strong modality network to the weak one. Extensive experiments on SKIG, VIVA, and NVGesture datasets demonstrate that our method is competitive with the state of the art. Especially, the performance of our method reaches 97.87% on the SKIG dataset using only RGB, which is the current best result.
Los estilos APA, Harvard, Vancouver, ISO, etc.

Tesis sobre el tema "Multimodal Knowledge Representation"

1

Palframan, Shirley Anne. "Multimodal representation and the making of knowledge : a social semiotic excavation of learning sites". Thesis, University College London (University of London), 2006. http://discovery.ucl.ac.uk/10019283/.

Texto completo
Resumen
This research is concerned with the construction of knowledge as evidenced in the multimodal representations of students. In the spirit of an archaeological excavation it seeks to uncover evidence of that which can not be seen; of learning. It provides systematic classification and analysis of multimodal texts retrieved from secondary school science and history lessons. By conducting this analysis and accounting for the conditions of representation that stimulate learning it also demonstrates the instrumentality of representational activity in the making of knowledge. Applying social semiotic theory to textual artefacts from the two sites, a new methodology is utilised to expose evidence of learning. This methodology is derived from theories of social semiotics (Halliday, 1978 and Hodge and Kress, 1988) and multimodality (Kress and van Leeuwen, 1996). It is based on a conception of learning as a process in which the status and identity of the individual are changed. It is informed by, amongst others, Bernstein (1996) - in relation to the socialising of individuals through systems of education and by Vygotsky (1962) - in relation to the shaping of consciousness. The thesis consists of the description and demonstration of new methods for multimodal analysis of students' representational activity. The technique used for the presentation of data is tracking semiosis and for analysis process charting and mode mapping. Together these methods expose changes arising from the reconfigurations, transformations and transductions undertaken by students engaged in representational activity. In so doing, new directions are offered for the orientation of education practices in the face of rapidly changing patterns of communication. The efficacy of learning in multiple modes is also established and groundwork laid for fresh approaches to assessment.
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Guo, Xuan. "Discovering a Domain Knowledge Representation for Image Grouping| Multimodal Data Modeling, Fusion, and Interactive Learning". Thesis, Rochester Institute of Technology, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10603860.

Texto completo
Resumen

In visually-oriented specialized medical domains such as dermatology and radiology, physicians explore interesting image cases from medical image repositories for comparative case studies to aid clinical diagnoses, educate medical trainees, and support medical research. However, general image classification and retrieval approaches fail in grouping medical images from the physicians' viewpoint. This is because fully-automated learning techniques cannot yet bridge the gap between image features and domain-specific content for the absence of expert knowledge. Understanding how experts get information from medical images is therefore an important research topic.

As a prior study, we conducted data elicitation experiments, where physicians were instructed to inspect each medical image towards a diagnosis while describing image content to a student seated nearby. Experts' eye movements and their verbal descriptions of the image content were recorded to capture various aspects of expert image understanding. This dissertation aims at an intuitive approach to extracting expert knowledge, which is to find patterns in expert data elicited from image-based diagnoses. These patterns are useful to understand both the characteristics of the medical images and the experts' cognitive reasoning processes.

The transformation from the viewed raw image features to interpretation as domain-specific concepts requires experts' domain knowledge and cognitive reasoning. This dissertation also approximates this transformation using a matrix factorization-based framework, which helps project multiple expert-derived data modalities to high-level abstractions.

To combine additional expert interventions with computational processing capabilities, an interactive machine learning paradigm is developed to treat experts as an integral part of the learning process. Specifically, experts refine medical image groups presented by the learned model locally, to incrementally re-learn the model globally. This paradigm avoids the onerous expert annotations for model training, while aligning the learned model with experts' sense-making.

Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Florén, Henrika. "Shapes of Knowledge : A multimodal study of six Swedish upper secondary students' meaning making and transduction of knowledge across essays and audiovisual presentations". Thesis, Stockholms universitet, Institutionen för pedagogik och didaktik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-156907.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Adjali, Omar. "Dynamic architecture for multimodal applications to reinforce robot-environment interaction". Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLV100.

Texto completo
Resumen
La représentation des connaissances et le raisonnement sont au cœur du grand défi de l'Intelligence Artificielle. Plus précisément, dans le contexte des applications robotiques, la représentation des connaissances et les approches de raisonnement sont nécessaires pour résoudre les problèmes de décision auxquels sont confrontés les robots autonomes lorsqu'ils évoluent dans des environnements incertains, dynamiques et complexes ou pour assurer une interaction naturelle dans l'environnement humain. Dans un système d'interaction robotique, l'information doit être représentée et traitée à différents niveaux d'abstraction: du capteur aux actions et plans. Ainsi, la représentation des connaissances fournit les moyens de décrire l'environnement avec différents niveaux d'abstraction qui permettent d'effectuer des décisions appropriées. Dans cette thèse, nous proposons une méthodologie pour résoudre le problème de l'interaction multimodale en décrivant une architecture d'interaction sémantique basée sur un cadre qui démontre une approche de représentation et de raisonnement avec le langage (EKRL environment knowledge representation language), afin d'améliorer l'interaction entre les robots et leur environnement. Ce cadre est utilisé pour gérer le processus d'interaction en représentant les connaissances impliquées dans l'interaction avec EKRL et en raisonnant pour faire une inférence. Le processus d'interaction comprend la fusion des valeurs des différents capteurs pour interpréter et comprendre ce qui se passe dans l'environnement, et la fission qui suggère un ensemble détaillé d'actions qui sont mises en œuvre. Avant que ces actions ne soient mises en œuvre par les actionneurs, ces actions sont d'abord évaluées dans un environnement virtuel qui reproduit l'environnement réel pour évaluer la faisabilité de la mise en œuvre de l'action dans le monde réel. Au cours de ces processus, des capacités de raisonnement sont nécessaires pour garantir une exécution globale d'un scénario d'interaction. Ainsi, nous avons fourni un ensemble de techniques de raisonnement pour effectuer de l’inférence déterministe grâce à des algorithmes d'unification et des inférences probabilistes pour gérer des connaissances incertaines en combinant des modèles relationnels statistiques à l'aide des réseaux logiques de Markov (MLN) avec EKRL. Le travail proposé est validé à travers des scénarios qui démontrent l’applicabilité et la performance de notre travail dans les applications du monde réel
Knowledge Representation and Reasoning is at the heart of the great challenge of Artificial Intelligence. More specifically, in the context of robotic applications, knowledge representation and reasoning approaches are necessary to solve decision problems that autonomous robots face when it comes to evolve in uncertain, dynamic and complex environments or to ensure a natural interaction in human environment. In a robotic interaction system, information has to be represented and processed at various levels of abstraction: From sensor up to actions and plans. Thus, knowledge representation provides the means to describe the environment with different abstraction levels which allow performing appropriate decisions. In this thesis we propose a methodology to solve the problem of multimodal interaction by describing a semantic interaction architecture based on a framework that demonstrates an approach for representing and reasoning with environment knowledge representation language (EKRL), to enhance interaction between robots and their environment. This framework is used to manage the interaction process by representing the knowledge involved in the interaction with EKRL and reasoning on it to make inference. The interaction process includes fusion of values from different sensors to interpret and understand what is happening in the environment, and the fission which suggests a detailed set of actions that are for implementation. Before such actions are implemented by actuators, these actions are first evaluated in a virtual environment which mimics the real-world environment to assess the feasibility of the action implementation in the real world. During these processes, reasoning abilities are necessary to guarantee a global execution of a given interaction scenario. Thus, we provided EKRL framework with reasoning techniques to draw deterministic inferences thanks to unification algorithms and probabilistic inferences to manage uncertain knowledge by combining statistical relational models using Markov logic Networks(MLN) framework with EKRL. The proposed work is validated through scenarios that demonstrate the usability and the performance of our framework in real world applications
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Ben, salem Yosra. "Fusion d'images multimodales pour l'aide au diagnostic du cancer du sein". Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2017. http://www.theses.fr/2017IMTA0062/document.

Texto completo
Resumen
Le cancer du sein est le cancer le plus répandu chez les femmes de plus de 40 ans. En effet, des études ont montré qu'une détection précoce et un traitement approprié du cancer du sein augmentent de manière significative les chances de survie. La mammographie constitue le moyen d'investigation le plus utilisé dans le diagnostic des lésions mammaires. Cependant, cette technique peut être insuffisante pour montrer les structures du sein et faire apparaître les anomalies présentes et le médecin peut faire appel à d'autres modalités d'imagerie telle que l'imagerie IRM. Ces modalités sont généralement complémentaires. Par conséquent, le médecin procède à une fusion mentale des différentes informations sur les deux images dans le but d'effectuer le diagnostic adéquat. Pour assister le médecin et l'aider dans ce processus, nous proposons une solution permettant de fusionner les deux images. Bien que l'idée de la fusion paraisse simple, sa mise en oeuvre pose de nombreux problèmes liés non seulement au problème de fusion en général mais aussi à la nature des images médicales qui sont généralement des images mal contrastées et présentant des données hétérogènes, imprécises et ambigües. Notons que les images mammographiques et les images IRM présentent des représentations très différentes des informations, étant donnée qu'elles sont prises dans des conditions distinctes. Ce qui nous amène à poser la question suivante: Comment passer de la représentation hétérogène des informations dans l'espace image, à un autre espace de représentation uniforme. Afin de traiter cette problématique, nous optons pour une approche de traitement multi-niveaux : niveau pixel, niveau primitives, niveau objet et niveau scène. Nous modélisons les objets pathologiques extraits des différentes images par des ontologies locales. La fusion est ensuite effectuée sur ces ontologies locales et résulte en une ontologie globale contenant les différentes connaissances sur les objets pathologiques du cas étudié. Cette ontologie globale sert à instancier une ontologie de référence modélisant les connaissances du diagnostic médical des lésions mammaires. Un raisonnement à base de cas est exploité pour fournir les rapports diagnostic des cas les plus similaires pouvant aider le médecin à prendre la meilleure décision. Dans le but de modéliser l'imperfection des informations traitées, nous utilisons la théorie des possibilités avec les différentes ontologies. Le résultat fourni est présenté sous forme de rapports diagnostic comportant les cas les plus similaires au cas étudié avec des degrés de similarité exprimés en mesures de possibilité. Un modèle virtuel 3D complète le rapport diagnostic par un aperçu simplifié de la scène étudiée
The breast cancer is the most prevalent cancer among women over 40 years old. Indeed, studies evinced that an early detection and an appropriate treatment of breast cancer increases significantly the chances of survival. The mammography is the most tool used in the diagnosis of breast lesions. However, this technique may be insufficient to evince the structures of the breast and reveal the anomalies present. The doctor can use additional imaging modalities such as MRI (Magnetic Reasoning Image). Therefore, the doctor proceeds to a mental fusion of the different information on the two images in order to make the adequate diagnosis. To assist the doctor in this process, we propose a solution to merge the two images. Although the idea of the fusion seems simple, its implementation poses many problems not only related to the paradigm of fusion in general but also to the nature of medical images that are generally poorly contrasted images, and presenting heterogeneous, inaccurate and ambiguous data. Mammography images and IRM images present very different information representations, since they are taken under different conditions. Which leads us to pose the following question: How to pass from the heterogeneous representation of information in the image space, to another space of uniform representation from the two modalities? In order to treat this problem, we opt a multilevel processing approach : the pixel level, the primitive level, the object level and the scene level. We model the pathological objects extracted from the different images by local ontologies. The fusion is then performed on these local ontologies and results in a global ontology containing the different knowledge on the pathological objects of the studied case. This global ontology serves to instantiate a reference ontology modeling knowledge of the medical diagnosis of breast lesions. Case-based reasoning (CBR) is used to provide the diagnostic reports of the most similar cases that can help the doctor to make the best decision. In order to model the imperfection of the treated information, we use the possibility theory with the ontologies. The final result is a diagnostic reports containing the most similar cases to the studied case with similarity degrees expressed with possibility measures. A 3D symbolic model complete the diagnostic report with a simplified overview of the studied scene
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Maatouk, Stefan. "Orientalism - A Netflix Unlimited Series : A Multimodal Critical Discourse Analysis of the Orientalist Representations of Arab Identify on Netflix Film and Television". Thesis, Malmö universitet, Malmö högskola, Institutionen för globala politiska studier (GPS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-43793.

Texto completo
Resumen
Orientalism was a term developed by post-colonial theorist Edward Said to describe the ways in which Europeans, or the West, portrayed the Orient as inferior, uncivilized, and wholly anti-Western. Netflix Inc., the world’s largest subscription-based streaming service, which as of 2018, expanded its streaming venue to over 190 countries globally, is the wellspring of knowledge for many people. Through the multimodal critical discourse analysis of 6 Netflix films and television programmes (Stateless, Gods of Egypt, Messiah, Al Hayba, Sand Castle, and Fauda) the study examines the extent to which the streaming giant is culpable in the reproduction of Orientalist discourses of power, i.e., discourses which facilitate the construction of the stereotyped Other. The results have shown that Netflix strengthens, through the dissemination and distribution of symbols and messages to the general population, the domination and authority over society and its political, economic, cultural, and ideological domains. Using Norman Fairclough’s approach to critical discourse analysis combined with a social semiotic perspective, this study endeavours to design a comprehensive methodological and theoretical framework which can be utilized by future researchers to analyse and critique particular power dynamics within society by exposing the dominant ideological world-view distortions which reinforce oppressive structures and institutional practices.
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

(11170170), Zhi Huang. "Integrative Analysis of Multimodal Biomedical Data with Machine Learning". Thesis, 2021.

Buscar texto completo
Resumen
With the rapid development in high-throughput technologies and the next generation sequencing (NGS) during the past decades, the bottleneck for advances in computational biology and bioinformatics research has shifted from data collection to data analysis. As one of the central goals in precision health, understanding and interpreting high-dimensional biomedical data is of major interest in computational biology and bioinformatics domains. Since significant effort has been committed to harnessing biomedical data for multiple analyses, this thesis is aiming for developing new machine learning approaches to help discover and interpret the complex mechanisms and interactions behind the high dimensional features in biomedical data. Moreover, this thesis also studies the prediction of post-treatment response given histopathologic images with machine learning.

Capturing the important features behind the biomedical data can be achieved in many ways such as network and correlation analyses, dimensionality reduction, image processing, etc. In this thesis, we accomplish the computation through co-expression analysis, survival analysis, and matrix decomposition in supervised and unsupervised learning manners. We use co-expression analysis as upfront feature engineering, implement survival regression in deep learning to predict patient survival and discover associated factors. By integrating Cox proportional hazards regression into non-negative matrix factorization algorithm, the latent clusters of human genes are uncovered. Using machine learning and automatic feature extraction workflow, we extract thirty-six image features from histopathologic images, and use them to predict post-treatment response. In addition, a web portal written by R language is built in order to bring convenience to future biomedical studies and analyses.

In conclusion, driven by machine learning algorithms, this thesis focuses on the integrative analysis given multimodal biomedical data, especially the supervised cancer patient survival prognosis, the recognition of latent gene clusters, and the application of predicting post-treatment response from histopathologic images. The proposed computational algorithms present its superiority comparing to other state-of-the-art models, provide new insights toward the biomedical and cancer studies in the future.
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Amelio, Ravelli Andrea. "Annotation of Linguistically Derived Action Concepts in Computer Vision Datasets". Doctoral thesis, 2020. http://hdl.handle.net/2158/1200356.

Texto completo
Resumen
In the present work, an in-depth exploration of IMAGACT Ontology of Action Verbs has been traced, with the focus of exploiting the resource in NLP tasks. Starting from the Introduction, the idea of making use of IMAGACT multimodal action conceptualisation has been drawn, with some reflections on evidences of the deep linking between Language and Vision, and on the fact that action plays a key role in this linkage. Thus, the multimodal and multilingual features of IMAGACT have been described, with also some details on the framework of the resource building. It followed a concrete case-study on IMAGACT internal data, that led to the proposal of an inter-linguistic manual mapping between the Action Types of verbs which refer to cutting eventualities in English and Italian. Then, a series of ex-periments have been presented, involving the exploitation of IMAGACT in linking with other resources and building deliverable NLP products (such as the Ref-vectors of action verbs). One of the experiments has been described extensively: the visual enrichment of IMAGACT through instance population of its action concepts, making use of Audio Description of movies for visually impaired people. From this last experiment it emerged that dealing with non-conventional scenarios, such as the one of assessing action reference similarity between texts from different domains, is particularly challenging, given that fine-grained differences among action concepts are difficult to derive purely from the textual representation.
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Thompson, Robyn Dyan. "Philosophy for children in a foundation phase literacy classroom in South Africa : multimodal representations of knowledge". Thesis, 2015. http://hdl.handle.net/10539/17833.

Texto completo
Resumen
The aim of this research is to understand how children explore and communicate philosophical concepts in the oral, written and visual modes as part of a literacy lesson and how Philosophy for Children (P4C) can be used as an approach in the Foundation Phase classroom. An additional aim is to determine whether a P4C approach complies with the National Curriculum’s requirements as stipulated in the CAPS documents to develop young children’s creative and critical thinking. This research study was important as it has implications for the theory and the practise of teaching early literacy in South Africa, in particular thinking, reasoning and comprehension. The research was carried out with my own class of Grade Two children as active participants throughout the process. Action research proved to be the most suitable methodology for this study as this methodology encourages both practioner based research and self reflective practise. The research provides evidence that the visual mode can be a sophisticated mode of communication and not only an aesthetic activity that supplements the written work. This mode allows children to express their own original ideas and offers rich material for reflection on children’s thinking and reasoning.
Los estilos APA, Harvard, Vancouver, ISO, etc.

Libros sobre el tema "Multimodal Knowledge Representation"

1

Reilly, Jamie y Nadine Martin. Semantic Processing in Transcortical Sensory Aphasia. Editado por Anastasia M. Raymer y Leslie J. Gonzalez Rothi. Oxford University Press, 2015. http://dx.doi.org/10.1093/oxfordhb/9780199772391.013.6.

Texto completo
Resumen
Transcortical sensory aphasia (TCSA) has historically been regarded as a disconnection syndrome characterized by impaired access between words and otherwise intact core object knowledge. Yet, an extensive body of research has also demonstrated a range of associated nonverbal semantic deficits in TCSA, suggestive of a multimodal semantic impairment that transcends representational modality (i.e., language). Here we delineate the semantic impairment incurred in TCSA within a neurologically constrained model of semantic memory premised upon dynamic interactivity between stored knowledge (e.g., semantic features) and integrative processes that serve to bind this knowledge into cohesive object representations. We discuss practical implications for clinical aphasiology and outline considerations for the broader fields of cognitive neuropsychology and neurolinguistics.
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Dove, Guy. Abstract Concepts and the Embodied Mind. Oxford University Press, 2022. http://dx.doi.org/10.1093/oso/9780190061975.001.0001.

Texto completo
Resumen
Our thoughts depend on knowledge about objects, people, properties, and events. In order to think about where we left our keys, what we are going to make for dinner, when we last fed the dogs, and how we are going to survive our next visit with our family, we need to know something about locations, keys, cooking, dogs, survival, families, and so on. Researchers have sought to explain how our brains can store and access such general knowledge. A growing body of evidence suggests that many of our concepts are grounded in action, emotion, and perception systems. We appear to think about the world by means of the same mechanisms that we use to experience it. Abstract concepts like “democracy,” “fermion,” “piety,” “truth,” and “zero” represent a clear challenge to this idea. Given that they represent a uniquely human cognitive achievement, answering the question of how we acquire and use them is central to our ability to understand ourselves. In Abstract Concepts and the Embodied Mind, Guy Dove contends that abstract concepts are heterogeneous and pose three important challenges to embodied cognition. They force us to ask these questions: How do we generalize beyond the specifics of our experience? How do we think about things that we do not experience directly? How do we adapt our thoughts to specific contexts and tasks? He argues that a successful theory of grounding must embrace multimodal representations, hierarchical architecture, and linguistic scaffolding. Abstract concepts are the product of an elastic mind.
Los estilos APA, Harvard, Vancouver, ISO, etc.

Capítulos de libros sobre el tema "Multimodal Knowledge Representation"

1

Latoschik, Marc Erich, Peter Biermann y Ipke Wachsmuth. "Knowledge in the Loop: Semantics Representation for Multimodal Simulative Environments". En Smart Graphics, 25–39. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11536482_3.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

De Silva, Daswin, Damminda Alahakoon y Shyamali Dharmage. "Extensions to Knowledge Acquisition and Effect of Multimodal Representation in Unsupervised Learning". En Studies in Computational Intelligence, 281–305. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-01082-8_11.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

McTear, Michael, Kristiina Jokinen, Mohnish Dubey, Gérard Chollet, Jérôme Boudy, Christophe Lohr, Sonja Dana Roelen, Wanja Mössing y Rainer Wieching. "Empowering Well-Being Through Conversational Coaching for Active and Healthy Ageing". En Lecture Notes in Computer Science, 257–65. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-09593-1_21.

Texto completo
Resumen
AbstractWith life expectancy growing rapidly over the past century, societies are being increasingly faced with a need to find smart living solutions for elderly care and active ageing. The e-VITA project, which is a joint European (H2020) and Japanese (MIC) funded project, is based on an innovative approach to virtual coaching that addresses the crucial domains of active and healthy ageing. In this paper we describe the role of spoken dialogue technology in the project. Requirements for the virtual coach were elicited through a process of participatory design in workshops, focus groups, and living labs, and a number of use cases were identified for development using the open-source RASA framework. Knowledge Graphs are used as a shared representation within the system, enabling an integration of multimodal data, context, and domain knowledge.
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Danielsson, Kristina y Staffan Selander. "Semiotic Modes and Representations of Knowledge". En Multimodal Texts in Disciplinary Education, 17–23. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-63960-0_3.

Texto completo
Resumen
AbstractWhen organizing our understanding of the world around us, we use semiotic resources (e.g. Kress 2010). Semiotic resources are resources that we use to organize our understanding of the world and to make meaning in communication with others, or to make meaning for ourselves.
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Moschini, Ilaria y Maria Grazia Sindoni. "The Digital Mediation of Knowledge, Representations and Practices through the Lenses of a Multimodal Theory of Communication". En Mediation and Multimodal Meaning Making in Digital Environments, 1–14. New York: Routledge, 2021. http://dx.doi.org/10.4324/9781003225423-1.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Zhang, Chao y Jiawei Han. "Data Mining and Knowledge Discovery". En Urban Informatics, 797–814. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-15-8983-6_42.

Texto completo
Resumen
AbstractOur physical world is being projected into online cyberspace at an unprecedented rate. People nowadays visit different places and leave behind them million-scale digital traces such as tweets, check-ins, Yelp reviews, and Uber trajectories. Such digital data are a result of social sensing: namely people act as human sensors that probe different places in the physical world and share their activities online. The availability of massive social-sensing data provides a unique opportunity for understanding urban space in a data-driven manner and improving many urban computing applications, ranging from urban planning and traffic scheduling to disaster control and trip planning. In this chapter, we present recent developments in data-mining techniques for urban activity modeling, a fundamental task for extracting useful urban knowledge from social-sensing data. We first describe traditional approaches to urban activity modeling, including pattern discovery methods and statistical models. Then, we present the latest developments in multimodal embedding techniques for this task, which learns vector representations for different modalities to model people's spatiotemporal activities. We study the empirical performance of these methods and demonstrate how data-mining techniques can be successfully applied to social-sensing data to extract actionable knowledge and facilitate downstream applications.
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Scrocca, Mario, Marco Comerio, Alessio Carenini y Irene Celino. "Modelling Business Agreements in the Multimodal Transportation Domain Through Ontological Smart Contracts". En Towards a Knowledge-Aware AI. IOS Press, 2022. http://dx.doi.org/10.3233/ssw220016.

Texto completo
Resumen
The blockchain technology provides integrity and reliability of the information, thus offering a suitable solution to guarantee trustability in a multi-stakeholder scenario that involves actors defining business agreements. The Ride2Rail project investigated the use of the blockchain to record as smart contracts the agreements between different stakeholders defined in a multimodal transportation domain. Modelling an ontology to represent the smart contracts enables the possibility of having a machine-readable and interoperable representation of the agreements. On one hand, the underlying blockchain ensures trust in the execution of the contracts, on the other hand, their ontological representation facilitates the retrieval of information within the ecosystem. The paper describes the development of the Ride2Rail Ontology for Agreements to showcase how the concept of an ontological smart contract, defined in the OASIS ontology, can be applied to a specific domain. The usage of the designed ontology is discussed by describing the modelling as ontological smart contracts of business agreements defined in a ride-sharing scenario.
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Farmer, Lesley S. J. "Extensions of Content Analysis in the Creation of Multimodal Knowledge Representations". En Advances in Library and Information Science, 63–81. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-5164-5.ch005.

Texto completo
Resumen
Information architecture is the structural design of shared information environments, optimizing users' interaction with that knowledge representation. This chapter explains knowledge representation and information architecture, focusing on comic arts' features for representing and structuring knowledge. Then it details information design theory and information behaviors relative to this format, also noting visual literacy. With this background, an expanded view of content analysis as a research method, combining information design to represent knowledge and information architecture within the context of comic arts, is explained and concretized. The chapter also recommends strategies for addressing knowledge acquisition and communication through effective knowledge representation.
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Khakhalin, Gennady K., Sergey S. Kurbatov, Xenia Naidenova y Alex P. Lobzin. "Integration of the Image and NL-text Analysis/Synthesis Systems". En Intelligent Data Analysis for Real-Life Applications, 160–85. IGI Global, 2012. http://dx.doi.org/10.4018/978-1-4666-1806-0.ch009.

Texto completo
Resumen
A complex combining multimodal intelligent systems is described. The complex consists of the following systems: image analyzer, image synthesizer, linguistic analyzer of NL-text, and synthesizer of NL-text and applied ontology. The ontology describes the knowledge common for these systems. The analyzers use the applied ontology language for describing the results of their work, and this language is input for the synthesizers. The language of semantic hypergraphs has been selected for ontological knowledge representation. It is an extension of semantic networks. Plane geometry (planimetry) has been selected as an applied domain of the complex. The complex’s systems and their interaction are described.
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Castellano Sanz, Margarida. "Challenging Picturebooks and Domestic Geographies". En Advances in Psychology, Mental Health, and Behavioral Studies, 213–35. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-6684-4735-2.ch015.

Texto completo
Resumen
The COVID-19 pandemic has brought new ways of facing the world and its multiple realities. Picturebooks, as a crossover genre, help in the process of understanding new contexts while offering literature as therapy. The new challenges of the 21st century require the implementation of methodologies that focus both on words and on other modes of representation to construct knowledge, and to this end, literacy education through challenging picturebooks involves paying attention to the diverse pedagogical demands of a global, aesthetic, and multimodal world. This chapter supports the notion of visual literacy as a multidimensional concept and proposes the approach to different picturebooks dealing with neighbours and neighbourhoods through the pedagogy of the multiliteracies, in order to enhance the transformation of the self and the global and local understanding of the current world. A learning path for teachers is designed according to the four knowledge processes: experiencing, conceptualizing, analyzing, and applying.
Los estilos APA, Harvard, Vancouver, ISO, etc.

Actas de conferencias sobre el tema "Multimodal Knowledge Representation"

1

"KNOWLEDGE-BASED MULTIMODAL DATA REPRESENTATION AND QUERYING". En International Conference on Knowledge Engineering and Ontology Development. SciTePress - Science and and Technology Publications, 2011. http://dx.doi.org/10.5220/0003627901520158.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Wang, Zikang, Linjing Li, Qiudan Li y Daniel Zeng. "Multimodal Data Enhanced Representation Learning for Knowledge Graphs". En 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 2019. http://dx.doi.org/10.1109/ijcnn.2019.8852079.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Sun, Chenkai, Weijiang Li, Jinfeng Xiao, Nikolaus Nova Parulian, ChengXiang Zhai y Heng Ji. "Fine-Grained Chemical Entity Typing with Multimodal Knowledge Representation". En 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2021. http://dx.doi.org/10.1109/bibm52615.2021.9669360.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Mousselly Sergieh, Hatem, Teresa Botschen, Iryna Gurevych y Stefan Roth. "A Multimodal Translation-Based Approach for Knowledge Graph Representation Learning". En Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/s18-2027.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Jenkins, Porter, Ahmad Farag, Suhang Wang y Zhenhui Li. "Unsupervised Representation Learning of Spatial Data via Multimodal Embedding". En CIKM '19: The 28th ACM International Conference on Information and Knowledge Management. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3357384.3358001.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Ćalić, J., N. Campbell, S. Dasiopoulou y Y. Kompatsiaris. "An overview of multimodal video representation for semantic analysis". En 2nd European Workshop on the Integration of Knowledge, Semantics and Digital Media Technology (EWIMT 2005). IET, 2005. http://dx.doi.org/10.1049/ic.2005.0708.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Liu, Wenxuan, Hao Duan, Zeng Li, Jingdong Liu, Hong Huo y Tao Fang. "Entity Representation Learning with Multimodal Neighbors for Link Prediction in Knowledge Graph". En 2021 7th International Conference on Computer and Communications (ICCC). IEEE, 2021. http://dx.doi.org/10.1109/iccc54389.2021.9674496.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Gôlo, Marcos P. S., Rafael G. Rossi y Ricardo M. Marcacini. "Triple-VAE: A Triple Variational Autoencoder to Represent Events in One-Class Event Detection". En Encontro Nacional de Inteligência Artificial e Computacional. Sociedade Brasileira de Computação - SBC, 2021. http://dx.doi.org/10.5753/eniac.2021.18291.

Texto completo
Resumen
Events are phenomena that occur at a specific time and place. Its detection can bring benefits to society since it is possible to extract knowledge from these events. Event detection is a multimodal task since these events have textual, geographical, and temporal components. Most multimodal research in the literature uses the concatenation of the components to represent the events. These approaches use multi-class or binary learning to detect events of interest which intensifies the user's labeling effort, in which the user should label event classes even if there is no interest in detecting them. In this paper, we present the Triple-VAE approach that learns a unified representation from textual, spatial, and density modalities through a variational autoencoder, one of the state-ofthe-art in representation learning. Our proposed Triple-VAE obtains suitable event representations for one-class classification, where users provide labels only for events of interest, thereby reducing the labeling effort. We carried out an experimental evaluation with ten real-world event datasets, four multimodal representation methods, and five evaluation metrics. Triple-VAE outperforms and presents a statistically significant difference considering the other three representation methods in all datasets. Therefore, Triple-VAE proved to be promising to represent the events in the one-class event detection scenario.
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Moctezuma, Daniela, Víctor Muníz y Jorge García. "Multimodal Data Evaluation for Classification Problems". En 7th International Conference on VLSI and Applications (VLSIA 2021). Academy and Industry Research Collaboration Center (AIRCC), 2021. http://dx.doi.org/10.5121/csit.2021.112105.

Texto completo
Resumen
Social media data is currently the main input to a wide variety of research works in many knowledge fields. This kind of data is generally multimodal, i.e., it contains different modalities of information such as text, images, video or audio, mainly. To deal with multimodal data to tackle a specific task could be very difficult. One of the main challenges is to find useful representations of the data, capable of capturing the subtle information that the users who generate that information provided, or even the way they use it. In this paper, we analysed the usage of two modalities of data, images, and text, both in a separate way and by combining them to address two classification problems: meme's classification and user profiling. For images, we use a textual semantic representation by using a pre-trained model of image captioning. Later, a text classifier based on optimal lexical representations was used to build a classification model. Interesting findings were found in the usage of these two modalities of data, and the pros and cons of using them to solve the two classification problems are also discussed.
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Oliveira, Angelo Schranko y Renato José Sassi. "Hunting Android Malware Using Multimodal Deep Learning and Hybrid Analysis Data". En Congresso Brasileiro de Inteligência Computacional. SBIC, 2021. http://dx.doi.org/10.21528/cbic2021-32.

Texto completo
Resumen
In this work, we propose a new multimodal Deep Learning (DL) Android malware detection method, Chimera, that combines both manual and automatic feature engineering by using the DL architectures, Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), and Transformer Networks (TN) to perform feature learning from raw data (Dalvik Executables (DEX)), static analysis data (Android Intents & Permissions), and dynamic analysis data (system call sequences) respectively. To train and evaluate our model, we implemented the Knowledge Discovery in Databases (KDD) process and used the publicly available Android benchmark dataset Omnidroid. By leveraging a hybrid source of information to learn high-level feature representations for both the static and dynamic properties of Android applications, Chimera’s detection Accuracy, Precision, and Recall outperform classical Machine Learning (ML) algorithms, state-of-the-art Ensemble, and Voting Ensembles ML methods, as well as unimodal DL methods using CNNs, DNNs, TNs, and Long-Short Term Memory Networks (LSTM). To the best of our knowledge, this is the first work that successfully applies multimodal DL to combine those three different modalities of data using DNNs, CNNs, and TNs to learn a shared representation that can be used in Android malware detection tasks.
Los estilos APA, Harvard, Vancouver, ISO, etc.
Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!

Pasar a la bibliografía