Academic literature on the topic 'Multi-modal Machine Learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Multi-modal Machine Learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Multi-modal Machine Learning"

1

Liang, Haotian, and Zhanqing Wang. "Hierarchical Attention Networks for Multimodal Machine Learning." Journal of Physics: Conference Series 2218, no. 1 (March 1, 2022): 012020. http://dx.doi.org/10.1088/1742-6596/2218/1/012020.

Full text
Abstract:
Abstract The Visual Question Answering (VQA) task is to infer the correct answer to a free-form question based on the given image. This task is challenging because it requires model handling both visual and textual information. Most successful attempts on VQA task have been achieved by using attention mechanism which can capture inter-modal and intra-modal dependencies. In this paper, we raise a new attention-based model to solve VQA. We use question information to guide model concentrate on special regions and attribute and hierarchically reason the answer. We also propose multi-modal fusion strategy based on co-attention method to fuse both visual and textual information. Under the same experimental conditions, extensive experiments on VQA-v2.0 dataset illustrate our method performance exceeds the performance of some state-of-the-art methods of the same experimental conditions.
APA, Harvard, Vancouver, ISO, and other styles
2

Nachiappan, Balusamy, N. Rajkumar, C. Viji, and Mohanraj A. "Artificial and Deceitful Faces Detection Using Machine Learning." Salud, Ciencia y Tecnología - Serie de Conferencias 3 (March 11, 2024): 611. http://dx.doi.org/10.56294/sctconf2024611.

Full text
Abstract:
Security certification is becoming popular for many applications, such as significant financial transactions. PIN and password authentication is the most common method of authentication. Due to the finite length of the password, the security level is low and can be easily damaged. Adding a new dimension to the sensing mode-driven state-of-the-art multi-modal boundary face recognition system of the image-based solutions. It combines the active complex visual features extracted from the latest facial recognition model and uses a custom Convolution Neural Network (CNN) issue facial authentications and extraction capabilities to ensure the safety of face recognition. The Echo function is dependent on the geometry and material of the face, not disguised by the pictures and videos, such as multi-modal design is easy to image-based face recognition system.
APA, Harvard, Vancouver, ISO, and other styles
3

Liu, Ang, Tianying Lin, Hailong Han, Xiaopei Zhang, Ze Chen, Fuwan Gan, Haibin Lv, and Xiaoping Liu. "Analyzing modal power in multi-mode waveguide via machine learning." Optics Express 26, no. 17 (August 10, 2018): 22100. http://dx.doi.org/10.1364/oe.26.022100.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Liu, Huaping, Jing Fang, Xinying Xu, and Fuchun Sun. "Surface Material Recognition Using Active Multi-modal Extreme Learning Machine." Cognitive Computation 10, no. 6 (July 4, 2018): 937–50. http://dx.doi.org/10.1007/s12559-018-9571-z.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Wei, Jie, Huaping Liu, Gaowei Yan, and Fuchun Sun. "Robotic grasping recognition using multi-modal deep extreme learning machine." Multidimensional Systems and Signal Processing 28, no. 3 (March 3, 2016): 817–33. http://dx.doi.org/10.1007/s11045-016-0389-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

A, Mr Balaji. "Extracting Audio from Image Using Machine Learning." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 04 (April 24, 2024): 1–5. http://dx.doi.org/10.55041/ijsrem31532.

Full text
Abstract:
This study introduces a new method for extracting sound from pictures by utilizing machine learning. Lately, there has been a lot of excitement around multi-modal learning because of its ability to reveal valuable information from various sources, like images and sound. Our research is centered on using the unique qualities of visual and auditory signals to predict sound content from pictures. This opens up possibilities for enhancing accessibility, creating content, and providing immersive user experiences. We start by exploring previous research in multi-modal learning, audio-visual processing, and tasks like image captioning and sound source localization. Based on this background, we introduce an approach that merges convolutional neural networks (CNNs) for image analysis with recurrent neural networks (RNNs) or transformers for sequence interpretation. The system is educated on a collection of matched images and associated audio tracks, allowing it to grasp the intricate connections between visual and auditory data. In our study, we carefully assessed the performance of our proposed method by using well-known metrics. We measure how well our method works by comparing it to other methods and showing that it can accurately and quickly extract audio from images. We also show through qualitative analysis that our model can create clear audio representations from a variety of visual inputs. After a thorough discussion, we analyze the findings, pointing out both the advantages and drawbacks of our method. We pinpoint potential areas for further study, such as delving into more advanced structures and incorporating semantic data to enhance audio extraction. To sum up, this study adds to the expanding field of multi-modal learning by introducing a promising model for extracting audio from images through machine learning. Our results emphasize the potential of this technology to improve accessibility, inspire creativity, and increase user engagement in different fields. Key Words: Audio Extraction, Machine Learning, Computer Vision, Deep Learning, Convolutional Neural Networks
APA, Harvard, Vancouver, ISO, and other styles
7

Asim, Yousra, Basit Raza, Ahmad Kamran Malik, Saima Rathore, Lal Hussain, and Mohammad Aksam Iftikhar. "A multi-modal, multi-atlas-based approach for Alzheimer detection via machine learning." International Journal of Imaging Systems and Technology 28, no. 2 (January 10, 2018): 113–23. http://dx.doi.org/10.1002/ima.22263.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

G, Nandhini, and Santosh K. Balivada. "Multi-Modal Feature Integration in Machine Learning Predictions for Cardiovascular Diseases." International Journal of Health Technology and Innovation 2, no. 03 (December 7, 2023): 15–18. http://dx.doi.org/10.60142/ijhti.v2i03.03.

Full text
Abstract:
Early detection and prevention of cardiovascular illnesses rely heavily on phonocardiogram (PCG) and electrocardiogram (ECG). A novel multi-modal machine learning strategy based on ECG and PCG data is presented in this work for predicting cardiovascular diseases (CVD). ECG and PCG features are combined for optimal feature subset selection using a genetic algorithm (GA). Then, machine learning classifiers are implemented to do the classification of abnormal and normal signals
APA, Harvard, Vancouver, ISO, and other styles
9

Liu, Huaping, Fengxue Li, Xinying Xu, and Fuchun Sun. "Multi-modal local receptive field extreme learning machine for object recognition." Neurocomputing 277 (February 2018): 4–11. http://dx.doi.org/10.1016/j.neucom.2017.04.077.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Lamichhane, Bidhan, Dinal Jayasekera, Rachel Jakes, Matthew F. Glasser, Justin Zhang, Chunhui Yang, Derayvia Grimes, et al. "Multi-modal biomarkers of low back pain: A machine learning approach." NeuroImage: Clinical 29 (2021): 102530. http://dx.doi.org/10.1016/j.nicl.2020.102530.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Multi-modal Machine Learning"

1

McCalman, Lachlan Robert. "Function Embeddings for Multi-modal Bayesian Inference." Thesis, The University of Sydney, 2013. http://hdl.handle.net/2123/12031.

Full text
Abstract:
Tractable Bayesian inference is a fundamental challenge in robotics and machine learning. Standard approaches such as Gaussian process regression and Kalman filtering make strong Gaussianity assumptions about the underlying distributions. Such assumptions, however, can quickly break down when dealing with complex systems such as the dynamics of a robot or multi-variate spatial models. In this thesis we aim to solve Bayesian regression and filtering problems without making assumptions about the underlying distributions. We develop techniques to produce rich posterior representations for complex, multi-modal phenomena. Our work extends kernel Bayes' rule (KBR), which uses empirical estimates of distributions derived from a set of training samples and embeds them into a high-dimensional reproducing kernel Hilbert space (RKHS). Bayes' rule itself occurs on elements of this space. Our first contribution is the development of an efficient method for estimating posterior density functions from kernel Bayes' rule, applied to both filtering and regression. By embedding fixed-mean mixtures of component distributions, we can efficiently find an approximate pre-image by optimising the mixture weights using a convex quadratic program. The result is a complex, multi-modal posterior representation. Our next contributions are methods for estimating cumulative distributions and quantile estimates from the posterior embedding of kernel Bayes' rule. We examine a number of novel methods, including those based on our density estimation techniques, as well as directly estimating the cumulative through use of the reproducing property of RKHSs. Finally, we develop a novel method for scaling kernel Bayes' rule inference to large datasets, using a reduced-set construction optimised using the posterior likelihood. This method retains the ability to perform multi-output inference, as well as our earlier contributions to represent explicitly non-Gaussian posteriors and quantile estimates.
APA, Harvard, Vancouver, ISO, and other styles
2

Bohg, Jeannette. "Multi-Modal Scene Understanding for Robotic Grasping." Doctoral thesis, KTH, Datorseende och robotik, CVAP, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-49062.

Full text
Abstract:
Current robotics research is largely driven by the vision of creatingan intelligent being that can perform dangerous, difficult orunpopular tasks. These can for example be exploring the surface of planet mars or the bottomof the ocean, maintaining a furnace or assembling a car.   They can also be more mundane such as cleaning an apartment or fetching groceries. This vision has been pursued since the 1960s when the first robots were built. Some of the tasks mentioned above, especially those in industrial manufacturing, arealready frequently performed by robots. Others are still completelyout of reach. Especially, household robots are far away from beingdeployable as general purpose devices. Although advancements have beenmade in this research area, robots are not yet able to performhousehold chores robustly in unstructured and open-ended environments givenunexpected events and uncertainty in perception and execution.In this thesis, we are analyzing which perceptual andmotor capabilities are necessaryfor the robot to perform common tasks in a household scenario. In that context, an essential capability is tounderstand the scene that the robot has to interact with. This involvesseparating objects from the background but also from each other.Once this is achieved, many other tasks becomemuch easier. Configuration of objectscan be determined; they can be identified or categorized; their pose can be estimated; free and occupied space in the environment can be outlined.This kind of scene model can then inform grasp planning algorithms to finally pick up objects.However, scene understanding is not a trivial problem and evenstate-of-the-art methods may fail. Given an incomplete, noisy andpotentially erroneously segmented scene model, the questions remain howsuitable grasps can be planned and how they can be executed robustly.In this thesis, we propose to equip the robot with a set of predictionmechanisms that allow it to hypothesize about parts of the sceneit has not yet observed. Additionally, the robot can alsoquantify how uncertain it is about this prediction allowing it toplan actions for exploring the scene at specifically uncertainplaces. We consider multiple modalities includingmonocular and stereo vision, haptic sensing and information obtainedthrough a human-robot dialog system. We also study several scene representations of different complexity and their applicability to a grasping scenario. Given an improved scene model from this multi-modalexploration, grasps can be inferred for each objecthypothesis. Dependent on whether the objects are known, familiar orunknown, different methodologies for grasp inference apply. In thisthesis, we propose novel methods for each of these cases. Furthermore,we demonstrate the execution of these grasp both in a closed andopen-loop manner showing the effectiveness of the proposed methods inreal-world scenarios.

QC 20111125


GRASP
APA, Harvard, Vancouver, ISO, and other styles
3

Ben-Younes, Hedi. "Multi-modal representation learning towards visual reasoning." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS173.

Full text
Abstract:
La quantité d'images présentes sur internet augmente considérablement, et il est nécessaire de développer des techniques permettant le traitement automatique de ces contenus. Alors que les méthodes de reconnaissance visuelle sont de plus en plus évoluées, la communauté scientifique s'intéresse désormais à des systèmes aux capacités de raisonnement plus poussées. Dans cette thèse, nous nous intéressons au Visual Question Answering (VQA), qui consiste en la conception de systèmes capables de répondre à une question portant sur une image. Classiquement, ces architectures sont conçues comme des systèmes d'apprentissage automatique auxquels on fournit des images, des questions et leur réponse. Ce problème difficile est habituellement abordé par des techniques d'apprentissage profond. Dans la première partie de cette thèse, nous développons des stratégies de fusion multimodales permettant de modéliser des interactions entre les représentations d'image et de question. Nous explorons des techniques de fusion bilinéaire, et assurons l'expressivité et la simplicité des modèles en utilisant des techniques de factorisation tensorielle. Dans la seconde partie, on s'intéresse au raisonnement visuel qui encapsule ces fusions. Après avoir présenté les schémas classiques d'attention visuelle, nous proposons une architecture plus avancée qui considère les objets ainsi que leurs relations mutuelles. Tous les modèles sont expérimentalement évalués sur des jeux de données standards et obtiennent des résultats compétitifs avec ceux de la littérature
The quantity of images that populate the Internet is dramatically increasing. It becomes of critical importance to develop the technology for a precise and automatic understanding of visual contents. As image recognition systems are becoming more and more relevant, researchers in artificial intelligence now seek for the next generation vision systems that can perform high-level scene understanding. In this thesis, we are interested in Visual Question Answering (VQA), which consists in building models that answer any natural language question about any image. Because of its nature and complexity, VQA is often considered as a proxy for visual reasoning. Classically, VQA architectures are designed as trainable systems that are provided with images, questions about them and their answers. To tackle this problem, typical approaches involve modern Deep Learning (DL) techniques. In the first part, we focus on developping multi-modal fusion strategies to model the interactions between image and question representations. More specifically, we explore bilinear fusion models and exploit concepts from tensor analysis to provide tractable and expressive factorizations of parameters. These fusion mechanisms are studied under the widely used visual attention framework: the answer to the question is provided by focusing only on the relevant image regions. In the last part, we move away from the attention mechanism and build a more advanced scene understanding architecture where we consider objects and their spatial and semantic relations. All models are thoroughly experimentally evaluated on standard datasets and the results are competitive with the literature
APA, Harvard, Vancouver, ISO, and other styles
4

Michel, Fabrice. "Multi-Modal Similarity Learning for 3D Deformable Registration of Medical Images." Phd thesis, Ecole Centrale Paris, 2013. http://tel.archives-ouvertes.fr/tel-01005141.

Full text
Abstract:
Even though the prospect of fusing images issued by different medical imagery systems is highly contemplated, the practical instantiation of it is subject to a theoretical hurdle: the definition of a similarity between images. Efforts in this field have proved successful for select pairs of images; however defining a suitable similarity between images regardless of their origin is one of the biggest challenges in deformable registration. In this thesis, we chose to develop generic approaches that allow the comparison of any two given modality. The recent advances in Machine Learning permitted us to provide innovative solutions to this very challenging problem. To tackle the problem of comparing incommensurable data we chose to view it as a data embedding problem where one embeds all the data in a common space in which comparison is possible. To this end, we explored the projection of one image space onto the image space of the other as well as the projection of both image spaces onto a common image space in which the comparison calculations are conducted. This was done by the study of the correspondences between image features in a pre-aligned dataset. In the pursuit of these goals, new methods for image regression as well as multi-modal metric learning methods were developed. The resulting learned similarities are then incorporated into a discrete optimization framework that mitigates the need for a differentiable criterion. Lastly we investigate on a new method that discards the constraint of a database of images that are pre-aligned, only requiring data annotated (segmented) by a physician. Experiments are conducted on two challenging medical images data-sets (Pre-Aligned MRI images and PET/CT images) to justify the benefits of our approach.
APA, Harvard, Vancouver, ISO, and other styles
5

Svoboda, Jiří. "Multi-modální "Restricted Boltzmann Machines"." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236426.

Full text
Abstract:
This thesis explores how multi-modal Restricted Boltzmann Machines (RBM) can be used in content-based image tagging. This work also cointains brief analysis of modalities that can be used for multi-modal classification. There are also described various RBMs, that are suitable for different kinds of input data. A design and implementation of multimodal RBM is described together with results of preliminary experiments.
APA, Harvard, Vancouver, ISO, and other styles
6

Partin, Michael. "Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems." Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1567073723628721.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Stein, Sebastian. "Multi-modal recognition of manipulation activities through visual accelerometer tracking, relational histograms, and user-adaptation." Thesis, University of Dundee, 2014. https://discovery.dundee.ac.uk/en/studentTheses/61c22b7e-5f02-4f21-a948-bf9e7b497120.

Full text
Abstract:
Activity recognition research in computer vision and pervasive computing has made a remarkable trajectory from distinguishing full-body motion patterns to recognizing complex activities. Manipulation activities as occurring in food preparation are particularly challenging to recognize, as they involve many different objects, non-unique task orders and are subject to personal idiosyncrasies. Video data and data from embedded accelerometers provide complementary information, which motivates an investigation of effective methods for fusing these sensor modalities. This thesis proposes a method for multi-modal recognition of manipulation activities that combines accelerometer data and video at multiple stages of the recognition pipeline. A method for accelerometer tracking is introduced that provides for each accelerometer-equipped object a location estimate in the camera view by identifying a point trajectory that matches well the accelerometer data. It is argued that associating accelerometer data with locations in the video provides a key link for modelling interactions between accelerometer-equipped objects and other visual entities in the scene. Estimates of accelerometer locations and their visual displacements are used to extract two new types of features: (i) Reference Tracklet Statistics characterizes statistical properties of an accelerometer's visual trajectory, and (ii) RETLETS, a feature representation that encodes relative motion, uses an accelerometer's visual trajectory as a reference frame for dense tracklets. In comparison to a traditional sensor fusion approach where features are extracted from each sensor-type independently and concatenated for classification, it is shown that combining RETLETS and Reference Tracklet Statistics with those sensor-specific features performs considerably better. Specifically addressing scenarios in which a recognition system would be primarily used by a single person (e.g., cognitive situational support), this thesis investigates three methods for adapting activity models to a target user based on user-specific training data. Via randomized control trials it is shown that these methods indeed learn user idiosyncrasies. All proposed methods are evaluated on two new challenging datasets of food preparation activities that have been made publicly available. Both datasets feature a novel combination of video and accelerometers attached to objects. The Accelerometer Localization dataset is the first publicly available dataset that enables quantitative evaluation of accelerometer tracking algorithms. The 50 Salads dataset contains 50 sequences of people preparing mixed salads with detailed activity annotations.
APA, Harvard, Vancouver, ISO, and other styles
8

Husseini, Orabi Ahmed. "Multi-Modal Technology for User Interface Analysis including Mental State Detection and Eye Tracking Analysis." Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36451.

Full text
Abstract:
We present a set of easy-to-use methods and tools to analyze human attention, behaviour, and physiological responses. A potential application of our work is evaluating user interfaces being used in a natural manner. Our approach is designed to be scalable and to work remotely on regular personal computers using expensive and noninvasive equipment. The data sources our tool processes are nonintrusive, and captured from video; i.e. eye tracking, and facial expressions. For video data retrieval, we use a basic webcam. We investigate combinations of observation modalities to detect and extract affective and mental states. Our tool provides a pipeline-based approach that 1) collects observational, data 2) incorporates and synchronizes the signal modality mentioned above, 3) detects users' affective and mental state, 4) records user interaction with applications and pinpoints the parts of the screen users are looking at, 5) analyzes and visualizes results. We describe the design, implementation, and validation of a novel multimodal signal fusion engine, Deep Temporal Credence Network (DTCN). The engine uses Deep Neural Networks to provide 1) a generative and probabilistic inference model, and 2) to handle multimodal data such that its performance does not degrade due to the absence of some modalities. We report on the recognition accuracy of basic emotions for each modality. Then, we evaluate our engine in terms of effectiveness of recognizing basic six emotions and six mental states, which are agreeing, concentrating, disagreeing, interested, thinking, and unsure. Our principal contributions include the implementation of a 1) multimodal signal fusion engine, 2) real time recognition of affective and primary mental states from nonintrusive and inexpensive modality, 3) novel mental state-based visualization techniques, 3D heatmaps, 3D scanpaths, and widget heatmaps that find parts of the user interface where users are perhaps unsure, annoyed, frustrated, or satisfied.
APA, Harvard, Vancouver, ISO, and other styles
9

Siddiqui, Mohammad Faridul Haque. "A Multi-modal Emotion Recognition Framework Through The Fusion Of Speech With Visible And Infrared Images." University of Toledo / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1556459232937498.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Cosa, Liñán Alejandro. "Analytical fusion of multimodal magnetic resonance imaging to identify pathological states in genetically selected Marchigian Sardinian alcohol-preferring (msP) rats." Doctoral thesis, Universitat Politècnica de València, 2017. http://hdl.handle.net/10251/90523.

Full text
Abstract:
[EN] Alcohol abuse is one of the most alarming issues for the health authorities. It is estimated that at least 23 million of European citizens are affected by alcoholism causing a cost around 270 million euros. Excessive alcohol consumption is related with physical harm and, although it damages the most of body organs, liver, pancreas, and brain are more severally affected. Not only physical harm is associated to alcohol-related disorders, but also other psychiatric disorders such as depression are often comorbiding. As well, alcohol is present in many of violent behaviors and traffic injures. Altogether reflects the high complexity of alcohol-related disorders suggesting the involvement of multiple brain systems. With the emergence of non-invasive diagnosis techniques such as neuroimaging or EEG, many neurobiological factors have been evidenced to be fundamental in the acquisition and maintenance of addictive behaviors, relapsing risk, and validity of available treatment alternatives. Alterations in brain structure and function reflected in non-invasive imaging studies have been repeatedly investigated. However, the extent to which imaging measures may precisely characterize and differentiate pathological stages of the disease often accompanied by other pathologies is not clear. The use of animal models has elucidated the role of neurobiological mechanisms paralleling alcohol misuses. Thus, combining animal research with non-invasive neuroimaging studies is a key tool in the advance of the disorder understanding. As the volume of data from very diverse nature available in clinical and research settings increases, an integration of data sets and methodologies is required to explore multidimensional aspects of psychiatric disorders. Complementing conventional mass-variate statistics, interests in predictive power of statistical machine learning to neuroimaging data is currently growing among scientific community. This doctoral thesis has covered most of the aspects mentioned above. Starting from a well-established animal model in alcohol research, Marchigian Sardinian rats, we have performed multimodal neuroimaging studies at several stages of alcohol-experimental design including the etiological mechanisms modulating high alcohol consumption (in comparison to Wistar control rats), alcohol consumption, and treatment with the opioid antagonist Naltrexone, a well-established drug in clinics but with heterogeneous response. Multimodal magnetic resonance imaging acquisition included Diffusion Tensor Imaging, structural imaging, and the calculation of magnetic-derived relaxometry maps. We have designed an analytical framework based on widely used algorithms in neuroimaging field, Random Forest and Support Vector Machine, combined in a wrapping fashion. Designed approach was applied on the same dataset with two different aims: exploring the validity of the approach to discriminate experimental stages running at subject-level and establishing predictive models at voxel-level to identify key anatomical regions modified during the experiment course. As expected, combination of multiple magnetic resonance imaging modalities resulted in an enhanced predictive power (between 3 and 16%) with heterogeneous modality contribution. Surprisingly, we have identified some inborn alterations correlating high alcohol preference and thalamic neuroadaptations related to Naltrexone efficacy. As well, reproducible contribution of DTI and relaxometry -related biomarkers has been repeatedly identified guiding further studies in alcohol research. In summary, along this research we demonstrate the feasibility of incorporating multimodal neuroimaging, machine learning algorithms, and animal research in the advance of the understanding alcohol-related disorders.
[ES] El abuso de alcohol es una de las mayores preocupaciones de las autoridades sanitarias en la Unión Europea. El consumo de alcohol en exceso afecta en mayor o menor medida la totalidad del organismo siendo el páncreas e hígado los más severamente afectados. Además de estos, el sistema nervioso central sufre deterioros relacionados con el alcohol y con frecuencia se presenta en paralelo con otras patologías psiquiátricas como la depresión u otras adicciones como la ludopatía. La presencia de estas comorbidades demuestra la complejidad de la patología en la que multitud de sistemas neuronales interaccionan entre sí. El uso imágenes de resonancia magnética (RM) han ayudado en el estudio de enfermedades psiquiátricas facilitando el descubrimiento de mecanismos neurológicos fundamentales en el desarrollo y mantenimiento de la adicción al alcohol, recaídas y el efecto de los tratamientos disponibles. A pesar de los avances, todavía se necesita investigar más para identificar las bases biológicas que contribuyen a la enfermedad. En este sentido, los modelos animales sirven, por lo tanto, a discriminar aquellos factores únicamente relacionados con el alcohol controlando otros factores que facilitan el desarrollo del alcoholismo. Estudios de resonancia magnética en animales de laboratorio y su posterior evaluación en humanos juegan un papel fundamental en el entendimiento de las patologías psiquatricas como la addicción al alcohol. La imagen por resonancia magnética se ha integrado en entornos clínicos como prueba diagnósticas no invasivas. A medida que el volumen de datos se va incrementando, se necesitan herramientas y metodologías capaces de fusionar información de muy distinta naturaleza y así establecer criterios diagnósticos cada vez más exactos. El poder predictivo de herramientas derivadas de la inteligencia artificial como el aprendizaje automático sirven de complemento a tradicionales métodos estadísticos. En este trabajo se han abordado la mayoría de estos aspectos. Se han obtenido datos multimodales de resonancia magnética de un modelo validado en la investigación de patologías derivadas del consumo del alcohol, las ratas Marchigian-Sardinian desarrolladas en la Universidad de Camerino (Italia) y con consumos de alcohol comparables a los humanos. Para cada animal se han adquirido datos antes y después del consumo de alcohol y bajo dos condiciones de abstinencia (con y sin tratamiento de Naltrexona, una medicaciones anti-recaídas usada como farmacoterapia en el alcoholismo). Los datos de resonancia magnética multimodal consistentes en imágenes de difusión, de relaxometría y estructurales se han fusionado en un esquema analítico multivariable incorporando dos herramientas generalmente usadas en datos derivados de neuroimagen, Random Forest y Support Vector Machine. Nuestro esquema fue aplicado con dos objetivos diferenciados. Por un lado, determinar en qué fase experimental se encuentra el sujeto a partir de biomarcadores y por el otro, identificar sistemas cerebrales susceptibles de alterarse debido a una importante ingesta de alcohol y su evolución durante la abstinencia. Nuestros resultados demostraron que cuando biomarcadores derivados de múltiples modalidades de neuroimagen se fusionan en un único análisis producen diagnósticos más exactos que los derivados de una única modalidad (hasta un 16% de mejora). Biomarcadores derivados de imágenes de difusión y relaxometría discriminan estados experimentales. También se han identificado algunos aspectos innatos que están relacionados con posteriores comportamientos con el consumo de alcohol o la relación entre la respuesta al tratamiento y los datos de resonancia magnética. Resumiendo, a lo largo de esta tesis, se demuestra que el uso de datos de resonancia magnética multimodales en modelos animales combinados en esquemas analíticos multivariados es una herramienta válida en el entendimiento de patologías
[CAT] L'abús de alcohol es una de les majors preocupacions per part de les autoritats sanitàries de la Unió Europea. Malgrat la dificultat de establir xifres exactes, se estima que uns 23 milions de europeus actualment sofreixen de malalties derivades del alcoholisme amb un cost que supera els 150.000 milions de euros per a la societat. Un consum de alcohol en excés afecta en major o menor mesura el cos humà sent el pàncreas i el fetge el més afectats. A més, el cervell sofreix de deterioraments produïts per l'alcohol i amb freqüència coexisteixen amb altres patologies com depressió o altres addiccions com la ludopatia. Tot aquest demostra la complexitat de la malaltia en la que múltiple sistemes neuronals interactuen entre si. Tècniques no invasives com el encefalograma (EEG) o imatges de ressonància magnètica (RM) han ajudat en l'estudi de malalties psiquiàtriques facilitant el descobriment de mecanismes neurològics fonamentals en el desenvolupament i manteniment de la addició, recaiguda i la efectivitat dels tractaments disponibles. Tot i els avanços, encara es necessiten més investigacions per identificar les bases biològiques que contribueixen a la malaltia. En aquesta direcció, el models animals serveixen per a identificar únicament dependents del abús del alcohol. Estudis de ressonància magnètica en animals de laboratori i posterior avaluació en humans jugarien un paper fonamental en l' enteniment de l'ús del alcohol. L'ús de probes diagnostiques no invasives en entorns clínics has sigut integrades. A mesura que el volum de dades es incrementa, eines i metodologies per a la fusió d' informació de molt distinta natura i per tant, establir criteris diagnòstics cada vegada més exactes. La predictibilitat de eines desenvolupades en el camp de la intel·ligència artificial com la aprenentatge automàtic serveixen de complement a mètodes estadístics tradicionals. En aquesta investigació se han abordat tots aquestes aspectes. Dades multimodals de ressonància magnètica se han obtingut de un model animal validat en l'estudi de patologies relacionades amb el consum d'alcohol, les rates Marchigian-Sardinian desenvolupades en la Universitat de Camerino (Italià) i amb consums d'alcohol comparables als humans. Per a cada animal es van adquirir dades previs i després al consum de alcohol i dos condicions diferents de abstinència (amb i sense tractament anti-recaiguda). Dades de ressonància magnètica multimodal constituides per imatges de difusió, de relaxometria magnètica i estructurals van ser fusionades en esquemes analítics multivariats incorporant dues metodologies validades en el camp de neuroimatge, Random Forest i Support Vector Machine. Nostre esquema ha sigut aplicat amb dos objectius diferenciats. El primer objectiu es determinar en quina fase experimental es troba el subjecte a partir de biomarcadors obtinguts per neuroimatge. Per l'altra banda, el segon objectiu es identificar el sistemes cerebrals susceptibles de ser alterats durant una important ingesta de alcohol i la seua evolució durant la fase del tractament. El nostres resultats demostraren que l'ús de biomarcadors derivats de varies modalitats de neuroimatge fusionades en un anàlisis multivariat produeixen diagnòstics més exactes que els derivats de una única modalitat (fins un 16% de millora). Biomarcadors derivats de imatges de difusió i relaxometria van contribuir de distints estats experimentals. També s'han identificat aspectes innats que estan relacionades amb posterior preferències d'alcohol o la relació entre la resposta al tractament anti-recaiguda i les dades de ressonància magnètica. En resum, al llarg de aquest treball, es demostra que l'ús de dades de ressonància magnètica multimodal en models animals combinats en esquemes analítics multivariats són una eina molt valida en l'enteniment i avanç de patologies psiquiàtriques com l'alcoholisme.
Cosa Liñán, A. (2017). Analytical fusion of multimodal magnetic resonance imaging to identify pathological states in genetically selected Marchigian Sardinian alcohol-preferring (msP) rats [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/90523
TESIS
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Multi-modal Machine Learning"

1

Zhou, Ziqi, Xinna Guo, Wanqi Yang, Yinghuan Shi, Luping Zhou, Lei Wang, and Ming Yang. "Cross-Modal Attention-Guided Convolutional Network for Multi-modal Cardiac Segmentation." In Machine Learning in Medical Imaging, 601–10. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-32692-0_69.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Xu, Hua. "Early Unimodal Sentiment Analysis of Comment Text Based on Traditional Machine Learning." In Multi-Modal Sentiment Analysis, 53–134. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-5776-7_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Yu, Zheng, Yanyuan Qiao, Yutong Xie, and Qi Wu. "Multi-modal Adapter for Medical Vision-and-Language Learning." In Machine Learning in Medical Imaging, 393–402. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-45673-2_39.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Symeonidis, Panagiotis, and Christos Perentis. "Link Prediction in Multi-modal Social Networks." In Machine Learning and Knowledge Discovery in Databases, 147–62. Berlin, Heidelberg: Springer Berlin Heidelberg, 2014. http://dx.doi.org/10.1007/978-3-662-44845-8_10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Tong, Tong, Katherine Gray, Qinquan Gao, Liang Chen, and Daniel Rueckert. "Nonlinear Graph Fusion for Multi-modal Classification of Alzheimer’s Disease." In Machine Learning in Medical Imaging, 77–84. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-24888-2_10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Ge, Hongkun, Guorong Wu, Li Wang, Yaozong Gao, and Dinggang Shen. "Hierarchical Multi-modal Image Registration by Learning Common Feature Representations." In Machine Learning in Medical Imaging, 203–11. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-24888-2_25.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Zhang, Sen, Changzheng Zhang, Lanjun Wang, Cixing Li, Dandan Tu, Rui Luo, Guojun Qi, and Jiebo Luo. "MSAFusionNet: Multiple Subspace Attention Based Deep Multi-modal Fusion Network." In Machine Learning in Medical Imaging, 54–62. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-32692-0_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Yang, Yang, Ye Zhonglin, Zhao Haixing, Li Gege, and Cao Shujuan. "BLR: A Multi-modal Sentiment Analysis Model." In Artificial Neural Networks and Machine Learning – ICANN 2023, 466–78. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-44204-9_39.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Nai, Ruihua. "Intelligent Recognition Model for Machine Translation Based on Machine Learning Algorithm." In Application of Intelligent Systems in Multi-modal Information Analytics, 650–57. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-05237-8_80.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Gadekar, Aumkar, Shreya Oak, Abhishek Revadekar, and Anant V. Nimkar. "MMAP: A Multi-Modal Automated Online Proctor." In Machine Learning and Big Data Analytics (Proceedings of International Conference on Machine Learning and Big Data Analytics (ICMLBDA) 2021), 314–25. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-82469-3_28.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Multi-modal Machine Learning"

1

Wachinger, Christian, and Nassir Navab. "Manifold Learning for Multi-Modal Image Registration." In British Machine Vision Conference 2010. British Machine Vision Association, 2010. http://dx.doi.org/10.5244/c.24.82.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Wang, Yan, Yawen Zeng, Junjie Liang, Xiaofen Xing, Jin Xu, and Xiangmin Xu. "RetrievalMMT: Retrieval-Constrained Multi-Modal Prompt Learning for Multi-Modal Machine Translation." In ICMR '24: International Conference on Multimedia Retrieval. New York, NY, USA: ACM, 2024. http://dx.doi.org/10.1145/3652583.3658018.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

M. Nguyen, D., K. Wong, and C. Chung. "Multi-Modal Search with Convex Bounding Neighbourhood." In 2006 International Conference on Machine Learning and Cybernetics. IEEE, 2006. http://dx.doi.org/10.1109/icmlc.2006.258347.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Huang, Xin, Jiajun Zhang, and Chengqing Zong. "Entity-level Cross-modal Learning Improves Multi-modal Machine Translation." In Findings of the Association for Computational Linguistics: EMNLP 2021. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.findings-emnlp.92.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Gupta, Danika. "Early detection of Alzheimer’s via machine learning with multi-modal data." In Applications of Machine Learning 2022, edited by Michael E. Zelinski, Tarek M. Taha, and Jonathan Howe. SPIE, 2022. http://dx.doi.org/10.1117/12.2641481.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Freeman, Cecille, Dana Kulic, and Otman Basir. "Multi-modal Tree-Based SVM Classification." In 2013 12th International Conference on Machine Learning and Applications (ICMLA). IEEE, 2013. http://dx.doi.org/10.1109/icmla.2013.19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Boishakhi, Fariha Tahosin, Ponkoj Chandra Shill, and Md Golam Rabiul Alam. "Multi-modal Hate Speech Detection using Machine Learning." In 2021 IEEE International Conference on Big Data (Big Data). IEEE, 2021. http://dx.doi.org/10.1109/bigdata52589.2021.9671955.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Irfan, Asim, Danish Azeem, Sanam Narejo, and Naresh Kumar. "Multi-Modal Hate Speech Recognition Through Machine Learning." In 2024 IEEE 1st Karachi Section Humanitarian Technology Conference (KHI-HTC). IEEE, 2024. http://dx.doi.org/10.1109/khi-htc60760.2024.10482031.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Menon Manghat, Neeraj, Vaishak Gopalakrishna, Sai Bonthu, Victor Hunt, Arthur Helmicki, and Doug McClintock. "Machine Learning Based Multi-Modal Transportation Network Planner." In International Conference on Transportation and Development 2024. Reston, VA: American Society of Civil Engineers, 2024. http://dx.doi.org/10.1061/9780784485514.034.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ellis, Seth T., and Andre V. Harrison. "Detecting functional objects using multi-modal data." In Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications V, edited by Latasha Solomon and Peter J. Schwartz. SPIE, 2023. http://dx.doi.org/10.1117/12.2664059.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Multi-modal Machine Learning"

1

Ehiabhi, Jolly, and Haifeng Wang. A Systematic Review of Machine Learning Models in Mental Health Analysis Based on Multi-Channel Multi-Modal Biometric Signals. INPLASY - International Platform of Registered Systematic Review and Meta-analysis Protocols, February 2023. http://dx.doi.org/10.37766/inplasy2023.2.0003.

Full text
Abstract:
Review question / Objective: A systematic review of Mental health diagnosis/prognoses of mental disorders using Machine Learning techniques with information from biometric signals. A review of the trend and status of these ML techniques in mental health diagnosis and an investigation of how these signals are used to help increase the efficiency of mental health disease diagnosis. Using Machine learning techniques to classify mental health diseases as against using only expert knowledge for diagnosis. Feature Extraction from signal gotten from biometric signals that help classify sleep disorders. Rationale: To review the application of ML techniques on multimodal and multichannel PSG datasets got from biosensors typically used in the Hospital. To help professionals grasp the steps of using machine learning to classify mental health diseases.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography