Relevant bibliographies by topics / Multimodal processing

Academic literature on the topic 'Multimodal processing'

Author: Grafiati

Published: 1 June 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Multimodal processing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers
Reports

Journal articles on the topic "Multimodal processing":

Ng, Vincent, and Shengjie Li. "Multimodal Propaganda Processing." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 13 (June 26, 2023): 15368–75. http://dx.doi.org/10.1609/aaai.v37i13.26792.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Propaganda campaigns have long been used to influence public opinion via disseminating biased and/or misleading information. Despite the increasing prevalence of propaganda content on the Internet, few attempts have been made by AI researchers to analyze such content. We introduce the task of multimodal propaganda processing, where the goal is to automatically analyze propaganda content. We believe that this task presents a long-term challenge to AI researchers and that successful processing of propaganda could bring machine understanding one important step closer to human understanding. We discuss the technical challenges associated with this task and outline the steps that need to be taken to address it.

Sinke, Christopher, Janina Neufeld, Daniel Wiswede, Hinderk M. Emrich, Stefan Bleich, and Gregor R. Szycik. "Multisensory processing in synesthesia — differences in the EEG signal during uni- and multimodal processing." Seeing and Perceiving 25 (2012): 53. http://dx.doi.org/10.1163/187847612x646749.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Synesthesia is a condition in which stimulation in one processing stream (e.g., letters or music) leads to perception in an unstimulated processing stream (e.g., colors). Behavioral differences in mutisensory processing have been shown for multimodal illusions, but the differences in neural processing are still unclear. In the present study, we examined uni- and multimodal processing in 14 people with synesthesia and 13 controls using EEG recordings and a simple detection task. Stimuli were either presented acoustically, visually or multimodaly (simultaneous visual and auditory stimulation). In the multimodal condition, auditory and visual stimuli were either matching or mismatching (e.g., a lion either roaring or ringing). The subjects had to press a button as soon as something was presented visually or acoustically. Results: ERPs revealed occipital group differences in the negative amplitude between 100 and 200 ms after stimulus presentation. Relative to controls, synesthetes showed an increased negative component peaking around 150 ms. This group difference is found in all visual conditions. Unimodal acoustical stimulation leads to increased negative amplitude in synesthetes in the same time window over parietal and visual electrodes. Overall this shows that processing in the occipital lobe is different in synesthetes independent of the stimulated modality. In addition, differences in the negative amplitude between processing of incongruent and congruent multimodal stimuli could be detected in the same time window between synesthetes and controls over left frontal sites. This shows that also multimodal integration processes are different in synesthetes.

D'Ulizia, Arianna, Fernando Ferri, and Patrizia Grifoni. "Generating Multimodal Grammars for Multimodal Dialogue Processing." IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans 40, no. 6 (November 2010): 1130–45. http://dx.doi.org/10.1109/tsmca.2010.2041227.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Barricelli, Barbara Rita, Piero Mussio, Marco Padula, and Paolo Luigi Scala. "TMS for multimodal information processing." Multimedia Tools and Applications 54, no. 1 (April 27, 2010): 97–120. http://dx.doi.org/10.1007/s11042-010-0527-x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Parsons, Aaron D., Stephen W. T. Price, Nicola Wadeson, Mark Basham, Andrew M. Beale, Alun W. Ashton, J. Frederick W. Mosselmans, and Paul D. Quinn. "Automatic processing of multimodal tomography datasets." Journal of Synchrotron Radiation 24, no. 1 (January 1, 2017): 248–56. http://dx.doi.org/10.1107/s1600577516017756.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

With the development of fourth-generation high-brightness synchrotrons on the horizon, the already large volume of data that will be collected on imaging and mapping beamlines is set to increase by orders of magnitude. As such, an easy and accessible way of dealing with such large datasets as quickly as possible is required in order to be able to address the core scientific problems during the experimental data collection. Savu is an accessible and flexible big data processing framework that is able to deal with both the variety and the volume of data of multimodal and multidimensional scientific datasets output such as those from chemical tomography experiments on the I18 microfocus scanning beamline at Diamond Light Source.

Holler, Judith, and Stephen C. Levinson. "Multimodal Language Processing in Human Communication." Trends in Cognitive Sciences 23, no. 8 (August 2019): 639–52. http://dx.doi.org/10.1016/j.tics.2019.05.006.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Farzin, Faraz, Eric P. Charles, and Susan M. Rivera. "Development of Multimodal Processing in Infancy." Infancy 14, no. 5 (September 1, 2009): 563–78. http://dx.doi.org/10.1080/15250000903144207.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Zhang, Ge, Tianxiang Luo, Witold Pedrycz, Mohammed A. El-Meligy, Mohamed Abdel Fattah Sharaf, and Zhiwu Li. "Outlier Processing in Multimodal Emotion Recognition." IEEE Access 8 (2020): 55688–701. http://dx.doi.org/10.1109/access.2020.2981760.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Metaxakis, Athanasios, Dionysia Petratou, and Nektarios Tavernarakis. "Multimodal sensory processing in Caenorhabditis elegans." Open Biology 8, no. 6 (June 2018): 180049. http://dx.doi.org/10.1098/rsob.180049.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Multisensory integration is a mechanism that allows organisms to simultaneously sense and understand external stimuli from different modalities. These distinct signals are transduced into neuronal signals that converge into decision-making neuronal entities. Such decision-making centres receive information through neuromodulators regarding the organism's physiological state and accordingly trigger behavioural responses. Despite the importance of multisensory integration for efficient functioning of the nervous system, and also the implication of dysfunctional multisensory integration in the aetiology of neuropsychiatric disease, little is known about the relative molecular mechanisms. Caenorhabditis elegans is an appropriate model system to study such mechanisms and elucidate the molecular ways through which organisms understand external environments in an accurate and coherent fashion.

Nock, Harriet J., Giridharan Iyengar, and Chalapathy Neti. "Multimodal processing by finding common cause." Communications of the ACM 47, no. 1 (January 1, 2004): 51. http://dx.doi.org/10.1145/962081.962105.

Full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Multimodal processing":

Cadène, Rémi. "Deep Multimodal Learning for Vision and Language Processing." Electronic Thesis or Diss., Sorbonne université, 2020. http://www.theses.fr/2020SORUS277.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Les technologies du numérique ont joué un rôle déterminant dans la transformation de notre société. Des méthodes statistiques récentes ont été déployées avec succès afin d’automatiser le traitement de la quantité croissante d’images, de vidéos et de textes que nous produisons quotidiennement. En particulier, les réseaux de neurones profonds ont été adopté par les communautés de la vision par ordinateur et du traitement du langage naturel pour leur capacité à interpréter le contenu des images et des textes une fois entraînés sur de grands ensembles de données. Les progrès réalisés dans les deux communautés ont permis de jeter les bases de nouveaux problèmes de recherche à l’intersection entre vision et langage. Dans la première partie de cette thèse, nous nous concentrons sur des moteurs de recherche multimodaux images-textes. Nous proposons une stratégie d’apprentissage pour aligner efficacement les deux modalités tout en structurant l’espace de recherche avec de l’information sémantique. Dans la deuxième partie, nous nous concentrons sur des systèmes capables de répondre à toute question sur une image. Nous proposons une architecture multimodale qui fusionne itérativement les modalités visuelles et textuelles en utilisant un modèle bilinéaire factorisé, tout en modélisant les relations par paires entre chaque région de l’image. Dans la dernière partie, nous abordons les problèmes de biais dans la modélisation. Nous proposons une stratégie d’apprentissage réduisant les biais linguistiques généralement présents dans les systèmes de réponse aux questions visuelles
Digital technologies have become instrumental in transforming our society. Recent statistical methods have been successfully deployed to automate the processing of the growing amount of images, videos, and texts we produce daily. In particular, deep neural networks have been adopted by the computer vision and natural language processing communities for their ability to perform accurate image recognition and text understanding once trained on big sets of data. Advances in both communities built the groundwork for new research problems at the intersection of vision and language. Integrating language into visual recognition could have an important impact on human life through the creation of real-world applications such as next-generation search engines or AI assistants.In the first part of this thesis, we focus on systems for cross-modal text-image retrieval. We propose a learning strategy to efficiently align both modalities while structuring the retrieval space with semantic information. In the second part, we focus on systems able to answer questions about an image. We propose a multimodal architecture that iteratively fuses the visual and textual modalities using a factorized bilinear model while modeling pairwise relationships between each region of the image. In the last part, we address the issues related to biases in the modeling. We propose a learning strategy to reduce the language biases which are commonly present in visual question answering systems

Hu, Yongtao, and 胡永涛. "Multimodal speaker localization and identification for video processing." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2014. http://hdl.handle.net/10722/212633.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Chen, Xun. "Multimodal biomedical signal processing for corticomuscular coupling analysis." Thesis, University of British Columbia, 2014. http://hdl.handle.net/2429/45811.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Corticomuscular coupling analysis using multiple data sets such as electroencepha-logram (EEG) and electromyogram (EMG) signals provides a useful tool for understanding human motor control systems. A popular conventional method to assess corticomuscular coupling is the pair-wise magnitude-squared coherence (MSC). However, there are certain limitations associated with MSC, including the difficulty in robustly assessing group inference, only dealing with two types of data sets simultaneously and the biologically implausible assumption of pair-wise interactions. In this thesis, we propose several novel signal processing techniques to overcome the disadvantages of current coupling analysis methods. We propose combining partial least squares (PLS) and canonical correlation analysis (CCA) to take advantage of both techniques to ensure that the extracted components are maximally correlated across two data sets and meanwhile can well explain the information within each data set. Furthermore, we propose jointly incorporating response-relevance and statistical independence into a multi-objective optimization function, meaningfully combining the goals of independent component analysis (ICA) and PLS under the same mathematical umbrella. In addition, we extend the coupling analysis to multiple data sets by proposing a joint multimodal group analysis framework. Finally, to acquire independent components but not just uncorrelated ones, we improve the multimodal framework by exploiting the complementary property of multiset canonical correlation analysis (M-CCA) and joint ICA. Simulations show that our proposed methods can achieve superior performances than conventional approaches. We also apply the proposed methods to concurrent EEG, EMG and behavior data collected in a Parkinson's disease (PD) study. The results reveal highly correlated temporal patterns among the multimodal signals and corresponding spatial activation patterns. In addition to the expected motor areas, the corresponding spatial activation patterns demonstrate enhanced occipital connectivity in PD subjects, consistent with previous medical findings.

Sadr, Lahijany Nadi. "Multimodal Signal Processing for Diagnosis of Cardiorespiratory Disorders." Thesis, The University of Sydney, 2017. http://hdl.handle.net/2123/17636.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This thesis addresses the use of multimodal signal processing to develop algorithms for the automated processing of two cardiorespiratory disorders. The aim of the first application of this thesis was to reduce false alarm rate in an intensive care unit. The goal was to detect five critical arrhythmias using processing of multimodal signals including photoplethysmography, arterial blood pressure, Lead II and augmented right arm electrocardiogram (ECG). A hierarchical approach was used to process the signals as well as a custom signal processing technique for each arrhythmia type. Sleep disorders are a prevalent health issue, currently costly and inconvenient to diagnose, as they normally require an overnight hospital stay by the patient. In the second application of this project, we designed automated signal processing algorithms for the diagnosis of sleep apnoea with a main focus on the ECG signal processing. We estimated the ECG-derived respiratory (EDR) signal using different methods: QRS-complex area, principal component analysis (PCA) and kernel PCA. We proposed two algorithms (segmented PCA and approximated PCA) for EDR estimation to enable applying the PCA method to overnight recordings and rectify the computational issues and memory requirement. We compared the EDR information against the chest respiratory effort signals. The performance was evaluated using three automated machine learning algorithms of linear discriminant analysis (LDA), extreme learning machine (ELM) and support vector machine (SVM) on two databases: the MIT PhysioNet database and the St. Vincent’s database. The results showed that the QRS area method for EDR estimation combined with the LDA classifier was the highest performing method and the EDR signals contain respiratory information useful for discriminating sleep apnoea. As a final step, heart rate variability (HRV) and cardiopulmonary coupling (CPC) features were extracted and combined with the EDR features and temporal optimisation techniques were applied. The cross-validation results of the minute-by-minute apnoea classification achieved an accuracy of 89%, a sensitivity of 90%, a specificity of 88%, and an AUC of 0.95 which is comparable to the best results reported in the literature.

Elshaw, Mark. "Multimodal neural grounding of language processing for robot actions." Thesis, University of Sunderland, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.420517.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Friedel, Paul. "Sensory information processing : detection, feature extraction, & multimodal integration." kostenfrei, 2008. http://mediatum2.ub.tum.de/doc/651333/651333.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Sadeghi, Ghandehari Soroush. "Multimodal signal processing in the peripheral and central vestibular pathways." Thesis, McGill University, 2009. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=95559.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The vestibular sensory apparatus and associated vestibular nuclei encode head movements during our daily activities. In addition to direct inputs from vestibular afferents, the vestibular nuclei receive substantial projections from cortical. cerebellar, spinal cord. and other brainstem structures. The present studies were aimed at investigating the coding strategies and the signals carried by the peripheral and central vestibular neurons under normal conditions and following vestibular compensation. In normal animals, we first studied the coding strategies of regular and irregular afferents using information theoretic measures over the range of functionally relevant frequenciesand found differences between the two types of afferents as a result of different variability in the resting discharge (i.e., noise) and sensitivity (i.e., signal). We found that regular afferents carry information mostly in the spike times of their discharge, whereas irregular afferents carry information mostly in their firing rate and at higher frequency range of stimuli, thus acting as event detectors. We next studied the signals carried by the vestibular-nerve afferents either as a result of direct vestibular stimulation or through efferent fibers, in normal conditions and following vestibular lesion. We first showed that the efferent vestibular system is functional in the alert monkey. In order to address the functional role of the efferent system. we then characterized the responses of vestibular afferents evoked by a wide range of stimuli. We found that vestibular afferents did not encode extravestibular signals and that their response properties do not change significantly following lesion. Thus the question of the functional role of the vestibular efferent system remains open. In addition our findings demonstrate that the vestibular periphery (afferents and efferents) do not show the plasticity required to support vestibular compensation. Finally. we studied the central vestibular
Les organes sensoriels vestibulaires de l’oreille interne détectent les mouvements de la tète dans r espace. Ces informations sont envoyées aux neurones vestibulaires centraux localises au niveau du tronc cérébral. A ce niveau convergent également d'autres signaux en provenance du cortex, du cervelet. de la moelle ainsi que de divers noyaux du tronc cérébral. Les études présentées ici ont pour but de comprendre le mode de codage et la nature des signaux générés par les neurones vestibulaires périphériques, ainsi que les capacités de traitement des neurones vestibulaires centraux. véritable centres d'intégration sensori-motrice. Ces travaux ont été conduits en condition physiologique et physiopathologique sur le modèle de la compensation vestibulaire. A r aide de mesures issues de la théorie de l'information, nous nous sommes tout d'abord intéresse aux codages effectues par Ies afférences vestibulaires régulières et irrégulières. Ces deux types neuronaux différent notamment par la variabilité de leur fréquence de décharge spontanée (bruit) et leurs sensibilités (signal). Nous avons montre que Ies fibres afférentes régulières utilisent un codage temporel alors que les fibres irrégulières fonctionnent essentiellement sur un codage en modulation de la fréquence, et ce d' autant mieux que les fréquences sont élevées, constituant ainsi de véritables détecteurs d'évènements. Nous avons ensuite étudie les réponses des afférences suite a une stimulation vestibulaire directe ou a une activation du « système efférent ». En conditions physiologiques, nous avons tout d'abord pu démontrer que le système efférent est bien fonctionneI chez le singe éveille. fr

Fateri, Sina. "Advanced signal processing techniques for multimodal ultrasonic guided wave response." Thesis, Brunel University, 2015. http://bura.brunel.ac.uk/handle/2438/11657.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Ultrasonic technology is commonly used in the eld of Non-Destructive Testing (NDT) of metal structures such as steel, aluminium, etc. Compared to ultrasonic bulk waves that travel in infinite media with no boundary influence, Ultrasonic Guided Waves (UGWs) require a structural boundary for propagation such that they can be used to inspect and monitor long elements of a structure from a single position. The greatest challenges for any UGW system are the plethora of wave modes arising from the geometry of the structural element which propagate with a range of frequency dependent velocities and the interpretation of these combined signals reflected by discontinuities in the structural element. In this thesis, a technique is developed which facilitates the measurement of Time of Arrival (ToA) and group velocity dispersion curves of wave modes for one dimensional structures as far as wave propagation is concerned. A second technique is also presented which employs the dispersion curves to deliver enhanced range measurements in complex multimodal UGW responses. Ultimately, the aforementioned techniques are used as a part of the analysis of previously unreported signals arising from interactions of UGWs with piezoelectric transducers. The first signal processing technique is presented which used a combination of frequency-sweep measurement, sampling rate conversion and the Fourier transform. The technique is applied to synthesized and experimental data in order to identify different wave modes in complex UGW signals. It is demonstrated that the technique has the capability to derive the ToA and group velocity dispersion curve of the wave modes of interest. The second signal processing technique uses broad band excitation, dispersion compensation and cross-correlation. The technique is applied to synthesized and experimental data in order to identify different wave modes in complex UGW signals. It is demonstrated that the technique noticeably improves the Signal to Noise Ratio (SNR) of the UGW response using a priori knowledge of the dispersion curve. It is also able to derive accurate quantitative information about the ToA and the propagation distance. During the development of the aforementioned signal processing techniques, some unwanted wave-packets are identified in the UGW responses which are found to be induced by the coupling of a shear mode piezoelectric transducer at the free edge of the waveguide. Accordingly, the effect of the force on the piezoelectric transducers and the corresponding reflections and mode conversions are studied experimentally. The aforementioned signal processing techniques are also employed as a part of the study. A Finite Element Analysis (FEA) procedure is also presented which can potentially improve the theoretical predictions and converge to results found in experimental routines. The approach enhances the con dence in the FEA models compared to traditional approaches. The outcome of the research conducted in this thesis paves the way to enhance the reliability of UGW inspections by utilizing the signal processing techniques and studying the multimodal responses.

Caglayan, Ozan. "Multimodal Machine Translation." Thesis, Le Mans, 2019. http://www.theses.fr/2019LEMA1016/document.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

La traduction automatique vise à traduire des documents d’une langue à une autre sans l’intervention humaine. Avec l’apparition des réseaux de neurones profonds (DNN), la traduction automatique neuronale(NMT) a commencé à dominer le domaine, atteignant l’état de l’art pour de nombreuses langues. NMT a également ravivé l’intérêt pour la traduction basée sur l’interlangue grâce à la manière dont elle place la tâche dans un cadre encodeur-décodeur en passant par des représentations latentes. Combiné avec la flexibilité architecturale des DNN, ce cadre a aussi ouvert une piste de recherche sur la multimodalité, ayant pour but d’enrichir les représentations latentes avec d’autres modalités telles que la vision ou la parole, par exemple. Cette thèse se concentre sur la traduction automatique multimodale(MMT) en intégrant la vision comme une modalité secondaire afin d’obtenir une meilleure compréhension du langage, ancrée de façon visuelle. J’ai travaillé spécifiquement avec un ensemble de données contenant des images et leurs descriptions traduites, où le contexte visuel peut être utile pour désambiguïser le sens des mots polysémiques, imputer des mots manquants ou déterminer le genre lors de la traduction vers une langue ayant du genre grammatical comme avec l’anglais vers le français. Je propose deux approches principales pour intégrer la modalité visuelle : (i) un mécanisme d’attention multimodal qui apprend à prendre en compte les représentations latentes des phrases sources ainsi que les caractéristiques visuelles convolutives, (ii) une méthode qui utilise des caractéristiques visuelles globales pour amorcer les encodeurs et les décodeurs récurrents. Grâce à une évaluation automatique et humaine réalisée sur plusieurs paires de langues, les approches proposées se sont montrées bénéfiques. Enfin,je montre qu’en supprimant certaines informations linguistiques à travers la dégradation systématique des phrases sources, la véritable force des deux méthodes émerge en imputant avec succès les noms et les couleurs manquants. Elles peuvent même traduire lorsque des morceaux de phrases sources sont entièrement supprimés
Machine translation aims at automatically translating documents from one language to another without human intervention. With the advent of deep neural networks (DNN), neural approaches to machine translation started to dominate the field, reaching state-ofthe-art performance in many languages. Neural machine translation (NMT) also revived the interest in interlingual machine translation due to how it naturally fits the task into an encoder-decoder framework which produces a translation by decoding a latent source representation. Combined with the architectural flexibility of DNNs, this framework paved the way for further research in multimodality with the objective of augmenting the latent representations with other modalities such as vision or speech, for example. This thesis focuses on a multimodal machine translation (MMT) framework that integrates a secondary visual modality to achieve better and visually grounded language understanding. I specifically worked with a dataset containing images and their translated descriptions, where visual context can be useful forword sense disambiguation, missing word imputation, or gender marking when translating from a language with gender-neutral nouns to one with grammatical gender system as is the case with English to French. I propose two main approaches to integrate the visual modality: (i) a multimodal attention mechanism that learns to take into account both sentence and convolutional visual representations, (ii) a method that uses global visual feature vectors to prime the sentence encoders and the decoders. Through automatic and human evaluation conducted on multiple language pairs, the proposed approaches were demonstrated to be beneficial. Finally, I further show that by systematically removing certain linguistic information from the input sentences, the true strength of both methods emerges as they successfully impute missing nouns, colors and can even translate when parts of the source sentences are completely removed

Fridman, Linnea, and Victoria Nordberg. "Two Multimodal Image Registration Approaches for Positioning Purposes." Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157424.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This report is the result of a master thesis made by two students at Linköping University. The aim was to find an image registration method for visual and infrared images and to find an error measure for grading the registration performance. In practice this could be used for position determination by registering the infrared image taken at the current position to a set of visual images with known positions and determining which visual image matches the best. Two methods were tried, using different image feature extractors and different ways to match the features. The first method used phase information in the images to generate soft features and then minimised the square error of the optical flow equation to estimate the transformation between the visual and infrared image. The second method used the Canny edge detector to extract hard features from the images and Chamfer distance as an error measure. Both methods were evaluated for registration as well as position determination and yielded promising results. However, the performance of both methods was image dependent. The soft edge method proved to be more robust and precise and worked better than the hard edge method for both registration and position determination.

More sources

Books on the topic "Multimodal processing":

Renals, Steve, Herve Bourlard, Jean Carletta, and Andrei Popescu-Belis, eds. Multimodal Signal Processing. Cambridge: Cambridge University Press, 2009. http://dx.doi.org/10.1017/cbo9781139136310.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Maragos, Petros, Alexandros Potamianos, and Patrick Gros, eds. Multimodal Processing and Interaction. Boston, MA: Springer US, 2008. http://dx.doi.org/10.1007/978-0-387-76316-3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Renals, Steve. Multimodal signal processing: Human interactions in meetings. Cambridge: Cambridge University Press, 2012.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Thiran, Jean-Philippe, and Hervé Bourlard. Multimodal signal processing: Theory and applications for human-computer interaction. Amsterdam: Academic, 2010.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Gavrilova, Marina L. Multimodal biometrics and intelligent image processing for security systems. Hershey, PA: Information Science Reference, 2013.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

1959-, Grifoni Patrizia, ed. Multimodal human computer interaction and pervasive services. Hershey PA: Information Science Reference, 2009.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Sappa, Angel D. Multimodal Interaction in Image and Video Applications. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Engelkamp, Johannes. Human memory: A multimodal approach. Seattle: Hogrefe & Huber Publishers, 1994.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Adams, Teresa M. Guidelines for the implementation of multimodal transportation location referencing systems. Washington, D.C: National Academy Press, 2001.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Kühnel, Christine. Quantifying Quality Aspects of Multimodal Interactive Systems. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Book chapters on the topic "Multimodal processing":

Huang, Lihe. "Collecting and processing multimodal data." In Toward Multimodal Pragmatics, 99–108. London: Routledge, 2021. http://dx.doi.org/10.4324/9781003251774-5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Gullberg, Marianne. "Studying Multimodal Language Processing." In The Routledge Handbook of Second Language Acquisition and Psycholinguistics, 137–49. New York: Routledge, 2022. http://dx.doi.org/10.4324/9781003018872-14.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Waibel, Alex, Minh Tue Vo, Paul Duchnowski, and Stefan Manke. "Multimodal Interfaces." In Integration of Natural Language and Vision Processing, 299–319. Dordrecht: Springer Netherlands, 1996. http://dx.doi.org/10.1007/978-94-009-1716-3_9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Rivet, Bertrand, and Jonathon Chambers. "Multimodal Speech Separation." In Advances in Nonlinear Speech Processing, 1–11. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-11509-7_1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Néel, Françoise D., and Wolfgang M. Minker. "Multimodal Speech Systems." In Computational Models of Speech Pattern Processing, 404–30. Berlin, Heidelberg: Springer Berlin Heidelberg, 1999. http://dx.doi.org/10.1007/978-3-642-60087-6_34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Maragos, Petros, Patrick Gros, Athanassios Katsamanis, and George Papandreou. "Cross-Modal Integration for Performance Improving in Multimedia: A Review." In Multimodal Processing and Interaction, 1–46. Boston, MA: Springer US, 2008. http://dx.doi.org/10.1007/978-0-387-76316-3_1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Ferecatu, Marin, Nozha Boujemaa, and Michel Crucianu. "Interactive Image Retrieval Using a Hybrid Visual and Conceptual Content Representation." In Multimodal Processing and Interaction, 1–20. Boston, MA: Springer US, 2008. http://dx.doi.org/10.1007/978-0-387-76316-3_10.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Neumayer, Robert, and Andreas Rauber. "Multimodal Analysis of Text and Audio Features for Music Information Retrieval." In Multimodal Processing and Interaction, 1–17. Boston, MA: Springer US, 2008. http://dx.doi.org/10.1007/978-0-387-76316-3_11.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Petrakis, Euripides G. M. "Intelligent Search for Image Information on the Web through Text and Link Structure Analysis." In Multimodal Processing and Interaction, 1–17. Boston, MA: Springer US, 2008. http://dx.doi.org/10.1007/978-0-387-76316-3_12.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Potamianos, Alexandros, and Manolis Perakakis. "IDesign Principles for Multimodal Spoken Dialogue Systems." In Multimodal Processing and Interaction, 1–18. Boston, MA: Springer US, 2008. http://dx.doi.org/10.1007/978-0-387-76316-3_13.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Multimodal processing":

Potamianos, Alexandros. "Cognitive Multimodal Processing." In the 2014 Workshop. New York, New York, USA: ACM Press, 2014. http://dx.doi.org/10.1145/2666253.2666264.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Johnston, Michael. "Multimodal language processing." In 5th International Conference on Spoken Language Processing (ICSLP 1998). ISCA: ISCA, 1998. http://dx.doi.org/10.21437/icslp.1998-278.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Yang, Lixin, Genshe Chen, Ronghua Xu, Sherry Chen, and Yu Chen. "Decentralized autonomous imaging data processing using blockchain." In Multimodal Biomedical Imaging XIV, edited by Fred S. Azar, Xavier Intes, and Qianqian Fang. SPIE, 2019. http://dx.doi.org/10.1117/12.2513243.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Barricelli, Barbara Rita, Marco Padula, and Paolo Luigi Scala. "TMS for Multimodal Information Processing." In 2009 20th International Workshop on Database and Expert Systems Application. IEEE, 2009. http://dx.doi.org/10.1109/dexa.2009.34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Zhu, Junnan, Haoran Li, Tianshang Liu, Yu Zhou, Jiajun Zhang, and Chengqing Zong. "MSMO: Multimodal Summarization with Multimodal Output." In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/d18-1448.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Damian, Ionut, Michael Dietz, Frank Gaibler, and Elisabeth André. "Social signal processing for dummies." In ICMI '16: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2993148.2998527.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Kokkinidis, K., A. Stergiaki, and A. Tsagaris. "Machine learning via multimodal signal processing." In 2017 6th International Conference on Modern Circuits and Systems Technologies (MOCAST). IEEE, 2017. http://dx.doi.org/10.1109/mocast.2017.7937653.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Oviatt, Sharon. "Multimodal system processing in mobile environments." In the 13th annual ACM symposium. New York, New York, USA: ACM Press, 2000. http://dx.doi.org/10.1145/354401.354408.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Kharinov, Mikhail V., and Aleksandr N. Bykov. "Data Structure for Multimodal Signal Processing." In 2019 International Russian Automation Conference. IEEE, 2019. http://dx.doi.org/10.1109/rusautocon.2019.8867769.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bangalore, Srinivas, and Michael Johnston. "Robust gesture processing for multimodal interaction." In the 10th international conference. New York, New York, USA: ACM Press, 2008. http://dx.doi.org/10.1145/1452392.1452439.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Multimodal processing":

Gazzaniga, Michael S. Multimodal Interactions in Sensory-Motor Processing. Fort Belvoir, VA: Defense Technical Information Center, June 1992. http://dx.doi.org/10.21236/ada255780.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hughes, H. C., P. A. Reuter-Lorenz, R. Fendrich, G. Nozawa, and M. S. Gazzaniga. Multimodal Interactions in Sensory-Motor Processing. Fort Belvoir, VA: Defense Technical Information Center, September 1990. http://dx.doi.org/10.21236/ada229111.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Lewandowski, Lawrence J., Susan B. Hursh, and David A. Kobus. Multimodal versus Unimodal Information Processing of Words. Fort Belvoir, VA: Defense Technical Information Center, July 1985. http://dx.doi.org/10.21236/ada160517.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Varshney, Pramod K. Multimodal Signal Processing for Personnel Detection and Activity Classification for Indoor Surveillance. Fort Belvoir, VA: Defense Technical Information Center, November 2013. http://dx.doi.org/10.21236/ada606602.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hamlin, Alexandra, Erik Kobylarz, James Lever, Susan Taylor, and Laura Ray. Assessing the feasibility of detecting epileptic seizures using non-cerebral sensor. Engineer Research and Development Center (U.S.), December 2021. http://dx.doi.org/10.21079/11681/42562.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This paper investigates the feasibility of using non-cerebral, time-series data to detect epileptic seizures. Data were recorded from fifteen patients (7 male, 5 female, 3 not noted, mean age 36.17 yrs), five of whom had a total of seven seizures. Patients were monitored in an inpatient setting using standard video electroencephalography (vEEG), while also wearing sensors monitoring electrocardiography, electrodermal activity, electromyography, accelerometry, and audio signals (vocalizations). A systematic and detailed study was conducted to identify the sensors and the features derived from the non-cerebral sensors that contribute most significantly to separability of data acquired during seizures from non-seizure data. Post-processing of the data using linear discriminant analysis (LDA) shows that seizure data are strongly separable from non-seizure data based on features derived from the signals recorded. The mean area under the receiver operator characteristic (ROC) curve for each individual patient that experienced a seizure during data collection, calculated using LDA, was 0.9682. The features that contribute most significantly to seizure detection differ for each patient. The results show that a multimodal approach to seizure detection using the specified sensor suite is promising in detecting seizures with both sensitivity and specificity. Moreover, the study provides a means to quantify the contribution of each sensor and feature to separability. Development of a non-electroencephalography (EEG) based seizure detection device would give doctors a more accurate seizure count outside of the clinical setting, improving treatment and the quality of life of epilepsy patients.

Lee, W. S., Victor Alchanatis, and Asher Levi. Innovative yield mapping system using hyperspectral and thermal imaging for precision tree crop management. United States Department of Agriculture, January 2014. http://dx.doi.org/10.32747/2014.7598158.bard.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Original objectives and revisions – The original overall objective was to develop, test and validate a prototype yield mapping system for unit area to increase yield and profit for tree crops. Specific objectives were: (1) to develop a yield mapping system for a static situation, using hyperspectral and thermal imaging independently, (2) to integrate hyperspectral and thermal imaging for improved yield estimation by combining thermal images with hyperspectral images to improve fruit detection, and (3) to expand the system to a mobile platform for a stop-measure- and-go situation. There were no major revisions in the overall objective, however, several revisions were made on the specific objectives. The revised specific objectives were: (1) to develop a yield mapping system for a static situation, using color and thermal imaging independently, (2) to integrate color and thermal imaging for improved yield estimation by combining thermal images with color images to improve fruit detection, and (3) to expand the system to an autonomous mobile platform for a continuous-measure situation. Background, major conclusions, solutions and achievements -- Yield mapping is considered as an initial step for applying precision agriculture technologies. Although many yield mapping systems have been developed for agronomic crops, it remains a difficult task for mapping yield of tree crops. In this project, an autonomous immature fruit yield mapping system was developed. The system could detect and count the number of fruit at early growth stages of citrus fruit so that farmers could apply site-specific management based on the maps. There were two sub-systems, a navigation system and an imaging system. Robot Operating System (ROS) was the backbone for developing the navigation system using an unmanned ground vehicle (UGV). An inertial measurement unit (IMU), wheel encoders and a GPS were integrated using an extended Kalman filter to provide reliable and accurate localization information. A LiDAR was added to support simultaneous localization and mapping (SLAM) algorithms. The color camera on a Microsoft Kinect was used to detect citrus trees and a new machine vision algorithm was developed to enable autonomous navigations in the citrus grove. A multimodal imaging system, which consisted of two color cameras and a thermal camera, was carried by the vehicle for video acquisitions. A novel image registration method was developed for combining color and thermal images and matching fruit in both images which achieved pixel-level accuracy. A new Color- Thermal Combined Probability (CTCP) algorithm was created to effectively fuse information from the color and thermal images to classify potential image regions into fruit and non-fruit classes. Algorithms were also developed to integrate image registration, information fusion and fruit classification and detection into a single step for real-time processing. The imaging system achieved a precision rate of 95.5% and a recall rate of 90.4% on immature green citrus fruit detection which was a great improvement compared to previous studies. Implications – The development of the immature green fruit yield mapping system will help farmers make early decisions for planning operations and marketing so high yield and profit can be achieved.

Federal Information Processing Standards Publication: detail specification for 62.5?m core diameter125-?m cladding diameter class IA multimode, graded-index optical waveguide fibers. Gaithersburg, MD: National Institute of Standards and Technology, 1989. http://dx.doi.org/10.6028/nist.fips.159.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Full text

APA, Harvard, Vancouver, ISO, and other styles

Academic literature on the topic 'Multimodal processing'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Contents

Journal articles on the topic "Multimodal processing":

Dissertations / Theses on the topic "Multimodal processing":

Books on the topic "Multimodal processing":

Book chapters on the topic "Multimodal processing":

Conference papers on the topic "Multimodal processing":

Reports on the topic "Multimodal processing":