Relevant bibliographies by topics / Multimodal object tracking

Journal articles
Dissertations / Theses
Book chapters
Conference papers

Academic literature on the topic 'Multimodal object tracking'

Author: Grafiati

Published: 2 November 2022

Last updated: 27 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Multimodal object tracking.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Multimodal object tracking"

Li, Kai, Lihua Cai, Guangjian He, and Xun Gong. "MATI: Multimodal Adaptive Tracking Integrator for Robust Visual Object Tracking." Sensors 24, no. 15 (2024): 4911. http://dx.doi.org/10.3390/s24154911.

Full text

Abstract:

Visual object tracking, pivotal for applications like earth observation and environmental monitoring, encounters challenges under adverse conditions such as low light and complex backgrounds. Traditional tracking technologies often falter, especially when tracking dynamic objects like aircraft amidst rapid movements and environmental disturbances. This study introduces an innovative adaptive multimodal image object-tracking model that harnesses the capabilities of multispectral image sensors, combining infrared and visible light imagery to significantly enhance tracking accuracy and robustness

APA, Harvard, Vancouver, ISO, and other styles

Zhang, Kunpeng, Yanheng Liu, Fang Mei, Jingyi Jin, and Yiming Wang. "Boost Correlation Features with 3D-MiIoU-Based Camera-LiDAR Fusion for MODT in Autonomous Driving." Remote Sensing 15, no. 4 (2023): 874. http://dx.doi.org/10.3390/rs15040874.

Full text

Abstract:

Three-dimensional (3D) object tracking is critical in 3D computer vision. It has applications in autonomous driving, robotics, and human–computer interaction. However, methods for using multimodal information among objects to increase multi-object detection and tracking (MOT) accuracy remain a critical focus of research. Therefore, we present a multimodal MOT framework for autonomous driving boost correlation multi-object detection and tracking (BcMODT) in this research study to provide more trustworthy features and correlation scores for real-time detection tracking using both camera and LiDA

APA, Harvard, Vancouver, ISO, and other styles

Zhang, Liwei, Jiahong Lai, Zenghui Zhang, Zhen Deng, Bingwei He, and Yucheng He. "Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information." Complexity 2020 (September 25, 2020): 1–10. http://dx.doi.org/10.1155/2020/8810340.

Full text

Abstract:

Multiobject Tracking (MOT) is one of the most important abilities of autonomous driving systems. However, most of the existing MOT methods only use a single sensor, such as a camera, which has the problem of insufficient reliability. In this paper, we propose a novel Multiobject Tracking method by fusing deep appearance features and motion information of objects. In this method, the locations of objects are first determined based on a 2D object detector and a 3D object detector. We use the Nonmaximum Suppression (NMS) algorithm to combine the detection results of the two detectors to ensure th

APA, Harvard, Vancouver, ISO, and other styles

Hu, Xiantao, Ying Tai, Xu Zhao, et al. "Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 4 (2025): 3581–89. https://doi.org/10.1609/aaai.v39i4.32372.

Full text

Abstract:

Multimodal tracking has garnered widespread attention as a result of its ability to effectively address the inherent limitations of traditional RGB tracking. However, existing multimodal trackers mainly focus on the fusion and enhancement of spatial features or merely leverage the sparse temporal relationships between video frames. These approaches do not fully exploit the temporal correlations in multimodal videos, making it difficult to capture the dynamic changes and motion information of targets in complex scenarios. To alleviate this problem, we propose a unified multimodal spatial-tempor

APA, Harvard, Vancouver, ISO, and other styles

Ye, Ping, Gang Xiao, and Jun Liu. "Multimodal Features Alignment for Vision–Language Object Tracking." Remote Sensing 16, no. 7 (2024): 1168. http://dx.doi.org/10.3390/rs16071168.

Full text

Abstract:

Vision–language tracking presents a crucial challenge in multimodal object tracking. Integrating language features and visual features can enhance target localization and improve the stability and accuracy of the tracking process. However, most existing fusion models in vision–language trackers simply concatenate visual and linguistic features without considering their semantic relationships. Such methods fail to distinguish the target’s appearance features from the background, particularly when the target changes dramatically. To address these limitations, we introduce an innovative technique

APA, Harvard, Vancouver, ISO, and other styles

Yao, Rui, Jiazhu Qiu, Yong Zhou, et al. "Visible and Infrared Object Tracking Based on Multimodal Hierarchical Relationship Modeling." Image Analysis and Stereology 43, no. 1 (2024): 41–51. http://dx.doi.org/10.5566/ias.3124.

Full text

Abstract:

Visible RGB and Thermal infrared (RGBT) object tracking has emerged as a prominent area of focus within the realm of computer vision. Nevertheless, the majority of existing RGBT tracking methods, which predominantly rely on Transformers, primarily emphasize the enhancement of features extracted by convolutional neural networks. Unfortunately, the latent potential of Transformers in representation learning has been inadequately explored. Furthermore, most studies tend to overlook the significance of distinguishing between the importance of each modality in the context of multimodal tasks. In th

APA, Harvard, Vancouver, ISO, and other styles

Cao, Bing, Junliang Guo, Pengfei Zhu, and Qinghua Hu. "Bi-directional Adapter for Multimodal Tracking." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 2 (2024): 927–35. http://dx.doi.org/10.1609/aaai.v38i2.27852.

Full text

Abstract:

Due to the rapid development of computer vision, single-modal (RGB) object tracking has made significant progress in recent years. Considering the limitation of single imaging sensor, multi-modal images (RGB, infrared, etc.) are introduced to compensate for this deficiency for all-weather object tracking in complex environments. However, as acquiring sufficient multi-modal tracking data is hard while the dominant modality changes with the open environment, most existing techniques fail to extract multi-modal complementary information dynamically, yielding unsatisfactory tracking performance. T

APA, Harvard, Vancouver, ISO, and other styles

Fu, Teng, Haiyang Yu, Ke Niu, Bin Li, and Xiangyang Xue. "Foundation Model Driven Appearance Extraction for Robust Multiple Object Tracking." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 3 (2025): 3031–39. https://doi.org/10.1609/aaai.v39i3.32311.

Full text

Abstract:

Multiple Object Tracking (MOT) is a fundamental task in computer vision. Existing methods utilize motion information or appearance information to perform object tracking. However, these algorithms still struggle with special circumstances, such as occlusion and blurring in complex scenes. Inspired by the fact that people can pinpoint objects through verbal descriptions, we explore performing long-term robust tracking using semantic features of objects. Motivated by the success of the multimodal foundation model in text-image alignment, we reconsider the appearance feature extraction module in

APA, Harvard, Vancouver, ISO, and other styles

Jang, Eunseong, Sang Jun Lee, and HyungGi Jo. "A New Multimodal Map Building Method Using Multiple Object Tracking and Gaussian Process Regression." Remote Sensing 16, no. 14 (2024): 2622. http://dx.doi.org/10.3390/rs16142622.

Full text

Abstract:

Recent advancements in simultaneous localization and mapping (SLAM) have significantly improved the handling of dynamic objects. Traditionally, SLAM systems mitigate the impact of dynamic objects by extracting, matching, and tracking features. However, in real-world scenarios, dynamic object information critically influences decision-making processes in autonomous navigation. To address this, we present a novel approach for incorporating dynamic object information into map representations, providing valuable insights for understanding movement context and estimating collision risks. Our method

APA, Harvard, Vancouver, ISO, and other styles

Kota, John S., and Antonia Papandreou-Suppappola. "Joint Design of Transmit Waveforms for Object Tracking in Coexisting Multimodal Sensing Systems." Sensors 19, no. 8 (2019): 1753. http://dx.doi.org/10.3390/s19081753.

Full text

Abstract:

We examine a multiple object tracking problem by jointly optimizing the transmit waveforms used in a multimodal system. Coexisting sensors in this system were assumed to share the same spectrum. Depending on the application, a system can include radars tracking multiple targets or multiuser wireless communications and a radar tracking both multiple messages and a target. The proposed spectral coexistence approach was based on designing all transmit waveforms to have the same time-varying phase function while optimizing desirable performance metrics. Considering the scenario of tracking a targe

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Multimodal object tracking"

De, goussencourt Timothée. "Système multimodal de prévisualisation “on set” pour le cinéma." Thesis, Université Grenoble Alpes (ComUE), 2016. http://www.theses.fr/2016GREAT106/document.

Full text

Abstract:

La previz on-set est une étape de prévisualisation qui a lieu directement pendant la phase de tournage d’un film à effets spéciaux. Cette proposition de prévisualisation consiste à montrer au réalisateur une vue assemblée du plan final en temps réel. Le travail présenté dans cette thèse s’intéresse à une étape spécifique de la prévisualisation : le compositing. Cette étape consiste à mélanger plusieurs sources d’images pour composer un plan unique et cohérent. Dans notre cas, il s’agit de mélanger une image de synthèse avec une image issue de la caméra présente sur le plateau de tournage. Les

APA, Harvard, Vancouver, ISO, and other styles

Mozaffari, Maaref Mohammad Hamed. "A Real-Time and Automatic Ultrasound-Enhanced Multimodal Second Language Training System: A Deep Learning Approach." Thesis, Université d'Ottawa / University of Ottawa, 2020. http://hdl.handle.net/10393/40477.

Full text

Abstract:

The critical role of language pronunciation in communicative competence is significant, especially for second language learners. Despite renewed awareness of the importance of articulation, it remains a challenge for instructors to handle the pronunciation needs of language learners. There are relatively scarce pedagogical tools for pronunciation teaching and learning, such as inefficient, traditional pronunciation instructions like listening and repeating. Recently, electronic visual feedback (EVF) systems (e.g., medical ultrasound imaging) have been exploited in new approaches in such a way

APA, Harvard, Vancouver, ISO, and other styles

Khalidov, Vasil. "Modèles de mélanges conjugués pour la modélisation de la perception visuelle et auditive." Grenoble, 2010. http://www.theses.fr/2010GRENM064.

Full text

Abstract:

Dans cette thèse, nous nous intéressons à la modélisation de la perception audio-visuelle avec une tête robotique. Les problèmes associés, notamment la calibration audio-visuelle, la détection, la localisation et le suivi d'objets audio-visuels sont étudiés. Une approche spatio-temporelle de calibration d'une tête robotique est proposée, basée sur une mise en correspondance probabiliste multimodale des trajectoires. Le formalisme de modèles de mélange conjugué est introduit ainsi qu'une famille d'algorithmes d'optimisation efficaces pour effectuer le regroupement multimodal. Un cas particulier

APA, Harvard, Vancouver, ISO, and other styles

ur, Réhman Shafiq. "Expressing emotions through vibration for perception and control." Doctoral thesis, Umeå universitet, Institutionen för tillämpad fysik och elektronik, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-32990.

Full text

Abstract:

This thesis addresses a challenging problem: “how to let the visually impaired ‘see’ others emotions”. We, human beings, are heavily dependent on facial expressions to express ourselves. A smile shows that the person you are talking to is pleased, amused, relieved etc. People use emotional information from facial expressions to switch between conversation topics and to determine attitudes of individuals. Missing emotional information from facial expressions and head gestures makes the visually impaired extremely difficult to interact with others in social events. To enhance the visually impair

APA, Harvard, Vancouver, ISO, and other styles

Rodríguez, Florez Sergio Alberto. "Contributions by vision systems to multi-sensor object localization and tracking for intelligent vehicles." Compiègne, 2010. http://www.theses.fr/2010COMP1910.

Full text

Abstract:

Les systèmes d’aide à la conduite peuvent améliorer la sécurité routière en aidant les utilisateurs via des avertissements de situations dangereuses ou en déclenchant des actions appropriées en cas de collision imminente (airbags, freinage d’urgence, etc). Dans ce cas, la connaissance de la position et de la vitesse des objets mobiles alentours constitue une information clé. C’est pourquoi, dans ce travail, nous nous focalisons sur la détection et le suivi d’objets dans une scène dynamique. En remarquant que les systèmes multi-caméras sont de plus en plus présents dans les véhicules et en sach

APA, Harvard, Vancouver, ISO, and other styles

Sattarov, Egor. "Etude et quantification de la contribution des systèmes de perception multimodale assistés par des informations de contexte pour la détection et le suivi d'objets dynamiques." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLS354.

Full text

Abstract:

Cette thèse a pour but d'étudier et de quantifier la contribution de la perception multimodale assistée par le contexte pour détecter et suivre des objets en mouvement. Cette étude sera appliquée à la détection et la reconnaissance des objets pertinents dans les environnements de la circulation pour les véhicules intelligents (VI). Les résultats à obtenir devront permettre de transposer le concept proposé à un ensemble plus large de capteurs et de classes d'objets en utilisant une approche système intégrative qui implique des méthodes d'apprentissage. En particulier, ces méthodes d'apprentissa

APA, Harvard, Vancouver, ISO, and other styles

Duarte, Diogo Ferreira. "The Multi-Object Tracking with Multimodal Information for Autonomous Surface Vehicles." Master's thesis, 2022. https://hdl.handle.net/10216/140667.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Multimodal object tracking"

Landabaso, José Luis, and Montse Pardàs. "Foreground Regions Extraction and Characterization Towards Real-Time Object Tracking." In Machine Learning for Multimodal Interaction. Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11677482_21.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Mulhollan, Zachary, Marco Gamarra, Anthony Vodacek, and Matthew Hoffman. "Essential Properties of a Multimodal Hypersonic Object Detection and Tracking System." In Lecture Notes in Computer Science. Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-52670-1_3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Mao, Chen, Chong Tan, Hong Liu, Jingqi Hu, and Min Zheng. "Stereo3DMOT: Stereo Vision Based 3D Multi-object Tracking with Multimodal ReID." In Pattern Recognition and Computer Vision. Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-8555-5_39.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Auer, Peter, Barbara Laner, Martin Pfeiffer, and Kerstin Botsch. "Noticing and assessing nature." In Studies in Language and Social Interaction. John Benjamins Publishing Company, 2024. http://dx.doi.org/10.1075/slsi.36.09aue.

Full text

Abstract:

We analyze how walkers employ a verbal format, i.e., the combination of a perception imperative followed by a wie ‘how’-exclamative (e.g., KUCK ma wie TRAUMhaft das is; ‘look PTCL how wonderful that is’), in its multimodal embedding, thus contributing to a multimodal extension of interactional linguistics. The analysis heavily relies on mobile eye-tracking as a method to collect naturally occurring data. It is argued that this kind of analysis would not be possible without the use of this novel technology. We focus on the role of the verbal format in the process of transforming individual perc

APA, Harvard, Vancouver, ISO, and other styles

Diao, Qian, Jianye Lu, Wei Hu, Yimin Zhang, and Gary Bradski. "DBN Models for Visual Tracking and Prediction." In Bayesian Network Technologies. IGI Global, 2007. http://dx.doi.org/10.4018/978-1-59904-141-4.ch009.

Full text

Abstract:

In a visual tracking task, the object may exhibit rich dynamic behavior in complex environments that can corrupt target observations via background clutter and occlusion. Such dynamics and background induce nonlinear, nonGaussian and multimodal observation densities. These densities are difficult to model with traditional methods such as Kalman filter models (KFMs) due to their Gaussian assumptions. Dynamic Bayesian networks (DBNs) provide a more general framework in which to solve these problems. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-

APA, Harvard, Vancouver, ISO, and other styles

"Software for Automatic Gaze and Face/Object Tracking and its Use for Early Diagnosis of Autism Spectrum Disorders." In Multimodal Interactive Systems Management. EPFL Press, 2014. http://dx.doi.org/10.1201/b15535-14.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bisson Martin, Cheriet Farida, and Parent Stefan. "3D visualization tool for minimally invasive discectomy assistance." In Studies in Health Technology and Informatics. IOS Press, 2010. https://doi.org/10.3233/978-1-60750-573-0-55.

Full text

Abstract:

Multimodal fusion of 2D thoracoscopic images with a pre-operative 3D anatomical model of the spine is useful for minimally invasive surgical procedures using an angled monocular endoscope with varying focal length. An offline calibration procedure has been developed to compute initial endoscope parameters, such as lens distortion, focal length and optical center before surgery. An optical tracking system is used to update extrinsic parameters describing the position and orientation of the endoscope in real-time during the procedure. This calibration allows the registration of the thoracoscopic

APA, Harvard, Vancouver, ISO, and other styles

Tung, Tony, and Takashi Matsuyama. "Visual Tracking Using Multimodal Particle Filter." In Computer Vision. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-5204-8.ch044.

Full text

Abstract:

Visual tracking of humans or objects in motion is a challenging problem when observed data undergo appearance changes (e.g., due to illumination variations, occlusion, cluttered background, etc.). Moreover, tracking systems are usually initialized with predefined target templates, or trained beforehand using known datasets. Hence, they are not always efficient to detect and track objects whose appearance changes over time. In this paper, we propose a multimodal framework based on particle filtering for visual tracking of objects under challenging conditions (e.g., tracking various human body p

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Multimodal object tracking"

Borges, Eduardo, Luís Garrote, and Urbano Nunes. "A Modular Multimodal Multi-Object Tracking-by-Detection Approach, with Applications in Outdoor and Indoor Environments." In 21st International Conference on Informatics in Control, Automation and Robotics. SCITEPRESS - Science and Technology Publications, 2024. http://dx.doi.org/10.5220/0013073200003822.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Kukal, Rupanjali, Jay Patravali, Fuxun Yu, Simranjit Singh, Nikolaos Karianakis, and Rishi Madhok. "Click&Describe: Multimodal Grounding and Tracking for Aerial Objects." In 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE, 2025. https://doi.org/10.1109/wacv61041.2025.00586.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Muresan, Mircea Paul, and Sergiu Nedevschi. "Multimodal sparse LIDAR object tracking in clutter." In 2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP). IEEE, 2018. http://dx.doi.org/10.1109/iccp.2018.8516646.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Li, Xinlin, Osama A. Hanna, Christina Fragouli, Suhas Diggavi, Gunjan Verma, and Joydeep Bhattacharyya. "Feature Compression for Multimodal Multi-Object Tracking." In MILCOM 2023 - 2023 IEEE Military Communications Conference (MILCOM). IEEE, 2023. http://dx.doi.org/10.1109/milcom58377.2023.10356289.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Morrison, Katelyn, Daniel Yates, Maya Roman, and William W. Clark. "Using Object Tracking Techniques to Non-Invasively Measure Thoracic Rotation Range of Motion." In ICMI '20: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. ACM, 2020. http://dx.doi.org/10.1145/3395035.3425189.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Perez, Marc, and Antonio Agudo. "Robust Multimodal and Multi-Object Tracking for Autonomous Driving Applications." In 2023 21st International Conference on Advanced Robotics (ICAR). IEEE, 2023. http://dx.doi.org/10.1109/icar58858.2023.10406433.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hu, Zhe-Kai, Sin-Ye Jhong, Hao-Wei Hwang, Shih-Hsuan Lin, Kai-Lung Hua, and Yung-Yao Chen. "Bi-Directional Bird’s-Eye View Features Fusion for 3D Multimodal Object Detection and Tracking." In 2023 International Automatic Control Conference (CACS). IEEE, 2023. http://dx.doi.org/10.1109/cacs60074.2023.10326208.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Perez, Marc. "Sensor-Agnostic Multimodal Fusion for Multiple Object Tracking from Camera, Radar, Lidar and V2X." In FISITA - Technology and Mobility Conference Europe 2023. FISITA, 2023. http://dx.doi.org/10.46720/fwc2023-sca-018.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Vyawahare, Vikram S., and Richard T. Stone. "Asymmetric Interface and Interactions for Bimanual Virtual Assembly With Haptics." In ASME 2012 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2012. http://dx.doi.org/10.1115/detc2012-71543.

Full text

Abstract:

This paper discusses development of a new bimanual interface configuration for virtual assembly consisting of a haptic device at one hand and a 6DOF tracking device at the other hand. The two devices form a multimodal interaction configuration facilitating unique interactions for virtual assembly. Tasks for virtual assembly can consist of both “one hand one object” and “bimanual single object” interactions. For one hand one object interactions this device configuration offers advantages in terms of increased manipulation workspace and provides a tradeoff between the cost effectiveness and mode

APA, Harvard, Vancouver, ISO, and other styles

Valverde, Francisco Rivera, Juana Valeria Hurtado, and Abhinav Valada. "There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge." In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021. http://dx.doi.org/10.1109/cvpr46437.2021.01144.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

Contents

Academic literature on the topic 'Multimodal object tracking'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Journal articles on the topic "Multimodal object tracking"

Dissertations / Theses on the topic "Multimodal object tracking"

Book chapters on the topic "Multimodal object tracking"

Conference papers on the topic "Multimodal object tracking"