Academic literature on the topic 'Multi-Objects perception'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Multi-Objects perception.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Multi-Objects perception":

1

Martín, Francisco, Carlos E. Agüero, and José M. Cañas. "Active Visual Perception for Humanoid Robots." International Journal of Humanoid Robotics 12, no. 01 (March 2015): 1550009. http://dx.doi.org/10.1142/s0219843615500097.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Robots detect and keep track of relevant objects in their environment to accomplish some tasks. Many of them are equipped with mobile cameras as the main sensors, process the images and maintain an internal representation of the detected objects. We propose a novel active visual memory that moves the camera to detect objects in robot's surroundings and tracks their positions. This visual memory is based on a combination of multi-modal filters that efficiently integrates partial information. The visual attention subsystem is distributed among the software components in charge of detecting relevant objects. We demonstrate the efficiency and robustness of this perception system in a real humanoid robot participating in the RoboCup SPL competition.
2

O’Sullivan, James, Jose Herrero, Elliot Smith, Catherine Schevon, Guy M. McKhann, Sameer A. Sheth, Ashesh D. Mehta, and Nima Mesgarani. "Hierarchical Encoding of Attended Auditory Objects in Multi-talker Speech Perception." Neuron 104, no. 6 (December 2019): 1195–209. http://dx.doi.org/10.1016/j.neuron.2019.09.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Han, Dong, Hong Nie, Jinbao Chen, Meng Chen, Zhen Deng, and Jianwei Zhang. "Multi-modal haptic image recognition based on deep learning." Sensor Review 38, no. 4 (September 17, 2018): 486–93. http://dx.doi.org/10.1108/sr-08-2017-0160.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Purpose This paper aims to improve the diversity and richness of haptic perception by recognizing multi-modal haptic images. Design/methodology/approach First, the multi-modal haptic data collected by BioTac sensors from different objects are pre-processed, and then combined into haptic images. Second, a multi-class and multi-label deep learning model is designed, which can simultaneously learn four haptic features (hardness, thermal conductivity, roughness and texture) from the haptic images, and recognize objects based on these features. The haptic images with different dimensions and modalities are provided for testing the recognition performance of this model. Findings The results imply that multi-modal data fusion has a better performance than single-modal data on tactile understanding, and the haptic images with larger dimension are conducive to more accurate haptic measurement. Practical implications The proposed method has important potential application in unknown environment perception, dexterous grasping manipulation and other intelligent robotics domains. Originality/value This paper proposes a new deep learning model for extracting multiple haptic features and recognizing objects from multi-modal haptic images.
4

Lisowski, Józef. "Radar Perception of Multi-Object Collision Risk Neural Domains during Autonomous Driving." Electronics 13, no. 6 (March 13, 2024): 1065. http://dx.doi.org/10.3390/electronics13061065.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The analysis of the state of the literature in the field of methods of perception and control of the movement of autonomous vehicles shows the possibilities of improving them by using an artificial neural network to generate domains of prohibited maneuvers of passing objects, contributing to increasing the safety of autonomous driving in various real conditions of the surrounding environment. This article concerns radar perception, which involves receiving information about the movement of many autonomous objects, then identifying and assigning them a collision risk and preparing a maneuvering response. In the identification process, each object is assigned a domain generated by a previously trained neural network. The size of the domain is proportional to the risk of collisions and distance changes during autonomous driving. Then, an optimal trajectory is determined from among the possible safe paths, ensuring control in a minimum of time. The presented solution to the radar perception task was illustrated with a computer simulation of autonomous driving in a situation of passing many objects. The main achievements presented in this article are the synthesis of a radar perception algorithm mapping the neural domains of autonomous objects characterizing their collision risk and the assessment of the degree of radar perception on the example of multi-object autonomous driving simulation.
5

Li, Yucheng, Fei Wang, Liangze Tao, and Juan Wu. "Multi-Modal Haptic Rendering Based on Genetic Algorithm." Electronics 11, no. 23 (November 24, 2022): 3878. http://dx.doi.org/10.3390/electronics11233878.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Multi-modal haptic rendering is an important research direction to improve realism in haptic rendering. It can produce various mechanical stimuli that render multiple perceptions, such as hardness and roughness. This paper proposes a multi-modal haptic rendering method based on a genetic algorithm (GA), which generates force and vibration stimuli of haptic actuators according to the user’s target hardness and roughness. The work utilizes a back propagation (BP) neural network to implement the perception model f that establishes the mapping (I=f(G)) from objective stimuli features G to perception intensities I. We use the perception model to design the fitness function of GA and set physically achievable constraints in fitness calculation. The perception model is transformed into the force/vibration control model by GA. Finally, we conducted realism evaluation experiments between real and virtual samples under single or multi-mode haptic rendering, where subjects scored 0-100. The average score was 70.86 for multi-modal haptic rendering compared with 57.81 for hardness rendering and 50.23 for roughness rendering, which proved that the multi-modal haptic rendering is more realistic than the single mode. Based on the work, our method can be applied to render objects in more perceptual dimensions, not only limited to hardness and roughness. It has significant implications for multi-modal haptic rendering.
6

Zhou, Wenjun, Tianfei Wang, Xiaoqin Wu, Chenglin Zuo, Yifan Wang, Quan Zhang, and Bo Peng. "Salient Object Detection via Fusion of Multi-Visual Perception." Applied Sciences 14, no. 8 (April 18, 2024): 3433. http://dx.doi.org/10.3390/app14083433.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Salient object detection aims to distinguish the most visually conspicuous regions, playing an important role in computer vision tasks. However, complex natural scenarios can challenge salient object detection, hindering accurate extraction of objects with rich morphological diversity. This paper proposes a novel method for salient object detection leveraging multi-visual perception, mirroring the human visual system’s rapid identification, and focusing on impressive objects/regions within complex scenes. First, a feature map is derived from the original image. Then, salient object detection results are obtained for each perception feature and combined via a feature fusion strategy to produce a saliency map. Finally, superpixel segmentation is employed for precise salient object extraction, removing interference areas. This multi-feature approach for salient object detection harnesses complementary features to adapt to complex scenarios. Competitive experiments on the MSRA10K and ECSSD datasets place our method in the first tier, achieving 0.1302 MAE and 0.9382 F-measure for the MSRA10K dataset and 0.0783 MAE and and 0.9635 F-measure for the ECSSD dataset, demonstrating superior salient object detection performance in complex natural scenarios.
7

Hirsch, Herb L., and Cathleen M. Moore. "Simulating Light Source Motion in Single Images for Enhanced Perceptual Object Detection." Journal of Defense Modeling and Simulation: Applications, Methodology, Technology 9, no. 3 (February 22, 2012): 269–78. http://dx.doi.org/10.1177/1548512911431814.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The SIPHER technique uses mathematically-uncomplicated processing to impart interesting effects upon a static image. Importantly, it renders certain areas of an image more perceptible than others, and draws a human observer’s attention to particular objects or portions of an image scene. By varying coefficients of the processing in a time-ordered sequence, we can create a multi-frame video wherein the frame-to-frame temporal dynamics further enhance human perception of image objects. In this article we first explain the mathematical formulations and present results from applying SIPHER to simple three-dimensional shapes. Then we explore SIPHER’s utility in enhancing visual perception of targets or objects of military interest, in imagery with some typical backgrounds. We also explore how and why these effects enhance human visual perception of the image objects.
8

Marmodoro, Anna, and Matteo Grasso. "THE POWER OF COLOR." American Philosophical Quarterly 57, no. 1 (January 1, 2020): 65–78. http://dx.doi.org/10.2307/48570646.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Are colors features of objects “out there in the world” or are they features of our inner experience and only “in our head?” Color perception has been the focus of extensive philosophical and scientific debate. In this paper we discuss the limitations of the view that Chalmers’ (2006) has characterized as Primitivism, and we develop Marmodoro’s (2006) Constitutionalism further, to provide a metaphysical account of color perception in terms of causal powers. The result is Power-based Constitutionalism, the view that colors are (multi-track and multi-stage) powers of objects, whose (full) manifestations depend on the mutual manifestation of relevant powers of perceivers and the perceived objects being co-realized in mutual interaction. After a presentation of the tenets of Power-based Constitutionalism, we evaluate its strengths in contrast to two other recent power-based accounts: John Heil’s (2003, 2012) powerful qualities view and Max Kistler’s (2017) multi-track view.
9

Wang, Li, Ruifeng Li, Jingwen Sun, Xingxing Liu, Lijun Zhao, Hock Soon Seah, Chee Kwang Quah, and Budianto Tandianus. "Multi-View Fusion-Based 3D Object Detection for Robot Indoor Scene Perception." Sensors 19, no. 19 (September 21, 2019): 4092. http://dx.doi.org/10.3390/s19194092.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
To autonomously move and operate objects in cluttered indoor environments, a service robot requires the ability of 3D scene perception. Though 3D object detection can provide an object-level environmental description to fill this gap, a robot always encounters incomplete object observation, recurring detections of the same object, error in detection, or intersection between objects when conducting detection continuously in a cluttered room. To solve these problems, we propose a two-stage 3D object detection algorithm which is to fuse multiple views of 3D object point clouds in the first stage and to eliminate unreasonable and intersection detections in the second stage. For each view, the robot performs a 2D object semantic segmentation and obtains 3D object point clouds. Then, an unsupervised segmentation method called Locally Convex Connected Patches (LCCP) is utilized to segment the object accurately from the background. Subsequently, the Manhattan Frame estimation is implemented to calculate the main orientation of the object and subsequently, the 3D object bounding box can be obtained. To deal with the detected objects in multiple views, we construct an object database and propose an object fusion criterion to maintain it automatically. Thus, the same object observed in multi-view is fused together and a more accurate bounding box can be calculated. Finally, we propose an object filtering approach based on prior knowledge to remove incorrect and intersecting objects in the object dataset. Experiments are carried out on both SceneNN dataset and a real indoor environment to verify the stability and accuracy of 3D semantic segmentation and bounding box detection of the object with multi-view fusion.
10

Zhu, Jinchao, Xiaoyu Zhang, Shuo Zhang, and Junnan Liu. "Inferring Camouflaged Objects by Texture-Aware Interactive Guidance Network." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 4 (May 18, 2021): 3599–607. http://dx.doi.org/10.1609/aaai.v35i4.16475.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Camouflaged objects, similar to the background, show indefinable boundaries and deceptive textures, which increases the difficulty of detection task and makes the model rely on features with more information. Herein, we design a texture label to facilitate our network for accurate camouflaged object segmentation. Motivated by the complementary relationship between texture labels and camouflaged object labels, we propose an interactive guidance framework named TINet, which focuses on finding the indefinable boundary and the texture difference by progressive interactive guidance. It maximizes the guidance effect of refined multi-level texture cues on segmentation. Specifically, texture perception decoder (TPD) makes a comprehensive analysis of texture information in multiple scales. Feature interaction guidance decoder (FGD) interactively refines multi-level features of camouflaged object detection and texture detection level by level. Holistic perception decoder (HPD) enhances FGD results by multi-level holistic perception. In addition, we propose a boundary weight map to help the loss function pay more attention to the object boundary. Sufficient experiments conducted on COD and SOD datasets demonstrate that the proposed method performs favorably against 23 state-of-the-art methods.

Dissertations / Theses on the topic "Multi-Objects perception":

1

Haddad, Lilas. "Impact of multiple affordances on object perception in natural scenes." Electronic Thesis or Diss., Université de Lille (2022-....), 2023. https://pepite-depot.univ-lille.fr/ToutIDP/EDSHS/2023/2023ULILH060.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
La perception d'objet et la perception d'action sont étroitement liées. La perception visuelle des objets amène à la perception des composants d'actions évoqués par ces objets : les micro-affordances. De nombreuses preuves ont souligné l'existence de ces affordances, correspondant à l'évocation de saisies de différentes mains (main gauche/droite), de saisies de différentes tailles (main entière/pince fine) ou d'orientations de poignet différentes (saisie verticale/horizontale). Cependant, les scènes naturelles sont rarement composées d'un seul mais de plusieurs objets pouvant évoquer multiples affordances. Ces affordances multiples pourraient avoir un impact sur le traitement perceptif des objets. De plus, les objets présentés dans une même scène sont généralement liés sémantiquement, car faisant partie du même contexte. Les relations sémantiques entre les objets pourraient alors moduler la façon dont nous percevons les objets et leurs affordances. Les relations thématiques entre objets (clé-cadenas) sont particulièrement intéressantes car elles partagent des mécanismes neurocognitifs avec la représentation des gestes d'utilisation d'objet. L'objectif de cette thèse est d'étudier les conséquences de l'évocation de multiples affordances sur le traitement perceptif et la sélection d'un objet donné dans des scènes naturelles. Nous avons évalué l'impact de la similarité des affordances sur la sélection d'objets et la façon dont les relations thématiques entre objets modulent le traitement perceptif de ceux-ci. Dans une première étude comportementale en ligne utilisant un paradigme de compatibilité stimulus-réponse, nous avons mis en évidence un coût de traitement lorsque des paires d'objets non-reliées et orientées pour une saisie main gauche/droite évoquaient des affordances similaires. La similarité des affordances ralentissait la sélection de la cible. En outre, le coût engendré par des affordances de saisie mains gauche/droite était limité aux situations pertinentes pour l'action, pour des réponses utilisant la main dominante et lorsque la réponse était compatible avec l'affordance évoqué par l'objet cible. Lors d'une deuxième expérience comportementale en laboratoire utilisant un paradigme similaire dans un environnement 3D, nous avons pu étendre ces premiers résultats aux affordances de tailles de saisies. Nous avons démontré un coût de traitement perceptif lorsque des paires d'objets évoquaient des affordances de taille de saisie similaires. En outre, nous avons mis en évidence une suppression du coût des affordances similaires sur la sélection de la cible lorsque les objets étaient thématiquement liés. Dans une troisième étude neurophysiologique utilisant l'électroencéphalographie, nous avons évalué les corrélats du coût entraîné par les affordances similaires sur la désynchronisation du rythme µ, reflétant l'activité du réseau neuronal moteur au cours de la perception. Les résultats ont révélé que pendant la sélection de la cible, la désynchronisation μ était réduite lorsque les affordances étaient similaires plutôt que dissimilaires. Cet effet disparaissait lorsque les objets étaient thématiquement liés. Dans l'ensemble, les preuves comportementales et neurophysiologiques soutiennent le modèle d'inhibition des affordances de Vainio et Ellis (2020) et Caligiore et al. (2013). Suivant l'hypothèse d'inhibition, un observateur doit inhiber les objets distracteurs pour sélectionner l'objet cible. Lorsque les objets évoquent des affordances similaires, l'inhibition de l'objet distracteur et de ses affordances conduit à l'inhibition automatique de l'affordance de la cible, ralentissant son traitement. Cette thèse fournit de nouvelles preuves comportementales et neuronales en faveur du modèle d'inhibition des affordances et de la sélection d'objets en situation naturelles. En outre, il démontre pour la première fois le rôle des relations sémantiques dans la régulation de l'inhibition des affordances dans des scènes naturelles
Object perception and action perception are closely interrelated. Perceiving visual objects also leads to the perception of various grasping components evoked by the objects, known as micro-affordances. We have numerous pieces of evidence that a single object may evoke micro-affordances such as a right- or left-hand grasp depending on object handle orientation or a power or precision grip depending on object size. However, natural scenes are usually composed of several objects evoking multiple affordances that may impact object perceptual processing. Moreover, objects presented in a common scene are usually semantically related, as they are part of the same context. The semantic relations between objects may then modulate how one perceives objects and their affordances. In this view, thematic relations between objects (e.g., key-lock) are particularly interesting as they share cognitive and neural substrates with use gesture knowledge. The aim of this thesis is to investigate the consequences of the evocation of multiple affordances on the perception and selection of a given object in naturalistic scenes. We investigated how the similarity of affordances would impact object selection and how thematic relations between objects would modulate object perceptual processing. In a first online behavioral study using a stimulus and response compatibility paradigm, we highlighted a processing cost when pairs of unrelated objects had similar right- or left-hand grasp affordances, with the similarity of affordances slowing down target selection. Furthermore, the cost entailed by similar handle affordances was restricted to action relevant situations, when responding with the dominant hand and when the response was compatible with the affordance of the target. In a second behavioral experiment using the stimulus and response compatibility paradigm in a 3D environment, we were able to extend these first findings to other types of micro-affordances (grasp size affordances). Again, we demonstrated a perceptual processing cost when pairs of objects had similar grasp size affordances. Furthermore, we highlighted a suppression of the cost entailed by similar affordances on target selection when objects were thematically related. In a third neurophysiological study using electroencephalography, we evaluated the correlates of the cost entailed by similar affordances on µ rhythm desynchronization, which is assumed to reflect the activity of the motor neural network during perception. Results revealed that during target selection, μ desynchronization was reduced when affordances were similar in comparison to dissimilar. This effect disappeared when objects were thematically related. Overall, behavioral and neurophysiological evidence support the model of affordance inhibition proposed by Vainio and Ellis (2020) and Caligiore et al. (2013). According to the inhibition hypothesis, the observer needs to inhibit distractor objects to select the target object. When the different objects in the scene have similar affordances, inhibition of the distractor object and its affordances leads to the automatic inhibition of the target affordance, which slows down target processing. The present work provides behavioral and neural evidence in favor of the inhibition model of affordance and object selection in more naturalistic scenes involving familiar meaningful objects. In addition, it first demonstrates the role of semantic relations in the regulation of affordance inhibition in naturalistic scenes
2

Vivet, Damien. "Perception de l'environnement par radar hyperfréquence. Application à la localisation et la cartographie simultanées, à la détection et au suivi d'objets mobiles en milieu extérieur." Phd thesis, Université Blaise Pascal - Clermont-Ferrand II, 2011. http://tel.archives-ouvertes.fr/tel-00659270.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Dans le cadre de la robotique mobile extérieure, les notions de perception et de localisation sont essentielles au fonctionnement autonome d'un véhicule. Les objectifs de ce travail de thèse sont multiples et mènent vers un but de localisation et de cartographie simultanée d'un environnement extérieur dynamique avec détection et suivi d'objet mobiles (SLAMMOT) à l'aide d'un unique capteur extéroceptif tournant de type radar dans des conditions de circulation dites "réalistes", c'est-à-dire à haute vitesse soit environ 30 km/h. Il est à noter qu'à de telles vitesses, les données acquises par un capteur tournant son corrompues par le déplacement propre du véhicule. Cette distorsion, habituellement considérée comme une perturbation, est analysée ici comme une source d'information. Cette étude vise également à évaluer les potentialités d'un capteur radar de type FMCW (onde continue modulée en fréquence) pour le fonctionnement d'un véhicule robotique autonome. Nous avons ainsi proposé différentes contributions : - une correction de la distorsion à la volée par capteurs proprioceptifs qui a conduit à une application de localisation et de cartographie simultanées (SLAM), - une méthode d'évaluation de résultats de SLAM basées segment, - une considération de la distorsion des données dans un but proprioceptif menant à une application SLAM, - un principe d'odométrie fondée sur les données Doppler propres au capteur radar, - une méthode de détection et de pistage d'objets mobiles : DATMO avec un unique radar.
3

Chavez, Garcia Ricardo Omar. "Multiple sensor fusion for detection, classification and tracking of moving objects in driving environments." Thesis, Grenoble, 2014. http://www.theses.fr/2014GRENM034/document.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Les systèmes avancés d'assistance au conducteur (ADAS) aident les conducteurs à effectuer des tâches de conduite complexes et à éviter ou atténuer les situations dangereuses. Le véhicule détecte le monde extérieur au moyen de capteurs, et ensuite construit et met à jour un modèle interne de la configuration de l'environnement. La perception de véhicule consiste à établir des relations spatiales et temporelles entre le véhicule et les obstacles statiques et mobiles dans l'environnement. Cette perception se compose de deux tâches principales : la localisation et cartographie simultanées (SLAM) traite de la modélisation de pièces statiques; et la détection et le suivi d'objets en mouvement (DATMO) est responsable de la modélisation des pièces mobiles dans l'environnement. Afin de réaliser un bon raisonnement et contrôle, le système doit modéliser correctement l'environnement. La détection précise et la classification des objets en mouvement est un aspect essentiel d'un système de suivi d'objets. Classification des objets en mouvement est nécessaire pour déterminer le comportement possible des objets entourant le véhicule, et il est généralement réalisée au niveau de suivi des objets. La connaissance de la classe d'objets en mouvement au niveau de la détection peut aider à améliorer leur suivi. La plupart des solutions de perception actuels considèrent informations de classification seulement comme information additional pour la sortie final de la perception. Aussi, la gestion de l'information incomplète est une exigence importante pour les systèmes de perception. Une information incomplète peut être originaire de raisons liées à la détection, tels que les problèmes d calibrage et les dysfonctionnements des capteurs; ou des perturbations de la scène, comme des occlusions, des problèmes de météo et objet déplacement. Les principales contributions de cette thèse se concentrent sur ​​la scène DATMO. Précisément, nous pensons que l'inclusion de la classe de l'objet comme un élément clé de la représentation de l'objet et la gestion de l'incertitude de plusieurs capteurs de détections, peut améliorer les résultats de la tâche de perception. Par conséquent, nous abordons les problèmes de l'association de données, la fusion de capteurs, la classification et le suivi à différents niveaux au sein de la phase de DATMO. Même si nous nous concentrons sur un ensemble de trois capteurs principaux: radar, lidar, et la caméra, nous proposons une architecture modifiables pour inclure un autre type ou nombre de capteurs. Premièrement, nous définissons une représentation composite de l'objet pour inclure des informations de classe et de l'état d'objet deouis le début de la tâche de perception. Deuxièmement, nous proposons, mettre en œuvre, et comparons deux architectures de perception afin de résoudre le problème de DATMO selon le niveau où l'association des objets, la fusion et la classification des informations sont inclus et appliquées. Nos méthodes de fusion de données sont basées sur la théorie de l'evidence, qui est utilisé pour gérer et inclure l'incertitude de la détection du capteur et de la classification des objets. Troisièmement, nous proposons une approche d'association de données bassée en la théorie de l'evidence pour établir une relation entre deux liste des détections d'objets. Quatrièmement, nous intégrons nos approches de fusion dans le cadre d'une application véhicule en temps réel. Cette intégration a été réalisée dans un réelle démonstrateur de véhicule du projet European InteractIVe. Finalement, nous avons analysé et évalué expérimentalement les performances des méthodes proposées. Nous avons comparé notre fusion rapproche les uns contre les autres et contre une méthode state-of-the-art en utilisant des données réelles de scénarios de conduite différents. Ces comparaisons sont concentrés sur la détection, la classification et le suivi des différents objets en mouvement: piétons, vélos, voitures et camions
Advanced driver assistance systems (ADAS) help drivers to perform complex driving tasks and to avoid or mitigate dangerous situations. The vehicle senses the external world using sensors and then builds and updates an internal model of the environment configuration. Vehicle perception consists of establishing the spatial and temporal relationships between the vehicle and the static and moving obstacles in the environment. Vehicle perception is composed of two main tasks: simultaneous localization and mapping (SLAM) deals with modelling static parts; and detection and tracking moving objects (DATMO) is responsible for modelling moving parts in the environment. In order to perform a good reasoning and control, the system has to correctly model the surrounding environment. The accurate detection and classification of moving objects is a critical aspect of a moving object tracking system. Therefore, many sensors are part of a common intelligent vehicle system. Classification of moving objects is needed to determine the possible behaviour of the objects surrounding the vehicle, and it is usually performed at tracking level. Knowledge about the class of moving objects at detection level can help improve their tracking. Most of the current perception solutions consider classification information only as aggregate information for the final perception output. Also, management of incomplete information is an important requirement for perception systems. Incomplete information can be originated from sensor-related reasons, such as calibration issues and hardware malfunctions; or from scene perturbations, like occlusions, weather issues and object shifting. It is important to manage these situations by taking them into account in the perception process. The main contributions in this dissertation focus on the DATMO stage of the perception problem. Precisely, we believe that including the object's class as a key element of the object's representation and managing the uncertainty from multiple sensors detections, we can improve the results of the perception task, i.e., a more reliable list of moving objects of interest represented by their dynamic state and appearance information. Therefore, we address the problems of sensor data association, and sensor fusion for object detection, classification, and tracking at different levels within the DATMO stage. Although we focus on a set of three main sensors: radar, lidar, and camera, we propose a modifiable architecture to include other type or number of sensors. First, we define a composite object representation to include class information as a part of the object state from early stages to the final output of the perception task. Second, we propose, implement, and compare two different perception architectures to solve the DATMO problem according to the level where object association, fusion, and classification information is included and performed. Our data fusion approaches are based on the evidential framework, which is used to manage and include the uncertainty from sensor detections and object classifications. Third, we propose an evidential data association approach to establish a relationship between two sources of evidence from object detections. We observe how the class information improves the final result of the DATMO component. Fourth, we integrate the proposed fusion approaches as a part of a real-time vehicle application. This integration has been performed in a real vehicle demonstrator from the interactIVe European project. Finally, we analysed and experimentally evaluated the performance of the proposed methods. We compared our evidential fusion approaches against each other and against a state-of-the-art method using real data from different driving scenarios. These comparisons focused on the detection, classification and tracking of different moving objects: pedestrian, bike, car and truck
4

Shao, Hang. "A Fast MLP-based Learning Method and its Application to Mine Countermeasure Missions." Thèse, Université d'Ottawa / University of Ottawa, 2012. http://hdl.handle.net/10393/23512.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In this research, a novel machine learning method is designed and applied to Mine Countermeasure Missions. Similarly to some kernel methods, the proposed approach seeks to compute a linear model from another higher dimensional feature space. However, no kernel is used and the feature mapping is explicit. Computation can be done directly in the accessible feature space. In the proposed approach, the feature projection is implemented by constructing a large hidden layer, which differs from traditional belief that Multi-Layer Perceptron is usually funnel-shaped and the hidden layer is used as feature extractor. The proposed approach is a general method that can be applied to various problems. It is able to improve the performance of the neural network based methods and the learning speed of support vector machine. The classification speed of the proposed approach is also faster than that of kernel machines on the mine countermeasure mission task.
5

Asvadi, Alireza. "Multi-Sensor Object Detection for Autonomous Driving." Doctoral thesis, 2018. http://hdl.handle.net/10316/81236.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Thesis submitted to the Department of Electrical and Computer Engineering of the Faculty of Science and Technology of the University of Coimbra in partial fulfillment of the requirements for the Degree of Doctor of Philosophy
Nesta tese é proposto um novo sistema multissensorial de detecção de obstáculos e objetos usando um LIDAR-3D, uma câmara monocular a cores e um sistema de posicionamento baseado em sensores inerciais e GPS, com aplicação a sistemas de condução autónoma. Em primeiro lugar, propõe-se a criação de um sistema de deteção de obstáculos, que incorpora dados 4D (3D espacial + tempo) e é composto por dois módulos principais: (i) uma estimativa do perfil do chão através de uma aproximação planar por partes e (ii) um modelo baseado numa grelha de voxels para a deteção de obstáculos estáticos e dinâmicos recorrendo à informação do próprio movimento do veículo. As funcionalidade do systemo foram posteriormente aumentado para permitir a Deteção e Seguimento de Objetos Móveis (DATMO) permitindo a percepção ao nível do objeto em cenas dinâmicas. De seguida procede-se à fusão dos dados obtidos pelo LIDAR-3D com os dados obtidos por uma câmara para melhorar o desempenho da função de seguimento do sistema DATMO. Em segundo lugar, é proposto um sistema de deteção de objetos baseado nos paradigmas de geração e verificação de hipóteses, usando dados obtidos pelo LIDAR-3D, recorrendo à utilização de redes neurais convolucionais (ConvNets). A geração de hipóteses é realizada aplicando um agrupamento de dados ao nível da nuvem de pontos. Na fase de verificação de hipóteses, é gerado um mapa de profundidade a partir dos dados do LIDAR-3D, sendo que esse mapa é inserido numa ConvNet para a deteção de objetos. Finalmente, é proposta uma detecção multimodal de objetos usando uma rede neuronal híbrida, composta por Deep ConvNets e uma rede neural do tipo Multi-Layer Perceptron (MLP). As modalidades sensoriais consideradas são: mapas de profundidade, mapas de reflectância geradas a partir do LIDAR-3D e imagens a cores. São definidos três detetores de objetos que individualmente, em cada modalidade, recorrendo a uma ConvNet detetam as bounding boxes do objeto. As deteções em cada uma das modalidades são depois consideradas em conjunto e fundidas por uma estratégia de fusão baseada em MLP. O propósito desta fusão é reduzir a taxa de erro na deteção de cada modalidade, o que leva a uma deteção mais precisa. Foram realizadas avaliações quantitativas e qualitativas dos métodos propostos, utilizando conjuntos de dados obtidos a partir dos datasets "Avaliação de Detecção de Objetos" e "Avaliação de Rastreamento de Objetos" do KITTI Vision Benchmark Suite. Os resultados obtidos demonstram a aplicabilidade e a eficiência da abordagem proposta para a deteção de obstáculos e objetos em cenários urbanos.
In this thesis, we propose on-board multisensor obstacle and object detection systems using a 3D-LIDAR, a monocular color camera and a GPS-aided Inertial Navigation System (INS) positioning data, with application in self-driving road vehicles. Firstly, an obstacle detection system is proposed that incorporates 4D data (3D spatial data and time), and composed by two main modules: (i) a ground surface estimation using piecewise planes, and (ii) a voxel grid model for static and moving obstacles detection using ego-motion information. An extension of the proposed obstacle detection system to a Detection And Tracking Moving Object (DATMO) system is proposed to achieve an object-level perception of dynamic scenes, followed by the fusion of 3D-LIDAR with camera data to improve the tracking function of the DATMO system. The obstacle detection we propose is to effectively model dynamic driving environment. The proposed DATMO method is able to deal with the localization error of the position sensing system when computing the motion. The proposed fusion tracking module integrates multiple sensors to improve object tracking. Secondly, an object detection system based on the hypothesis generation and verification paradigms is proposed using 3D-LIDAR data and Convolutional Neural Networks (ConvNets). Hypothesis generation is performed by applying clustering on point cloud data. In the hypothesis verification phase, a depth map is generated using 3D-LIDAR data, and the depth map values are inputted to a ConvNet for object detection. Finally, a multimodal object detection is proposed using a hybrid neural network, composed by deep ConvNets and a Multi-Layer Perceptron (MLP) neural network. Three modalities, depth and reflectance maps (both generated from 3D-LIDAR data) and a color image, are used as inputs. Three deep ConvNet-based object detectors run individually on each modality to detect the object bounding boxes. Detections on each one of the modalities are jointly learned and fused by an MLP-based late-fusion strategy. The purpose of the multimodal detection fusion is to reduce the misdetection rate from each modality, which leads to a more accurate detection. Quantitative and qualitative evaluations were performed using ‘Object Detection Evaluation’ dataset and ‘Object Tracking Evaluation’ based derived datasets from the KITTI Vision Benchmark Suite. Reported results demonstrate the applicability and efficiency of the proposed obstacle and object detection approaches in urban scenarios.

Book chapters on the topic "Multi-Objects perception":

1

Bruder, S., M. Farooq, and M. Bayoumi. "Multi-Sensor Integration for Robots Interacting with Autonomous Objects." In Active Perception and Robot Vision, 395–411. Berlin, Heidelberg: Springer Berlin Heidelberg, 1992. http://dx.doi.org/10.1007/978-3-642-77225-2_20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Porquis, Lope Ben, Masashi Konyo, Naohisa Nagaya, and Satoshi Tadokoro. "Multi-contact Vacuum-Driven Tactile Display for Representing Force Vectors Applied on Grasped Objects." In Haptics: Perception, Devices, Mobility, and Communication, 218–21. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-31404-9_40.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hummel, Emilie, Claudio Pacchierotti, Valérie Gouranton, Ronan Gaugne, Theophane Nicolas, and Anatole Lécuyer. "Haptic Rattle: Multi-modal Rendering of Virtual Objects Inside a Hollow Container." In Haptics: Science, Technology, Applications, 189–97. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-06249-0_22.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
AbstractThe sense of touch plays a strong role in the perception of the properties and characteristics of hollow objects. The action of shaking a hollow container to get an insight of its content is a natural and common interaction. In this paper, we present a multi-modal rendering approach for the simulation of virtual moving objects inside a hollow container, based on the combination of haptic and audio cues generated by voice-coils actuators and high-fidelity headphones, respectively. We conducted a user study. Thirty participants were asked to interact with a target cylindrical hollow object and estimate the number of moving objects inside, relying on haptic feedback only, audio feedback only, or a combination of both. Results indicate that the combination of various senses is important in the perception of the content of a container.
4

Altamirano Cabrera, Miguel, Juan Heredia, and Dzmitry Tsetserukou. "Tactile Perception of Objects by the User’s Palm for the Development of Multi-contact Wearable Tactile Displays." In Haptics: Science, Technology, Applications, 51–59. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-58147-3_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Chen, Jian, Bingxi Jia, and Kaixiang Zhang. "Range Identification of Moving Objects." In Multi-View Geometry Based Visual Perception and Control of Robotic Systems, 17–126. Boca Raton, FL : CRC Press/Taylor &Francis Group, 2017.: CRC Press, 2018. http://dx.doi.org/10.1201/9780429489211-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Chen, Jian, Bingxi Jia, and Kaixiang Zhang. "Motion Estimation of Moving Objects." In Multi-View Geometry Based Visual Perception and Control of Robotic Systems, 127–40. Boca Raton, FL : CRC Press/Taylor &Francis Group, 2017.: CRC Press, 2018. http://dx.doi.org/10.1201/9780429489211-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Imanov, Elbrus, and Zubair Shah. "Applying Multi-layers Feature Fusion in SSD for Detection of Small-Scale Objects." In 11th International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions and Artificial Intelligence - ICSCCW-2021, 552–59. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-92127-9_74.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Blasco, Jose, and Francisco Rovira-Más. "Advances in local perception for orchard robotics." In Burleigh Dodds Series in Agricultural Science, 75–102. Burleigh Dodds Science Publishing, 2024. http://dx.doi.org/10.19103/as.2023.0124.03.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The development of digital technologies, cost pressures and the increasing need for sustainability have heightened interest in the application of robotics and automation to improve the efficiency of agricultural operations. Sensors for autonomous navigation require precise positioning and perception to keep robots on track, avoid obstacles and correctly identify target objects such as fruit. Sensors capable of providing three-dimensional information, such as stereo cameras, time-of-flight cameras and laser scanners, are emerging as effective solutions. Colour, multi- or hyperspectral and thermal cameras are also widely used for real-time crop sensing. This chapter reviews the advantages and limitations of these sensors for practical farming operations.
9

Ababsa, Fakhreddine, Iman Maissa Zendjebil, and Jean-Yves Didier. "3D Camera Tracking for Mixed Reality using Multi-Sensors Technology." In Geographic Information Systems, 2164–75. IGI Global, 2013. http://dx.doi.org/10.4018/978-1-4666-2038-4.ch128.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The concept of Mixed Reality (MR) aims at completing our perception of the real world, by adding fictitious elements that are not perceptible naturally such as: computer generated images, virtual objects, texts, symbols, graphics, sounds, smells, et cetera. One of the major challenges for efficient Mixed Reality system is to ensure the spatiotemporal coherence of the augmented scene between the virtual and the real objects. The quality of the Real/Virtual registration depends mainly on the accuracy of the 3D camera pose estimation. The goal of this chapter is to provide an overview on the recent multi-sensor fusion approaches used in Mixed Reality systems for the 3D camera tracking. We describe the main sensors used in those approaches and we detail the issues surrounding their use (calibration process, fusion strategies, etc.). We include the description of some Mixed Reality techniques developed these last years and which use multi-sensor technology. Finally, we highlight new directions and open problems in this research field.
10

Yu, Hong, Zhiyue Wang, Yuanqiu Liu, and Han Liu. "Boosting Visual Question Answering Through Geometric Perception and Region Features." In Frontiers in Artificial Intelligence and Applications. IOS Press, 2023. http://dx.doi.org/10.3233/faia230607.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Visual question answering (VQA) is a crucial yet challenging task in multimodal understanding. To correctly answer questions about an image, VQA models are required to comprehend the fine-grained semantics of both the image and the question. Recent advances have shown that both grid and region features contribute to improving the VQA performance, while grid features surprisingly outperform region features. However, grid features will inevitably induce visual semantic noise due to fine granularity. Besides, the ignorance of geometric relationship makes VQA models difficult to understand the object relative positions in the image and answer questions accurately. In this paper, we propose a visual enhancement network for VQA that leverages region features and position information to enhance grid features, thus generating richer visual grid semantics. First, the grid enhancement multi-head guided-attention module utilizes regions around the grid to provide visual context, forming rich visual grid semantics and effectively compensating for the fine granularity of the grid. Second, a novel geometric perception multi-head self-attention is introduced to process two types of features, incorporating geometric relations such as relative direction between objects while exploring internal semantic interactions. Extensive experiments demonstrate that the proposed method can obtain competitive results over other strong baselines.

Conference papers on the topic "Multi-Objects perception":

1

Wang, Yi Ru, Yuchi Zhao, Haoping Xu, Sagi Eppel, Alán Aspuru-Guzik, Florian Shkurti, and Animesh Garg. "MVTrans: Multi-View Perception of Transparent Objects." In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023. http://dx.doi.org/10.1109/icra48891.2023.10161089.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Nayeem, Rashida, Salah Bazzi, Mohsen Sadeghi, Reza Sharif Razavian, and Dagmar Sternad. "Multi-modal Interactive Perception in Human Control of Complex Objects." In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023. http://dx.doi.org/10.1109/icra48891.2023.10160375.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Luciani, Annie, Sile O'Modhrain, Charlotte Magnusson, Jean-Loup Florens, and Damien Couroussé. "Perception of Virtual Multi-Sensory Objects: Some Musings on the Enactive Approach." In 2008 International Conference on Cyberworlds (CW). IEEE, 2008. http://dx.doi.org/10.1109/cw.2008.107.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Ji, Jia-Hui, Yu Zhao, Jing-Wen Bu, and Tao Zhang. "Point Cloud Holographic Encryption Display System involving 3D Face Recognition and air-writing." In 3D Image Acquisition and Display: Technology, Perception and Applications. Washington, D.C.: Optica Publishing Group, 2023. http://dx.doi.org/10.1364/3d.2023.jw2a.22.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
We propose a holographic display system involving face recognition, air-writing, and multiple point cloud gridding encryption (M-PCGE) methods to provide multi-level security for objects. The feasibility of the proposed methods is confirmed by numerical reconstruction.
5

Amiri, Saeid, Suhua Wei, Shiqi Zhang, Jivko Sinapov, Jesse Thomason, and Peter Stone. "Multi-modal Predicate Identification using Dynamically Learned Robot Controllers." In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/645.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Intelligent robots frequently need to explore the objects in their working environments. Modern sensors have enabled robots to learn object properties via perception of multiple modalities. However, object exploration in the real world poses a challenging trade-off between information gains and exploration action costs. Mixed observability Markov decision process (MOMDP) is a framework for planning under uncertainty, while accounting for both fully and partially observable components of the state. Robot perception frequently has to face such mixed observability. This work enables a robot equipped with an arm to dynamically construct query-oriented MOMDPs for multi-modal predicate identification (MPI) of objects. The robot's behavioral policy is learned from two datasets collected using real robots. Our approach enables a robot to explore object properties in a way that is significantly faster while improving accuracies in comparison to existing methods that rely on hand-coded exploration strategies.
6

Martin Martin, Roberto, and Oliver Brock. "Online interactive perception of articulated objects with multi-level recursive estimation based on task-specific priors." In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014). IEEE, 2014. http://dx.doi.org/10.1109/iros.2014.6942902.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Levine, Evan, Can Chen, Manuel Martinello, Mahdi Nezamabadi, Siu-Kei Tin, Jinwei Ye, and Francisco Imai. "Challenges and solutions in 3D object capture: High-precision multi-view camera calibration using a rotating state; and 3D reconstruction of mirror-like objects using efficient ray coding." In 3D Image Acquisition and Display: Technology, Perception and Applications. Washington, D.C.: OSA, 2017. http://dx.doi.org/10.1364/3d.2017.dw4f.2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Skaza, Maciej. "Between virtuality and reality: remarks about perception of city architecture." In Virtual City and Territory. Barcelona: Centre de Política de Sòl i Valoracions, 2016. http://dx.doi.org/10.5821/ctv.8055.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In the contemporary reality the term "diversity" has become the basic feature that characterizes both creation and perception of the surrounding world. Trying to describe the city as the place to live of the half of the Earth's population faces the same problem that occurs during attempts to define styles or tendencies in architecture, urbanism or each other area of human activity. Therefore it is not possible to indicate one model of the contemporary city, and to determine its appropriate scale, structure and function. Considering complexity of contemporaneity, it’s multi–layering and a variety of possible reference points (named here "perception"), the only element which can be identified as prevalent in discussion about the city is man. Developing space in our cities is followed by the continuous development of the parallel virtual world. Perhaps it is still too early to name it "virtual reality", comprehended in the way in which we perceive the reality around us. It doesn’t change the fact, that fragments of electronic space, acting as digital memory, change our perception of architecture and cities. Currently the technology development affects Homo Sapiens much more than other factors in the environment where we live. One can ask, whether this new reality won’t entirely replace the need of direct contact with the real world. The city and its architecture is perceived through electronic prostheses. The surrounding world ceases to be perceived in a natural way and images of images become objects of human perception. The intention of these considerations is not to answer these questions, but to focus attention on problems arising from the change of perceiving architecture.
9

Faykus, Max Henry, Bradley Selee, and Melissa Smith. "Utilizing Neural Networks for Semantic Segmentation on RGB/LiDAR Fused Data for Off-road Autonomous Military Vehicle Perception." In WCX SAE World Congress Experience. 400 Commonwealth Drive, Warrendale, PA, United States: SAE International, 2023. http://dx.doi.org/10.4271/2023-01-0740.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
<div class="section abstract"><div class="htmlview paragraph">Image segmentation has historically been a technique for analyzing terrain for military autonomous vehicles. One of the weaknesses of image segmentation from camera data is that it lacks depth information, and it can be affected by environment lighting. Light detection and ranging (LiDAR) is an emerging technology in image segmentation that is able to estimate distances to the objects it detects. One advantage of LiDAR is the ability to gather accurate distances regardless of day, night, shadows, or glare. This study examines LiDAR and camera image segmentation fusion to improve an advanced driver-assistance systems (ADAS) algorithm for off-road autonomous military vehicles. The volume of points generated by LiDAR provides the vehicle with distance and spatial data surrounding the vehicle. Processing these point clouds with semantic segmentation is a computationally intensive process requiring fusion of camera and LiDAR data so that the neural network can process depth and image data simultaneously. We create fused depth images by using a projection method from the LiDAR onto the images to create depth images (RGB-Depth). A neural network is trained to segment the fused data from RELLIS-3D, which is a multi-modal data set for off road robotics. This data set contains both LiDAR point clouds and corresponding RGB images for training the neural network. The labels from the data set are grouped as objects, traversable terrain, non-traversable terrain, and sky to balance underrepresented classes. Results on a modified version of DeepLabv3+ with a ResNet-18 backbone achieves an overall accuracy of 93.989 percent.</div></div>
10

Wippelhauser, András, Arpita Chand, Somak Datta Gupta, and Andras Varadi. "Performance and Network Architecture Options of Consolidated Object Data Service for Multi-RAT Vehicular Communication." In WCX SAE World Congress Experience. 400 Commonwealth Drive, Warrendale, PA, United States: SAE International, 2023. http://dx.doi.org/10.4271/2023-01-0857.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
<div class="section abstract"><div class="htmlview paragraph">With the proliferation of ADAS and autonomous systems, the quality and quantity of the data to be used by vehicles has become crucial. In-vehicle sensors are evolving, but their usability is limited to their field of view and detection distance. V2X communication systems solve these issues by creating a cooperative perception domain amongst road users and the infrastructure by communicating accurate, real-time information.</div><div class="htmlview paragraph">In this paper, we propose a novel Consolidated Object Data Service (CODS) for multi-Radio Access Technology (RAT) V2X communication. This service collects information using BSM packets from the vehicular network and perception information from infrastructure-based sensors. The service then fuses the collected data, offering the communication participants with a consolidated, deduplicated, and accurate object database. Since fusing the objects is resource intensive, this service can save in-vehicle computation costs. The combination of diverse input sources improves the object detection accuracy, which can benefit the vehicle's ADAS or autonomous driving functions.</div><div class="htmlview paragraph">A testbed was developed to evaluate the performance of the system under three network architectures – local RSU, Edge and the Cloud. The CODS resided in virtual machines in the corresponding three locations. The OBUs and RSU used had multi-RAT (C-V2X PC5 and 5G Uu) connectivity. A connected thermal camera was used as the infrastructure sensor in our setup.</div><div class="htmlview paragraph">The paper presents the performance evaluation of various CODS realizations, deployment details of the testbed on a live network and introduces our promising experimental results, explaining the trade-offs of the different deployment schemes and their effects on system fidelity and communication characteristics.</div></div>

To the bibliography