Academic literature on the topic 'Computer Vision, Object Recognition, Vision and Scene Understanding, Object Detection'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Computer Vision, Object Recognition, Vision and Scene Understanding, Object Detection.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Computer Vision, Object Recognition, Vision and Scene Understanding, Object Detection"

1

Achirei, Ștefan-Daniel. "Short Literature Review for Visual Scene Understanding." Bulletin of the Polytechnic Institute of Iași. Electrical Engineering, Power Engineering, Electronics Section 67, no. 3 (September 1, 2021): 57–72. http://dx.doi.org/10.2478/bipie-2021-0017.

Full text
Abstract:
Abstract Individuals are highly accurate for visually understanding natural scenes. By extracting and extrapolating data we reach the highest stage of scene understanding. In the past few years it proved to be an essential part in computer vision applications. It goes further than object detection by bringing machine perceiving closer to the human one: integrates meaningful information and extracts semantic relationships and patterns. Researchers in computer vision focused on scene understanding algorithms, the aim being to obtain semantic knowledge from the environment and determine the properties of objects and the relations between them. For applications in robotics, gaming, assisted living, augmented reality, etc a fundamental task is to be aware of spatial position and capture depth information. First part of this paper focuses on deep learning solutions for scene recognition following the main leads: low-level features and object detection. In the second part we present extensively the most relevant datasets for visual scene understanding. We take into consideration both directions having in mind future applications.
APA, Harvard, Vancouver, ISO, and other styles
2

Singh, Ankita. "Face Mask Detection using Deep Learning to Manage Pandemic Guidelines." Journal of Management and Service Science (JMSS) 1, no. 2 (2021): 1–21. http://dx.doi.org/10.54060/jmss/001.02.003.

Full text
Abstract:
The field of Computer Vision is a branch of science of the computers and systems of software in which one can visualize and as well as comprehend the images and scenes given in the input. This field is consisting of numerous aspects for example image recognition, the detection of objects, generation of images, image super resolution and more others. Object detection is broadly utilized for the detection of faces, the detection of vehicles, counting of pedestrians on a certain street, images displayed on the web, security systems and cars with the feature of self-driving. This process also encompasses the precision of every technique for recognizing the objects. The detection of objects is a crucial task; however, it is also a very challenging vision task. It is an analytical subdivide of various applications such as searching of images, image auto-annotation or scene understanding and tracking of various objects. The tracking of objects in motion of video image sequence was one of the most important subjects in computer vision.
APA, Harvard, Vancouver, ISO, and other styles
3

Bao, Sid Yingze, Min Sun, and Silvio Savarese. "Toward coherent object detection and scene layout understanding." Image and Vision Computing 29, no. 9 (August 2011): 569–79. http://dx.doi.org/10.1016/j.imavis.2011.08.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zolghadr, Esfandiar, and Borko Furht. "Context-Based Scene Understanding." International Journal of Multimedia Data Engineering and Management 7, no. 1 (January 2016): 22–40. http://dx.doi.org/10.4018/ijmdem.2016010102.

Full text
Abstract:
Context plays an important role in performance of object detection. There are two popular considerations in building context models for computer vision applications; type of context (semantic, spatial, scale) and scope of the relations (pairwise, high-order). In this paper, a new unified framework is presented that combines multiple sources of context in high-order relations to encode semantical coherence and consistency of the scenes. This framework introduces a new descriptor called context relevance score to model context-based distribution of the response variables and apply it to two distributions. First model incorporates context descriptor along with annotation response into a supervised Latent Dirichlet Allocation (LDA) built on multi-variate Bernoulli distribution called Context-Based LDA (CBLDA). The second model is based on multi-variate Wallenius' non-central Hyper-geometric distribution and is called Wallenius LDA (WLDA). WLDA incorporates context knowledge as bias parameter. Scene context is modeled as a graph and effectively used in object detection framework to maximize semantical consistency of the scene. The graph can also be used in recognition of out-of-context objects. Annotation metadata of Sun397 dataset is used to construct the context model. Performance of the proposed approaches was evaluated on ImageNet dataset. Comparison between proposed approaches and state-of-art multi-class object annotation algorithm shows superiority of presented approach in labeling of scene content.
APA, Harvard, Vancouver, ISO, and other styles
5

Sriram, K. V., and R. H. Havaldar. "Analytical review and study on object detection techniques in the image." International Journal of Modeling, Simulation, and Scientific Computing 12, no. 05 (May 21, 2021): 2150031. http://dx.doi.org/10.1142/s1793962321500318.

Full text
Abstract:
Object detection is the most fundamental but challenging issues in the field of computer vision. Object detection identifies the presence of various individual objects in an image. Great success is attained for object detection/recognition problems in the controlled environment, but still, the problem remains unsolved in the uncontrolled places, particularly, when the objects are placed in arbitrary poses in an occluded and cluttered environment. In the last few years, a lots of efforts are made by researchers to resolve this issue, because of its wide range of applications in computer vision tasks, like content-enabled image retrieval, event or activity recognition, scene understanding, and so on. This review provides a detailed survey of 50 research papers presenting the object detection techniques, like machine learning-based techniques, gradient-based techniques, Fast Region-based Convolutional Neural Network (Fast R-CNN) detector, and the foreground-based techniques. Here, the machine learning-based approaches are classified into deep learning-based approaches, random forest, Support Vector Machine (SVM), and so on. Moreover, the challenges faced by the existing techniques are explained in the gaps and issues section. The analysis based on the classification, toolset, datasets utilized, published year, and the performance metrics are discussed. The future dimension of the research is based on the gaps and issues identified from the existing research works.
APA, Harvard, Vancouver, ISO, and other styles
6

Wang, Chang, Jinyu Sun, Shiwei Ma, Yuqiu Lu, and Wang Liu. "Multi-stream Network for Human-object Interaction Detection." International Journal of Pattern Recognition and Artificial Intelligence 35, no. 08 (March 12, 2021): 2150025. http://dx.doi.org/10.1142/s0218001421500257.

Full text
Abstract:
Detecting the interaction between humans and objects in images is a critical problem for obtaining a deeper understanding of the visual relationship in a scene and also a critical technology in many practical applications, such as augmented reality, video surveillance and information retrieval. Be that as it may, due to the fine-grained actions and objects in the real scene and the coexistence of multiple interactions in one scene, the problem is far from being solved. This paper differs from prior approaches, which focused only on the features of instances, by proposing a method that utilizes a four-stream CNNs network for human-object interaction (HOI) detection. More detailed visual features, spatial features and pose features from human-object pairs are extracted to solve the challenging task of detection in images. Specially, the core idea is that the region where people interact with objects contains important identifying cues for specific action classes, and the detailed cues can be fused to facilitate HOI recognition. Experiments on two large-scale HOI public benchmarks, V-COCO and HICO-DET, are carried out and the results show the effectiveness of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
7

Achirei, Stefan-Daniel, Mihail-Cristian Heghea, Robert-Gabriel Lupu, and Vasile-Ion Manta. "Human Activity Recognition for Assisted Living Based on Scene Understanding." Applied Sciences 12, no. 21 (October 24, 2022): 10743. http://dx.doi.org/10.3390/app122110743.

Full text
Abstract:
The growing share of the population over the age of 65 is putting pressure on the social health insurance system, especially on institutions that provide long-term care services for the elderly or to people who suffer from chronic diseases or mental disabilities. This pressure can be reduced through the assisted living of the patients, based on an intelligent system for monitoring vital signs and home automation. In this regard, since 2008, the European Commission has financed the development of medical products and services through the ambient assisted living (AAL) program—Ageing Well in the Digital World. The SmartCare Project, which integrates the proposed Computer Vision solution, follows the European strategy on AAL. This paper presents an indoor human activity recognition (HAR) system based on scene understanding. The system consists of a ZED 2 stereo camera and a NVIDIA Jetson AGX processing unit. The recognition of human activity is carried out in two stages: all humans and objects in the frame are detected using a neural network, then the results are fed to a second network for the detection of interactions between humans and objects. The activity score is determined based on the human–object interaction (HOI) detections.
APA, Harvard, Vancouver, ISO, and other styles
8

Joshi, Rakesh Chandra, Saumya Yadav, Malay Kishore Dutta, and Carlos M. Travieso-Gonzalez. "Efficient Multi-Object Detection and Smart Navigation Using Artificial Intelligence for Visually Impaired People." Entropy 22, no. 9 (August 27, 2020): 941. http://dx.doi.org/10.3390/e22090941.

Full text
Abstract:
Visually impaired people face numerous difficulties in their daily life, and technological interventions may assist them to meet these challenges. This paper proposes an artificial intelligence-based fully automatic assistive technology to recognize different objects, and auditory inputs are provided to the user in real time, which gives better understanding to the visually impaired person about their surroundings. A deep-learning model is trained with multiple images of objects that are highly relevant to the visually impaired person. Training images are augmented and manually annotated to bring more robustness to the trained model. In addition to computer vision-based techniques for object recognition, a distance-measuring sensor is integrated to make the device more comprehensive by recognizing obstacles while navigating from one place to another. The auditory information that is conveyed to the user after scene segmentation and obstacle identification is optimized to obtain more information in less time for faster processing of video frames. The average accuracy of this proposed method is 95.19% and 99.69% for object detection and recognition, respectively. The time complexity is low, allowing a user to perceive the surrounding scene in real time.
APA, Harvard, Vancouver, ISO, and other styles
9

TIAN, MINGHUI, SHOUHONG WAN, and LIHUA YUE. "A VISUAL ATTENTION MODEL FOR NATURAL SCENES BASED ON DYNAMIC FEATURE COMBINATION." International Journal of Software Engineering and Knowledge Engineering 20, no. 08 (December 2010): 1077–95. http://dx.doi.org/10.1142/s0218194010005043.

Full text
Abstract:
In recent years, many research works indicate that human's visual attention is very helpful in some research areas that are related to computer vision, such as object recognition, scene understanding and object-based image/video retrieval or annotation. This paper presents a visual attention model for natural scenes based on a dynamic feature combination strategy. The model can be divided into three parts, which are feature extraction, dynamic feature combination and salient objects detection. First, the saliency features of color, information entropy and salient boundary are extracted from an original colored image. After that, two different evaluation measurements are proposed for two different categories of feature maps defined in this dynamic combination strategy, which measures the contribution of each feature map to saliency and carries out a dynamic weighting of individual feature maps. Finally, salient objects are located from an integrated saliency map and a computational method is given to simulate the location shift of the real human visual attention. Experimental results show that this model is effective and robust for saliency detection in natural scenes, also similar to the real human visual attention mechanism.
APA, Harvard, Vancouver, ISO, and other styles
10

XIAO, JIANGJIAN, HUI CHENG, FENG HAN, and HARPREET SAWHNEY. "GEO-BASED AERIAL SURVEILLANCE VIDEO PROCESSING FOR SCENE UNDERSTANDING AND OBJECT TRACKING." International Journal of Pattern Recognition and Artificial Intelligence 23, no. 07 (November 2009): 1285–307. http://dx.doi.org/10.1142/s0218001409007582.

Full text
Abstract:
This paper presents an approach to extract semantic layers from aerial surveillance videos for scene understanding and object tracking. The input videos are captured by low flying aerial platforms and typically consist of strong parallax from non-ground-plane structures as well as moving objects. Our approach leverages the geo-registration between video frames and reference images (such as those available from Terraserver and Google satellite imagery) to establish a unique geo-spatial coordinate system for pixels in the video. The geo-registration process enables Euclidean 3D reconstruction with absolute scale unlike traditional monocular structure from motion where continuous scale estimation over long periods of time is an issue. Geo-registration also enables correlation of video data to other stored information sources such as GIS (Geo-spatial Information System) databases. In addition to the geo-registration and 3D reconstruction aspects, the other key contributions of this paper also include: (1) providing a reliable geo-based solution to estimate camera pose for 3D reconstruction, (2) exploiting appearance and 3D shape constraints derived from geo-registered videos for labeling of structures such as buildings, foliage, and roads for scene understanding, and (3) elimination of moving object detection and tracking errors using 3D parallax constraints and semantic labels derived from geo-registered videos. Experimental results on extended time aerial video data demonstrates the qualitative and quantitative aspects of our work.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Computer Vision, Object Recognition, Vision and Scene Understanding, Object Detection"

1

Simonelli, Andrea. "3D Object Detection from Images." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/353602.

Full text
Abstract:
Remarkable advancements in the field of Computer Vision, Artificial Intelligence and Machine Learning have led to unprecedented breakthroughs in what machines are able to achieve. In many tasks such as in Image Classification in fact, they are now capable of even surpassing human performance. While this is truly outstanding, there are still many tasks in which machines lag far behind. Walking in a room, driving on an highway, grabbing some food for example. These are all actions that feel natural to us but can be quite unfeasible for them. Such actions require to identify and localize objects in the environment, effectively building a robust understanding of the scene. Humans easily gain this understanding thanks to their binocular vision, which provides an high-resolution and continuous stream of information to our brain that efficiently processes it. Unfortunately, things are much different for machines. With cameras instead of eyes and artificial neural networks instead of a brain, gaining this understanding is still an open problem. In this thesis we will not focus on solving this problem as a whole, but instead delve into a very relevant part of it. We will in fact analyze how to make ma- chines be able to identify and precisely localize objects in the 3D space by relying only on visual input i.e. 3D Object Detection from Images. One of the most complex aspects of Image-based 3D Object Detection is that it inherently requires the solution of many different sub-tasks e.g. the estimation of the object’s distance and its rotation. A first contribution of this thesis is an analysis of how these sub-tasks are usually learned, highlighting a destructivebehavior which limits the overall performance and the proposal of an alternative learning method that avoids it. A second contribution is the discovery of a flaw in the computation of the metric which is widely used in the field, affecting the re-computation of the performance of all published methods and the introduction of a novel un-flawed metric which has now become the official one. A third contribution is focused on one particular sub-task, i.e. estimation of the object’s distance, which is demonstrated to be the most challenging. Thanks to the introduction of a novel approach which normalizes the appearance of objects with respect to their distance, detection performances can be greatly improved. A last contribution of the thesis is the critical analysis of the recently proposed Pseudo-LiDAR methods. Two flaws in their training protocol have been identified and analyzed. On top of this, a novel method able to achieve state-of-the-art in Image-based 3D Object Detection has been developed.
APA, Harvard, Vancouver, ISO, and other styles
2

Del, Pero Luca. "Top-Down Bayesian Modeling and Inference for Indoor Scenes." Diss., The University of Arizona, 2013. http://hdl.handle.net/10150/297040.

Full text
Abstract:
People can understand the content of an image without effort. We can easily identify the objects in it, and figure out where they are in the 3D world. Automating these abilities is critical for many applications, like robotics, autonomous driving and surveillance. Unfortunately, despite recent advancements, fully automated vision systems for image understanding do not exist. In this work, we present progress restricted to the domain of images of indoor scenes, such as bedrooms and kitchens. These environments typically have the "Manhattan" property that most surfaces are parallel to three principal ones. Further, the 3D geometry of a room and the objects within it can be approximated with simple geometric primitives, such as 3D blocks. Our goal is to reconstruct the 3D geometry of an indoor environment while also understanding its semantic meaning, by identifying the objects in the scene, such as beds and couches. We separately model the 3D geometry, the camera, and an image likelihood, to provide a generative statistical model for image data. Our representation captures the rich structure of an indoor scene, by explicitly modeling the contextual relationships among its elements, such as the typical size of objects and their arrangement in the room, and simple physical constraints, such as 3D objects do not intersect. This ensures that the predicted image interpretation will be globally coherent geometrically and semantically, which allows tackling the ambiguities caused by projecting a 3D scene onto an image, such as occlusions and foreshortening. We fit this model to images using MCMC sampling. Our inference method combines bottom-up evidence from the data and top-down knowledge from the 3D world, in order to explore the vast output space efficiently. Comprehensive evaluation confirms our intuition that global inference of the entire scene is more effective than estimating its individual elements independently. Further, our experiments show that our approach is competitive and often exceeds the results of state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
3

Moria, Kawther. "Computer vision-based detection of fire and violent actions performed by individuals in videos acquired with handheld devices." Thesis, 2016. http://hdl.handle.net/1828/7423.

Full text
Abstract:
Advances in social networks and multimedia technologies greatly facilitate the recording and sharing of video data on violent social and/or political events via In- ternet. These video data are a rich source of information in terms of identifying the individuals responsible for damaging public and private property through vio- lent behavior. Any abnormal, violent individual behavior could trigger a cascade of undesirable events, such as vandalism and damage to stores and public facilities. When such incidents occur, investigators usually need to analyze thousands of hours of videos recorded using handheld devices in order to identify suspects. The exhaus- tive manual investigation of these video data is highly time and resource-consuming. Automated detection techniques of abnormal events and actions based on computer vision would o↵er a more e cient solution to this problem. The first contribution described in this thesis consists of a novel method for fire detection in riot videos acquired with handheld cameras and smart-phones. This is a typical example of computer vision in the wild, where we have no control over the data acquisition process, and where the quality of the video data varies considerably. The proposed spatial model is based on the Mixtures of Gaussians model and exploits color adjacency in the visible spectrum of incandescence. The experimental results demonstrate that using this spatial model in concert with motion cues leads to highly accurate results for fire detection in noisy, complex scenes of rioting crowds. The second contribution consists in a method for detecting abnormal, violent actions that are performed by individual subjects and witnessed by passive crowds. The problem of abnormal individual behavior, such as a fight, witnessed by passive bystanders gathered into a crowd has not been studied before. We show that the presence of a passive, standing crowd is an important indicator that an abnormal action might occur. Thus, detecting the standing crowd improves the performance of detecting the abnormal action. The proposed method performs crowd detection first, followed by the detection of abnormal motion events. Our main theoretical contribution consists in linking crowd detection to abnormal, violent actions, as well as in defining novel sets of features that characterize static crowds and abnormal individual actions in both spatial and spatio-temporal domains. Experimental results are computed on a custom dataset, the Vancouver Riot Dataset, that we generated using amateur video footage acquired with handheld devices and uploaded on public social network sites. Our approach achieves good precision and recall values, which validates our system’s reliability of localizing the crowds and the abnormal actions. To summarize, this thesis focuses on the detection of two types of abnormal events occurring in violent street movements. The data are gathered by passive participants to these movements using handheld devices. Although our data sets are drawn from one single social movement (the Vancouver 2011 Stanley cup riot) we are confident that our approaches would generalize well and would be helpful to forensic activities performed in the context of other similar violent occasions.
Graduate
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Computer Vision, Object Recognition, Vision and Scene Understanding, Object Detection"

1

Liu, Zhuo, Xuemei Xie, and Xuyang Li. "Scene Semantic Guidance for Object Detection." In Pattern Recognition and Computer Vision, 355–65. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-88004-0_29.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Tong, Jiaxing, Tao Chen, Qiong Wang, and Yazhou Yao. "Few-Shot Object Detection via Understanding Convolution and Attention." In Pattern Recognition and Computer Vision, 674–87. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-18907-4_52.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Zolghadr, Esfandiar, and Borko Furht. "Context-Based Scene Understanding." In Computer Vision, 754–73. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-5204-8.ch029.

Full text
Abstract:
Context plays an important role in performance of object detection. There are two popular considerations in building context models for computer vision applications; type of context (semantic, spatial, scale) and scope of the relations (pairwise, high-order). In this paper, a new unified framework is presented that combines multiple sources of context in high-order relations to encode semantical coherence and consistency of the scenes. This framework introduces a new descriptor called context relevance score to model context-based distribution of the response variables and apply it to two distributions. First model incorporates context descriptor along with annotation response into a supervised Latent Dirichlet Allocation (LDA) built on multi-variate Bernoulli distribution called Context-Based LDA (CBLDA). The second model is based on multi-variate Wallenius' non-central Hyper-geometric distribution and is called Wallenius LDA (WLDA). WLDA incorporates context knowledge as bias parameter. Scene context is modeled as a graph and effectively used in object detection framework to maximize semantical consistency of the scene. The graph can also be used in recognition of out-of-context objects. Annotation metadata of Sun397 dataset is used to construct the context model. Performance of the proposed approaches was evaluated on ImageNet dataset. Comparison between proposed approaches and state-of-art multi-class object annotation algorithm shows superiority of presented approach in labeling of scene content.
APA, Harvard, Vancouver, ISO, and other styles
4

Dogra, Debi Prosad. "Visual Attention Guided Object Detection and Tracking." In Innovative Research in Attention Modeling and Computer Vision Applications, 99–114. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-4666-8723-3.ch004.

Full text
Abstract:
Scene understanding and object recognition heavily depend on the success of visual attention guided salient region detection in images and videos. Therefore, summarizing computer vision techniques that take the help of visual attention models to accomplish video object recognition and tracking, can be helpful to the researchers of computer vision community. In this chapter, it is aimed to present a philosophical overview of the possible applications of visual attention models in the context of object recognition and tracking. At the beginning of this chapter, a brief introduction to various visual saliency models suitable for object recognition is presented, that is followed by discussions on possible applications of attention models on video object tracking. The chapter also provides a commentary on the existing techniques available on this domain and discusses some of their possible extensions. It is believed that, prospective readers will benefit since the chapter comprehensively guides a reader to understand the pros and cons of this particular topic.
APA, Harvard, Vancouver, ISO, and other styles
5

Jamalian, Amirhossein, and Fred H. Hamker. "Biologically-Inspired Models for Attentive Robot Vision." In Innovative Research in Attention Modeling and Computer Vision Applications, 69–98. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-4666-8723-3.ch003.

Full text
Abstract:
A rich stream of visual data enters the cameras of a typical artificial vision system (e.g., a robot) and considering the fact that processing this volume of data in real-rime is almost impossible, a clever mechanism is required to reduce the amount of trivial visual data. Visual Attention might be the solution. The idea is to control the information flow and thus to improve vision by focusing the resources merely on some special aspects instead of the whole visual scene. However, does attention only speed-up processing or can the understanding of human visual attention provide additional guidance for robot vision research? In this chapter, first, some basic concepts of the primate visual system and visual attention are introduced. Afterward, a new taxonomy of biologically-inspired models of attention, particularly those that are used in robotics applications (e.g., in object detection and recognition) is given and finally, future research trends in modelling of visual attention and its applications are highlighted.
APA, Harvard, Vancouver, ISO, and other styles
6

Morozov, Alexei Alexandrovich, Olga Sergeevna Sushkova, and Alexander Fedorovich Polupanov. "Object-Oriented Logic Programming of Intelligent Visual Surveillance for Human Anomalous Behavior Detection." In Optoelectronics in Machine Vision-Based Theories and Applications, 134–87. IGI Global, 2019. http://dx.doi.org/10.4018/978-1-5225-5751-7.ch006.

Full text
Abstract:
The idea of the logic programming-based approach to the intelligent visual surveillance is in usage of logical rules for description and analysis of people behavior. New prospects in logic programming of the intelligent visual surveillance are connected with the usage of 3D machine vision methods and adaptation of the multi-agent approach to the intelligent visual surveillance. The main advantage of usage of 3D vision instead of the conventional 2D vision is that the first one can provide essentially more complete information about the video scene. The availability of exact information about the coordinates of the parts of the body and scene geometry provided by means of 3D vision is a key to the automation of behavior analysis, recognition, and understanding. This chapter supplies the first systematic and complete description of the method of object-oriented logic programming of the intelligent visual surveillance, special software implementing this method, and new trends in the research area linked with the usage of novel 3D data acquisition equipment.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Computer Vision, Object Recognition, Vision and Scene Understanding, Object Detection"

1

Bao, Sid Ying-Ze, Min Sun, and Silvio Savarese. "Toward coherent object detection and scene layout understanding." In 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2010. http://dx.doi.org/10.1109/cvpr.2010.5540229.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Lin, Dahua, Sanja Fidler, and Raquel Urtasun. "Holistic Scene Understanding for 3D Object Detection with RGBD Cameras." In 2013 IEEE International Conference on Computer Vision (ICCV). IEEE, 2013. http://dx.doi.org/10.1109/iccv.2013.179.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Wu, Zuxuan, Yanwei Fu, Yu-Gang Jiang, and Leonid Sigal. "Harnessing Object and Scene Semantics for Large-Scale Video Understanding." In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016. http://dx.doi.org/10.1109/cvpr.2016.339.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Jiangjian Xiao, Hui Cheng, Feng Han, and Harpreet Sawhney. "Geo-spatial aerial video processing for scene understanding and object tracking." In 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2008. http://dx.doi.org/10.1109/cvpr.2008.4587434.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Dong, Jingming, Xiaohan Fei, and Stefano Soatto. "Visual-Inertial-Semantic Scene Representation for 3D Object Detection." In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017. http://dx.doi.org/10.1109/cvpr.2017.380.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wang, Tao, Xuming He, Songzhi Su, and Yin Guan. "Efficient Scene Layout Aware Object Detection for Traffic Surveillance." In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2017. http://dx.doi.org/10.1109/cvprw.2017.128.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Jian Yao, S. Fidler, and R. Urtasun. "Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation." In 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2012. http://dx.doi.org/10.1109/cvpr.2012.6247739.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Cheng, Gong, Junwei Han, Lei Guo, and Tianming Liu. "Learning coarse-to-fine sparselets for efficient object detection and scene classification." In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2015. http://dx.doi.org/10.1109/cvpr.2015.7298721.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Liu, Yong, Ruiping Wang, Shiguang Shan, and Xilin Chen. "Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships." In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2018. http://dx.doi.org/10.1109/cvpr.2018.00730.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Wu, Aming, and Cheng Deng. "Single-Domain Generalized Object Detection in Urban Scene via Cyclic-Disentangled Self-Distillation." In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2022. http://dx.doi.org/10.1109/cvpr52688.2022.00092.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography