Dissertations / Theses on the topic '2D/3D object discovery'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 44 dissertations / theses for your research on the topic '2D/3D object discovery.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Kara, Sandra. "Unsupervised object discovery in images and video data." Electronic Thesis or Diss., université Paris-Saclay, 2025. http://www.theses.fr/2025UPASG019.
Full textThis thesis explores self-supervised learning methods for object localization, commonly known as Object Discovery. Object localization in images and videos is an essential component of computer vision tasks such as detection, re-identification, tracking etc. Current supervised algorithms can localize (and classify) objects accurately but are costly due to the need for annotated data. The process of labeling is typically repeated for each new data or category of interest, limiting their scalability. Additionally, the semantically specialized approaches require prior knowledge of the target classes, restricting their use to known objects. Object Discovery aims to address these limitations by being more generic. The first contribution of this thesis focused on the image modality, investigating how features from self-supervised vision transformers can serve as cues for multi-object discovery. To localize objects in their broadest definition, we extended our focus to video data, leveraging motion cues and targeting the localization of objects that can move. We introduced background modeling and knowledge distillation in object discovery to tackle the background over-segmentation issue in existing object discovery methods and to reintegrate static objects, significantly improving the signal-to-noise ratio in predictions. Recognizing the limitations of single-modality data, we incorporated 3D data through a cross-modal distillation framework. The knowledge exchange between 2D and 3D domains improved alignment on object regions between the two modalities, enabling the use of multi-modal consistency as a confidence criterion
Shao, Zhimin. "3D/2D object recognition from surface patterns." Thesis, University of Surrey, 1997. http://epubs.surrey.ac.uk/844055/.
Full textSirtkaya, Salim. "Moving Object Detction In 2d And 3d Scenes." Master's thesis, METU, 2004. http://etd.lib.metu.edu.tr/upload/2/12605310/index.pdf.
Full textKanade-Lucas Feature Tracker&rdquo
. For non-stationary camera sequences, different algorithms are developed based on the scene structure and camera motion characteristics. In planar scenes where the scene is flat or distant from the camera and/or when camera makes rotations only, a method is proposed that uses 2D parametric registration based on affine parameters of the dominant plane for independently moving object detection. A modified version of the 2D parametric registration approach is used when the scene is not planar but consists of a few number of planes at different depths, and camera makes translational motion. Optical flow field segmentation and sequential registration are the key points for this case. For 3D scenes, where the depth variation within the scene is high, a parallax rigidity based approach is developed for moving object detection. All these algorithms are integrated to form a unified independently moving object detector that works in stationary and non-stationary camera sequences and with different scene and camera motion structures. Optical flow field estimation and segmentation is used for this purpose.
Toth, Levente. "3D object recognition based on constrained 2D views." Thesis, University of Plymouth, 1998. http://hdl.handle.net/10026.1/1808.
Full textGovender, Natasha. "Active object recognition for 2D and 3D applications." Doctoral thesis, University of Cape Town, 2015. http://hdl.handle.net/11427/16520.
Full textActive object recognition provides a mechanism for selecting informative viewpoints to complete recognition tasks as quickly and accurately as possible. One can manipulate the position of the camera or the object of interest to obtain more useful information. This approach can improve the computational efficiency of the recognition task by only processing viewpoints selected based on the amount of relevant information they contain. Active object recognition methods are based around how to select the next best viewpoint and the integration of the extracted information. Most active recognition methods do not use local interest points which have been shown to work well in other recognition tasks and are tested on images containing a single object with no occlusions or clutter. In this thesis we investigate using local interest points (SIFT) in probabilistic and non-probabilistic settings for active single and multiple object and viewpoint/pose recognition. Test images used contain objects that are occluded and occur in significant clutter. Visually similar objects are also included in our dataset. Initially we introduce a non-probabilistic 3D active object recognition system which consists of a mechanism for selecting the next best viewpoint and an integration strategy to provide feedback to the system. A novel approach to weighting the uniqueness of features extracted is presented, using a vocabulary tree data structure. This process is then used to determine the next best viewpoint by selecting the one with the highest number of unique features. A Bayesian framework uses the modified statistics from the vocabulary structure to update the system's confidence in the identity of the object. New test images are only captured when the belief hypothesis is below a predefined threshold. This vocabulary tree method is tested against randomly selecting the next viewpoint and a state-of-the-art active object recognition method by Kootstra et al.. Our approach outperforms both methods by correctly recognizing more objects with less computational expense. This vocabulary tree method is extended for use in a probabilistic setting to improve the object recognition accuracy. We introduce Bayesian approaches for object recognition and object and pose recognition. Three likelihood models are introduced which incorporate various parameters and levels of complexity. The occlusion model, which includes geometric information and variables that cater for the background distribution and occlusion, correctly recognizes all objects on our challenging database. This probabilistic approach is further extended for recognizing multiple objects and poses in a test images. We show through experiments that this model can recognize multiple objects which occur in close proximity to distractor objects. Our viewpoint selection strategy is also extended to the multiple object application and performs well when compared to randomly selecting the next viewpoint, the activation model and mutual information. We also study the impact of using active vision for shape recognition. Fourier descriptors are used as input to our shape recognition system with mutual information as the active vision component. We build multinomial and Gaussian distributions using this information, which correctly recognizes a sequence of objects. We demonstrate the effectiveness of active vision in object recognition systems. We show that even in different recognition applications using different low level inputs, incorporating active vision improves the overall accuracy and decreases the computational expense of object recognition systems.
Noé, Estelle. "3D layered articulated object from a single 2D drawing." Thesis, KTH, Medieteknik och interaktionsdesign, MID, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-216943.
Full textAtt modellera artikulerade objekt gjorda av styva delar lagda i lager som används till att fylla 3D-scener i datorspel och filmskapande är en komplex och tidsödande uppgift för digitala konstnärer. Den här undersökningen föreslår ett skiss-baserat tillvägagångssätt att effektivt modellera artikulerade 3D-objekt lagda i lager, såsom djur med styva skal och rustning, i att annotera ett 2D-foto manuellt, och eventuellt skapa det från automatiskt beräknade 2D-mönster. Hänsyn är tagen till symmetriska objekt sedda under en 3/4 vy, och annotera framträdande egenskapersåsom extremiteter av de styva artikulerade delarna som en blandning avcirkulära och Bézier-kurvor, kan det här tillvägagångssättet hämta information om djup, gömda delar och rotations-artikulerade strukturer. Den slutliga formen består av ett set av fyrsidiga polygoner som kan bli tillplattade i 2D. Detaljer såsom öron, svansar och ben där framtida modeller använder dedikerade annotationer. Noggrannheten av rekonstruktionen har blivit validerad på syntetiska cylindriska exempeloch dess robusthet i att rekonstruera en 3D-modell av en rustning, ett bältdjur och en räka. Den senare skapades slutligen med hjälp av papper.
Zhu, Yonggen. "Feature extraction and 2D/3D object recognition using geometric invariants." Thesis, King's College London (University of London), 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.362731.
Full textGamal, Eldin Ahmed. "Point process and graph cut applied to 2D and 3D object extraction." Nice, 2011. http://www.theses.fr/2011NICE4107.
Full textThe topic of this thesis is to develop a novel approach for 3D object detection from a 2D image. This approach takes into consideration the occlusions and the perspective effects. This work has been embedded in a marked point process framework, proved to be efficient for solving many challenging problems dealing with high resolution images. The accomplished work during the thesis can be presented in two parts : In the first part, we propose a novel probabilistic approach to handle occlusions and perspective effects. The proposed method is based on 3D scene simulation on the GPU using OpenGL. It is an object based method embedded in a marked point process framework. We apply it for the size estimation of a penguin colony, where we model a penguin colony as an unknown number of 3D objects. The main idea of the proposed approach is to sample some candidate configurations consisting of 3D objects lying on the real plane. A Gibbs energy is define on the configuration space, which takes into account both prior and data information. The proposed configurations are projected onto the image plane, and the configurations are modified until convergence. To evaluate a proposed configuration, we measure the similarity between the projected image of the proposed configuration and the real image, by defining a data term and a prior term which penalize objects overlapping. We introduced modifications to the optimization algorithm to take into account new dependencies that exists in our 3D model. In the second part, we propose a new optimization method which we call “Multiple Births and Cut” (MBC). It combines the recently developed optimization algorithm Multiple Births and Deaths (MBD) and the Graph-Cut. MBD and MBC optimization methods are applied for the optimization of a marked point process. We compared the MBC to the MBD algorithms showing that the main advantage of our newly proposed algorithm is the reduction of the number of parameters, the speed of convergence and the quality of the obtained results. We validated our algorithm on the counting problem of flamingos in a colony
Gomez-Donoso, Francisco. "Contributions to 3D object recognition and 3D hand pose estimation using deep learning techniques." Doctoral thesis, Universidad de Alicante, 2020. http://hdl.handle.net/10045/110658.
Full textSambra-Petre, Raluca-Diana. "2D/3D knowledge inference for intelligent access to enriched visual content." Phd thesis, Institut National des Télécommunications, 2013. http://tel.archives-ouvertes.fr/tel-00917972.
Full textMadi, Kamel. "Inexact graph matching : application to 2D and 3D Pattern Recognition." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSE1315/document.
Full textGraphs are powerful mathematical modeling tools used in various fields of computer science, in particular, in Pattern Recognition. Graph matching is the main operation in Pattern Recognition using graph-based approach. Finding solutions to the problem of graph matching that ensure optimality in terms of accuracy and time complexity is a difficult research challenge and a topical issue. In this thesis, we investigate the resolution of this problem in two fields: 2D and 3D Pattern Recognition. Firstly, we address the problem of geometric graphs matching and its applications on 2D Pattern Recognition. Kite (archaeological structures) recognition in satellite images is the main application considered in this first part. We present a complete graph based framework for Kite recognition on satellite images. We propose mainly two contributions. The first one is an automatic process transforming Kites from real images into graphs and a process of generating randomly synthetic Kite graphs. This allowing to construct a benchmark of Kite graphs (real and synthetic) structured in different level of deformations. The second contribution in this part, is the proposition of a new graph similarity measure adapted to geometric graphs and consequently for Kite graphs. The proposed approach combines graph invariants with a geometric graph edit distance computation. Secondly, we address the problem of deformable 3D objects recognition, represented by graphs, i.e., triangular tessellations. We propose a new decomposition of triangular tessellations into a set of substructures that we call triangle-stars. Based on this new decomposition, we propose a new algorithm of graph matching to measure the distance between triangular tessellations. The proposed algorithm offers a better measure by assuring a minimum number of triangle-stars covering a larger neighbourhood, and uses a set of descriptors which are invariant or at least oblivious under most common deformations. Finally, we propose a more general graph matching approach founded on a new formalization based on the stable marriage problem. The proposed approach is optimal in term of execution time, i.e. the time complexity is quadratic O(n2) and flexible in term of applicability (2D and 3D). The analyze of the time complexity of the proposed algorithms and the extensive experiments conducted on Kite graph data sets (real and synthetic) and standard data sets (2D and 3D) attest the effectiveness, the high performance and accuracy of the proposed approaches and show that the proposed approaches are extensible and quite general
Wu, Siju. "Study and design of interaction techniques to facilitate object selection and manipulation in virtual environments on mobile devices." Thesis, Université Paris-Saclay (ComUE), 2015. http://www.theses.fr/2015SACLE023/document.
Full textThe advances in the field of NUIs (Natural User Interfaces) can provide more and more guidelines for designers to develop efficient and easy-to-use techniques for 3D interaction. In this context, mobile devices attract much attention to design 3D interaction techniques for ubiquitous usage. Our research work focuses on proposing new techniques to facilitate object selection and manipulation in virtual environments on mobile devices. Indeed, the efficiency and accuracy of object selection are highly affected by the target size and the cluster density. To overcome the fingertip occlusion issue on Smartphones, we have designed two touch-based selection techniques. We have also designed two freehand hybrid techniques for selection of small objects displayed at a distance. To perform constrained manipulation on Tablet-PCs, we have proposed a bimanual technique based on the asymmetrical model. Both hands can be used in collaboration, in order to specify the constraint, determine the manipulation mode, and control the transformation. We have also proposed two other single-hand manipulation techniques using identified touch inputs. The evaluations of our techniques demonstrate that they can improve the users’ interaction experience on mobile devices. Our results permit also to give some guidelines to improve the design of 3D interactions techniques on mobile devices
Sankoh, Hiroshi. "Object Extraction for Virtual-viewpoint Video Synthesis." 京都大学 (Kyoto University), 2015. http://hdl.handle.net/2433/200465.
Full textQiu, Xuchong. "2D and 3D Geometric Attributes Estimation in Images via deep learning." Thesis, Marne-la-vallée, ENPC, 2021. http://www.theses.fr/2021ENPC0005.
Full textThe visual perception of 2D and 3D geometric attributes (e.g. translation, rotation, spatial size and etc.) is important in robotic applications. It helps robotic system build knowledge about its surrounding environment and can serve as the input for down-stream tasks such as motion planning and physical intersection with objects.The main goal of this thesis is to automatically detect positions and poses of interested objects for robotic manipulation tasks. In particular, we are interested in the low-level task of estimating occlusion relationship to discriminate different objects and the high-level tasks of object visual tracking and object pose estimation.The first focus is to track the object of interest with correct locations and sizes in a given video. We first study systematically the tracking framework based on discriminative correlation filter (DCF) and propose to leverage semantics information in two tracking stages: the visual feature encoding stage and the target localization stage. Our experiments demonstrate that the involvement of semantics improves the performance of both localization and size estimation in our DCF-based tracking framework. We also make an analysis for failure cases.The second focus is using object shape information to improve the performance of object 6D pose estimation and do object pose refinement. We propose to estimate the 2D projections of object 3D surface points with deep models to recover object 6D poses. Our results show that the proposed method benefits from the large number of 3D-to-2D point correspondences and achieves better performance. As a second part, we study the constraints of existing object pose refinement methods and develop a pose refinement method for objects in the wild. Our experiments demonstrate that our models trained on either real data or generated synthetic data can refine pose estimates for objects in the wild, even though these objects are not seen during training.The third focus is studying geometric occlusion in single images to better discriminate objects in the scene. We first formalize geometric occlusion definition and propose a method to automatically generate high-quality occlusion annotations. Then we propose a new occlusion relationship formulation (i.e. abbnom) and the corresponding inference method. Experiments on occlusion reasoning benchmarks demonstrate the superiority of the proposed formulation and method. To recover accurate depth discontinuities, we also propose a depth map refinement method and a single-stage monocular depth estimation method.All the methods that we propose leverage on the versatility and power of deep learning. This should facilitate their integration in the visual perception module of modern robotic systems.Besides the above methodological advances, we also made available software (for occlusion and pose estimation) and datasets (of high-quality occlusion information) as a contribution to the scientific community
Sambra-Petre, Raluca-Diana. "2D/3D knowledge inference for intelligent access to enriched visual content." Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2013. http://www.theses.fr/2013TELE0012.
Full textThis Ph.D. thesis tackles the issue of sill and video object categorization. The objective is to associate semantic labels to 2D objects present in natural images/videos. The principle of the proposed approach consists of exploiting categorized 3D model repositories in order to identify unknown 2D objects based on 2D/3D matching techniques. We propose here an object recognition framework, designed to work for real time applications. The similarity between classified 3D models and unknown 2D content is evaluated with the help of the 2D/3D description. A voting procedure is further employed in order to determine the most probable categories of the 2D object. A representative viewing angle selection strategy and a new contour based descriptor (so-called AH), are proposed. The experimental evaluation proved that, by employing the intelligent selection of views, the number of projections can be decreased significantly (up to 5 times) while obtaining similar performance. The results have also shown the superiority of AH with respect to other state of the art descriptors. An objective evaluation of the intra and inter class variability of the 3D model repositories involved in this work is also proposed, together with a comparative study of the retained indexing approaches . An interactive, scribble-based segmentation approach is also introduced. The proposed method is specifically designed to overcome compression artefacts such as those introduced by JPEG compression. We finally present an indexing/retrieval/classification Web platform, so-called Diana, which integrates the various methodologies employed in this thesis
Sharma, Naresh. "Arbitrarily Shaped Virtual-Object Based Video Compression." Columbus, Ohio : Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1238165271.
Full textPaternesi, Claudio. "Virtual Reality Labelling Tool for 3D Semantic Segmentation." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019.
Find full textLengyel, Kristián. "Zobrazování medicínských dat v reálném čase." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-235544.
Full textBatmaz, Anil Ufuk. "Speed, precision and grip force analysis of human manual operations with and without direct visual input." Thesis, Strasbourg, 2018. http://www.theses.fr/2018STRAJ056/document.
Full textPerceptual system of a surgeon must adapt to conditions of multisensorial constrains regard to planning, control, and execution of the image-guided surgical operations. Three experimental setups are designed to explore these visual and haptic constraints in the image-guided training. Results show that subjects are faster and more precise with direct vision compared to image guidance. Stereoscopic 3D viewing does not represent a performance advantage for complete beginners. In virtual reality, variation in object length, width, position, and complexity affect the motor performance. Applied grip force on a surgical robot system depends on the user experience level. In conclusion, both time and precision matter critically, but trainee gets as precise as possible before getting faster should be a priority. Study group homogeneity and background play key role in surgical training research. The findings have direct implications for individual skill monitoring for image-guided applications
"3D object reconstruction from 2D and 3D line drawings." 2008. http://library.cuhk.edu.hk/record=b5893538.
Full textThesis (M.Phil.)--Chinese University of Hong Kong, 2008.
Includes bibliographical references (leaves 78-85).
Abstracts in English and Chinese.
Chapter 1 --- Introduction and Related Work --- p.1
Chapter 1.1 --- Reconstruction from 2D Line Drawings and the Applications --- p.2
Chapter 1.2 --- Previous Work on 3D Reconstruction from Single 2D Line Drawings --- p.4
Chapter 1.3 --- Other Related Work on Interpretation of 2D Line Drawings --- p.5
Chapter 1.3.1 --- Line Labeling and Superstrictness Problem --- p.6
Chapter 1.3.2 --- CAD Reconstruction --- p.6
Chapter 1.3.3 --- Modeling from Images --- p.6
Chapter 1.3.4 --- Identifying Faces in the Line Drawings --- p.7
Chapter 1.4 --- 3D Modeling Systems --- p.8
Chapter 1.5 --- Research Problems and Our Contributions --- p.10
Chapter 1.5.1 --- Recovering Complex Manifold Objects from Line Drawings --- p.10
Chapter 1.5.2 --- The Vision-based Sketching System --- p.11
Chapter 2 --- Reconstruction from Complex Line Drawings --- p.13
Chapter 2.1 --- Introduction --- p.13
Chapter 2.2 --- Assumptions and Terminology --- p.15
Chapter 2.3 --- Separation of a Line Drawing --- p.17
Chapter 2.3.1 --- Classification of Internal Faces --- p.18
Chapter 2.3.2 --- Separating a Line Drawing along Internal Faces of Type 1 --- p.19
Chapter 2.3.3 --- Detecting Internal Faces of Type 2 --- p.20
Chapter 2.3.4 --- Separating a Line Drawing along Internal Faces of Type 2 --- p.28
Chapter 2.4 --- 3D Reconstruction --- p.44
Chapter 2.4.1 --- 3D Reconstruction from a Line Drawing --- p.44
Chapter 2.4.2 --- Merging 3D Manifolds --- p.45
Chapter 2.4.3 --- The Complete 3D Reconstruction Algorithm --- p.47
Chapter 2.5 --- Experimental Results --- p.47
Chapter 2.6 --- Summary --- p.52
Chapter 3 --- A Vision-Based Sketching System for 3D Object Design --- p.54
Chapter 3.1 --- Introduction --- p.54
Chapter 3.2 --- The Sketching System --- p.55
Chapter 3.3 --- 3D Geometry of the System --- p.56
Chapter 3.3.1 --- Locating the Wand --- p.57
Chapter 3.3.2 --- Calibration --- p.59
Chapter 3.3.3 --- Working Space --- p.60
Chapter 3.4 --- Wireframe Input and Object Editing --- p.62
Chapter 3.5 --- Surface Generation --- p.63
Chapter 3.5.1 --- Face Identification --- p.64
Chapter 3.5.2 --- Planar Surface Generation --- p.65
Chapter 3.5.3 --- Smooth Curved Surface Generation --- p.67
Chapter 3.6 --- Experiments --- p.70
Chapter 3.7 --- Summary --- p.72
Chapter 4 --- Conclusion and Future Work --- p.74
Chapter 4.1 --- Conclusion --- p.74
Chapter 4.2 --- Future Work --- p.75
Chapter 4.2.1 --- Learning-Based Line Drawing Reconstruction --- p.75
Chapter 4.2.2 --- New Query Interface for 3D Object Retrieval --- p.75
Chapter 4.2.3 --- Curved Object Reconstruction --- p.76
Chapter 4.2.4 --- Improving the 3D Sketch System --- p.77
Chapter 4.2.5 --- Other Directions --- p.77
Bibliography --- p.78
Richards, Whitman, Jan J. Koenderink, and D. D. Hoffman. "Inferring 3D Shapes from 2D Codons." 1985. http://hdl.handle.net/1721.1/5613.
Full textGrimson, W. Eric, Daniel P. Huttenlocher, and T. D. Alter. "Recognizing 3D Ojbects of 2D Images: An Error Analysis." 1992. http://hdl.handle.net/1721.1/5959.
Full textMou, Chia-Chang, and 牟家昌. "Object Recognition Using 2D Image and 3D Point Clouds Data." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/35279695086781961762.
Full text國立交通大學
電控工程研究所
99
In recent years, research works of three dimensional object recognition in point cloud data become more and more popular. Appearance-based features, such as silhouettes of objects, will directly affect the recognition efficiency in different positions with various angles. To tackle this problem, this thesis proposes a recognition system with two-feature integration. One is the Fourier descriptor of the contour in a range image, and the other is the structure descriptor extracted from point clouds. The Fourier descriptor is used to identify an object in the far distance. Additionally, a method of view-angle interpolation is proposed to increase the correct recognition rate. The structure descriptor is used to recognize an object when closing to the object, since the contour information lacks the ability to describe the object. Furthermore, a strategy of proposed method is presented to select the appropriate feature for object recognition. Ten different control towers are used to verify the performance of the proposed approach. The experimental results show that the proposed system performs better than the method using only feature of range image or feature of point clouds data across the entire distance range.
Xian, Xiaohua. "2D & 3D UML-based software visualization for object-oriented programs." Thesis, 2003. http://spectrum.library.concordia.ca/2345/1/MQ83923.pdf.
Full textChen, Yi-Chun, and 陳奕均. "An Efficient 2D to 3D Image Conversion with Object-based Segmentation." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/85772474159077980474.
Full text國立交通大學
電子研究所
99
Nowadays, the 3D image processing has become a trend in the related visual processing field. Many automatic 2D to 3D conversion algorithms have been proposed to solve the lack of 3D content. But there is still no fast algorithm that converts single monocular images well. In this thesis, we propose a fast conversion algorithm that includes the image segmentation, image classification, object boundary tracing method, and 3D image generation. The image segmentation adopts the watershed method to easily collect the information of depth cue. Then, the image classification recovers the geometry of scene in the image. With the depth cue and geometry information, the object boundary tracing method is proposed to detect objects in image efficiently. Finally, the object result is used to generate depth map and 3D anaglyph image. To evaluate the results, we compare the stereo images with other 2D to 3D conversion systems. Experiment result shows that the proposed 2D to 3D conversion algorithm could perform better than the associated ones in the depth accuracy and processing speed for converting monocular images.
Guo, Jiang-Yu, and 郭江禹. "Reconstruction of a 3D Object Model Using 2D Image Contours Data." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/r465mt.
Full text國立臺灣師範大學
機電科技研究所
97
This paper proposes reconstruction of a 3D object model using 2D image contours data, with building the two-dimensional image contours, so that objects can show three-dimensional model, and applied to medical physical therapy, for example: magnetic resonance imaging systems, nuclear medicine system, and so on .... And then combine with a robot which can achieves an automatic system. Generated three-dimensional model approach, is currently the most commonly seen are the following: first, the most direct way is to use three-dimensional model of graphics software (such as: 3D Maxs) to produce three-dimensional model; Secondly, the use of three-dimensional measurement scanning system to scan objects directly through the three-dimensional information in order to establish a three-dimensional computer model; Third, the use of camera, obtained by two-dimensional digital imaging portfolio, or digital image processing, with Some algorithms, and the establishment of a three-dimensional model. In this study, which adopts the third approach, to create three-dimensional model. First of all, by the principle of the concept of projection to the two CCD cameras, the engineering graphics simulation of the four quadrants in the first quadrant of the vertical and horizontal projection surface type, and this two-sided projection, through the papers algorithm, the two-dimensional coordinates of the image sequence extracted value. Through the OpenGL function, will link three-dimensional coordinates of points, you can get a rough three-dimensional model, then the layout of the grid, and for shading and lighting technology, three-dimensional model can be generated. From the experimental results, this research establishes an automatic 3D object model reconstructed system. The system builds a 3D object model in computer by CCD camera and multiple image processes technology. The model will provide to robot using for the next stage.
Payet, Nadia. "From shape-based object recognition and discovery to 3D scene interpretation." Thesis, 2011. http://hdl.handle.net/1957/21316.
Full textGraduation date: 2011
Access restricted to the OSU Community at author's request from May 12, 2011 - May 12, 2012
Castelhano, João Miguel Seabra. "Neural substrates of 2D/3D object perception: a combined EEG/fMRI approach." Doctoral thesis, 2015. http://hdl.handle.net/10316/26307.
Full textPerceptual decision making is defined as the choice of possible interpretations of the world based on the incoming sensory evidence. The role of temporal coding in this process and coherent perception, defined as hierarchical grouping of local elements, remains controversial. Oscillatory processes in the gamma frequency range (>30 Hz) have been proposed to play a role in signaling emerging object percepts in the brain. Studies using Electroencephalography and Magnetoencephalography (EEG and MEG) have suggested that gamma-band oscillations are related to the integration of information and the ability to form coherent gestalts as well as attention and working memory processes. It is accepted that gamma-band synchrony reflects binding of information across different brain regions leading to the emergence of a coherent percept. There are also reports that correlate gamma activity with many other cognitive processes. Hence, a wide variety of gamma-band patterns and sources were reported for different tasks. In this line, both animal and human studies have suggested that understanding oscillatory activity patterning can be important to understand normal and abnormal cognitive function. However, it remains unclear whether distinct patterns across the gamma frequency range related to different cognitive modules do coexist in the same task. We investigated visual perceptual recognition moments based on EEG analysis with ambiguous Mooney stimuli (black and white incomplete pictures). We departed from classical paradigms which are based on contrasts between stimuli conditions that are fixed in time, and adopted a paradigm whereby the moment of perception of an emergent global pattern was variable. Therefore we could directly compare perception vs. no perception states for the same stimuli and separate sensory and motor processing components. We found a direct link between gamma-band temporal patterns (in two distinct sub-bands: ~40 Hz and ~60 Hz) and the presence versus absence of emerging holistic perception of variable onset. These findings were confirmed in a data driven manner with a support vector machine classification approach based on time-frequency features. Unimodal studies do not have enough resolution to test for non-unitary sources of these sub-bands and to establish their spatial distribution. Using a simultaneous Electroencephalography and functional Magnetic Resonance Imaging (EEG/fMRI) approach we provided new evidence for separable gamma activity patterns reflecting holistic perception. We found that distinct gamma frequency sub-bands reflect different neural substrates and cognitive mechanisms when comparing object perception states vs. no categorical perception. Accordingly, at least two separate neural modules are involved in holistic perceptual decision, one in the visual cortex (~60 Hz) and the other in the anterior insula (~40 Hz). These findings showed that current neuronal models of gamma-band spatial distribution need to consider the duality by separating low and high sub-bands. This provides a step forward in understanding the functional specialization of decision-making networks and the role of gamma frequency range sub-bands in signaling their different neural and cognitive components. This may shed new light on the role of gamma-band response in normal cognition and in neuropsychiatric disorders such as autism and schizophrenia, where both visual and decision making circuits may be impaired. Importantly, it remains unclear whether oscillation amplitude is relevant for encoding global stimulus properties or, alternatively, it is neural synchrony that plays a pivotal role in gestalt formation. In this study, we addressed this question by studying Williams Syndrome (WS), a well characterized model of impaired central coherence, using EEG and a set of experimental tasks requiring visual integration. It has been hypothesized that neural synchrony underlies central coherence that is a well-known model for cognitive dysfunction in autistic spectrum disorders. WS patients show markedly disrupted visual perceptual coherence and holistic integration. Using this human model of loss of coherence, we showed for the first time that neuronal synchrony is reduced across stimulus conditions and this is associated with increased amplitude modulation at 25-45 Hz. This combination of a dramatic loss of synchrony despite increased oscillatory activity represents strong evidence that synchrony underlies central coherence. To directly identify the sources of those specific sub-bands within gamma range and clarify their roles, we used Electrocorticography (ECoG) with the added value of greater spatial and temporal resolution. We used the unique opportunity provided by functional mapping in epilepsy and tested an epileptic patient. Interestingly, we identified a stimulus dependent graded posteroanterior sharpening of frequency responses. Lower frequencies dominated in the anterior ventro-temporal areas and higher frequency modulations in occipital regions. In summary, this set of works addressed several critical points to understand the role of oscillatory activity in perceptual decision mechanisms. We conclude that separable gamma sub-bands reflect different cognitive mechanisms. A distinct spatial source map is present for different gamma sub-bands activity during visual holistic perception. Low gamma (40 Hz) activity is related to the decision making network and High gamma (60 Hz) is localized to early visual processing regions. Moreover, we showed that synchrony underlies central coherence. These demonstrations of a clear functional topography for distinct gamma sub-bands within the same task shows that distinct gamma-band modulations (amplitude and synchrony) underlie sensory processing and perceptual decision mechanisms. These results have potential implications for the development of new diagnostic biomarkers and therapeutic targets.
A decisão perceptual representa o processo de escolha de possíveis interpretações do mundo com base na evidência sensorial externa. O papel dos ritmos cerebrais neste processo e na emergência da percepção holística de objetos, a partir do processamento hierárquico de elementos locais, permanece controverso. No entanto, tem sido proposto que as oscilações num intervalo de frequências conhecido como a banda gama (> 30 Hz), estejam envolvidas neste processamento, com relevância particular na identificação de objetos a partir de estímulos ambíguos. Vários estudos de EEG e MEG (Eletroencefalografia e Magnetoencefalografia) sugeriram que as oscilações nesta banda de frequências estão relacionadas com a integração de informação proveniente de diferentes áreas cerebrais e a capacidade de tomar decisões perceptuais. Outros processos cognitivos, como a atenção ou a capacidade de memória de trabalho, também parecem ter por base mecanismos análogos. Neste sentido, compreender os mecanismos de emergência de oscilações em relação com processos cognitivos bem como as suas bases neurais é importante para compreender a função cognitiva normal e/ou em doenças neuropsiquiátricas. Apesar do crescente interesse nesta área de estudo, ainda não é claro se a multiplicidade de padrões encontrados está relacionada com diferentes módulos cognitivos que coexistem na mesma tarefa. Neste estudo, utilizámos EEG para estudar os momentos de decisão perceptual em tarefas visuais com estímulos ambíguos (estímulos Mooney, imagens compostas de fragmentos negros e brancos sem interpretação perceptual imediata). Dado que o momento da percepção do objecto era variável foi assim possível separar os componentes sensoriais e motores daqueles relacionados com a decisão perceptual. Assim, construímos novos paradigmas para comparar directamente estados de percepção vs. não percepção do mesmo estímulo físico. Foi possível identificar actividade em duas sub-bandas distintas (40 Hz e 60 Hz) com importância para a percepção holística. Estes resultados foram confirmados com classificadores automáticos usando como entradas as características do sinal obtidas no domínio das frequências. Para podermos identificar as fontes (em termos da distribuição espacial no cérebro) destas sub-bandas, recorremos a uma técnica multimodal com melhor resolução espácio-temporal que o EEG. Usando electroencefalografia e Imagem por Ressonância Magnética funcional em simultâneo (EEG /fMRI), descobrimos que aquelas sub-bandas da banda gama reflectem diferentes substratos neuronais e mecanismos cognitivos. Neste sentido, pelo menos dois módulos estão envolvidos na rede da percepção holística. Um sediado no córtex visual (60 Hz) e outro na ínsula anterior (40 Hz). Estes resultados permitem compreender melhor a especialização das redes de tomada de decisão e mostram que os actuais modelos neuronais da localização espacial da banda gama devem considerar a sua dualidade, separando-a em diferentes sub-bandas, com diferentes funções. Estes dados podem trazer novas perspectivas sobre o papel funcional das diferentes sub-bandas na cognição normal, assim como em doenças como o autismo ou a esquizofrenia, onde vários circuitos (quer visuais, quer de decisão) parecem estar afectados. Uma questão de interesse científico considerável, é se é a amplitude das oscilações o factor relevante para a codificação dos estímulos como um todo (percepção holística) ou, por outro lado, se é a sincronização entre áreas cerebrais que desempenha o papel chave. Para responder a esta questão estudámos uma população com síndrome de Williams (WS). Esta condição é caracterizada por dificuldades na integração visual e processamento holístico (“como se não vissem a floresta, mas apenas as árvores”). Como a sincronia está relacionada com a coerência central, esta deveria estar afectada neste modelo de disrupção da percepção holística. Pela primeira vez, mostrámos que a sincronização neuronal está reduzida neste grupo ao mesmo tempo que há um aumento da amplitude das oscilações na mesma banda de frequências (25-45 Hz). Esta combinação de uma dramática perda de sincronia mesmo na presença de um aumento concomitante da amplitude representa uma forte evidência de que a sincronia está subjacente à coerência central. Com o objectivo de melhorar a resolução espacial e identificar directamente as fontes destas oscilações, aproveitámos a oportunidade única proporcionada pelo mapeamento funcional de um doente com epilepsia usando electrocorticografia (ECoG). Neste caso, foi possível identificar um padrão posterior-anterior de actividade que é consistente com a noção de que existem bandas representativas de diferentes processos cognitivos. As frequências mais baixas dominam nas áreas anteriores ventro-temporais (<100 Hz) e modulações de frequência numa banda mais alta dominam nas regiões occipitais. Em suma, estes estudos permitiram contribuir para esclarecimento do papel das oscilações nos mecanismos de decisão perceptual. Conclui-se assim que é possível separar diferentes sub-bandas dentro da banda gama e que estas refletem diferentes mecanismos cognitivos. Estas têm uma função específica na decisão perceptual e uma origem distinta no córtex. A actividade na banda mais baixa (40 Hz), está relacionada com a rede de tomada de decisão. Por outro lado, a banda dos 60 Hz está essencialmente localizada em regiões de processamento visual primário. A demonstração de uma topografia funcional específica para sub-bandas específicas, dentro da mesma tarefa, mostra que diferentes modulações das mesmas (amplitude e sincronia) estão na base do processamento sensorial e dos mecanismos de decisão perceptual. Estes resultados, em conjunto com o estudo dos mecanismos moleculares que dão origem às oscilações, têm implicações para a compreensão dos fenómenos perceptuais na saúde e na doença, bem como no possível desenvolvimento de novos biomarcadores de diagnóstico e alvos terapêuticos.
FCT - SFRH/BD/65341/2009
Chen, Yu-Ru, and 陳妤如. "Robot Arm Autonomous Object Grasping System Based on 2D and 3D Vision Techniques." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/84u98v.
Full text國立臺灣科技大學
機械工程系
107
This research develops a 2D and 3D integrated vision system that can command the robotic arm to fully autonomous object grasping, combines the six-axis robot arm for object grasping. The techniques used include 2D object recognition using deep learning, Point Pair Features (PPF), Image Based Visual Servoing (IBVS), and Perspective-n-Point (PnP). For the variety of objects and scenes in the family, 3D object pose estimation is the core of the system. PPF is a very effective object 6D pose estimation technology, but the implementation of PPF requires constant sampling, which leads to a huge amount of calculation, and the matching result may be wrong. Therefore, this study uses deep learning object recognition technology to identify objects and finding the 2D pixel position of the object using RGB information. After that, the 2D position is converted into a 3D coordinate in the RGB-D camera, and then only save the point cloud of the approximate position area of the object for matching. It can remove the unnecessary point cloud and save a large number of sampling processes, also save a lot of matching time and increase the recall rate of PPF matching. After the matching is completed, the robot arm can be guided to the position of the template set by the matching object. In order to overcome various errors, the system uses the artificial mark on the object to perform IBVS or PnP. It can move the robot arm to a more accurate object grasping position. Relative to the IBVS that can perform grasping of moving objects, but with slow convergence, PnP technology can quickly move the robot arm to the precise grasping position of the stationary object. This study also included the implementation of several object grasping experiments to confirm the practicality and time effectiveness of this development system.
Ghobadi, Seyed Eghbal [Verfasser]. "Real time object recognition and tracking using 2D/3D images / von Seyed Eghbal Ghobadi." 2010. http://d-nb.info/1009885472/34.
Full textHegazy, Doaa Abd al-Kareem Mohammed [Verfasser]. "Boosting for generic 2D/3D object recognition / von Doaa Abd Al-Kareem Mohammed Hegazy." 2010. http://d-nb.info/1001518209/34.
Full textJiang, Ci-syu, and 江麒旭. "A correction method of the multiple-object 3D model reconstruction based on 2D images." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/93624808001814518129.
Full text國立臺灣科技大學
機械工程系
97
A 3D digitizer can be used to perform reverse engineering. It takes photos of a working piece layer by layer. After the images are segmented, we can use these images to reconstruct a CAD model by computer software. Although threshold method was useful to segment single object images, multi-threshold method did not work on multiple-object images. After multiple-object images were segmented by multi-threshold method, these images had 3 kinds of image errors. The first one was material transition, the second one was image profile’s boundary and the third one was tooling marks. Both material transition and image profile’s boundary were caused by the pixel’s gray level which was too similar to our target. Tooling marks were caused randomly by cutting tool. In this study, we used linear regression method to correct image error. In our cases, the material transition was between 4% and 10 %. Then we used image morphology to correct images profile’s boundary and tooling marks. By the image processes from this study, we can now correct images from 3D digitizers, and thus we can offer a accurate images for our applications.
Lin, Chin-Hsin, and 林進星. "Converting 2D Video Sequences Using Object Tracking and Depth-Maps for 3D Stereoscopic Display." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/62275354545388440493.
Full text中原大學
資訊工程研究所
95
A computer framework for the conversion of 2D video sequence to 3D for stereoscopic display based on inside of image frame of vanishing lines and vanishing point is presented. Given a 2D video sequence in a single-view scene, the main processes were to automatically segment and track moving objects, and to generate a depth-map with respect to a vanishing point for the scene. Depths of the motion path for the moving object could be estimated accordingly. As a result, binocular-view images were generated and recombined for stereoscopic display. The experimental results are promising, because of virtual 3D experience due to moving objects that draw attention of viewers. In addition, special 3D effects could be created with the moving objects that were further superimposed onto various backgrounds. In conclusion, our computer framework provides a systematic way of creating 3D video sequence for stereoscopic display, especially for 3D experience of moving objects in various single-view scenes.
Huang, Ying-Yuan, and 黃盈源. "3D Object Model Recovery from 2D Images Utilizing Corner Detection and Virtual Mesh Grid." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/08443820117389508599.
Full text國立臺灣師範大學
機電科技研究所
99
This research proposes a new method to reconstruct the 3D object model from 2D images. One type of the non-contact scanning measurement for the stereo vision algorithm is used in this research. The stereo vision simulates human’s eyes to capture the depth information of the object. Therefore, this research uses two CCD Cameras to capture two images of the object. Then, find out the match points from the two images. Using the match points and combine 1)the parameters of the two CCD Cameras and 2)transform matrix between the world coordinate and camera coordinate to get the depth information of each point in the space. Finally, the object’s 3D model can be reconstructed. The important issue of the stereo vision theorem is how to find out the match points from the two images accurate. For solving this issue in the past researches, many articles used a projected structure light on the object’s surfaces to measure the match points. In this research, the proposed system is able to find out the match points from the two images by the structure light. But this method will be restricted by the color of the object surface. This research proposes a method to reconstruct the 3D model without projecting the structure light. The system uses corner detection and virtual mesh grid to reconstruct the simple geometry and curved the surface of object. The feature points of the simple geometry object are usually on the corner of the contour. So we can find out the feature points by doing the corner detection, and then the system would calculate the depth of the feature points to project the feature points in the 3D coordinated space. And then, the simple geometry object’s 3D model would be reconstructed from these feature points. But the curved surface object doesn’t have the visible feature points, therefore, this paper build up the virtual mesh grid from the left image. Then, the system would estimate the match points by the epipolar geometry theorem and builds up the virtual mesh grid on the right image. Finally system reconstructs the 3D model by the stereo vision theorem and virtual mesh grid of the two images successfully.
Kuan-JuLu and 呂冠儒. "3D Object Point Estimation by 2D Image Using Multi-Stage Deep Convolutional Neural Networks." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/f2pmma.
Full textBorkowski, Maciej. "2D to 3D conversion with direct geometrical search and approximation spaces." 2007. http://hdl.handle.net/1993/2827.
Full textOctober 2007
Su, Tzung-Min, and 蘇宗敏. "Robust 3D Object Recognition using 2D Views via an Incremental Similarity-Based Aspect-Graph Approach." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/39874419206497670787.
Full text國立交通大學
電機與控制工程系所
96
This work presents a framework for robust recognizing 3D objects from 2D views. The proposed framework comprises of two stages: the pre-processing stage and the incremental database construction stage. In the pre-processing stage, foreground objects is extracted from 2D views and applied for building 3D database and recognizing. In the incremental database construction stage, a 3D object database is built and updated using 2D views randomly sampled from a viewing sphere. A background subtraction scheme involving highlight and shadow removal (BSHSR) is proposed as the pre-processing stage of the framework. Foreground regions can be precisely extracted from 2D views using the BSHSR despite illumination variations and dynamic background. The BSHSR comprises three models, called the color-based probabilistic background model (CBM), the gradient-based version of the color-based probabilistic background model (GBM) and a cone-shape illumination model (CSIM). The Gaussian mixture model (GMM) is applied to construct the CBM using pixel statistics. Based on the CBM, the short-term color-based background model (STCBM) and the long-term color-based background model (LTCBM) can be extracted and applied to build the GBM. Furthermore, a new dynamic cone-shape boundary in the RGB color space, called the CSIM, is proposed to distinguish pixels among shadow, highlight and foreground. An incremental database construction method based on similarity-based aspect-graph (ISAG) is proposed for building the 3D object database using 2D views. Similarity-based aspect-graph, which contains a set of aspects and characteristic views for these aspects, is employed to represent the database of 3D objects. An incremental database construction method that maximizes the similarity of views in the same aspect and minimizes the similarity of prototypes is proposed as the core of the framework. To imitate the ability of human cognition, 2D views randomly sampled from a viewing sphere are applied for building and updating a 3D object database. The effectiveness of the BSHSR is demonstrated via experiments with several video clips collected in a complex indoor environment. The BSHSR is applied in the proposed framework to extract foreground object from 2D views. The proposed framework is evaluated on various 3D object recognition problems, including 3D rigid recognition, human posture recognition, and scene recognition. Shape and color features are employed in different applications with the proposed framework to show the efficiency of the proposed method.
Chan, Ya-Ping, and 詹雅評. "3D Video Conversion from 2D Video with Both Camera and Object Motion Using Bundle Adjustment." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/47194804895905129787.
Full text國立臺灣大學
資訊工程學研究所
99
This thesis presents a system - convert 2-D video to 3-D video via the method for reconstructing good high-quality video dis- parity maps. First, find out the background disparity maps of videos which contain both camera motion and object motion in videos. we formulate it to a energy minimization problem by using color constraint and geometric constraint to recover disparity maps. The goal is to estimate background disparity, so we discard any information of moving objects (foreground), such as color and segmentation, and the disparity value of pixels occluded by foreground is affected by the disparity of its neighbors. Given background disparity maps, we recover background images. With background images and disparity maps, we synthesize left-eye view and right- eye view video pair by using depth image-based rendering (DIBR) method.
HSIEH, ZONG-YOU, and 謝宗佑. "3D Object Recognition System of Indoor Scene Based on Point Cloud And 2D SURF Feature Points." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/mk75r4.
Full text國立臺北科技大學
資訊工程系
107
In our previous work had proposed a system for 3D point cloud object recognition. Providing intelligent robots a deeper understanding ability on object recognition. However, the process of recognition in this system need to compare with object in database. Due to here are many 3D objects in the world, as the number of recognition 3D object increases, the size of object database increases, when the object of pending recognition need to be matched with the database object, if there have no similar index search, it will takes long time on matching and decrease the efficiency of system. This work proposed the recognition system based on 2D SURF feature point and 3D point cloud. Divide 3D object that need to recognition into thirty-two orientation, each orientation interval of 11.25 degree. Using SURF (Speeded-Up Robust Features) algorithm to extract the keypoint for the image of object and store to the database. In the aspect of the correspondent angle object point cloud, calculate the information including normal, 3D keypoints needed for recognition. Due to 2D image recognition is faster than 3D, before execution 3D point cloud matching, the system first will use 2D SURF feature match with database object and high similarity object of pending test first, than use 3D point cloud to confirm. This process can significantly reduce the number of object, which need to be compared, also reduce the time on 3D matching. It can achieve the purpose of identifying the multi-objects without losing the identification efficiency and accuracy. As the experimental results of special testing object, the average of the 3D match process can exclude more than 88% of the dissimilar objects, and the average object identification accuracy is 80% or more.
HUANG, WEI-SHIANG, and 黃暐翔. "Development of Software System for Autonomous Object Operations by 6-axis Robot Arm with 2D/3D Vison Capabilities." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/m95y7h.
Full text國立臺灣科技大學
機械工程系
107
This research proposes a software system architecture suitable for amplification and growth for the fully autonomous robot arm object operating system. The autonomous intelligent object operating system needs to integrate with various intelligent vision systems for identifying and locating the object, and controlling the movement of the robot arm. However, the existing intelligent vision systems are numerous, and each has its scope of application. Moreover, more new visual algorithms and robot object operation modules are being continuously developed. A practical robot arm operating software system must be able to update and add new technical modules. Therefore, for this issue, this research designed a scalable and adapt-for-growth software system architecture. By preplanning the specified working directories and the specified specifications of subprogram, the user can easily replace the recognition codes without rewriting the main program. This research combines the robot arm control with a variety of 2D/3D vision systems to develop a smart robotic fully-automatic object operating system that can perform object recognition and 6D pose estimation of object in cluttered scenes by a 3D camera, control the robot arm to move to the preset grabbing point of the object, use the 2D camera at the end of the arm to manipulate the visual servo control based on the landmark on the object, and finally move the robot arm to the precise grabbing point and grab the object. The system also registers the applicable environment and reliability of different intelligent vision modules for each of the object in data set, and increases the robustness of the robot arm system in response to environmental variation. This research also examines the capabilities of the system through multiple experiments and carefully explores future directions for improvement.
Yu, Ming-Jyun, and 余明駿. "Markers Based 3D Position Estimation for Rod Shaped Object Using 2D Image and Its Application In Endoscopic MIS Instrument Tracking." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/mwg4jq.
Full text國立雲林科技大學
電機工程系
103
This aim of our research is to use a uniform circular rod-shaped object (such as the endoscopic surgical instruments), it is under the single-lens cameras shooting. We were labeled two markers on the rod-shaped object. All markers have the same shape, but the color is in different. Base on the digital image processing we can detect these markers and estimate 2D information of the rod-shaped object more efficiently. We can estimate 3D position information of the rod-shaped object more quickly, through the lens mapping on the camera sensor imaging positions geometric relationships. The 3D position information of the rod-shaped object total have seven parameters. Parameters are 3D of coordinate (X,Y,Z) and In-plane, out-plane angles (alpha,beta,gamma). We propose using the binary encoding to estimate rotation angle(theta).
Chen, Kuan-Chieh, and 陳冠傑. "A Study on Autonomous Vehicle Navigation by 2D Object Image Matching and 3D Computer Vision Analysis for Indoor Security Patrolling Applications." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/34337662999305229023.
Full text國立交通大學
多媒體工程研究所
96
A vision-based vehicle system for security patrolling in indoor environments using an autonomous vehicle is proposed. A small vehicle with wireless control and a web camera which has the capabilities of panning, tilting, and zooming is used as a test bed. At first, an easy-to-use learning technique is proposed, which has the capability of extracting specific features, including navigation path, floor color, monitored object, and vehicle location with respect to monitored objects. Next, a security patrolling method by vehicle navigation with obstacle avoidance and security monitoring capabilities is proposed. The vehicle navigates according to the node data of the path map which is created in the learning phase and monitors concerned objects by a simplified scale-invariant feature transform (simplified-SIFT) algorithm proposed in this study. Accordingly, we can extract the features of each monitored object from acquired images and match them with the corresponding learned data by the Hough transform. Furthermore, a vehicle location estimation technique for path correction utilizing the monitored object matching result is proposed. In addition, techniques for obstacle avoidance are also proposed, which can be used to find the clusters of floor colors, detect obstacles in environments with various floor colors, and integrate a technique of goal-directed minimum path following to guide the vehicle to avoid obstacles. Good experimental results show the flexibility and feasibility of the proposed methods for the application of security patrolling in indoor environments.
SYU, JIA-CYUAN, and 許家銓. "Double-Rings Markers Based 3D Complete Eight Quadrants Position Estimation for Rod Shaped Object Using 2D Image and Its Application In Endoscopic MIS Instrument Tracking." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/92550082240203623869.
Full text國立雲林科技大學
電機工程系
104
This paper is based on the results of Mr. Ming-Jyun Yu’s master thesis from our laborotary in 2015 titled “Markers Based 3D Position Estimation for Rod Shaped Object Using 2D Image and Its Application In Endoscopic MIS Instrument Tracking” which features fast and accurate estimation of the six 3D position parameters using just a single 2D image through a set of deterministic formulars (equations). In this thesis, we completed four research goals. Firstly, we extend the original formular for one particular pose to formulars for any pose. Secondly, we select the colors of the two rings as well as the RGB thresholds for fast and accurate ring shape extraction from the 2D image by analyzing a large amount of laparoscopic images. Thirdly, we propose an algorithm with a 2D laparoscopic image as the input and the corresponding six (6) 3D pose parameters as the output. We aslo verify the correctness of the proposed formulars and algorithm by conducting extensive experiments to measure (by human observers) and estimate (by proposed algorithm) the six 3D parameters and analyze their differences for various poses. The results of analysis can be used for further accuracy improvements for the propsed method. Finally, we demonstrate the possibility of sychronizing the motion of a real rod-shaped body and its Unity3D based 3D model for further application in Augmented Reality. The six 3D pose parameters of a real rod-shaped body is estimated by the proposed system and transmitted to drive its 3D model in the remote. Compared to other existing MIS pose estimation methods, the proposed Double-Ring marker based algorithm is accurate and computationally very efficient.
Liu, Cheng Hsiung, and 劉政雄. "RECOGNITION OF 3D OBJECTS BY SINGLE CAMERA VIEWS USING CAMERA CALIBRATION, SURFACE BACKPROJECTION, AND 2D MODEL MATCHING TECHNIQUES BASED ON OBJECT SHAPE AND SURFACE PATTERN INFORMATION." Thesis, 1993. http://ndltd.ncl.edu.tw/handle/99374905602646662157.
Full text國立交通大學
資訊工程研究所
81
A new approach to recognition of three different classes of 3D objects by single camera views using a combination of camera calibration, surface backprojection, and 2D model matching techniques are proposed. The three classes of 3D objects are cuboids, cylinders, and regular prisms, which are commonly seen in commercial products and industrial parts. Not only the silhouette shape but also the surface pattern of the object are utilized in the recognition scheme. For each class, objects of both different sizes and different surface patterns can be recognized. To recognize an input object of each class, a new camera calibration technique is first employed to compute the camera parameters as well as the object dimension parameters analytically using a single camera view of the object. The availability of the analytical solutions of the camera parameters makes the proposed technique faster in parameter computation than other camera calibration approaches requiring iterative parameter computation processes. The calibration technique is based on the use of the information of the lines or curves formed by the intersections of the object surfaces. A surface backprojection technique is then adopted to reconstruct the pattern on each surface patch of the input object. This technique transforms the 3D surface data into a set of 2D surface patch patterns, which make the subsequent model matching process becomes 2D in nature. Finally, in the model matching process, each surface patch pattern is matched with those of each object model using the distance weighted correlation measure. Experimental results show the feasibility of the proposed approach.