Dissertations / Theses on the topic 'Binocular vision. Depth perception. Computer vision'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 28 dissertations / theses for your research on the topic 'Binocular vision. Depth perception. Computer vision.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Tsang, Kong Chau. "Preference for phase-based disparity in a neuromorphic implementation of the binocular energy model /." View Abstract or Full-Text, 2003. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202003%20TSANG.
Full textIncludes bibliographical references (leaves 64-66). Also available in electronic version. Access restricted to campus users.
Val, Petran. "BINOCULAR DEPTH PERCEPTION, PROBABILITY, FUZZY LOGIC, AND CONTINUOUS QUANTIFICATION OF UNIQUENESS." Case Western Reserve University School of Graduate Studies / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=case1504749439893027.
Full textGampher, John Eric. "Perception of motion-in-depth induced motion effects on monocular and binocular cues /." Birmingham, Ala. : University of Alabama at Birmingham, 2008. https://www.mhsl.uab.edu/dt/2009r/gampher.pdf.
Full textTitle from PDF title page (viewed Mar. 30, 2010). Additional advisors: Franklin R. Amthor, James E. Cox, Timothy J. Gawne, Rosalyn E. Weller. Includes bibliographical references (p. 104-114).
Chan, Y. M. "Depth perception in visual images." Thesis, University of Brighton, 1987. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.380238.
Full textZotov, Alexander. "Models of disparity gradient estimation in the visual cortex." Birmingham, Ala. : University of Alabama at Birmingham, 2007. https://www.mhsl.uab.edu/dt/2008r/zotov.pdf.
Full textParton, Andrew D. "The role of binocular disparity and motion parallax information in the perception of depth and shape of physical and simulated stimuli." Thesis, University of Surrey, 2000. http://epubs.surrey.ac.uk/843854/.
Full textGrafton, Catherine E. "Binocular vision and three-dimensional motion perception : the use of changing disparity and inter-ocular velocity differences." Thesis, University of St Andrews, 2011. http://hdl.handle.net/10023/1922.
Full textRiddell, Patricia Mary. "Vergence eye movements and dyslexia." Thesis, University of Oxford, 1987. http://ora.ox.ac.uk/objects/uuid:fc695d53-073a-467d-bc8d-8d47c0b9321e.
Full textUlusoy, Ilkay. "Active Stereo Vision: Depth Perception For Navigation, Environmental Map Formation And Object Recognition." Phd thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/12604737/index.pdf.
Full texts internal parameters bring high computational load. Thus, finding the strategy to be followed in a simulated world and then applying this on real robot for real applications is preferable. In this study, we describe an algorithm for object recognition and cognitive map formation using stereo image data in a 3D virtual world where 3D objects and a robot with active stereo imaging system are simulated. Stereo imaging system is simulated so that the actual human visual system properties are parameterized. Only the stereo images obtained from this world are supplied to the virtual robot. By applying our disparity algorithm, depth map for the current stereo view is extracted. Using the depth information for the current view, a cognitive map of the environment is updated gradually while the virtual agent is exploring the environment. The agent explores its environment in an intelligent way using the current view and environmental map information obtained up to date. Also, during exploration if a new object is observed, the robot turns around it, obtains stereo images from different directions and extracts the model of the object in 3D. Using the available set of possible objects, it recognizes the object.
McIntire, John Paul. "Investigating the Relationship between Binocular Disparity, Viewer Discomfort, and Depth Task Performance on Stereoscopic 3D Displays." Wright State University / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=wright1400790668.
Full textBuckley, John G., Gurvinder K. Panesar, Michael J. MacLellan, Ian E. Pacey, and Brendan T. Barrett. "Changes to Control of Adaptive Gait in Individuals with Long-standing Reduced Stereoacuity." Association for Research in Vision and Ophthalmology, 2010. http://hdl.handle.net/10454/4728.
Full textRCUK (Research Councils, UK)
Héjja-Brichard, Yseult. "Spatial and temporal integration of binocular disparity in the primate brain." Thesis, Toulouse 3, 2020. http://www.theses.fr/2020TOU30086.
Full textThe primate visual system strongly relies on the small differences between the two retinal projections to perceive depth. However, it is not fully understood how those binocular disparities are computed and integrated by the nervous system. On the one hand, single-unit recordings in macaque give access to neuronal encoding of disparity at a very local level. On the other hand, functional neuroimaging (fMRI) studies in human shed light on the cortical networks involved in disparity processing at a macroscopic level but with a different species. In this thesis, we propose to use an fMRI approach in macaque to bridge the gap between single-unit and fMRI recordings conducted in the non-human and human primate brain, respectively, by allowing direct comparisons between the two species. More specifically, we focused on the temporal and spatial processing of binocular disparities at the cortical but also at the perceptual level. Investigating cortical activity in response to motion-in-depth, we could show for the first time that 1) there is a dedicated network in macaque that comprises areas beyond the MT cluster and its surroundings and that 2) there are homologies with the human network involved in processing very similar stimuli. In a second study, we tried to establish a link between perceptual biases that reflect statistical regularities in the three-dimensional visual environment and cortical activity, by investigating whether such biases exist and can be related to specific responses at a macroscopic level. We found stronger activity for the stimulus reflecting natural statistics in one subject, demonstrating a potential influence of spatial regularities on the cortical activity. Further work is needed to firmly conclude about such a link. Nonetheless, we robustly confirmed the existence of a vast cortical network responding to correlated disparities in the macaque brain. Finally, we could measure for the first time retinal corresponding points on the vertical meridian of a macaque subject performing a behavioural task (forced-choice procedure) and compare it to the data we also collected in several human observers with the very same protocol. In the discussion sections, we showed how these findings open the door to varied perspectives
Salvi, Joaquim. "An approach to coded structured light to obtain three dimensional information." Doctoral thesis, Universitat de Girona, 1998. http://hdl.handle.net/10803/7714.
Full textThe stereo vision principle is based on obtaining the three dimensional position of an object point from the position of its projective points in both camera image planes. However, before inferring 3D information, the mathematical models of both cameras have to be known. This step is known as camera calibration and is broadly describes in the thesis. Perhaps the most important problem in stereo vision is the determination of the pair of homologue points in the two images, known as the correspondence problem, and it is also one of the most difficult problems to be solved which is currently investigated by a lot of researchers. The epipolar geometry allows us to reduce the correspondence problem. An approach to the epipolar geometry is describes in the thesis. Nevertheless, it does not solve it at all as a lot of considerations have to be taken into account. As an example we have to consider points without correspondence due to a surface occlusion or simply due to a projection out of the camera scope.
The interest of the thesis is focused on structured light which has been considered as one of the most frequently used techniques in order to reduce the problems related lo stereo vision. Structured light is based on the relationship between a projected light pattern its projection and an image sensor. The deformations between the pattern projected into the scene and the one captured by the camera, permits to obtain three dimensional information of the illuminated scene. This technique has been widely used in such applications as: 3D object reconstruction, robot navigation, quality control, and so on. Although the projection of regular patterns solve the problem of points without match, it does not solve the problem of multiple matching, which leads us to use hard computing algorithms in order to search the correct matches.
In recent years, another structured light technique has increased in importance. This technique is based on the codification of the light projected on the scene in order to be used as a tool to obtain an unique match. Each token of light is imaged by the camera, we have to read the label (decode the pattern) in order to solve the correspondence problem. The advantages and disadvantages of stereo vision against structured light and a survey on coded structured light are related and discussed. The work carried out in the frame of this thesis has permitted to present a new coded structured light pattern which solves the correspondence problem uniquely and robust. Unique, as each token of light is coded by a different word which removes the problem of multiple matching. Robust, since the pattern has been coded using the position of each token of light with respect to both co-ordinate axis. Algorithms and experimental results are included in the thesis. The reader can see examples 3D measurement of static objects, and the more complicated measurement of moving objects. The technique can be used in both cases as the pattern is coded by a single projection shot. Then it can be used in several applications of robot vision.
Our interest is focused on the mathematical study of the camera and pattern projector models. We are also interested in how these models can be obtained by calibration, and how they can be used to obtained three dimensional information from two correspondence points. Furthermore, we have studied structured light and coded structured light, and we have presented a new coded structured light pattern. However, in this thesis we started from the assumption that the correspondence points could be well-segmented from the captured image. Computer vision constitutes a huge problem and a lot of work is being done at all levels of human vision modelling, starting from a)image acquisition; b) further image enhancement, filtering and processing, c) image segmentation which involves thresholding, thinning, contour detection, texture and colour analysis, and so on. The interest of this thesis starts in the next step, usually known as depth perception or 3D measurement.
Djikic, Addi. "Segmentation and Depth Estimation of Urban Road Using Monocular Camera and Convolutional Neural Networks." Thesis, KTH, Robotik, perception och lärande, RPL, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235496.
Full textDeep learning för säkra autonoma transportsystem framträder mer och mer inom forskning och utveckling. Snabb och robust uppfattning om miljön för autonoma fordon kommer att vara avgörande för framtida navigering inom stadsområden med stor trafiksampel. I denna avhandling härleder vi en ny form av ett neuralt nätverk som vi kallar AutoNet. Där nätverket är designat som en autoencoder för pixelvis djupskattning av den fria körbara vägytan för stadsområden, där nätverket endast använder sig av en monokulär kamera och dess bilder. Det föreslagna nätverket för djupskattning hanteras som ett regressions problem. AutoNet är även konstruerad som ett klassificeringsnätverk som endast ska klassificera och segmentera den körbara vägytan i realtid med monokulärt seende. Där detta är hanterat som ett övervakande klassificerings problem, som även visar sig vara en mer simpel och mer robust lösning för att hitta vägyta i stadsområden. Vi implementerar även ett av de främsta neurala nätverken ENet för jämförelse. ENet är utformat för snabb semantisk segmentering i realtid, med hög prediktions- hastighet. Evalueringen av nätverken visar att AutoNet utklassar ENet i varje prestandamätning för noggrannhet, men visar sig vara långsammare med avseende på antal bilder per sekund. Olika optimeringslösningar föreslås för framtida arbete, för hur man ökar nätverk-modelens bildhastighet samtidigt som man behåller robustheten.All träning och utvärdering görs på Cityscapes dataset. Ny data för träning samt evaluering för djupskattningen för väg skapas med ett nytt tillvägagångssätt, genom att kombinera förberäknade djupkartor med semantiska etiketter för väg. Datainsamling med ett Scania-fordon utförs även, monterad med en monoculär kamera för att testa den slutgiltiga härleda modellen. Det föreslagna nätverket AutoNet visar sig vara en lovande topp-presterande modell i fråga om djupuppskattning för väg samt vägklassificering för stadsområden.
Zins, Matthieu. "Color Fusion and Super-Resolution for Time-of-Flight Cameras." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-141956.
Full textShakeel, Amlaan. "Service robot for the visually impaired: Providing navigational assistance using Deep Learning." Miami University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=miami1500647716257366.
Full textRadu, Orghidan. "Catadioptric stereo based on structured light projection." Doctoral thesis, Universitat de Girona, 2006. http://hdl.handle.net/10803/7733.
Full textS'ha realitzat un estudi sobre els sistemes de visió omnidireccional. S'han avaluat vàries configuracions estèreo i s'ha escollit la millor. Els paràmetres del model són difícils de mesurar directament i, en conseqüència, s'ha desenvolupat una sèrie de mètodes de calibració.
Els resultats obtinguts són prometedors i demostren que el sensor pot ésser utilitzat en aplicacions per a la percepció de la profunditat com serien el modelatge de l'escena, la inspecció de canonades, navegació de robots, etc.
Vision perception is enhanced when a large field of view is available. This thesis is focused on the visual perception of depth by means of omnidirectional cameras. The 3D sensing is obtained in computer vision by means of stereo configurations with the drawback of feature matching between images. The solution offered in this dissertation uses structured light projection for solving the matching problem.
First, a survey on omnidirectional vision systems was realized. Then, the sensor design was addressed and the particular stereo configuration of the proposed sensor was decided. An accurate model is obtained by a careful study of both components of the sensor. The model parameters are measured by a set of calibration methods.
The results obtained are encouraging and prove that the sensor can be used in depth perception applications such as scene modeling, pipe inspections, robot navigation, etc.
Lovell, P. G., Marina Bloj, and J. M. Harris. "Optimal integration of shading and binocular disparity for depth perception." 2012. http://hdl.handle.net/10454/6070.
Full textLee, Hwan Sean. "Double-matching in anti-correlated random dot stereograms of Panum's limiting case reveals the interactions among the elementary disparity signals across scale." 2006. http://www.mhsl.uab.edu/dt/Hwan%20Sean%20Lee%20phd--2006-RANDOM%20DOT%20STEREOGRAMS.pdf.
Full textVan, der Merwe Juliaan Werner. "An evaluation of local two-frame dense stereo matching algorithms." Thesis, 2012. http://hdl.handle.net/10210/4966.
Full textThe process of extracting depth information from multiple two-dimensional images taken of the same scene is known as stereo vision. It is of central importance to the field of machine vision as it is a low level task required for many higher level applications. The past few decades has witnessed the development of hundreds of different stereo vision algorithms. This has made it difficult to classify and compare the various approaches to the problem. In this research we provide an overview of the types of approaches that exist to solve the problem of stereo vision. We focus on a specific subset of algorithms, known as local stereo algorithms. Our goal is to critically analyse and compare a representative sample of local stereo algorithm in terms of both speed and accuracy. We also divide the algorithms into discrete interchangeable components and experiment to determine the effect that each of the alternative components has on an algorithm’s speed and accuracy. We investigate even further to quantify and analyse the effect of various design choices within specific algorithm components. Finally we assemble all of the knowledge gained through the experimentation to compose and optimise a novel algorithm. The experimentation highlighted the fact that by far the most important component of a local stereo algorithm is the manner in which it aggregates matching costs. All of the top performing local stereo algorithms dynamically define the shape of the windows over which the matching costs are aggregated. This is done in a manner that aims to only include pixels in a window that is likely to be at the same depth as the depth of the centre pixel of the window. Since the depth is unknown, the cost aggregation techniques use colour and proximity information to best guess whether pixels are at the same depth when defining the shape of the aggregation windows. Local stereo algorithms are usually less accurate than global methods but they are supposed to be faster and more parallelisable. These cost aggregation techniques result in very accurate depth estimates but unfortunately they are also very expensive computationally. We believe the focus of local stereo algorithm development should be speed. Using the experimental results we developed an algorithm that achieves accuracies in the same order of magnitude as the state-of-the-art algorithms while reducing the computation time by over 50%.
Maloney, R. T., M. Kaestner, Alison Bruce, Marina Bloj, J. M. Harris, and A. R. Wade. "Sensitivity to velocity- and disparity based cues to motion-in-depth with and without spared stereopsis in binocular visual impairment." 2018. http://hdl.handle.net/10454/16547.
Full textPurpose: Two binocular sources of information serve motion-in-depth (MID) perception: changes in disparity over time (CD), and interocular velocity differences (IOVD). While CD requires the computation of small spatial disparities, IOVD could be computed from a much lower-resolution signal. IOVD signals therefore might still be available under conditions of binocular vision impairment (BVI) with limited or no stereopsis, e.g. amblyopia. Methods: Sensitivity to CD and IOVD was measured in adults who had undergone therapy to correct optical misalignment or amblyopia in childhood (n=16), as well as normal vision controls with good stereoacuity (n=8). Observers discriminated the interval containing a smoothly-oscillating MID “test” stimulus from a “control” stimulus in a two-interval forced choice (2IFC) paradigm. Results: Of the BVI observers with no static stereoacuity (n=9), one displayed evidence for sensitivity to IOVD only, while there was otherwise no sensitivity for either CD or IOVD in the group. Generally, BVI observers with measurable stereoacuity (n=7) displayed a pattern resembling the control group: showing a similar sensitivity for both cues. A neutral-density (ND) filter placed in front of the fixing eye in a subset of BVI observers did not improve performance. Conclusions: In one BVI observer there was preserved sensitivity to IOVD but not CD, though overall only those BVI observers with at least gross stereopsis were able to detect disparity-based or velocity-based cues to MID. The results imply that these logically distinct information sources are somehow coupled, and in some cases BVI observers with no stereopsis may still retain sensitivity to IOVD.
UK Biotechnology and Biological 498 Sciences Research Council (BBSRC): BB/M002543/1 (Alex R. Wade) BB/M001660/1 (Julie 499 M. Harris) and BB/M001210/1 (Marina Bloj)
Adler, P., Andy J. Scally, and Brendan T. Barrett. "Test-retest variability of Randot stereoacuity measures gathered in an unselected sample of UK primary school children." 2012. http://hdl.handle.net/10454/6782.
Full textSaksena, Harsh. "A Novel Fusion Technique For 2D LIDAR And Stereo Camera Data Using Fuzzy Logic For Improved Depth Perception." Thesis, 2021. http://dx.doi.org/10.7912/C2/45.
Full textObstacle detection, avoidance and path finding for autonomous vehicles requires precise information of the vehicle’s system environment for faultless navigation and decision making. As such vision and depth perception sensors have become an integral part of autonomous vehicles in the current research and development of the autonomous industry. The advancements made in vision sensors such as radars, Light Detection And Ranging (LIDAR) sensors and compact high resolution cameras is encouraging, however individual sensors can be prone to error and misinformation due to environmental factors such as scene illumination, object reflectivity and object transparency. The application of sensor fusion in a system, by the utilization of multiple sensors perceiving similar or relatable information over a network, is implemented to provide a more robust and complete system information and minimize the overall perceived error of the system. 3D LIDAR and monocular camera are the most commonly utilized vision sensors for the implementation of sensor fusion. 3D LIDARs boast a high accuracy and resolution for depth capturing for any given environment and have a broad range of applications such as terrain mapping and 3D reconstruction. Despite 3D LIDAR being the superior sensor for depth, the high cost and sensitivity to its environment make it a poor choice for mid-range application such as autonomous rovers, RC cars and robots. 2D LIDARs are more affordable, easily available and have a wider range of applications than 3D LIDARs, making them the more obvious choice for budget projects. The primary objective of this thesis is to implement a smart and robust sensor fusion system using 2D LIDAR and a stereo depth camera to capture depth and color information of an environment. The depth points generated by the LIDAR are fused with the depth map generated by the stereo camera by a Fuzzy system that implements smart fusion and corrects any gaps in the depth information of the stereo camera. The use of Fuzzy system for sensor fusion of 2D LIDAR and stereo camera is a novel approach to the sensor fusion problem and the output of the fuzzy fusion provides higher depth confidence than the individual sensors provide. In this thesis, we will explore the multiple layers of sensor and data fusion that have been applied to the vision system, both on the camera and lidar data individually and in relation to each other. We will go into detail regarding the development and implementation of fuzzy logic based fusion approach, the fuzzification of input data and the method of selection of the fuzzy system for depth specific fusion for the given vision system and how fuzzy logic can be utilized to provide information which is vastly more reliable than the information provided by the camera and LIDAR separately
"Motion and shape from apparent flow." 2013. http://library.cuhk.edu.hk/record=b5549772.
Full text攝像機和場景的相對運動通常產生出optical flow。問題的困難主要在於,在直接觀察視頻中的optical flow通常不是完全由運動誘導出的optical flow,而只是它的一部分。這個部分就是空間圖像等光線輪廓的正交。這部分的流場被稱為normal flow。本論文提出直接利用normal flow,而不是由normal flow引申出的optical flow,去解決以下的問題:尋找攝像機運動,場景深度圖和手眼校準。這種方法有許多顯著的貢獻,它不需引申流場,進而不要求平滑的成像場景。跟optical flow相反,normal flow不需要複雜的優化處理程序去解決流場不連續性的問題,這種技術一般是需要用大量的計算量。這也打破了傳統攝像機運動與場景深度之間的問題,在沒有預先知道不連續位置的情況下也可找出攝像機的運動。這篇論提出了幾個直接方法運用在三種不同類型的視覺系統,分別是單個攝像機,雙攝像機和多個攝像機,去找出攝像機的運動。
本論文首先提通過Apparent Flow 正深度 (AFPD) 約束去利用所有觀察到的normal flow去找出單個攝像機的運動參數。AFPD約束是利用一個優化問題來估計運動參數。一個反复由粗到細雙重約束的投票框架能使AFPD約束尋找出運動參數。
由於有限的視頻採樣率,normal flow在提取方向比其幅度部分更準確。本論文提出了兩個約束條件:一個是Apparent Flow方向(AFD)的約束,另外一個是Apparent Flow 幅度(AFM)的約束去尋找運動參數。第一個約束本身是作為一個線性不等式系統去約束運動方向的參數,第二個是利用所有圖像位置的旋轉幅度的統一性去進一步限制運動參數。一個兩階段從粗到細的約束框架能使AFD及AFM約束尋找出運動參數。
然而,如果沒有optical flow,normal flow是唯一的原始資料,它通常遭受到有限影像分辨率和有限視頻採樣率的問題而產生出錯誤。本文探討了這個問題的補救措施,方法是把一些攝像機併在一起,形成一個近似球形的攝像機,以增加成像系統的視野。有了一個加寬視野,normal flow的數量可更大,這可以用來抵銷normal flow在每個成像點的提取錯誤。更重要的是,攝像頭的平移和旋轉運動方向可以透過Apparent Flow分離 (AFS) 約束 及 延伸Apparent Flow分離 (EAFS) 約束來獨立估算。
除了使用單攝像機或球面成像系統之外,立體視覺成像系統提供了其它的視覺線索去尋找攝像機在沒有被任意縮放大小的平移運動和深度圖。傳統的立體視覺方法是確定在兩個輸入圖像特徵的對應。然而,對應的建立是非常困難。本文探討了兩個直接方法來恢復完整的攝像機運動,而沒有需要利用一對影像明確的點至點對應。第一種方法是利用AFD和AFM約束伸延到立體視覺系統,並提供了一個穩定的幾何方法來確定平移運動的幅度。第二個方法需要利用有一個較大的重疊視場,以提供一個不需反覆計算的closed-form算法。一旦確定了運動參數,深度圖可以沒有任何困難地重建。從normal flow產生的深度圖一般是以稀疏的形式存在。我們可以通過擴張深度圖,然後利用它作為在常見的TV-L₁框架的初始估計。其結果不僅有一個更好的重建性能,也產生出更快的運算時間。
手眼校準通常是基於像圖特徵對應。本文提出一個替代方法,是從動態攝像系統產生的normal flow來做自我校準。為了使這個方法有更強防備噪音的能力,策略是使用normal flow的流場方向去尋找手眼幾何的方向部份。偏離點及部分的手眼幾何可利用normal flow固有的流場屬性去尋找。最後完整的手眼幾何可使用穩定法來變得更可靠。手眼校準還可以被用來確定多個攝像機的相對幾何關係,而不需要求它們有重疊的視場。
Determination of general camera motion and reconstructing depth map from a captured video of the imaged scene relative to a camera is important for computer vision and various robotics tasks including visual control and autonomous navigation. A camera (or a cluster of cameras) is usually mounted on the end-effector of a robot arm when performing the above tasks. The determination of the relative geometry between the camera frame and the end-effector frame which is commonly referred as hand-eye calibration is essential to proper operation in visual control. Similarly, determining the relative geometry of multiple cameras is also important to various applications requiring the use of multi-camera rig.
The relative motion between an observer and the imaged scene generally induces apparent flow in the video. The difficulty of the problem lies mainly in that the flow pattern directly observable in the video is generally not the full flow field induced by the motion, but only partial information of it, which is orthogonal to the iso-brightness contour of the spatial image intensity profile. The partial flow field is known as the normal flow field. This thesis addresses several important problems in computer vision: determination of camera motion, recovery of depth map, and performing hand-eye calibration from the apparent flow (normal flow) pattern itself in the video data directly but not from the full flow interpolated from the apparent flow. This approach has a number of significant contributions. It does not require interpolating the flow field and in turn does not demand the imaged scene to be smooth. In contrast to optical flow, no sophisticated optimization procedures that account for handling flow discontinuities are required, and such techniques are generally computational expensive. It also breaks the classical chicken-and-egg problem between scene depth and camera motion. No prior knowledge about the locations of the discontinuities is required for motion determination. In this thesis, several direct methods are proposed to determine camera motion using three different types of imaging systems, namely monocular camera, stereo camera, and multi-camera rig.
This thesis begins with the Apparent Flow Positive Depth (AFPD) constraint to determine the motion parameters using all observable normal flows from a monocular camera. The constraint presents itself as an optimization problem to estimate the motion parameters. An iterative process in a constrained dual coarse-to-fine voting framework on the motion parameter space is used to exploit the constraint.
Due to the finite video sampling rate, the extracted normal flow field is generally more accurate in direction component than its magnitude part. This thesis proposes two constraints: one related to the direction component of the normal flow field - the Apparent Flow Direction (AFD) constraint, and the other to the magnitude component of the field - the Apparent Flow Magnitude (AFM) constraint, to determine motion. The first constraint presents itself as a system of linear inequalities to bind the direction of motion parameters; the second one uses the globality of rotational magnitude to all image positions to constrain the motion parameters further. A two-stage iterative process in a coarse-to-fine framework on the motion parameter space is used to exploit the two constraints.
Yet without the need of the interpolation step, normal flow is only raw information extracted locally that generally suffers from flow extraction error arisen from finiteness of the image resolution and video sampling rate. This thesis explores a remedy to the problem, which is to increase the visual field of the imaging system by fixating a number of cameras together to form an approximate spherical eye. With a substantially widened visual field, the normal flow data points would be in a much greater number, which can be used to combat the local flow extraction error at each image point. More importantly, the directions of translation and rotation components in general motion can be separately estimated with the use of the novel Apparent Flow Separation (AFS) and Extended Apparent Flow Separation (EAFS) constraints.
Instead of using a monocular camera or a spherical imaging system, stereo vision contributes another visual clue to determine magnitude of translation and depth map without the problem of arbitrarily scaling of the magnitude. The conventional approach in stereo vision is to determine feature correspondences across the two input images. However, the correspondence establishment is often difficult. This thesis explores two direct methods to recover the complete camera motion from the stereo system without the explicit point-to-point correspondences matching. The first method extends the AFD and AFM constraints to stereo camera, and provides a robust geometrical method to determine translation magnitude. The second method which requires the stereo image pair to have a large overlapped field of view provides a closed-form solution, requiring no iterative computation. Once the motion parameters are here, depth map can be reconstructed without any difficulty. The depth map resulted from normal flows is generally sparse in nature. We can interpolate the depth map and then utilizing it as an initial estimate in a conventional TV-L₁ framework. The result is not only a better reconstruction performance, but also a faster computation time.
Calibration of hand-eye geometry is usually based on feature correspondences. This thesis presents an alternative method that uses normal flows generated from an active camera system to perform self-calibration. In order to make the method more robust to noise, the strategy is to use the direction component of the flow field which is more noise-immune to recover the direction part of the hand-eye geometry first. Outliers are then detected using some intrinsic properties of the flow field together with the partially recovered hand-eye geometry. The final solution is refined using a robust method. The method can also be used to determine the relative geometry of multiple cameras without demanding overlap in their visual fields.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Hui, Tak Wai.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2013.
Includes bibliographical references (leaves 159-165).
Abstracts in English and Chinese.
Acknowledgements --- p.i
Abstract --- p.ii
Lists of Figures --- p.xiii
Lists of Tables --- p.xix
Chapter Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Background --- p.1
Chapter 1.2 --- Motivation --- p.4
Chapter 1.3 --- Research Objectives --- p.6
Chapter 1.4 --- Thesis Outline --- p.7
Chapter Chapter 2 --- Literature Review --- p.10
Chapter 2.1 --- Introduction --- p.10
Chapter 2.2 --- Recovery of Optical Flows --- p.10
Chapter 2.3 --- Egomotion Estimation Based on Optical Flow Field --- p.14
Chapter 2.3.1 --- Bilinear Constraint --- p.14
Chapter 2.3.2 --- Subspace Method --- p.15
Chapter 2.3.3 --- Partial Search Method --- p.16
Chapter 2.3.4 --- Fixation --- p.17
Chapter 2.3.5 --- Region Alignment --- p.17
Chapter 2.3.6 --- Linearity and Divergence Properties of Optical Flows --- p.18
Chapter 2.3.7 --- Constraint Lines and Collinear Points --- p.18
Chapter 2.3.8 --- Multi-Camera Rig --- p.19
Chapter 2.3.9 --- Discussion --- p.21
Chapter 2.4 --- Determining Egomotion Using Direct Methods --- p.22
Chapter 2.4.1 --- Introduction --- p.22
Chapter 2.4.2 --- Classical Methods --- p.23
Chapter 2.4.3 --- Pattern Matching --- p.24
Chapter 2.4.4 --- Search Subspace Method --- p.25
Chapter 2.4.5 --- Histogram-Based Method --- p.26
Chapter 2.4.6 --- Multi-Camera Rig --- p.26
Chapter 2.4.7 --- Discussion --- p.27
Chapter 2.5 --- Determining Egomotion Using Feature Correspondences --- p.28
Chapter 2.6 --- Hand-Eye Calibration --- p.30
Chapter 2.7 --- Summary --- p.31
Chapter Chapter 3 --- Determining Motion from Monocular Camera Using Merely the Positive Depth Constraint --- p.32
Chapter 3.1 --- Introduction --- p.32
Chapter 3.2 --- Related Works --- p.33
Chapter 3.3 --- Background --- p.34
Chapter 3.3 --- Apparent Flow Positive Depth (AFPD) Constraint --- p.39
Chapter 3.4 --- Numerical Solution to AFPD Constraint --- p.40
Chapter 3.5 --- Constrained Coarse-to-Fine Searching --- p.40
Chapter 3.6 --- Experimental Results --- p.43
Chapter 3.7 --- Conclusion --- p.47
Chapter Chapter 4 --- Determining Motion from Monocular Camera Using Direction and Magnitude of Normal Flows Separately --- p.48
Chapter 4.1 --- Introduction --- p.48
Chapter 4.2 --- Related Works --- p.50
Chapter 4.3 --- Apparent Flow Direction (AFD) Constraint --- p.51
Chapter 4.3.1 --- The Special Case: Pure Translation --- p.51
Chapter 4.3.1.1 --- Locus of Translation Using Full Flow as a Constraint --- p.51
Chapter 4.3.1.2 --- Locus of Translation Using Normal Flow as a Constraint --- p.53
Chapter 4.3.2 --- The Special Case: Pure Rotation --- p.54
Chapter 4.3.2.1 --- Locus of Rotation Using Full Flow as a Constraint --- p.54
Chapter 4.3.2.2 --- Locus of Rotation Using Normal Flow as a Constraint --- p.54
Chapter 4.3.3 --- Solving the System of Linear Inequalities for the Two Special Cases --- p.55
Chapter 4.3.5 --- Ambiguities of AFD Constraint --- p.59
Chapter 4.4 --- Apparent Flow Magnitude (AFM) Constraint --- p.60
Chapter 4.5 --- Putting the Two Constraints Together --- p.63
Chapter 4.6 --- Experimental Results --- p.65
Chapter 4.6.1 --- Simulation --- p.65
Chapter 4.6.2 --- Video Data --- p.67
Chapter 4.6.2.1 --- Pure Translation --- p.67
Chapter 4.6.2.2 --- General Motion --- p.68
Chapter 4.7 --- Conclusion --- p.72
Chapter Chapter 5 --- Determining Motion from Multi-Cameras with Non-Overlapping Visual Fields --- p.73
Chapter 5.1 --- Introduction --- p.73
Chapter 5.2 --- Related Works --- p.75
Chapter 5.3 --- Background --- p.76
Chapter 5.3.1 --- Image Sphere --- p.77
Chapter 5.3.2 --- Planar Case --- p.78
Chapter 5.3.3 --- Projective Transformation --- p.79
Chapter 5.4 --- Constraint from Normal Flows --- p.80
Chapter 5.5 --- Approximation of Spherical Eye by Multiple Cameras --- p.81
Chapter 5.6 --- Recovery of Motion Parameters --- p.83
Chapter 5.6.1 --- Classification of a Pair of Normal Flows --- p.84
Chapter 5.6.2 --- Classification of a Triplet of Normal Flows --- p.86
Chapter 5.6.3 --- Apparent Flow Separation (AFS) Constraint --- p.87
Chapter 5.6.3.1 --- Constraint to Direction of Translation --- p.87
Chapter 5.6.3.2 --- Constraint to Direction of Rotation --- p.88
Chapter 5.6.3.3 --- Remarks about the AFS Constraint --- p.88
Chapter 5.6.4 --- Extension of Apparent Flow Separation Constraint (EAFS) --- p.89
Chapter 5.6.4.1 --- Constraint to Direction of Translation --- p.90
Chapter 5.6.4.2 --- Constraint to Direction of Rotation --- p.92
Chapter 5.6.5 --- Solution to the AFS and EAFS Constraints --- p.94
Chapter 5.6.6 --- Apparent Flow Magnitude (AFM) Constraint --- p.96
Chapter 5.7 --- Experimental Results --- p.98
Chapter 5.7.1 --- Simulation --- p.98
Chapter 5.7.2 --- Real Video --- p.103
Chapter 5.7.2.1 --- Using Feature Correspondences --- p.108
Chapter 5.7.2.2 --- Using Optical Flows --- p.108
Chapter 5.7.2.3 --- Using Direct Methods --- p.109
Chapter 5.8 --- Conclusion --- p.111
Chapter Chapter 6 --- Motion and Shape from Binocular Camera System: An Extension of AFD and AFM Constraints --- p.112
Chapter 6.1 --- Introduction --- p.112
Chapter 6.2 --- Related Works --- p.112
Chapter 6.3 --- Recovery of Camera Motion Using Search Subspaces --- p.113
Chapter 6.4 --- Correspondence-Free Stereo Vision --- p.114
Chapter 6.4.1 --- Determination of Full Translation Using Two 3D Lines --- p.114
Chapter 6.4.2 --- Determination of Full Translation Using All Normal Flows --- p.115
Chapter 6.4.3 --- Determination of Full Translation Using a Geometrical Method --- p.117
Chapter 6.5 --- Experimental Results --- p.119
Chapter 6.5.1 --- Synthetic Image Data --- p.119
Chapter 6.5.2 --- Real Scene --- p.120
Chapter 6.6 --- Conclusion --- p.122
Chapter Chapter 7 --- Motion and Shape from Binocular Camera System: A Closed-Form Solution for Motion Determination --- p.123
Chapter 7.1 --- Introduction --- p.123
Chapter 7.2 --- Related Works --- p.124
Chapter 7.3 --- Background --- p.125
Chapter 7.4 --- Recovery of Camera Motion Using a Linear Method --- p.126
Chapter 7.4.1 --- Region-Correspondence Stereo Vision --- p.126
Chapter 7.3.2 --- Combined with Epipolar Constraints --- p.127
Chapter 7.4 --- Refinement of Scene Depth --- p.131
Chapter 7.4.1 --- Using Spatial and Temporal Constraints --- p.131
Chapter 7.4.2 --- Using Stereo Image Pairs --- p.134
Chapter 7.5 --- Experiments --- p.136
Chapter 7.5.1 --- Synthetic Data --- p.136
Chapter 7.5.2 --- Real Image Sequences --- p.137
Chapter 7.6 --- Conclusion --- p.143
Chapter Chapter 8 --- Hand-Eye Calibration Using Normal Flows --- p.144
Chapter 8.1 --- Introduction --- p.144
Chapter 8.2 --- Related Works --- p.144
Chapter 8.3 --- Problem Formulation --- p.145
Chapter 8.3 --- Model-Based Brightness Constraint --- p.146
Chapter 8.4 --- Hand-Eye Calibration --- p.147
Chapter 8.4.1 --- Determining the Rotation Matrix R --- p.148
Chapter 8.4.2 --- Determining the Direction of Position Vector T --- p.149
Chapter 8.4.3 --- Determining the Complete Position Vector T --- p.150
Chapter 8.4.4 --- Extrinsic Calibration of a Multi-Camera Rig --- p.151
Chapter 8.5 --- Experimental Results --- p.151
Chapter 8.5.1 --- Synthetic Data --- p.151
Chapter 8.5.2 --- Real Image Data --- p.152
Chapter 8.6 --- Conclusion --- p.153
Chapter Chapter 9 --- Conclusion and Future Work --- p.154
Related Publications --- p.158
Bibliography --- p.159
Appendix --- p.166
Chapter A --- Apparent Flow Direction Constraint --- p.166
Chapter B --- Ambiguity of AFD Constraint --- p.168
Chapter C --- Relationship between the Angle Subtended by any two Flow Vectors in Image Plane and the Associated Flow Vectors in Image Sphere --- p.169
Buckley, J. G., G. K. Panesar, M. J. MacLellan, I. E. Pacey, and B. T. Barrett. "Changes to control of adaptive gait in individuals with long-standing reduced stereoacuity." 2010. http://hdl.handle.net/10454/5896.
Full textSabihuddin, Siraj. "Dense Stereo Reconstruction in a Field Programmable Gate Array." Thesis, 2008. http://hdl.handle.net/1807/11161.
Full textSuper, Selwyn. "Stereopsis and its educational significance." Thesis, 2014. http://hdl.handle.net/10210/11832.
Full textStereopsis -- binocular depth perception is a visual function which falls within the ambit of the hyperacuities. The term, Hyperacuity, is one coined by Westheimer (1976) to describe thresholds of discrimination which cannot be explained on the basis of the optical components or sensory elements of the eyes alone. By implication such levels of discrimination are effected by higher levels of brain function. It is reasoned that an individual's stereoscopic hyperacuity should in some way relate to other measures of higher sensory and motor brain functions. In a school situation hyperacuity should relate to measures of intelligence, as well as scholastic and sporting achievement. The design and implementation of an experiment to test this premise forms the basis of this thesis. A literature review is reported of current knowledge relevant to this study together with a description of the stereoscopic testing instruments commonly available in clinical practice. A rationale for modifying these instruments and testing methods to suit the needs of this study is also included. This study exposes new knowledge about the process of static nearpoint stereopsis. This stereopsis proves to be a complex of diverse skills, which are significantly age-related and developmental in nature. These skills are seen to influence and be influenced by educational interventions. It may be concluded from this study that there is value in measuring stereopsis in more depth than has been done previously and that it is crucial to measure the speed of stereo performance in its own right in addition to the measures of stereoacuity. The study reveals significant differences of performance which relate to stereopsis in front as opposed to behind the plane of regard and also related to figure/ground contrast differences. The two non-stereoscopic tests and the six different stereoscopic tests described in this thesis prove to be highly discriminative and diagnostic with respect to age, grade level, I.Q., scholastic achievement and sporting ability.
Elliott, D. B., and G. J. Chapman. "Adaptive gait changes due to spectacle magnification and dioptric blur in older people." 2010. http://hdl.handle.net/10454/5961.
Full text