Dissertations / Theses on the topic '3D semantic scene completion'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 15 dissertations / theses for your research on the topic '3D semantic scene completion.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Roldão, Jimenez Luis Guillermo. "3D Scene Reconstruction and Completion for Autonomous Driving." Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS415.
Full textIn this thesis, we address the challenges of 3D scene reconstruction and completion from sparse and heterogeneous density point clouds. Therefore proposing different techniques to create a 3D model of the surroundings.In the first part, we study the use of 3-dimensional occupancy grids for multi-frame reconstruction, useful for localization and HD-Maps applications. This is done by exploiting ray-path information to resolve ambiguities in partially occupied cells. Our sensor model reduces discretization inaccuracies and enables occupancy updates in dynamic scenarios.We also focus on single-frame environment perception by the introduction of a 3D implicit surface reconstruction algorithm capable to deal with heterogeneous density data by employing an adaptive neighborhood strategy. Our method completes small regions of missing data and outputs a continuous representation useful for physical modeling or terrain traversability assessment.We dive into deep learning applications for the novel task of semantic scene completion, which completes and semantically annotates entire 3D input scans. Given the little consensus found in the literature, we present an in-depth survey of existing methods and introduce our lightweight multiscale semantic completion network for outdoor scenarios. Our method employs a new hybrid pipeline based on a 2D CNN backbone branch to reduce computation overhead and 3D segmentation heads to predict the complete semantic scene at different scales, being significantly lighter and faster than existing approaches
Garbade, Martin [Verfasser]. "Semantic Segmentation and Completion of 2D and 3D Scenes / Martin Garbade." Bonn : Universitäts- und Landesbibliothek Bonn, 2019. http://d-nb.info/1201728010/34.
Full textJaritz, Maximilian. "2D-3D scene understanding for autonomous driving." Thesis, Université Paris sciences et lettres, 2020. https://pastel.archives-ouvertes.fr/tel-02921424.
Full textIn this thesis, we address the challenges of label scarcity and fusion of heterogeneous 3D point clouds and 2D images. We adopt the strategy of end-to-end race driving where a neural network is trained to directly map sensor input (camera image) to control output, which makes this strategy independent from annotations in the visual domain. We employ deep reinforcement learning where the algorithm learns from reward by interaction with a realistic simulator. We propose new training strategies and reward functions for better driving and faster convergence. However, training time is still very long which is why we focus on perception to study point cloud and image fusion in the remainder of this thesis. We propose two different methods for 2D-3D fusion. First, we project 3D LiDAR point clouds into 2D image space, resulting in sparse depth maps. We propose a novel encoder-decoder architecture to fuse dense RGB and sparse depth for the task of depth completion that enhances point cloud resolution to image level. Second, we fuse directly in 3D space to prevent information loss through projection. Therefore, we compute image features with a 2D CNN of multiple views and then lift them all to a global 3D point cloud for fusion, followed by a point-based network to predict 3D semantic labels. Building on this work, we introduce the more difficult novel task of cross-modal unsupervised domain adaptation, where one is provided with multi-modal data in a labeled source and an unlabeled target dataset. We propose to perform 2D-3D cross-modal learning via mutual mimicking between image and point cloud networks to address the source-target domain shift. We further showcase that our method is complementary to the existing uni-modal technique of pseudo-labeling
Dewan, Ayush [Verfasser], and Wolfram [Akademischer Betreuer] Burgard. "Leveraging motion and semantic cues for 3D scene understanding." Freiburg : Universität, 2020. http://d-nb.info/1215499493/34.
Full textLind, Johan. "Make it Meaningful : Semantic Segmentation of Three-Dimensional Urban Scene Models." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-143599.
Full textPiewak, Florian [Verfasser], and J. M. [Akademischer Betreuer] Zöllner. "LiDAR-based Semantic Labeling : Automotive 3D Scene Understanding / Florian Pierre Joseph Piewak ; Betreuer: J. M. Zöllner." Karlsruhe : KIT-Bibliothek, 2020. http://d-nb.info/1212512405/34.
Full textPiewak, Florian Pierre Joseph [Verfasser], and J. M. [Akademischer Betreuer] Zöllner. "LiDAR-based Semantic Labeling : Automotive 3D Scene Understanding / Florian Pierre Joseph Piewak ; Betreuer: J. M. Zöllner." Karlsruhe : KIT-Bibliothek, 2020. http://d-nb.info/1212512405/34.
Full textMinto, Ludovico. "Deep learning for scene understanding with color and depth data." Doctoral thesis, Università degli studi di Padova, 2018. http://hdl.handle.net/11577/3422424.
Full textNegli ultimi anni sono stati raggiunti notevoli progressi sia per quanto concerne l'acquisizione di dati sia per quanto riguarda la strumentazione e gli algoritmi necessari per processarli. Da un lato, l'introduzione di sensori di profondità nel mercato del grande consumo ha reso possibile l'acquisizione di dati tridimensionali ad un costo irrisorio, permettendo così di superare le limitazioni cui sono tipicamente soggette svariate applicazioni basate solamente sull'elaborazione del colore. Al tempo stesso, processori grafici sempre più performanti hanno consentito l'estensione della ricerca ad algoritmi computazionalmente onerosi e la loro applicazione a grandi moli di dati. Dall'altro lato, lo sviluppo di algoritmi sempre più efficaci per l'apprendimento automatico, ivi incluse tecniche di apprendimento profondo, ha permesso di sfruttare l'enorme quantità di dati oggi a disposizione. Alla luce di queste premesse, vengono presentati in questa tesi tre tipici problemi nell'ambito della visione computazionale proponendo altrettanti approcci per una loro soluzione in grado di sfruttare sia l'utilizzo di reti neurali convoluzionali sia l'informazione congiunta convogliata da dati di colore e profondità. In particolare, viene presentato un approccio per la segmentazione semantica di immagini colore/profondità che utilizza sia l'informazione estratta con l'aiuto di una rete neurale convoluzionale sia l'informazione geometrica ricavata attraverso algoritmi più tradizionali. Viene descritto un metodo per la classificazione di forme tridimensionali basato anch'esso sull'utilizzo di una rete neurale convoluzionale operante su particolari rappresentazioni dei dati 3D a disposizione. Infine, viene proposto l'utilizzo dei una rete convoluzionale per stimare la confidenza associata a dati di profondità rispettivamente raccolti con un sensore ToF ed un sistema stereo al fine di guidare con successo la loro fusione senza impiegare, per lo stesso scopo, complicati modelli di rumore.
Lai, Po Kong. "Immersive Dynamic Scenes for Virtual Reality from a Single RGB-D Camera." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39663.
Full textYalcin, Bayramoglu Neslihan. "Range Data Recognition: Segmentation, Matching, And Similarity Retrieval." Phd thesis, METU, 2011. http://etd.lib.metu.edu.tr/upload/12613586/index.pdf.
Full texthowever, there is still a gap in the 3D semantic analysis between the requirements of the applications and the obtained results. In this thesis we studied 3D semantic analysis of range data. Under this broad title we address segmentation of range scenes, correspondence matching of range images and the similarity retrieval of range models. Inputs are considered as single view depth images. First, possible research topics related to 3D semantic analysis are introduced. Planar structure detection in range scenes are analyzed and some modifications on available methods are proposed. Also, a novel algorithm to segment 3D point cloud (obtained via TOF camera) into objects by using the spatial information is presented. We proposed a novel local range image matching method that combines 3D surface properties with the 2D scale invariant feature transform. Next, our proposal for retrieving similar models where the query and the database both consist of only range models is presented. Finally, analysis of heat diffusion process on range data is presented. Challenges and some experimental results are presented.
Duan, Liuyun. "Modélisation géométrique de scènes urbaines par imagerie satellitaire." Thesis, Université Côte d'Azur (ComUE), 2017. http://www.theses.fr/2017AZUR4025.
Full textAutomatic city modeling from satellite imagery is one of the biggest challenges in urban reconstruction. The ultimate goal is to produce compact and accurate 3D city models that benefit many application fields such as urban planning, telecommunications and disaster management. Compared with aerial acquisition, satellite imagery provides appealing advantages such as low acquisition cost, worldwide coverage and high collection frequency. However, satellite context also imposes a set of technical constraints as a lower pixel resolution and a wider that challenge 3D city reconstruction. In this PhD thesis, we present a set of methodological tools for generating compact, semantically-aware and geometrically accurate 3D city models from stereo pairs of satellite images. The proposed pipeline relies on two key ingredients. First, geometry and semantics are retrieved simultaneously providing robust handling of occlusion areas and low image quality. Second, it operates at the scale of geometric atomic regions which allows the shape of urban objects to be well preserved, with a gain in scalability and efficiency. Images are first decomposed into convex polygons that capture geometric details via Voronoi diagram. Semantic classes, elevations, and 3D geometric shapes are then retrieved in a joint classification and reconstruction process operating on polygons. Experimental results on various cities around the world show the robustness, scalability and efficiency of the proposed approach
Yin, Wei. "3D Scene Reconstruction from A Monocular Image." Thesis, 2022. https://hdl.handle.net/2440/134585.
Full textThesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2022
Zhuo, Wei. "2D+3D Indoor Scene Understanding from a Single Monocular Image." Phd thesis, 2018. http://hdl.handle.net/1885/144616.
Full textWu, Sheng-Han, and 吳昇翰. "Indoor Scene Semantic Modeling with Scalable 3D Model Retrieval to Interact with Real-world Environment in Virtual Reality." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/p74u4k.
Full text國立交通大學
資訊科學與工程研究所
108
In recent of years, Virtual Reality (VR) applications have been developed rapidly. However, few of them support interacting with real-world environment, because large efforts are required to build almost the same 3D models as real objects and put them into VR environment. In this paper, we propose a fully automatic method for indoor scene semantic modeling. Moreover, the reconstructed 3D model is completely fit real scene and can allow user to touch real objects in VR. First, we acquire the real indoor scene by using SemanticFusion which provides a point cloud with semantic labels. After that, we present a method to handle incorrect labels and extract individual object point cloud. Finally, a novel 3D object model retrieval is proposed. Unlike existing works, our method is able to generate the geometrically faithful models and work well even when there is no exactly the same 3D object in the shape database or object point cloud is incomplete. The result has been applied to a VR application, and it shows the reconstruction model is precise enough for haptic touch in VR.
Najafi, Mohammad. "On the Role of Context at Different Scales in Scene Parsing." Phd thesis, 2017. http://hdl.handle.net/1885/116302.
Full text