Rozprawy doktorskie: „MULTI VIEW VIDEOS”

1

Wang, Dongang. "Action Recognition in Multi-view Videos". Thesis, The University of Sydney, 2018. http://hdl.handle.net/2123/19740.

Pełny tekst źródła

Streszczenie:

A long-lasting goal in the field of artificial intelligence is to develop agents that can perceive and understand the rich visual world around us. With the improvement in deep learning and neural networks, many previous difficulties in the computer vision area have been resolved. For example, the accuracy in image classification has even exceeded human being in the ImageNet challenge. However, some issues are still attractive in the community, like action recognition and its application in multi-view videos. Based on a large number of previous works in the last few years, we propose a new Dividing and Aggregating Network (DA-Net) to address the problem of action recognition in multi-view videos in this thesis. First, the DA-Net can learn view-independent representations shared by all views at lower layers and learn one view-specific representation for each view at higher layers. We then train view-specific action classifiers based on the view-specific representation for each view and a view classifier based on the shared representation at lower layers. The view classifier is used to predict how likely each video belongs to each view. Finally, the predicted view probabilities from multiple views are used as the weights when fusing the prediction scores of view-specific action classifiers. We also propose a new approach based on the conditional random field (CRF) formulation to pass message among view-specific representations from different branches to help each other. Comprehensive experiments are conducted accordingly. The experiments on three benchmark datasets clearly demonstrate the effectiveness of our proposed DA-Net for multi-view action recognition. We also conduct the ablation study, which indicates the three modules we proposed can provide steady improvements to the prediction accuracy.

Style APA, Harvard, Vancouver, ISO itp.

2

Canavan, Shaun. "Face recognition by multi-frame fusion of rotating heads in videos /". Connect to resource online, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1210446052.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

3

Canavan, Shaun J. "Face Recognition by Multi-Frame Fusion of Rotating Heads in Videos". Youngstown State University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1210446052.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

4

Balusu, Anusha. "Multi-Vehicle Detection and Tracking in Traffic Videos Obtained from UAVs". University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1593266183551245.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

5

Twinanda, Andru Putra. "Vision-based approaches for surgical activity recognition using laparoscopic and RBGD videos". Thesis, Strasbourg, 2017. http://www.theses.fr/2017STRAD005/document.

Pełny tekst źródła

Streszczenie:

Cette thèse a pour objectif la conception de méthodes pour la reconnaissance automatique des activités chirurgicales. Cette reconnaissance est un élément clé pour le développement de systèmes réactifs au contexte clinique et pour des applications comme l’assistance automatique lors de chirurgies complexes. Nous abordons ce problème en utilisant des méthodes de Vision puisque l’utilisation de caméras permet de percevoir l’environnement sans perturber la chirurgie. Deux types de vidéos sont utilisées : des vidéos laparoscopiques et des vidéos multi-vues RGBD. Nous avons d’abord étudié les résultats obtenus avec les méthodes de l’état de l’art, puis nous avons proposé des nouvelles approches basées sur le « Deep learning ». Nous avons aussi généré de larges jeux de données constitués d’enregistrements de chirurgies. Les résultats montrent que nos méthodes permettent d’obtenir des meilleures performances pour la reconnaissance automatique d’activités chirurgicales que l’état de l’art
The main objective of this thesis is to address the problem of activity recognition in the operating room (OR). Activity recognition is an essential component in the development of context-aware systems, which will allow various applications, such as automated assistance during difficult procedures. Here, we focus on vision-based approaches since cameras are a common source of information to observe the OR without disrupting the surgical workflow. Specifically, we propose to use two complementary video types: laparoscopic and OR-scene RGBD videos. We investigate how state-of-the-art computer vision approaches perform on these videos and propose novel approaches, consisting of deep learning approaches, to carry out the tasks. To evaluate our proposed approaches, we generate large datasets of recordings of real surgeries. The results demonstrate that the proposed approaches outperform the state-of-the-art methods in performing surgical activity recognition on these new datasets

Style APA, Harvard, Vancouver, ISO itp.

6

Ozcinar, Cagri. "Multi-view video communication". Thesis, University of Surrey, 2015. http://epubs.surrey.ac.uk/807807/.

Pełny tekst źródła

Streszczenie:

The proliferation of three-dimensional (3D) video technology increases the demand for multiview video (MVV) communication tremendously. Applications that involve MVV constitute the next step in 3D media technology, given that they offer a more realistic viewing experience. The distribution of MVV to users brings significant challenges due to the large volume of data involved and the inherent limitations imposed by the communication protocols. As the number of views increases, current systems will struggle to meet the demand of delivering the MVV at a consistent quality level. To this end, this thesis addresses efficient coding, adaptive streaming, and loss-resilient delivery techniques for MVV. The first contribution of this thesis addresses the problem of cost-efficient transmission of MVV with a provided per-view depth map. The primary goal is to facilitate the delivery of the maximum possible number of MVV streams over the Internet in order to ensure that the MVV reconstruction quality is maximised. Accordingly, a novel view scalable MVV coding approach is introduced, which includes a new view discarding and reconstruction algorithm. The results of extensive experiments demonstrate that the proposed MVV coding scheme has considerably improved rate-distortion (R-D) performance compared to the state-of-the-art standards. The second contribution of this thesis is the design of an adaptive MVV streaming technique that offers uninterrupted high-quality delivery to users. In order to achieve this, a number of novel mechanisms are introduced that can adapt the MVV content to collaborative peer-to-peer (P2P) and server-client multimedia dissemination networks. Experiment results show that the suggested adaptation technique yields a superior playback performance over a broad range of network conditions. The final contribution of this thesis is the design of an error-resilient scheme that addresses packet losses for MVV streaming. The aim is to make the MVV streaming more reliable against communication failures. Simulation results clearly show that the proposed approach outperforms reference solutions by a significant margin, not only objectively, but through subjective testing as well.

Style APA, Harvard, Vancouver, ISO itp.

7

Salvador, Marcos Jordi. "Surface reconstruction for multi-view video". Doctoral thesis, Universitat Politècnica de Catalunya, 2011. http://hdl.handle.net/10803/108907.

Pełny tekst źródła

Streszczenie:

This thesis introduces a methodology for obtaining an alternative representation of video sequences captured by calibrated multi-camera systems in controlled environments with known scene background. This representation consists in a 3D description of the surfaces of foreground objects, which allows for the recovering of part of the 3D information of the original scene lost in the projection process in each camera. The choice of the type of representation and the design of the reconstruction techniques are driven by three requirements that appear in smart rooms or recording studios. In these scenarios, video sequences captured by a multi-camera rig are used both for analysis applications and interactive visualization methods. The requirements are: the reconstruction method must be fast in order to be usable in interactive applications, the surface representation must provide a compression of the multi-view data redundancies and this representation must also provide all the relevant information to be used for analysis applications as well as for free-viewpoint video. Once foreground and background are segregated for each view, the reconstruction process is divided in two stages. The first one obtains a sampling of the foreground surfaces (including orientation and texture), whereas the second provides closed, continuous surfaces from the samples, through interpolation. The sampling process is interpreted as a search for 3D positions that result in feature matchings between different views. This search process can be driven by different mechanisms: an image-based approach, another one based on the deformation of a surface from frame to frame or a statistical sampling approach where samples are searched around the positions of other detected samples, which is the fastest and easiest to parallelize of the three approaches. A meshing algorithm is also presented, which allows for the interpolation of surfaces between samples. Starting by an initial triangle, which connects three points coherently oriented, an iterative expansion of the surface over the complete set of samples takes place. The proposed method presents a very accurate reconstruction and results in a correct topology. Furthermore, it is fast enough to be used interactively. The presented methodology for surface reconstruction permits obtaining a fast, compressed and complete representation of foreground elements in multi-view video, as reflected by the experimental results.
Aquesta tesi presenta diferents tècniques per a la definiciò d’una metodologia per obtenir una representaciò alternativa de les seqüències de vídeo capturades per sistemes multi-càmera calibrats en entorns controlats, amb fons de l’escena conegut. Com el títol de la tesi suggereix, aquesta representació consisteix en una descripció tridimensional de les superfícies dels objectes de primer pla. Aquesta aproximació per la representació de les dades multi-vista permet recuperar part de la informació tridimensional de l’escena original perduda en el procés de projecció que fa cada càmera. L’elecció del tipus de representació i el disseny de les tècniques per la reconstrucció de l’escena responen a tres requeriments que apareixen en entorns controlats del tipus smart room o estudis de gravació, en què les seqüències capturades pel sistema multi-càmera són utilitzades tant per aplicacions d’anàlisi com per diferents mètodes de visualització interactius. El primer requeriment és que el mètode de reconstrucció ha de ser ràpid, per tal de poder-ho utilitzar en aplicacions interactives. El segon és que la representació de les superfícies sigui eficient, de manera que en resulti una compressió de les dades multi-vista. El tercer requeriment és que aquesta representació sigui efectiva, és a dir, que pugui ser utilitzada en aplicacions d’anàlisi, així com per visualitació. Un cop separats els continguts de primer pla i de fons de cada vista –possible en entorns controlats amb fons conegut–, l’estratègia que es segueix en el desenvolupament de la tesi és la de dividir el procés de reconstrucció en dues etapes. La primera consisteix en obtenir un mostreig de les superfícies (incloent orientació i textura). La segona etapa proporciona superfícies tancades, contínues, a partir del conjunt de mostres, mitjançant un procés d’interpolació. El resultat de la primera etapa és un conjunt de punts orientats a l’espai 3D que representen localment la posició, orientació i textura de les superfícies visibles pel conjunt de càmeres. El procés de mostreig s’interpreta com un procés de cerca de posicions 3D que resulten en correspondències de característiques de la imatge entre diferents vistes. Aquest procés de cerca pot ser conduït mitjançant diferents mecanismes, els quals es presenten a la primera part d’aquesta tesi. La primera proposta és fer servir un mètode basat en les imatges que busca mostres de superfície al llarg de la semi-recta que comença al centre de projeccions de cada càmera i passa per un determinat punt de la imatge corresponent. Aquest mètode s’adapta correctament al cas de voler explotar foto-consistència en un escenari estàtic i presenta caracterìstiques favorables per la seva utilizació en GPUs–desitjable–, però no està orientat a explotar les redundàncies temporals existentsen seqüències multi-vista ni proporciona superfícies tancades. El segon mètode efectua la cerca a partir d’una superfície inicial mostrejada que tanca l’espai on es troben els objectes a reconstruir. La cerca en direcció inversa a les normals –apuntant a l’interior– permet obtenir superfícies tancades amb un algorisme que explota la correlació temporal de l’escena per a l’evolució de reconstruccions 3D successives al llarg del temps. Un inconvenient d’aquest mètode és el conjunt d’operacions topològiques sobre la superfície inicial, que en general no són aplicables eficientment en GPUs. La tercera estratègia de mostreig està orientada a la paral·lelització –GPU– i l’explotació de correlacions temporals i espacials en la cerca de mostres de superfície. Definint un espai inicial de cerca que inclou els objectes a reconstruir, es busquen aleatòriament unes quantes mostres llavor sobre la superfície dels objectes. A continuació, es continuen buscant noves mostres de superfície al voltant de cada llavor –procés d’expansió– fins que s’aconsegueix una densitat suficient. Per tal de millorar l’eficiència de la cerca inicial de llavors, es proposa reduir l’espai de cerca, explotant d’una banda correlacions temporals en seqüències multi-vista i de l’altra aplicant multi-resolució. A continuació es procedeix amb l’expansió, que explota la correlació espacial en la distribució de les mostres de superfície. A la segona part de la tesi es presenta un algorisme de mallat que permet interpolar la superfície entre les mostres. A partir d’un triangle inicial, que connecta tres punts coherentment orientats, es procedeix a una expansió iterativa de la superfície sobre el conjunt complet de mostres. En relació amb l’estat de l’art, el mètode proposat presenta una reconstrucció molt precisa (no modifica la posició de les mostres) i resulta en una topologia correcta. A més, és prou ràpid com per ser utilitzable en aplicacions interactives, a diferència de la majoria de mètodes disponibles. Els resultats finals, aplicant ambdues etapes –mostreig i interpolació–, demostren la validesa de la proposta. Les dades experimentals mostren com la metodologia presentada permet obtenir una representació ràpida, eficient –compressió– i efectiva –completa– dels elements de primer pla de l’escena.

Style APA, Harvard, Vancouver, ISO itp.

8

Abdullah, Jan Mirza, i Mahmododfateh Ahsan. "Multi-View Video Transmission over the Internet". Thesis, Linköping University, Department of Electrical Engineering, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-57903.

Pełny tekst źródła

Streszczenie:

3D television using multiple views rendering is receiving increasing interest. In this technology a number of video sequences are transmitted simultaneously and provides a larger view of the scene or stereoscopic viewing experience. With two views stereoscopic rendition is possible. Nowadays 3D displays are available that are capable of displaying several views simultaneously and the user is able to see different views by moving his head.

The thesis work aims at implementing a demonstration system with a number of simultaneous views. The system will include two cameras, computers at both the transmitting and receiving end and a multi-view display. Besides setting up the hardware, the main task is to implement software so that the transmission can be done over an IP-network.

This thesis report includes an overview and experiences of similar published systems, the implementation of real time video, its compression, encoding, and transmission over the internet with the help of socket programming and finally the multi-view display in 3D format. This report also describes the design considerations more precisely regarding the video coding and network protocols.

Style APA, Harvard, Vancouver, ISO itp.

9

Fecker, Ulrich. "Coding techniques for multi-view video signals /". Aachen : Shaker, 2009. http://d-nb.info/993283179/04.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

10

Ozkalayci, Burak Oguz. "Multi-view Video Coding Via Dense Depth Field". Master's thesis, METU, 2006. http://etd.lib.metu.edu.tr/upload/12607517/index.pdf.

Pełny tekst źródła

Streszczenie:

Emerging 3-D applications and 3-D display technologies raise some transmission problems of the next-generation multimedia data. Multi-view Video Coding (MVC) is one of the challenging topics in this area, that is on its road for standardization via ISO MPEG. In this thesis, a 3-D geometry-based MVC approach is proposed and analyzed in terms of its compression performance. For this purpose, the overall study is partitioned into three preceding parts. The first step is dense depth estimation of a view from a fully calibrated multi-view set. The calibration information and smoothness assumptions are utilized for determining dense correspondences via a Markov Random Field (MRF) model, which is solved by Belief Propagation (BP) method. In the second part, the estimated dense depth maps are utilized for generating (predicting) arbitrary (other camera) views of a scene, that is known as novel view generation. A 3-D warping algorithm, which is followed by an occlusion-compatible hole-filling process, is implemented for this aim. In order to suppress the occlusion artifacts, an intermediate novel view generation method, which fuses two novel views generated from different source views, is developed. Finally, for the last part, dense depth estimation and intermediate novel view generation tools are utilized in the proposed H.264-based MVC scheme for the removal of the spatial redundancies between different views. The performance of the proposed approach is compared against the simulcast coding and a recent MVC proposal, which is expected to be the standard recommendation for MPEG in the near future. These results show that the geometric approaches in MVC can still be utilized, especially in certain 3-D applications, in addition to conventional temporal motion compensation techniques, although the rate-distortion performances of geometry-free approaches are quite superior.

Style APA, Harvard, Vancouver, ISO itp.

11

Cigla, Cevahir. "Real-time Stereo To Multi-view Video Conversion". Phd thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614513/index.pdf.

Pełny tekst źródła

Streszczenie:

A novel and efficient methodology is presented for the conversion of stereo to multi-view video in order to address the 3D content requirements for the next generation 3D-TVs and auto-stereoscopic multi-view displays. There are two main algorithmic blocks in such a conversion system
stereo matching and virtual view rendering that enable extraction of 3D information from stereo video and synthesis of inexistent virtual views, respectively. In the intermediate steps of these functional blocks, a novel edge-preserving filter is proposed that recursively constructs connected support regions for each pixel among color-wise similar neighboring pixels. The proposed recursive update structure eliminates pre-defined window dependency of the conventional approaches, providing complete content adaptibility with quite low computational complexity. Based on extensive tests, it is observed that the proposed filtering technique yields better or competitive results against some leading techniques in the literature. The proposed filter is mainly applied for stereo matching to aggregate cost functions and also handles occlusions that enable high quality disparity maps for the stereo pairs. Similar to box filter paradigm, this novel technique yields matching of arbitrary-shaped regions in constant time. Based on Middlebury benchmarking, the proposed technique is currently the best local matching technique in the literature in terms of both precision and complexity. Next, virtual view synthesis is conducted through depth image based rendering, in which reference color views of left and right pairs are warped to the desired virtual view using the estimated disparity maps. A feedback mechanism based on disparity error is introduced at this step to remove salient distortions for the sake of visual quality. Furthermore, the proposed edge-aware filter is re-utilized to assign proper texture for holes and occluded regions during view synthesis. Efficiency of the proposed scheme is validated by the real-time implementation on a special graphics card that enables parallel computing. Based on extensive experiments on stereo matching and virtual view rendering, proposed method yields fast execution, low memory requirement and high quality outputs with superior performance compared to most of the state-of-the-art techniques.

Style APA, Harvard, Vancouver, ISO itp.

12

Lawan, Sagir. "Adaptive intra refresh for robust wireless multi-view video". Thesis, Brunel University, 2016. http://bura.brunel.ac.uk/handle/2438/13078.

Pełny tekst źródła

Streszczenie:

Mobile wireless communication technology is a fast developing field and every day new mobile communication techniques and means are becoming available. In this thesis multi-view video (MVV) is also refers to as 3D video. Thus, the 3D video signals through wireless communication are shaping telecommunication industry and academia. However, wireless channels are prone to high level of bit and burst errors that largely deteriorate the quality of service (QoS). Noise along the wireless transmission path can introduce distortion or make a compressed bitstream lose vital information. The error caused by noise progressively spread to subsequent frames and among multiple views due to prediction. This error may compel the receiver to pause momentarily and wait for the subsequent INTRA picture to continue decoding. The pausing of video stream affects the user's Quality of Experience (QoE). Thus, an error resilience strategy is needed to protect the compressed bitstream against transmission errors. This thesis focuses on error resilience Adaptive Intra Refresh (AIR) technique. The AIR method is developed to make the compressed 3D video more robust to channel errors. The process involves periodic injection of Intra-coded macroblocks in a cyclic pattern using H.264/AVC standard. The algorithm takes into account individual features in each macroblock and the feedback information sent by the decoder about the channel condition in order to generate an MVV-AIR map. MVV-AIR map generation regulates the order of packets arrival and identifies the motion activities in each macroblock. Based on the level of motion activity contained in each macroblock, the MVV-AIR map classifies frames as high or low motion macroblocks. A proxy MVV-AIR transcoder is used to validate the efficiency of the generated MVV-AIR map. The MVV-AIR transcoding algorithm uses spatial and views downscaling scheme to convert from MVV to single view. Various experimental results indicate that the proposed error resilient MVV-AIR transcoder technique effectively improves the quality of reconstructed 3D video in wireless networks. A comparison of MVV-AIR transcoder algorithm with some traditional error resilience techniques demonstrates that MVV-AIR algorithm performs better in an error prone channel. Results of simulation revealed significant improvements in both objective and subjective qualities. No additional computational complexity emanates from the scheme while the QoS and QoE requirements are still fully met.

Style APA, Harvard, Vancouver, ISO itp.

13

Talebpourazad, Mahsa. "3D-TV Content generation and multi-view video coding". Thesis, University of British Columbia, 2010. http://hdl.handle.net/2429/25949.

Pełny tekst źródła

Streszczenie:

The success of the 3D technology and the speed at which it will penetrate the entertainment market will depend on how well the challenges faced by the 3D-broadcasting system are resolved. The three main 3D-broadcasting system components are 3D content generation, 3D video transmission and 3D display. One obvious challenge is the unavailability of a wide variety of 3D content. Thus, besides generating new 3D-format videos, it is equally important to convert existing 2D material to the 3D format. This is because the generation of new 3D content is highly demanding and in most cases, involves post-processing correction algorithms. Another major challenge is that of transmitting a huge amount of data. This problem becomes much more severe in the case of multiview video content. This thesis addresses three aspects of the 3D-broadcasting system challenges. Firstly, the problem of converting 2D acquired video to a 3D format is addressed. Two new and efficient methods were proposed, which exploit the existing relationship between the motion of objects and their distance from the camera, to estimate the depth map of the scene in real-time. These methods can be used at the transmitter and receiver-ends. It is especially advantageous to employ them at the receiver-end since they do not increase the transmission bandwidth requirements. Performance evaluations show that our methods outperform the other existing technique by providing better depth approximation and thus a better 3D visual effect. Secondly, we studied one of the problems caused by unsynchronized zooming in stereo-camera video acquisition. We developed an effective algorithm for correcting unsynchronized zoom in 3D videos. The proposed scheme finds corresponding pairs of pixels between the left and right views and the relationship between them. This relationship is used to estimate the amount of scaling and translation needed to align the views. Experimental results show our method produces videos with negligible scale difference and vertical parallax. Lastly, the transmission of 3D-content problem is addressed and two schemes for multiview video coding (MVC) are proposed. While both methods outperform the current MVC standard, one of them introduces significantly less random access delay compared to the MVC standard.

Style APA, Harvard, Vancouver, ISO itp.

14

Bouyagoub, Samira. "Multi-camera optimisation for view synthesis and video communications". Thesis, University of Bristol, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.529898.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

15

Lee, Yung-Lyul. "Trend of Multi-View Video Coding in Korea (3D AV)". INTELLIGENT MEDIA INTEGRATION NAGOYA UNIVERSITY / COE, 2005. http://hdl.handle.net/2237/10360.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

16

Ekmekcioglu, Erhan. "Advanced three-dimensional multi-view video coding and evaluation techniques". Thesis, University of Surrey, 2009. http://epubs.surrey.ac.uk/843601/.

Pełny tekst źródła

Streszczenie:

3D video services constitute the next step in multimedia services, as they give the chance of more natural visualisation and provide a sense of "being there". A lot of research effort has been put towards to realising 3D video services, especially in the context of stereoscopic video. 3D multi-view video is a step beyond stereoscopic video, creating much wider scene navigation range and improved user interaction, despite the higher source data size. With the aid of the extracted scene geometry and depth information, any arbitrary viewpoint can be reconstructed. The level of research in multi-view video is not as mature as the level of research in stereoscopic video, although there is a lot of ongoing work towards the realisation of practical multi-view video based 3D video applications. This thesis addresses compression and quality assessment related aspects of 3D multi-view video, for reduced bandwidth usage and more reliable evaluation of perceived quality. In the first part of the thesis, efficient compression algorithms for multi-view video with depth information that take into account several constraints are studied. These include the ease of viewpoint scalability and fast viewpoint random access. To be standards conformant, the proposed methods are implemented on the multi-view codec standard. The second part of the thesis studies processing and block based coding approaches for depth map video sequences, taking into account their special characteristics and effects on the 3D scene reconstruction process. Some state-of-the-art compression approaches, like reduced resolution coding, are extended to exploit scene texture and geometry information for improved performance. The last part of the thesis is devoted to the quality assessment problem for synthesized camera viewpoints, a core element of multi-view based free-viewpoint video applications. Depth based rendering related aspects are exploited to quantify the objective quality of synthesized scenes in an improved way, by extending the state-of-the-art 2D video quality assessment tools.

Style APA, Harvard, Vancouver, ISO itp.

17

Pouladzadeh, Parvaneh. "Design and Implementation of Video View Synthesis for the Cloud". Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/37048.

Pełny tekst źródła

Streszczenie:

In multi-view video applications, view synthesis is a computationally intensive task that needs to be done correctly and efficiently in order to deliver a seamless user experience. In order to provide fast and efficient view synthesis, in this thesis, we present a cloud-based implementation that will be especially beneficial to mobile users whose devices may not be powerful enough for high quality view synthesis. Our proposed implementation balances the view synthesis algorithm’s components across multiple threads and utilizes the computational capacity of modern CPUs for faster and higher quality view synthesis. For arbitrary view generation, we utilize the depth map of the scene from the cameras’ viewpoint and estimate the depth information conceived from the virtual camera. The estimated depth is then used in a backward direction to warp the cameras’ image onto the virtual view. Finally, we use a depth-aided inpainting strategy for the rendering step to reduce the effect of disocclusion regions (holes) and to paint the missing pixels. For our cloud implementation, we employed an automatic scaling feature to offer elasticity in order to adapt the service load according to the fluctuating user demands. Our performance results using 4 multi-view videos over 2 different scenarios show that our proposed system achieves average improvement of 3x speedup, 87% efficiency, and 90% CPU utilization for the parallelizable parts of the algorithm.

Style APA, Harvard, Vancouver, ISO itp.

18

Yang, Fan. "Integral Video Coding". Thesis, KTH, Kommunikationsteori, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-162922.

Pełny tekst źródła

Streszczenie:

In recent years, 3D camera products and prototypes based on Integral imaging (II) technique have gradually emerged and gained broad attention. II is a method that spatially samples the natural light (light field) of a scene, usually using a microlens array or a camera array and records the light field using a high resolution 2D image sensor. The large amount of data generated by II and the redundancy it contains together lead to the need for an efficient compression scheme. During recent years, the compression of 3D integral images has been widely researched. Nevertheless, there have not been many approaches proposed regarding the compression of integral videos (IVs). The objective of the thesis is to investigate efficient coding methods for integral videos. The integral video frames used are captured by the first consumer used light field camera Lytro. One of the coding methods is to encode the video data directly by an H.265/HEVC encoder. In other coding schemes the integral video is first converted to an array of sub-videos with different view perspectives. The sub-videos are then encoded either independently or following a specific reference picture pattern which uses a MVHEVC encoder. In this way the redundancy between the multi-view videos is utilized instead of the original elemental images. Moreover, by varying the pattern of the subvideo input array and the number of inter-layer reference pictures, the coding performance can be further improved. Considering the intrinsic properties of the input video sequences, a QP-per-layer scheme is also proposed in this thesis. Though more studies would be required regarding time and complexity constraints for real-time applications as well as dramatic increase of number of views, the methods proposed inthis thesis prove to be an efficient compression for integral videos.

Style APA, Harvard, Vancouver, ISO itp.

19

Cigla, Cevahir. "Dense Depth Map Estimation For Object Segmentation In Multi-view Video". Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/12608647/index.pdf.

Pełny tekst źródła

Streszczenie:

In this thesis, novel approaches for dense depth field estimation and object segmentation from mono, stereo and multiple views are presented. In the first stage, a novel graph-theoretic color segmentation algorithm is proposed, in which the popular Normalized Cuts 59H[6] segmentation algorithm is improved with some modifications on its graph structure. Segmentation is obtained by the recursive partitioning of the weighted graph. The simulation results for the comparison of the proposed segmentation scheme with some well-known segmentation methods, such as Recursive Shortest Spanning Tree 60H[3] and Mean-Shift 61H[4] and the conventional Normalized Cuts, show clear improvements over these traditional methods. The proposed region-based approach is also utilized during the dense depth map estimation step, based on a novel modified plane- and angle-sweeping strategy. In the proposed dense depth estimation technique, the whole scene is assumed to be region-wise planar and 3D models of these plane patches are estimated by a greedy-search algorithm that also considers visibility constraint. In order to refine the depth maps and relax the planarity assumption of the scene, at the final step, two refinement techniques that are based on region splitting and pixel-based optimization via Belief Propagation 62H[32] are also applied. Finally, the image segmentation algorithm is extended to object segmentation in multi-view video with the additional depth and optical flow information. Optical flow estimation is obtained via two different methods, KLT tracker and region-based block matching and the comparisons between these methods are performed. The experimental results indicate an improvement for the segmentation performance by the usage of depth and motion information.

Style APA, Harvard, Vancouver, ISO itp.

20

Hany, Hanafy Mahmoud Said. "Low bitrate multi-view video coding based on H.264/AVC". Thesis, Staffordshire University, 2015. http://eprints.staffs.ac.uk/2206/.

Pełny tekst źródła

Streszczenie:

Multi-view Video Coding (MVC) is vital for low bitrate applications that have constraints in bandwidth, battery capacity and memory size. Symmetric and mixed spatial-resolution coding approaches are addressed in this thesis, where Prediction Architecture (PA) is investigated using block matching statistics. Impact of camera separation is studied for symmetric coding to define a criterion for the best usage of MVC. Visual enhancement is studied for mixed spatial-resolution coding to improve visual quality for the interpolated frames by utilising the information derived from disparity compensation. In the context of symmetric coding investigations, camera separation cannot be used as a sufficient criterion to select suitable coding solution for a given video. Prediction architectures are proposed, where MVC that uses these architectures have higher coding performance than the corresponding codec that deploys a set of other prediction architectures, where the coding gain is up to 2.3 dB. An Adaptive Reference Frame Ordering (ARFO) algorithm is proposed that saves up to 6.2% in bits compared to static reference frame ordering when coding sequence that contains hard scene changes. In the case of mixed spatial-resolution coding investigations, a new PA is proposed that is able to save bitrate by 13.1 Kbps compared to the corresponding codec that uses the extended architecture based on 3D-digital multimedia. The codec that uses hierarchical B-picture PA has higher coding efficiency than the corresponding codec that employs the proposed PA, where the bitrate saving is 24.9 Kbps. The ARFO algorithm has been integrated with the proposed PA where it saves bitrates by up to 35.4 Kbps compared to corresponding codec that uses other prediction architectures. Visual enhancement algorithm is proposed and integrated within the presented PA. It provides highest quality improvement for the interpolated frames where coding gain is up to 0.9 dB compared to the corresponding frames that are coded by other prediction architectures.

Style APA, Harvard, Vancouver, ISO itp.

21

Mohib, Hamdullah. "End-to-end 3D video communication over heterogeneous networks". Thesis, Brunel University, 2014. http://bura.brunel.ac.uk/handle/2438/8293.

Pełny tekst źródła

Streszczenie:

Three-dimensional technology, more commonly referred to as 3D technology, has revolutionised many fields including entertainment, medicine, and communications to name a few. In addition to 3D films, games, and sports channels, 3D perception has made tele-medicine a reality. By the year 2015, 30% of the all HD panels at home will be 3D enabled, predicted by consumer electronics manufacturers. Stereoscopic cameras, a comparatively mature technology compared to other 3D systems, are now being used by ordinary citizens to produce 3D content and share at a click of a button just like they do with the 2D counterparts via sites like YouTube. But technical challenges still exist, including with autostereoscopic multiview displays. 3D content requires many complex considerations--including how to represent it, and deciphering what is the best compression format--when considering transmission or storage, because of its increased amount of data. Any decision must be taken in the light of the available bandwidth or storage capacity, quality and user expectations. Free viewpoint navigation also remains partly unsolved. The most pressing issue getting in the way of widespread uptake of consumer 3D systems is the ability to deliver 3D content to heterogeneous consumer displays over the heterogeneous networks. Optimising 3D video communication solutions must consider the entire pipeline, starting with optimisation at the video source to the end display and transmission optimisation. Multi-view offers the most compelling solution for 3D videos with motion parallax and freedom from wearing headgear for 3D video perception. Optimising multi-view video for delivery and display could increase the demand for true 3D in the consumer market. This thesis focuses on an end-to-end quality optimisation in 3D video communication/transmission, offering solutions for optimisation at the compression, transmission, and decoder levels.

Style APA, Harvard, Vancouver, ISO itp.

22

Andersson, Håkan. "3D Video Playback : A modular cross-platform GPU-based approach for flexible multi-view 3D video rendering". Thesis, Mittuniversitetet, Institutionen för informationsteknologi och medier, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-12389.

Pełny tekst źródła

Streszczenie:

The evolution of depth‐perception visualization technologies, emerging format standardization work and research within the field of multi‐view 3D video and imagery addresses the need for flexible 3D video visualization. The wide variety of available 3D‐display types and visualization techniques for multi‐view video, as well as the high throughput requirements for high definition video, addresses the need for a real‐time 3D video playback solution that takes advantage of hardware accelerated graphics, while providing a high degree of flexibility through format configuration and cross‐platform interoperability. A modular component based software solution based on FFmpeg for video demultiplexing and video decoding is proposed,using OpenGL and GLUT for hardware accelerated graphics and POSIX threads for increased CPU utilization. The solution has been verified to have sufficient throughput in order to display 1080p video at the native video frame rate on the experimental system, which is considered as a standard high‐end desktop PC only using commercial hardware. In order to evaluate the performance of the proposed solution a number of throughput evaluation metrics have been introduced measuring average frame rate as a function of: video bit rate, video resolution and number of views. The results obtained have indicated that the GPU constitutes the primary bottleneck in a multi‐view lenticular rendering system and that multi‐view rendering performance is degraded as the number of views is increased. This is a result of the current GPU square matrix texture cache architectures, resulting in texture lookup access times according to random memory access patterns when the number of views is high. The proposed solution has been identified in order to provide low CPU efficiency, i.e. low CPU hardware utilization and it is recommended to increase performance by investigating the gains of scalable multithreading techniques. It is also recommended to investigate the gains of introducing video frame buffering in video memory or to move more calculations to the CPU in order to increase GPU performance.

Style APA, Harvard, Vancouver, ISO itp.

23

Yamaguchi, Tatsuhisa. "3D Video Capture of a Moving Object in a Wide Area Using Active Cameras". 京都大学 (Kyoto University), 2013. http://hdl.handle.net/2433/180466.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

24

Su, Tianyu. "An Architecture for 3D Multi-view video Transmission based on Dynamic Adaptive Streaming over HTTP (DASH)". Thesis, Université d'Ottawa / University of Ottawa, 2015. http://hdl.handle.net/10393/32505.

Pełny tekst źródła

Streszczenie:

Recent advancement in cameras and image processing technology has generated a paradigm shift from traditional 2D and 3D video to Multi-view Video (MVV) technology, while at the same time improving video quality and compression through standards such as High Efficiency video Coding (HEVC). In multi-view, cameras are placed in predetermined positions to capture the video from various views. Delivering such views with high quality over the Internet is a challenging prospect, as MVV traffic is several times larger than traditional video since it consists of multiple video sequences each captured from a different angle, requiring more bandwidth than single view video to transmit MVV. Also, the Internet is known to be prone to packet loss, delay, and bandwidth variation, which adversely affects MVV transmission. Another challenge is that end users’ devices have different capabilities in terms of computing power, display, and access link capacity, requiring MVV to be adapted to each user’s context. In this paper, we propose an HEVC Multi-View system using Dynamic Adaptive Streaming over HTTP (DASH) to overcome the above mentioned challenges. Our system uses an adaptive mechanism to adjust the video bitrate to the variations of bandwidth in best effort networks. We also propose a novel scalable way for the Multi-view video and Depth (MVD) content for 3D video in terms of the number of transmitted views. Our objective measurements show that our method of transmitting MVV content can maximize the perceptual quality of virtual views after the rendering and hence increase the user’s quality of experience.

Style APA, Harvard, Vancouver, ISO itp.

25

Danielsen, Eivind. "An exploration of user needs and experiences towards an interactive multi-view video presentation". Thesis, Norwegian University of Science and Technology, Department of Electronics and Telecommunications, 2009. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-8997.

Pełny tekst źródła

Streszczenie:

After a literature review about multi-view video technologies, it was focused on a multi-view video presentation where the user receives multiple video streams and can freely switch between them. User interaction was considered to be a key function for this system. The goal was to explore user needs and expectations towards an interactive multi-view video presentation. A multi-view video player was implemented according to specifications in possible scenarios and users needs and expectations conducted through an online survey. The media player was written in objective-C, Cocoa and was developed using the integrated development environment tool XCode and graphics user interface tool Interface Builder. The media player was built around Quicktime's framework QTKit. A plugin tool, Perian, added extra media format support to QuickTime. The results from the online survey shows that the minority has experience with such a multi-view video presentation. However, those who had tried multi-view video are positive towards it. The usage of the system is strongly dependent on content. The content should be highly entertainment- and action-oriented. Switching of views was to be considered a key feature by experienced users of the conducted test of the multi-view video player. This feature provides a more interactive application and more satisfied users, when the content is suitable for multi-view video. However, rearranging and hiding of views also contributed to a positive viewing experience. However, it is important to notice that these results are not complete in order to fully investigate users need and expectations towards an interactive multi-view video presentation.

Style APA, Harvard, Vancouver, ISO itp.

26

Fecker, Ulrich [Verfasser]. "Coding Techniques for Multi-View Video Signals : Verfahren zur Codierung von Mehrkamera-Videosignalen / Ulrich Fecker". Aachen : Shaker, 2009. http://d-nb.info/1156517311/34.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

27

Gale, Nicholas C. "FUSION OF VIDEO AND MULTI-WAVEFORM FMCW RADAR FOR TRAFFIC SURVEILLANCE". Wright State University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=wright1315857639.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

28

Fülöp-Balogh, Beatrix-Emőke. "Acquisition multi-vues et rendu de scènes animées". Thesis, Lyon, 2021. http://www.theses.fr/2021LYSE1308.

Pełny tekst źródła

Streszczenie:

Les récentes percées technologiques ont conduit à une abondance d'appareils d'enregistrement vidéo conviviaux. De nos jours, les nouveaux modèles de smartphones, par exemple, sont équipés non seulement de plusieurs caméras, mais également de capteurs de profondeur. Cela signifie que tout événement peut facilement être capturé par plusieurs appareils et technologies différents en même temps, et cela soulève des questions sur la façon dont on peut traiter les données afin de restituer une scène 3D significative. La plupart des solutions actuelles se concentrent uniquement sur les scènes statiques, les scanners LiDaR produisent des cartes de profondeur extrêmement précises et les algorithmes stéréo multi-vues peuvent reconstruire une scène en 3D à partir d'une poignée d'images. Cependant, ces idées ne sont pas directement applicables en cas de scènes dynamiques. Les capteurs de profondeur échangent la précision contre la vitesse, ou vice versa, et les méthodes basées sur des images couleur souffrent d'incohérences temporelles ou sont trop exigeantes en termes de calcul. Dans cette thèse, nous visons à fournir des solutions conviviales pour fusionner des technologies multiples, éventuellement hétérogènes, pour reconstruire et rendre des scènes dynamiques 3D. Premièrement, nous introduisons un algorithme qui corrige les distorsions produites par de petits mouvements dans les acquisitions de temps de vol et produit une séquence animée corrigée. Pour ce faire, nous combinons un système LiDAR à temps de vol lent mais haute résolution et un capteur de profondeur consommateur rapide mais basse résolution. Nous avons présenté le problème comme un recalage courbe-volume, en voyant le nuage de points LiDAR comme une courbe dans l'espace-temps à 4 dimensions et la vidéo de profondeur à basse résolution capturée comme un volume d'espace-temps à 4 dimensions. Nous convoyons ensuite les détails du nuage de points haute résolution à la vidéo de profondeur en utilisant son flux optique. Deuxièmement, nous abordons le cas de la reconstruction et du rendu de scènes dynamiques capturées par plusieurs caméras RVB. Dans des contextes occasionnels, les deux problèmes sont difficiles à fusionner : la structure à partir du mouvement (SfM) produit des nuages de points spatio-temporellement instables et parcimonieux, tandis que les algorithmes de rendu qui reposent sur la reconstruction doivent produire des vidéos temporellement cohérentes. Pour relever le défi, nous considérons les deux étapes conjointement. Tout d'abord, pour SfM, nous récupérons des poses de caméra stables, puis nous différons l'exigence de points cohérents dans le temps sur la scène et ne reconstruisons qu'un nuage de points épars par pas de temps qui est bruité dans l'espace-temps. Deuxièmement, pour le rendu, nous présentons une formulation de diffusion variationnelle sur les profondeurs et les couleurs qui nous permet de faire face de manière robuste au bruit en appliquant une cohérence spatio-temporelle via des poids de reprojection par pixel dérivés des vues d'entrée. Dans l'ensemble, nous montrons que notre travail a contribué à la compréhension de l'acquisition et du rendu de scènes dynamiques capturées simplement
Recent technological breakthroughs have led to an abundance of consumer friendly video recording devices. Nowadays new smart phone models, for instance, are equipped not only with multiple cameras, but also depth sensors. This means that any event can easily be captured by several different devices and technologies at the same time, and it raises questions about how one can process the data in order to render a meaningful 3D scene. Most current solutions focus on static scenes only, LiDar scanners produce extremely accurate depth maps, and multi-view stereo algorithms can reconstruct a scene in 3D based on a handful of images. However, these ideas are not directly applicable in case of dynamic scenes. Depth sensors trade accuracy for speed, or vice versa, and color image based methods suffer from temporal inconsistencies or are too computationally demanding. In this thesis we aim to provide consumer friendly solutions to fuse multiple, possibly heterogeneous, technologies to reconstruct and render 3D dynamic scenes. Firstly, we introduce an algorithm that corrects distortions produced by small motions in time-of-flight acquisitions and outputs a corrected animated sequence. We do so by combining a slow but high-resolution time-of-flight LiDAR system and a fast but low-resolution consumer depth sensor. We cast the problem as a curve-to-volume registration, by seeing the LiDAR point cloud as a curve in the 4-dimensional spacetime and the captured low-resolution depth video as a 4-dimensional spacetime volume. We then advect the details of the high-resolution point cloud to the depth video using its optical flow. Second, we tackle the case of the reconstruction and rendering of dynamic scenes captured by multiple RGB cameras. In casual settings, the two problems are hard to merge: structure from motion (SfM) produces spatio-temporally unstable and sparse point clouds, while the rendering algorithms that rely on the reconstruction need to produce temporally consistent videos. To ease the challenge, we consider the two steps together. First, for SfM, we recover stable camera poses, then we defer the requirement for temporally-consistent points across the scene and reconstruct only a sparse point cloud per timestep that is noisy in space-time. Second, for rendering, we present a variational diffusion formulation on depths and colors that lets us robustly cope with the noise by enforcing spatio-temporal consistency via per-pixel reprojection weights derived from the input views. Overall, our work contributes to the understanding of the acquisition and rendering of casually captured dynamic scenes

Style APA, Harvard, Vancouver, ISO itp.

29

Ding, Sihao. "Multi-Perspective Image and Video Processing for Human-Machine Interaction". The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1488462115943949.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

30

Thonat, Théo. "Complétion d'image, segmentation et mixture de vidéos dans un contexte multi-vue, pour un rendu basé image plus polyvalent". Thesis, Université Côte d'Azur (ComUE), 2019. http://theses.univ-cotedazur.fr/2019AZUR4047.

Pełny tekst źródła

Streszczenie:

La création d'images réalistes avec le processus classique de rendu demande un travail manuel considérable, de la génération de modèles 3D à la gestion de l'illumination. Cela demande à la fois des artistes experts modeleurs 3D mais également des machines avec une certaine puissance de calcul. Se basant uniquement sur des photos prises par un utilisateur lambda, le rendu basé image (IBR) est un moyen alternatif de rendre une scène en temps réel, de manière immersive et réaliste. Ce type de rendu possède des applications dans des domaines tels que le tourisme virtuel, la préservation du patrimoine, la cartographie interactive, la planification urbaine et architecturale, ainsi que la production de films. De nombreuses méthodes efficaces de rendu base image ont été proposées ces dernières années, mais elles possèdent néanmoins certaines limitations. Tout d'abord, bien que ces méthodes permettent effectivement de générer des images de bonne qualité, il est difficile de pouvoir modifier le contenu de la scène. En effet, la capture d'une scène réelle s'accompagne des contraintes liées a l'environnement au moment de la prise de photos, qui peut ne pas correspondre totalement aux exigences de l'utilisateur. Ensuite, ces méthodes dépendent grandement de la qualité de la représentation géométrique sous-jacente des scènes. En conséquence, des scènes contenant par exemple des surfaces réflectives, des structures fines ou bien du contenu dynamique, produisent des artefacts visuels important. Afin de répondre à la première limitation, nous proposons d’étendre la complétion d’image a un contexte multi-vue non structuré, permettant ainsi le retrait d’objet d’une scène. Ce genre de complétion demande non seulement d’halluciner l’apparence, mais également la géométrie de ce qui se trouve derrière l’objet à retirer. Notre méthode réduit les artefacts de rendu en supprimant les objets mal représentés par l’IBR, et permet également de déplacer des objets correctement rendus. Nous répondons à la deuxième limitation en élargissant le spectre des scènes traitable en IBR, et ce de deux manières. Tout d’abord, nous nous focalisons sur le cas des structures fines qui sont un cas particulièrement compliqué pour la reconstruction multi-vue 3D, et qui représente une importante limitation pour l’IBR dans un contexte urbain. Nous proposons une méthode qui extrait puis rend les structures fines dont la surface sous-jacente est simple. Nous introduisons un algorithme de segmentation multi-vue pour les structures fines, ainsi qu’une méthode de rendu qui étend le rendu IBR avec de l’information de transparence. Enfin, nous proposons une première approche pour étendre l’IBR à des contenus dynamiques. En nous focalisant sur des effets dynamiques stochastiques, nous sommes capables de préserver à la fois une acquisition facile à mettre en œuvre et une navigation libre dans la scène rendue. Notre idée principale est d’utiliser une représentation des vidéos adaptée à les mélanger spatio-temporellement et à les faire boucler. Les résultats de chacune de nos méthodes montrent une amélioration de la qualité visuelle de rendu sur des scènes variées
Creating realistic images with the traditional rendering pipeline requires tedious work, starting with complex manual work to create 3D models, materials, and lighting, and then computationally expensive realistic rendering. Such a process requires both skilled artists and significant computing power. Image Based Rendering (IBR) is an alternative way to create high quality content by only using an unstructured set of photos as input. IBR allows casual users to create and render realistic and immersive scenes in real time, for applications such as virtual tourism, cultural heritage, interactive mapping, urban and architecture planning, and movie production. Existing IBR methods produce generally good image quality, but still suffer from limitations. First, many types of scene content produce visually-unappealing rendering artifacts, because the underlying scene representation is insufficient, e.g, for reflective surfaces, thin structures, and dynamic content. Second, scenes are often captured with real- world constraints which require editing to meet the user requirements, yet existing IBR methods do not allow this. To address editing, we propose to extend single image inpainting to allow sparse multiview object removal. Such inpainting requires to hallucinating both color and geometry behind the object to be removed in a multi-view coherent fashion. Our method reduces rendering artifacts by removing objects which are not well represented by IBR methods or by moving well represented objects in the scene. To address rendering quality, we enlarge the scope of casual IBR in two different ways. First we deal with the case of thin structures, which are extremely challenging for multi-view 3D reconstruction and represent a major limitation for IBR in an urban context. We propose a pipeline which locates and renders thin structures supported by simple surfaces. We introduce both a multi-view segmentation algorithm for thin structures, and a rendering method which extends traditional IBR with transparency information. Second, we propose an approach to extend IBR to dynamic contents. By focusing on time-dependent stochastic textures, we preserve both the casual capture setup and the free-viewpoint navigation of the rendered scene. Our key insight is to use a video representation which is adapted to video looping and spatio-temporal blending. Our results for all methods show improved visual quality compared to previous solutions on a variety of input scenes

Style APA, Harvard, Vancouver, ISO itp.

31

Kulasekera, Sunera C. "Multiplierless DFT, DCT Approximations for Multi-Beam RF Aperture and HEVC HD Video Applications: Digital Systems Implementation". University of Akron / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=akron1454023102.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

32

Powers, Jennifer Ann. ""Designing" in the 21st century English language arts classroom processes and influences in creating multimodal video narratives /". [Kent, Ohio] : Kent State University, 2007. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=kent1194639677.

Pełny tekst źródła

Streszczenie:

Thesis (Ph.D.)--Kent State University, 2007.
Title from PDF t.p. (viewed Mar. 31, 2008). Advisor: David Bruce. Keywords: multiliteracies, multi-modal literacies, language arts education, secondary education, video composition. Includes survey instrument. Includes bibliographical references (p. 169-179).

Style APA, Harvard, Vancouver, ISO itp.

33

Bartoli, Simone. "Deploying deep learning for 3D reconstruction from monocular video sequences". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/22402/.

Pełny tekst źródła

Streszczenie:

3D reconstruction from monocular video sequences is a field of increasingly interest in the late years. Before the growth of deep learning, the retrieve of depth information from single images was possible only with RGBD sensors or algorithmic approaches. However, the availability of more and more data has allowed the training of monocular depth estimation neural networks, introducing innovative data-driven techniques. Since recovering ground-truth labels for depth estimation is very challenging, most of the research has focused on unsupervised or semi-supervised training approaches. The currently state of the art for 3D reconstruction is defined by an algorithmic method which exploits a Structure from Motion and Multi-View Stereo pipeline. Nevertheless, the whole approach is based on keypoints extraction, which provides well-known limitations when it comes to texture-less, reflective and/or transparent surfaces. Consequentely, a possible way to predict dense depth maps even in absence of keypoints is by employing neural networks. This work proposes a novel data-driven pipeline for 3D reconstruction from monocular video sequences. It exploits a fine-tuning technique to adjust the weights of a pre-trained depth estimation neural network depending on the input scene. In doing so, the network can learn the features of a particular object and can provide semi real-time depth predictions for 3D reconstruction. Furthermore, the project provides a comparison with a custom implementation of the current state of the art approach and shows the potential of this innovative data-driven pipeline.

Style APA, Harvard, Vancouver, ISO itp.

34

Mora, Elie-Gabriel. "Codage multi-vues multi-profondeur pour de nouveaux services multimédia". Thesis, Paris, ENST, 2014. http://www.theses.fr/2014ENST0007/document.

Pełny tekst źródła

Streszczenie:

Les travaux effectués durant cette thèse de doctorat ont pour but d’augmenter l’efficacité de codage dans 3D-HEVC. Nous proposons des approches conventionnelles orientées vers la normalisation vidéo, ainsi que des approches en rupture basées sur le flot optique. En approches conventionnelles, nous proposons une méthode qui prédit les modes Intra de profondeur avec ceux de texture. L’héritage est conditionné par un critère qui mesure le degré de similitude entre les deux modes. Ensuite, nous proposons deux méthodes pour améliorer la prédiction inter-vue du mouvement dans 3D-HEVC. La première ajoute un vecteur de disparité comme candidat inter-vue dans la liste des candidats du Merge, et la seconde modifie le processus de dérivation de ce vecteur. Finalement, un outil de codage intercomposantes est proposé, où le lien entre les arbres quaternaires de texture et de profondeur est exploité pour réduire le temps d’encodage et le débit, à travers un codage conjoint des deux arbres. Dans la catégorie des approches en rupture, nous proposons deux méthodes basées sur l’estimation de champs denses de vecteurs de mouvement en utilisant le flot optique. La première calcule un champ au niveau d’une vue de base reconstruite, puis l’extrapole au niveau d’une vue dépendante, où il est hérité par les unités de prédiction en tant que candidat dense du Merge. La deuxième méthode améliore la synthèse de vues : quatre champs sont calculés au niveau de deux vues de référence en utilisant deux références temporelles. Ils sont ensuite extrapolés au niveau d’une vue synthétisée et corrigés en utilisant une contrainte épipolaire. Les quatre prédictions correspondantes sont ensuite combinées
This PhD. thesis deals with improving the coding efficiency in 3D-HEVC. We propose both constrained approaches aimed towards standardization, and also more innovative approaches based on optical flow. In the constrained approaches category, we first propose a method that predicts the depth Intra modes using the ones of the texture. The inheritance is driven by a criterion measuring how much the two are expected to match. Second, we propose two simple ways to improve inter-view motion prediction in 3D-HEVC. The first adds an inter-view disparity vector candidate in the Merge list and the second modifies the derivation process of this disparity vector. Third, an inter-component tool is proposed where the link between the texture and depth quadtree structures is exploited to save both runtime and bits through a joint coding of the quadtrees. In the more innovative approaches category, we propose two methods that are based on a dense motion vector field estimation using optical flow. The first computes such a field on a reconstructed base view. It is then warped at the level of a dependent view where it is inserted as a dense candidate in the Merge list of prediction units in that view. The second method improves the view synthesis process: four fields are computed at the level of the left and right reference views using a past and a future temporal reference. These are then warped at the level of the synthesized view and corrected using an epipolar constraint. The four corresponding predictions are then blended together. Both methods bring significant coding gains which confirm the potential of such innovative solutions

Style APA, Harvard, Vancouver, ISO itp.

35

Bosc, Emilie. "Compression des données Multi-View-plus-Depth (MVD) : De l'analyse de la qualité perçue à l'élaboration d'outils pour le codage des données MVD". Phd thesis, INSA de Rennes, 2012. http://tel.archives-ouvertes.fr/tel-00777710.

Pełny tekst źródła

Streszczenie:

Cette thèse aborde la problématique de compression des vidéos multi-vues avec pour pilier un souci constant du respect de la perception humaine du media, dans le contexte de la vidéo 3D. Les études et les choix portés durant cette thèse se veulent orientés par la recherche de la meilleure qualité perçue possible des vues synthétisées. L'enjeu des travaux que de cette thèse réside dans l'investigation de nouvelles techniques de compression des données multi-view-plus-depth (MVD) limitant autant que possible les dégradations perceptibles sur les vues synthétisées à partir de ces données décodées. La difficulté vient du fait que les sources de dégradations des vues synthétisées sont d'une part multiples et d'autre part difficilement mesurables par les techniques actuelles d'évaluation de qualité d'images. Pour cette raison, les travaux de cette thèse s'articulent autour de deux axes principaux: l'évaluation de la qualité des vues synthétisées ainsi que les artefacts spécifiques et l'étude de schémas de compression des données MVD aidée de critères perceptuels. Durant cette thèse nous avons réalisé des études pour caractériser les artefacts liés aux algorithmes DIBR. Les analyses des tests de Student réalisés à partir des scores des tests de Comparaisons par paires et ACR-HR ont permis de déterminer l'adéquation des méthodes d'évaluation subjective de qualité pour le cas des vues synthétisées. L'évaluation des métriques objectives de qualité d'image/vidéo ont également permis d'établir leur corrélation avec les scores subjectifs. Nous nous sommes ensuite concentrés sur la compression des cartes de profondeur, en proposant deux méthodes dérivées pour le codage des cartes de profondeur et basées sur la méthode LAR. En nous appuyant sur nos observations, nous avons proposé une stratégie de représentation et de codage adaptée au besoin de préserver les discontinuités de la carte tout en réalisant des taux de compression importants. Les comparaisons avec les codecs de l'état de l'art (H.264/AVC, HEVC) montrent que notre méthode propose des images de meilleure qualité visuelle à bas débit. Nous avons également réalisé des études sur la répartition du débit entre la texture et la profondeur lors de la compression de séquences MVD. Les résultats de cette thèse peuvent être utilisés pour aider à la conception de nouveaux protocoles d'évaluation de qualité de données de synthèse; pour la conception de nouvelles métriques de qualité; pour améliorer les schémas de codage pour les données MVD, notamment grâce aux approches originales proposées; pour optimiser les schémas de codage de données MVD, à partir de nos études sur les relations entre la texture et la profondeur.

Style APA, Harvard, Vancouver, ISO itp.

36

Hossain, Md Amjad. "DESIGN OF CROWD-SCALE MULTI-PARTY TELEPRESENCE SYSTEM WITH DISTRIBUTED MULTIPOINT CONTROL UNIT BASED ON PEER TO PEER NETWORK". Kent State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=kent1606570495229229.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

37

Ezzo, Anthony John. "Using typography and iconography to express emotion (or meaning) in motion graphicsas a learning tool for ESL (English as a second language) in a multi-device platform". Kent State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=kent1460146374.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

38

魏震豪. "Multi-view video synthesis from stereo videos with iterative depth refinement". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/77312469437977067092.

Pełny tekst źródła

Streszczenie:

碩士
國立清華大學
資訊工程學系
101
In this thesis, we propose a novel algorithm to refine depth maps and generate multi-view video sequences from two-view video sequences for modern autostereoscopic display. In order to generate realistic contents for virtual views, high-quality depth maps are very critical to the view synthesis results. Therefore, refining the depth maps is the main challenging problem in the task. We propose an iterative depth refinement algorithm, including error detection and error correction, to correct errors in depth map. The error types are classified into across-view color-depth-inconsistency errors and local color-depth-inconsistency errors. Then, we correct the error pixels based on sampling local candidates. Next, we apply a trilateral filter that considers intensity, spatial and temporal terms into the filter weighting to enhance the temporal and spatial consistencies across frames. So the virtual views can be synthesized according to the refined depth maps. To combine both warped images, disparity-based view interpolation is introduced to alleviate the translucent artifacts. Finally, a directional filter is applied to reduce the aliasing around the object boundaries. Finally, the high-quality virtual views between the two views are generated. We demonstrate the superior image quality of the synthesized virtual views by using the proposed algorithm over the state-of-the-art view synthesis methods through experiments on benchmarking image and video datasets.

Style APA, Harvard, Vancouver, ISO itp.

39

Dai, Yu-Chia, i 戴佑家. "Automatic Alignment of Multi-View Event Videos by Fast Sequence Matching". Thesis, 2011. http://ndltd.ncl.edu.tw/handle/10220095380663952412.

Pełny tekst źródła

Streszczenie:

碩士
國立臺灣大學
資訊工程學研究所
99
The high availability of digital video capture devices and the increasing diversity of social video sharing sites make sharing and searching become easy. Multi-view event videos provide diverse visual content and different audio information of the same event. Compared with single-view video, users prefer a more diverse and comprehensive views (video segments) of the same event. Therefore, the rise of multi-view event videos alignment becomes more and more important. It is a challenging work because the scene’s visual appearances from different views look apparently dissimilar. This work has been solved using audio before, but videos’ audio is not always available. In this work, we investigate the effect of different visual features and focus on regions of interest. Moreover, we propose a time sensitive dynamic time warping algorithm which takes temporal factor into consideration. Besides, we can reduce the computational cost by LSH indexing to improve time efficiency. Experimental results show that our proposed method provides an efficiency way to align videos and derive robust matching results.

Style APA, Harvard, Vancouver, ISO itp.

40

Lee, Ji-Tang, i 李繼唐. "Efficient Caching for Multi-view 3D Videos with Depth-Image-Based Rendering". Thesis, 2016. http://ndltd.ncl.edu.tw/handle/02216570548253704949.

Pełny tekst źródła

Streszczenie:

碩士
國立臺灣大學
電信工程學研究所
104
Due to the emergence of mobile 3D and VR devices, multi-view 3D videos are expected to play increasingly important roles shortly. Compared with traditional single-view videos, it is envisaged that a multi-view 3D video requires a larger storage space. Nevertheless, efficient caching of multi-view 3D videos in a proxy has not been explored in the literature. In this thesis, therefore, we first observe that the storage space can be effectively reduced by leveraging Depth Image Based Rendering (DIBR) in multi-view 3D. We then formulate a new cache replacement problem, named View Selection and Cache Operation (VSCO), and find the optimal policy based on Markov Decision Process. In addition, we devise an efficient and effective algorithm, named Efficient View Exploration Algorithm (EVEA), to solve the problem in large cases. Simulation results manifest that the proposed algorithm can significantly improve the cache hit rate and reduce the total cost compared with the previous renowned cache replacement algorithms.

Style APA, Harvard, Vancouver, ISO itp.

41

GOEL, YUVRAJ. "3D VIDEO CODING". Thesis, 2011. http://dspace.dtu.ac.in:8080/jspui/handle/repository/13874.

Pełny tekst źródła

Streszczenie:

M.TECH
Interest in 3DTV has increased recently with more and more products and services becoming available for the consumer market. 3D video is an emerging trend in developing digital video system. Three-dimensional multi-view video is typically obtained from a set of synchronized cameras, which are capturing the same scene from different viewpoints. The video (texture) plus depth (V+D) representation is an interesting method to realize 3D video. A depth map is simply a grayscale image which represents the distance between a pixel and camera in black and white. However, a major problem when dealing with multi-view video is the intrinsically large amount of data to be compressed decompressed and rendered. We extend the standard H.264/MPEG-4 MVC for handling the compression of multi-view video. An algorithm is implemented to compress the data in which instead of separate bit-streams each for depth and texture, only one bit-stream for texture (also containing depth data) is developed. As opposed to the Multi-view Video Coding (MVC) standard that encodes only the multi-view texture data, the proposed algorithm performs the compression of both the texture and the depth multi-view sequences. The proposed extension is based on exploiting the correlation between the multiple camera views. The goal of this thesis work is to establish an efficient method to encode depth information along with multiple but limited numbers of views Software used is JMVC (Joint Multi-view Video Coding) jmvc8.0, which is an open source software for the Multi-view Video Coding (MVC) project of the Joint Video Team (JVT) of the ISO/IEC Moving Pictures Experts Group (MPEG) and the ITU-T Video Coding Experts Group (VCEG).

Style APA, Harvard, Vancouver, ISO itp.

42

Szu-HuaWu i 吳思樺. "A Parallax Adjustable Multi-view Rendering System based on Stereoscopic Videos with Depth Information". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/61927844729721099451.

Pełny tekst źródła

Streszczenie:

碩士
國立成功大學
電腦與通信工程研究所
101
With the development of 3D techniques, the related products are now available in recent years. However, the traditional 3D television should watch the TV to feel the stereo perception with the equipment of polarized glasses, shutter glasses or red-cyan glasses. Nevertheless, it is inconvenient to wear a pair of glasses while watching 3D television. Therefore, in order to increase the application area of 3D techniques, the naked-eye stereoscopic display must be the trend of the future development. On the other hand, the multiview display system can provide the different viewing angles to perceive stereo visions. The goal of this thesis focuses on a parallax adjustable multiview rendering system. With the rich view information, we can provide better rendering results than the system with one view plus on depth. The proposed stereo-based direct single-image multiview rendering algorithm is based on the traditional depth-image-based rendering (DIBR) algorithm. It can directly render the output image with multiview information. The proposed stereo-based parallax adjustable multiview rendering system is implemented with GPU to reduce the rendering time.

Style APA, Harvard, Vancouver, ISO itp.

43

Huang, Yao-Ching, i 黃耀慶. "Summarization of Multi-view Surveillance Videos by an Object-Based Key Frame Extraction Method". Thesis, 2011. http://ndltd.ncl.edu.tw/handle/06942591994137463963.

Pełny tekst źródła

Streszczenie:

碩士
輔仁大學
電機工程學系
99
Video summarization is an important technique which has been an interested subject in many research fields which generates a short summary of a video for the presentation to users with browsing and navigation. Multi-view development is also beneficial to video surveillance, since the vast public security area installed a lot of cameras need to filter of huge non-important information. In this paper, we propose a multi-view video summarization approach that extracts semantic-level key frames by object information from multiple cameras. Our main goal is to avoid the redundant key frames with multi-view videos that the dominant camera selection presented to decentralize key frame extraction approach. The proposed approach is a new formulation which integrates camera selection algorithm into key frame extraction for optimization. This proposed approach has been verified by large amounts video dataset that include different surveillance scenes, and comparing with other camera selection method. This method proved by experiments not only can extract representative key frames but also reduce redundant key frames in multi-view videos.

Style APA, Harvard, Vancouver, ISO itp.

44

TungHsiao i 蕭桐. "Improved Depth Upsampling and Multi-view Generation for Depacking Centralized Texture Depth Depacked 3D Videos". Thesis, 2019. http://ndltd.ncl.edu.tw/handle/rt8jj7.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

45

Huang, Xin-Xian, i 黃信憲. "Efficient Multi-view Video Coding Algorithm Using Inter-View Information". Thesis, 2012. http://ndltd.ncl.edu.tw/handle/zma67u.

Pełny tekst źródła

Streszczenie:

碩士
國立東華大學
電機工程學系
100
Multi-view video coding (MVC) is extended based on H.264/AVC and can improve the coding efficiency of multi-view video. However, MVC produces much more computational complexity than single view video coding. Therefore, how to promote the coding efficiency is a very important issue for the numerous applications of multi-view video. This thesis proposes a fast mode decision algorithm to solve this enormous computational complexity, and early decide the mode partition. We utilize the best mode partition in the reference views to determine the complexity of the macroblock in the current view. And then, the mode candidates needed to calculate can be obtained according to the complexity. If the complexity is belong to low or median, the search range can be reduced. The threshold of the rate-distortion cost for the current macroblock is calculated by the ones of co-located and neighboring macroblocks, in previously coded view and utilized as the criterion for early termination. The motion vector difference with co-located macroblock in the reference view is utilized to adaptively adjust the search range in the current macroblock. The experimental results verify our proposed algorithm can achieve 81.04% and 92.34% of time saving for fast search TZ and full search, respectively, and keep good performance of quality and bit-rate.

Style APA, Harvard, Vancouver, ISO itp.

46

Hu, Kai-Wen, i 胡凱文. "Multicast Multi-View 3D Video over WLANs". Thesis, 2017. http://ndltd.ncl.edu.tw/handle/zt7d57.

Pełny tekst źródła

Streszczenie:

碩士
國立臺灣大學
資訊工程學研究所
105
In recent years, many video service providers start to provide 3D video contents,moreover, there is a new service which is called multi-view 3D video arise. The multi-view 3D video provides multiple view point to be chosen, also, it offer better immersive experience than traditional single view 3D video to users. But, the transmission of all views in multi-view 3D video would require significantly bandwidth consumption. Another interesting and well developed technology is the Wireless Local Area Networks (WLANs) which could support efficient multicast service on streaming with the limited wireless bandwidth. However, multicast to a set of heterogeneous user over multiple wireless Access Points (APs) also been a complicated problem. In this thesis, we wont to solve the multicast multi-view 3D video in WLANs problem. We exploit the Depth Image Based Rendering (DIBR) technology to synthesize the user subscribed view from nearby left and right views. Therefore, in this problem we need to decide the user to AP association and the view session to be multicast from each AP. And the aim of our problem is maximize the number of satisfied users with bandwidth constraint. We propose an efficiently algorithm to solve this problem, and the simulation results show that our algorithm could further consider the view thesis and the inter APs coordination, and effectively satisfy most user demands.

Style APA, Harvard, Vancouver, ISO itp.

47

Wang, Poching, i 王柏青. "Morphing-based View Synthesis without Depth for Multi-view Video Players". Thesis, 2012. http://ndltd.ncl.edu.tw/handle/03497455498882364187.

Pełny tekst źródła

Streszczenie:

碩士
國立中正大學
資訊工程研究所
100
In this thesis, we present a morphing based view synthesis approach. With the proposed algorithm, users can use two cameras and generate the different view point of virtual view. First, we use the SIFT algorithm to detect and extract feature points. Then we use the correspondences and make use of normalized direct linear transformation to solve the parameters of the multi-view geometry. After the morphed view is obtained, the distortion occurred in the initial virtual view would be eliminated in Image Re-projection and Repairing stage. The method we use does not need such a calibration; instead, it makes use of the multiple view geometry to achieve this special effect. Moreover, since the proposed method is morphing based, we do not need to make the model. The method proposed in this thesis can be applied for various applications, not only we can use in the digital photo frame or the multi-angle display but also we can use it in the 360-degree street-level imagery.

Style APA, Harvard, Vancouver, ISO itp.

48

Mahapatra, Ansuman. "Framework and Algorithms for Multi-view Video Synopsis". Thesis, 2018. http://ethesis.nitrkl.ac.in/9442/1/2018_PhD_AMahapatra_511CS108_Framework.pdf.

Pełny tekst źródła

Streszczenie:

Summary or synopsis production of videos, shot with a single static camera, has been well studied in last one decade. It has numerous applications, especially in surveillance business. With the advent of multi-camera networks (MCN), newer challenges have surfaced before the research community and synopsis generation is one of them. It may be noted that, in MCN, a scene is recorded from multiple angles i.e. the network of cameras records multi-view videos. Adaptation of single video synopsis generation methods to each view of MCN would not only lead to redundancy but also make the comprehension of synopses cumbersome. Besides, the background of each camera view is different making it difficult to bring all views under a single view. Furthermore, the coherence among the multiple views is another issue that demands special attention to generate a single synopsis. In this doctoral research, the focus is made on developing a framework that generates a single synopsis of multi-view videos. Alongside the framework, various methods are proposed that help in synopsis generation. The methods are grouped in three categories; pre-processing, synopsis generation, and post-processing. Some of the methods in pre-processing are adapted from existing literature while the rest are proposed. The framework uses the top view of the surveillance site as the common ground plane, wherein objects detected from different views are mapped through homographic technique. The mapped object locations are clustered, spatially followed by temporally, adapting density based clustering algorithm to form the track of each object. An action recognition module is also used in the framework to recognize the objects’ action and prioritize them so that the objects performing important actions can be included in the synopsis leading omission of trivial content and reduction in synopsis length. Two more methods are suggested in the framework; interaction detection method makes the generated synopsis more rich in information, and collision detection method helps in excluding colliding tracks in the generated synopsis. The generation of multi-view video synopsis is modelled as a scheduling problem. Two sets of solutions are suggested in the research; deterministic and non-deterministic. Under the deterministic category, four approaches are proposed. A table driven method has been proposed to schedule object tracks with zero collision by carefully selecting objects. A contradictory graph coloring based approach has been proposed that allows a small number of collision in the synopsis to reduce its length. A greedy based object scheduling method has also been proposed for scheduling more number of objects per schedule. Lastly, a dynamic programming based scheduling algorithm is proposed that considers both the number of collisions and action performed by the object to generate a synopsis. Under the non-deterministic head, the synopsis generation is modelled as a multi-objective optimization problem that takes into account components like synopsis length, number of collisions, actions, and interactions performed by the objects. Simulated Annealing (SA) and Genetic Algorithm (GA) are used to optimize the cost function. A fuzzy based post-processing method is also proposed that further reduces the synopsis length by computing the visibility scores of each object track. The visualization of the generated synopsis is achieved by presenting the objects on top of the common ground plane. The proposed framework reveals its efficacy when tested on different datasets and compared with the state-of-the-art.

Style APA, Harvard, Vancouver, ISO itp.

49

Kuan, Yuan-Kai, i 官元凱. "Error Concealment Algorithm Using Inter-View Correlation for Multi-View Video Decoding". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/44843855194809419466.

Pełny tekst źródła

Streszczenie:

碩士
國立東華大學
電機工程學系
101
This thesis proposes an error concealment algorithm for the whole frame loss of multi-view video decoding. Compared with H.264/AVC, Multi-view Video Coding(MVC) utilizes inter-view correlation to reduce bit-rate. However, when the network transmission delay happens or error occurs during transmission, the decoded video is damaged and error propagation occurs. How to conceal the error is a very important issue. When the whole frame is lost or damaged in the two-view sequence, the proposed algorithm uses the inter-view and intra-view domains to conceal the damaged frame. Experimental results show that our proposed algorithm provides better video quality than previous work and reduces error propagation.

Style APA, Harvard, Vancouver, ISO itp.

50

Huang, Jun-Te, i 黃潤德. "View Synthesis for Multi-view Video Plus Depth Using Spatial-temporal Information". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/52759014517694521488.

Pełny tekst źródła

Streszczenie:

碩士
國立中正大學
資訊工程研究所
102
View synthesis is an important technique for free viewpoint TV applications to reducing the transmission bit rate. In order to simultaneously reduce bit rate and display high quality video, a general solution is using reference viewpoint video plus the depth information (multi-view video plus depth format) to synthesize virtual view point video. In this study, a view synthesis for multi-view video plus depth using spatial-temporal information is proposed. The proposed approach includes five steps: (1) disparity map estimation, optimization, and projected onto a virtual viewpoint; (2) use spatial information to synthesis virtual view, (3) use temporal information to synthesis virtual view , (4) select the best virtual view, (5) motion compensation. Based on the experimental results, the synthesis views of the proposed approach are better than those of view synthesis reference software approach (VSRS).

Style APA, Harvard, Vancouver, ISO itp.

Rozprawy doktorskie na temat „MULTI VIEW VIDEOS”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych