Dissertations / Theses: 'Multiview2'

1

ZARDINI, Alessandro. "Gli impatti organizzativi delle piattaforme di Enterprise Content Management sui processi decisionali." Doctoral thesis, Università degli Studi di Verona, 2010. http://hdl.handle.net/11562/343376.

Full text

Abstract:

L’obiettivo della tesi di ricerca è quello di analizzare le correlazioni esistenti tra il vantaggio competitivo, associato al miglioramento del processo di decision making, e la gestione dei contenuti aziendali attraverso le piattaforme di Enterprise Content Management (ECM). Con questo contributo si intende pertanto incrementare la letteratura presente all’interno del Knowledge Management (KM) ed in particolare sul rapporto esistente tra i sistemi di Knowledge Management, di Enterprise Content Management e la gestione dei processi decisionali. All’interno della letteratura del Knowledge Management, le piattaforme di Enterprise Content Management, sino ad ora, sono state analizzate solo attraverso la Transaction-Costs Theory (Reimer, 2002; McKeever, 2003; Smith e McKeen, 2003; O'Callaghan e Smits, 2005; Tyrväinen et al., 2006) e vengono generalmente descritte come dei sistemi utili per la riduzione dei costi di gestione dei contenuti aziendali presenti all’interno dell’organizzazione. Nello specifico attraverso analisi empiriche i diversi autori hanno evidenziato come gli strumenti di ECM siano in grado di aumentare l’efficienza della gestione delle informazioni aziendali, riducendone il costo di gestione e ricerca. Analizzando gli articoli presenti all’interno della letteratura manageriale, si può facilmente constatare che, a tutt’oggi, non esiste una definizione univocamente accettata del concetto di ECM. Esaminandoli congiuntamente si possono però riscontrare alcune analogie. La distinzione non dipende dal contenuto ma dal focus utilizzato dal ricercatore per descrivere, analizzare ed interpretare i sistemi ECM. Pochi ricercatori hanno però studiato gli impatti che tali strumenti di Content Management hanno sull’organizzazione e sui processi aziendali. In particolare, nessuna ricerca ha mai evidenziato il ruolo strategico delle piattaforme ECM nella gestione dei contenuti aziendali (Gupta et al., 2002; Helfat e Peteraf, 2003; Smith e McKeen, 2003; O'Callaghan e Smits, 2005). Per analizzare ed interpretare i valori rilevati all’interno del case study, verrà utilizzata la teoria della Knowledge Based View. Si considera infatti che i siano le risorse strategiche utili per raggiungere e mantenere il vantaggio competitivo (Conner e Prahalad; 1996; Choi et al.; 2008). I sistemi di ECM non verranno analizzati secondo un approccio gestionale, cioè non si valuterà l’aumento di efficienza connesso al miglioramento della gestione delle informazioni aziendali, bensì si andrà ad analizzare l’evoluzione delle performance aziendali connesso con lo sviluppo del processo decisionale. Nel corso dell’analisi, si andrà ad analizzare se la conoscenza contenuta all’interno delle organizzazioni, risulta essere fondamentale per lo sviluppo e la crescita aziendale (Wernerfelt, 1984; Grant, 1991; Penrose, 1995; Grant, 1996; Prusak, 1996; Teece et al., 1997; Piccoli et al., 2000; Piccoli et al., 2002). Le informazioni assumono però un reale valore solamente quando possono essere gestite facilmente all’interno del processo di decision making per il mantenimento di un vantaggio competitivo. Per migliorare le prestazioni aziendali, risulta fondamentale riuscire a trasformare i numerosi contenuti aziendali “passivi” in sorgenti “attive”. La potenzialità dei sistemi di Enterprise Content Management consiste nella loro capacità di elaborare elevati volumi informativi, fornendo all’utente finale o al sistema di Decision Support Systems (DSS), tutte le informazioni utili ai fini decisionali. In tal modo le migliori performance dell’attività del decision maker avviene non solo attraverso l’incremento della qualità e della quantità delle informazioni di ingresso al processo decisionale ma anche grazie ad una migliore formalizzazione della conoscenza presente all’interno della memoria organizzativa. Il metodo di ricerca utilizzato sarà il cosiddetto “interpretative case study”, il quale risulta particolarmente utile per esaminare un fenomeno nella sua naturale evoluzione (Benbasat, 1984). Il metodo del case study è stato scelto anche perché può rappresentare un veicolo ideale per giungere ad una più profonda comprensione dei processi di business espliciti ed impliciti, ma anche per comprendere meglio il ruolo degli attori all'interno dei sistemi organizzativi (Campbell, 1975; Hamel et al., 1993; Lee, 1999; Stake, 2000). Si utilizzerà l'azienda come unità di analisi (Yin, 1984) sia quando si analizzeranno le relazioni col mercato che il comportamento dei singoli partecipanti ad un processo (Zardini et al., 2010). Inizialmente si andranno ad analizzare alcune delle più significative definizioni di conoscenza presenti all’interno della letteratura e per ciascuna si evidenzieranno i punti di forza e di debolezza. Inizialmente sarà ripresa l’enunciazione proposta da Polanyi (Polanyi, 1958; Polanyi, 1967), la quale verrà poi integrata con gli studi condotti da Nonaka, Takeuchi e Konno (Nonaka, 1991; Nonaka e Takeuchi, 1995; Nonaka e Konno, 1998; Nonaka et al., 2000). Si passerà dal concetto generale di conoscenza alla nozione di knowledge assests, i quali verranno identificati anche come delle risorse intangibili generate internamente all’impresa, difficilmente acquistabili sul mercato. Dopo aver accertato che la conoscenza può essere considerata una risorsa importante per l’ottenimento di un vantaggio competitivo (Grant, 1996b; Prusak, 1996; Alavi e Leidner, 1999; Earl e Scott, 1999; Piccoli et al., 2002), il capitolo terminerà contestualizzando il concetto di knowledge assets anche all’interno della teoria della Knowledge Based View. Nel secondo capitolo verrà esplicitato il processo di creazione della conoscenza e si identificheranno le tre tipologie di Knowledge Management Systems. Il capitolo terminerà con una disamina dei principali sistemi di Knowledge Management utilizzati per la creazione, l’analisi ed il mantenimento della conoscenza presente all’interno della memoria organizzativa. Nel terzo capitolo si procederà alla disamina delle componenti principali presenti all’interno del processo di decision making e con l’analisi degli strumenti di KM specifici per il miglioramento del processo decisionale medesimo. Il capitolo si concluderà con la descrizione e la disamina dei sistemi a supporto delle decisioni. Nella quarta sezione si definirà il termine “contenuto aziendale” e lo si assocerà al concetto di dynamic capabilities (Teece et al., 1997; Eisenhardt e Martin, 2000; Helfat et al., 2007). Successivamente si analizzeranno tutte le fasi presenti all’interno del ciclo di vita dell’informazione: dalla creazione di un nuovo contenuto sino alla catalogazione, al salvataggio ed all’eventuale modifica o cancellazione dello stesso. Avendo circoscritto il concetto di content si procederà con l’analisi delle definizioni presenti all’interno della letteratura. Il capitolo terminerà con lo studio delle componenti principali presenti all’intento dei sistemi ECM ed in particolare con l’analisi degli strumenti utili a supportare i processi decisionali presenti all’interno delle organizzazioni. Nell’ultimo capitolo si procederà alla disamina della metodologia dell’Action-Research, analizzandone i punti di forza e le criticità. Successivamente si seguirà l’approccio proposto da Baskerville (Baskerville, 1999), secondo cui il termine “Ricerca-Azione” da un lato identifica un metodo di investigazione per le scienze sociali, dall’altro rappresenta una sub-categoria che la distingue dagli altri sotto-metodi presenti. Procedendo con l’analisi si giungerà al modello di Baskerville e Wood-Harper (Baskerville e Wood-Harper; 1998) secondo cui si possono individuare dieci distinte forme di Action-Research all’interno della letteratura dei Sistemi Informativi, e tra queste, la Multiview ed in particolare la Multiview2, sarà la metodologia di riferimento utilizzata per testare il framework teorico all’interno del case study.
The focus of this thesis is to analyze the correlations between the competitive advantage, associated to the improvement of the process of decision making, and the content management through the Enterprise Content Management platform (ECM). One scope of this work is to increase the Knowledge Management (KM) literature and in particular to seek the correlation between the ECM Systems and the Decision Support Systems. Enterprise Content Management platforms largely have been analyzed according to Transaction Cost Theory (Reimer, 2002; McKeever, 2003; Smith and McKeen, 2003; O'Callaghan and Smits, 2005; Tyrväinen et al., 2006) and generally are described as useful for the reduction of ECM costs inside an organization (McKeever, 2003). Through empirical analyses, various authors have stressed that ECM tools increase efficiency and reduce management and research costs. Few studies consider the impacts of these tools on the organization or company processes. In particular, no research has highlighted the strategic role of ECM platforms in Enterprise Content Management (Gupta et al., 2002; Helfat and Peteraf, 2003; Smith and McKeen, 2003; O'Callaghan and Smits, 2005) as a means to improve and speed up the decision-making process. The case study will be analyzed by the Knowledge Based View. Specifically, the knowledge-based view (KBV) constitutes a fundamental essence of the resource-based view (RBV; Conner and Prahalad, 1996), reflecting the importance of knowledge assets. The knowledge and enterprise content generated thus can be interpreted not only as strategic resources to achieve or maintain a competitive advantage but also as useful tools for developing and expanding the company’s ability to respond promptly to unexpected events in the external environment and therefore perfect decision making within the organization. According to several authors (Barney, 1991; Amit and Schoemaker, 1993; Peteraf, 1993; Winter, 1995; Grover et al., 2009), the Resource Based View (RBV) cites knowledge as a resource that can generate information asymmetries and thus a competitive advantage for the enterprises that possess it. Reconsidering the general theory on the RBV and including knowledge assets among an enterprise’s intangible resources easily results in the KBV. If the term “acquired resources” from the general RBV proposed by Lippman and Rumelt (1982) and Barney (1986) gets replaced by “knowledge,” the result is KBV theory, and knowledge represents one of the strategic factors for maintaining a competitive advantage (Grant and Baden-Fuller, 1995; Grant, 1996c; Teece et al., 1997; Sambamurthy and Subramani, 2005; Bach et al., 2008; Choi et al., 2008). The availability of content thus is necessary, but it is not a sufficient condition to improve the decision-making process and company performance. Rather, the company also needs to transform “passive” contents, such as unused information within the boundaries of organizational memory, into “active” sources that are integral to the decision-making process. To improve the decision-making process and create value, the enterprises must enrich the quality and quantity of all information that provides critical input to a decision. The goal therefore involves an ability to manage knowledge in- and outside the organization by transforming data into knowledge. In the case analyzed, decision-makers achieve the best performance not only improving the quantity and quality of input information to the decisional process but also thanks to a better formalization of the knowledge included in all phases of the process. In this view, ECM platforms are advanced KM tools that are fundamental for the development of a competitive advantage, in that they simplify and speed up the management (creation, classification, storing, change, deletion) of information, increase the productivity of each member, and improve the efficiency of the system (McKeever, 2003; Nordheim and Päivärinta, 2004; O' Callaghan and Smits, 2005). By implementing an ECM system, the company has not only an effective means for creating, tracking, managing, and archiving all company content but also can integrate business processes, develop collaborative actions through the systemic organization of work teams, and create a search engine with specialized “business logic views.” Standardized contents and layout, associated with a definition of content owners and users (i.e., management of authorizations), and document processes support the spread of updated, error-free information to various organizational actors. Similar to business intelligence systems, ECM platforms support decision making inside the organizations in terms of viewing and retrieving data and analyzing and sharing information—and thus increase organizational memory—as well as their storage and continuous maintenance along the life cycle of the enterprise. For the analysis of the case study, this study employs the action research method (Lewin, 1946; Checkland, 1985; Checkland and Scholes, 1990), and specifically Multiview2 (Avison and Wood-Harper, 2003). The original Multiview concept assumed a continuous interaction between analysts and method, including the present situation and the future scenario that originated by application of the methodology. In some respects, the original definition was limited, in that it did not describe the function of each element and the trend of possible interactions (Avison and Wood-Harper, 2003). Multiview2 fills these gaps by taking into consideration the action and reaction generated by the interactions of the elements. The three macro-categories therefore must be aligned to conduct an organizational, socio-technical, and technological analysis (Avison et al., 1998; Avison and Wood-Harper, 2003). The researcher provides a clear contribution that matches the theoretical framework used as a reference and measures and evaluates in subsequent phases the results obtained from those implemented actions.

APA, Harvard, Vancouver, ISO, and other styles

2

Vetro, Anthony, Emin Martinian, Jun Xin, Alexander Behrens, and Huifang Sun. "THECHNIQUES FOR MULTIVIEW VIDEO CODING." INTELLIGENT MEDIA INTEGRATION NAGOYA UNIVERSITY / COE, 2005. http://hdl.handle.net/2237/10361.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Shafaei, Alireza. "Multiview depth-based pose estimation." Thesis, University of British Columbia, 2015. http://hdl.handle.net/2429/56180.

Full text

Abstract:

Commonly used human motion capture systems require intrusive attachment of markers that are visually tracked with multiple cameras. In this work we present an efficient and inexpensive solution to markerless motion capture using only a few Kinect sensors. We use our system to design a smart home platform with a network of Kinects that are installed inside the house. Our first contribution is a multiview pose estimation system. Unlike the previous work on 3d pose estimation using a single depth camera, we relax constraints on the camera location and do not assume a co-operative user. We apply recent image segmentation techniques with convolutional neural networks to depth images and use curriculum learning to train our system on purely synthetic data. Our method accurately localizes body parts without requiring an explicit shape model. The body joint locations are then recovered by combining evidence from multiple views in real-time. Our second contribution is a dataset of 6 million synthetic depth frames for pose estimation from multiple cameras with varying levels of complexity to make curriculum learning possible. We show the efficacy and applicability of our data generation process through various evaluations. Our final system exceeds the state-of-the-art results on multiview pose estimation on the Berkeley MHAD dataset. Our third contribution is a scalable software platform to coordinate Kinect devices in real-time over a network. We use various compression techniques and develop software services that allow communication with multiple Kinects through TCP/IP. The flexibility of our system allows real-time orchestration of up to 10 Kinect devices over Ethernet.
Science, Faculty of
Computer Science, Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

4

Khattak, Shadan. "Low complexity multiview video coding." Thesis, De Montfort University, 2014. http://hdl.handle.net/2086/10511.

Full text

Abstract:

3D video is a technology that has seen a tremendous attention in the recent years. Multiview Video Coding (MVC) is an extension of the popular H.264 video coding standard and is commonly used to compress 3D videos. It offers an improvement of 20% to 50% in compression efficiency over simulcast encoding of multiview videos using the conventional H.264 video coding standard. However, there are two important problems associated with it: (i) its superior compression performance comes at the cost of significantly higher computational complexity which hampers the real-world realization of MVC encoder in applications such as 3D live broadcasting and interactive Free Viewpoint Television (FTV), and (ii) compressed 3D videos can suffer from packet loss during transmission, which can degrade the viewing quality of the 3D video at the decoder. This thesis aims to solve these problems by presenting techniques to reduce the computational complexity of the MVC encoder and by proposing a consistent error concealment technique for frame losses in 3D video transmission. The thesis first analyses the complexity of the MVC encoder. It then proposes two novel techniques to reduce the complexity of motion and disparity estimation. The first method achieves complexity reduction in the disparity estimation process by exploiting the relationship between temporal levels, type of macroblocks and search ranges while the second method achieves it by exploiting the geometrical relation- ship between motion and disparity vectors in stereo frames. These two methods are then combined with other state-of-the-art methods in a unique framework where gains add up. Experimental results show that the proposed low-complexity framework can reduce the encoding time of the standard MVC encoder by over 93% while maintaining similar compression efficiency performance. The addition of new View Synthesis Prediction (VSP) modes to the MVC encoding framework improves the compression efficiency of MVC. However, testing additional modes comes at the cost of increased encoding complexity. In order to reduce the encoding complexity, the thesis, next, proposes a bayesian early mode decision technique for a VSP enhanced MVC coder. It exploits the statistical similarities between the RD costs of the VSP SKIP mode in neighbouring views to terminate the mode decision process early. Results indicate that the proposed technique can reduce the encoding time of the enhanced MVC coder by over 33% at similar compression efficiency levels. Finally, compressed 3D videos are usually required to be broadcast to a large number of users where transmission errors can lead to frame losses which can degrade the video quality at the decoder. A simple reconstruction of the lost frames can lead to inconsistent reconstruction of the 3D scene which may negatively affect the viewing experience of a user. In order to solve this problem, the thesis proposes, at the end, a consistency model for recovering frames lost during transmission. The proposed consistency model is used to evaluate inter-view and temporal consistencies while selecting candidate blocks for concealment. Experimental results show that the proposed technique is able to recover the lost frames with high consistency and better quality than two standard error concealment methods and a baseline technique based on the boundary matching algorithm.

APA, Harvard, Vancouver, ISO, and other styles

5

Barba, Ferrer Pere. "Multiview Landmark Detection forIdentity-Preserving Alignment." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-142475.

Full text

Abstract:

Face recognition is a fundamental task in computer vision and has been an important field of study for many years. Its importance in activities such as face recognition and classification, 3D animation, virtual modelling or biomedicine makes it a top-demanded activity, but finding accurate solutions still represents a great challenge nowadays. This report presents a unified process for automatically extract a set of face landmarks and remove all differences related to pose, expression and environment by bringing faces to a neutral pose-centred state. Landmark detection is based on a multiple viewpoint Pictorial Structure model, which specifies first, a part for each landmark we want to extract, second a tree structure to constraint its position within the face geometry and third, multiple trees to model differences due the orientation. In this report we address both the problem of how to find a set of landmarks from a model and the problem of training such a model from a set of labelled examples. We show how such a model successfully captures a great range of deformations needing far less training examples than common commercial face detectors. The alignment process basically aims to remove differences between multiple faces so they all can be analysed under the same criteria. It is carried out with Thin-plate Splines to adjust the detected set of landmarks to the desired configuration. With this method we assure smooth interpolations while the subject identity is preserved by modifying the original extracted configuration of parts and creating a generic distribution with the help of a reference face dataset. We present results of our algorithms both in a constrained environment and in the challenging LFPW face database. Successful outcomes are shown that prove our method to be a solid process for unitedly recognise and warp faces in the wild and to be on a par with other state-of-the-art procedures.
Ansiktsigenkänning är en grundläggande uppgift inom datorseende och har varit ett viktigt område för forskning i många år. Dess betydelse i områden som ansiktsigenkänning och klassificering, 3D-animering, virtuell modellering eller biomedicin gör det till en verksamhet med hög efterfrågan. Att hitta precisa lösningar utgör fortfarande en stor utmaning idag. Denna rapport presenterar en enhetlig process för att automatiskt extrahera en uppsättning ansiktslandmärken och ta bort alla skillnader relaterade till posering, uttryck och miljö genom att ta ansiktet till ett neutralcentrerat poseringstillstånd. Landmärksdetektering baseras på en bildmässig strukturmodell med multipel synvinkel som först anger en del för varje landmärke som ska extraheras, och sen en trädstruktur där positionen sparas därefter skapas multipla trädmodeller för att modellera skillnader på grund av olika riktningar. I denna rapport behandlas både problemet med hur man hittar en uppsättning landmärken från en modell och problemet med att träna en sådan modell från en uppsättning märkta exempel. Vi visar hur en sådan modell framgångsrikt fångar ett stort utbud av formändringar där betydligt mindre träningsexempel behövs än för vanliga kommersiella ansiktsdetektorer. Inriktningsprocessen syftar huvudsakligen till att upphäva skillnaderna mellan flera ansikten så att de alla kan analyseras enligt samma kriterier. För att justera den detekterade uppsättning landmärken används en splineinterpolation till den önskade konfigurationen. Denna metod ger en dämpad interpolation medan objektets identitet bevaras. Vi presenterar resultaten av våra algoritmer både i en begränsad miljö och i utmanande LFPW face-databas. Goda resultat visar att vår metod är en bra process för enigt erkänna och förvränga ansikten i en obegränsad miljö och att vara i nivå med andra state-of-the-art förfaranden.

APA, Harvard, Vancouver, ISO, and other styles

6

Mendonça, Paulo Ricardo dos Santos. "Multiview geometry : profiles and self-calibration." Thesis, University of Cambridge, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.621114.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Aksay, Anil. "Error Resilient Multiview Video Coding And Streaming." Phd thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12611682/index.pdf.

Full text

Abstract:

In this thesis, a number of novel techniques for error resilient coding and streaming for multiview video are presented. First of all, a novel coding technique for stereoscopic video is proposed where additional coding gain is achieved by downsampling one of the views spatially or temporally based on the well-known theory that the human visual system can perceive high frequencies in 3D from the higher quality view. Stereoscopic videos can be coded at a rate upto 1.2 times that of monoscopic videos with little visual quality degradation with the proposed coding technique. Next, a systematic method for design and optimization of multi-threaded multi-view video encoding/decoding algorithms using multi-core processors is proposed. The proposed multi-core decoding architectures are compliant with the current international standards, and enable multi-threaded processing with negligible loss of encoding efficiency and minimum processing overhead. End-to-end 3D Streaming system over Internet using current standards is implemented. A heuristic methodology for modeling the end-toend rate-distortion characteristic of this system is suggested and the parameters of the system is optimally selected using this model. End-to-end 3D Broadcasting system over DVB-H using current standards is also implemented. Extensive testing is employed to show the importance and characteristics of several error resilient tools. Finally we modeled end-to-end RD characteristics to optimize the encoding and protection parameters.

APA, Harvard, Vancouver, ISO, and other styles

8

Richter, Stefan. "Compression and View Interpolation for Multiview Imagery." Thesis, KTH, Ljud- och bildbehandling, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-37699.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Jutla, Dawn N. "Multiview model for protection and access control." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/NQ31529.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Seeling, Christian. "MultiView-Systeme zur explorativen Analyse unstrukturierter Information." Aachen Shaker, 2007. http://d-nb.info/1000271293/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Milborrow, Stephen. "Multiview active shape models with SIFT descriptors." Doctoral thesis, University of Cape Town, 2016. http://hdl.handle.net/11427/22867.

Full text

Abstract:

This thesis presents techniques for locating landmarks in images of human faces. A modified Active Shape Model (ASM [21]) is introduced that uses a form of SIFT descriptors [68]. Multivariate Adaptive Regression Splines (MARS [40]) are used to efficiently match descriptors around landmarks. This modified ASM is fast and performs well on frontal faces. The model is then extended to also handle non-frontal faces. This is done by first estimating the face's pose, rotating the face upright, then applying one of three ASM submodels specialized for frontal, left, or right three-quarter views. The multiview model is shown to be effective on a variety of datasets.

APA, Harvard, Vancouver, ISO, and other styles

12

Chan, Kevin S. (Kevin Sao Wei). "Multiview monocular depth estimation using unsupervised learning methods." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/119753.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 50-51).
Existing learned methods for monocular depth estimation use only a single view of scene for depth evaluation, so they inherently overt to their training scenes and cannot generalize well to new datasets. This thesis presents a neural network for multiview monocular depth estimation. Teaching a network to estimate depth via structure from motion allows it to generalize better to new environments with unfamiliar objects. This thesis extends recent work in unsupervised methods for single-view monocular depth estimation and uses the reconstruction losses for training posed in those works. Models and baseline models were evaluated on a variety of datasets and results indicate that indicate multiview models generalize across datasets better than previous work. This work is unique in that it emphasizes cross domain performance and ability to generalize more so than performance on the training set.
by Kevin S. Chan.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

13

Chari, Visesh. "Shape estimation of specular objects from multiview images." Thesis, Grenoble, 2012. http://www.theses.fr/2012GRENM106/document.

Full text

Abstract:

Un des modèles les plus simples de surface de réfraction est une surface plane. Bien que sa présence soit omniprésente dans notre monde sous la forme de vitres transparentes, de fenêtres, ou la surface d'eau stagnante, très peu de choses sont connues sur la géométrie multi-vues causée par la réfraction d'une telle surface. Dans la première partie de cette thèse, nous analysons la géométrie à vues multiple d'une surface réfractive. Nous considérons le cas où une ou plusieurs caméras dans un milieu (p. ex. l'air) regardent une scène dans un autre milieu (p. ex. l'eau), avec une interface plane entre ces deux milieux. Le cas d'une photo sous-marine, par exemple, correspond à cette description. Comme le modèle de projection perspectif ne correspond pas à ce scenario, nous dérivons le modèle de caméra et sa matrice de projection associée. Nous montrons que les lignes 3D de la scène correspondent à des courbes quartiques dans les images. Un point intéressant à noter à propos de cette configuration est que si l'on considère un indice de réfraction homogène, alors il existe une courbe unique dans l'image pour chaque ligne 3D du monde. Nous décrivons et développons ensuite des éléments de géométrie multi-vues telles que les matrices fondamentales ou d'homographies liées à la scène, et donnons des éléments pour l'estimation de pose des caméras à partir de plusieurs points de vue. Nous montrons également que lorsque le milieu est plus dense, la ligne d'horizon correspond à une conique qui peut être décomposer afin d'en déduire les paramètres de l'interface. Ensuite, nous étendons notre approche en proposant des algorithmes pour estimer la géométrie de plusieurs surfaces planes refractives à partir d'une seule image. Un exemple typique d'un tel scenario est par exemple lorsque l'on regarde à travers un aquarium. Nous proposons une méthode simple pour calculer les normales de telles surfaces étant donné divers scenari, en limitant le système à une caméra axiale. Cela permet dans notre cas d'utiliser des approches basées sur ransac comme l'algorithme “8 points” pour le calcul de matrice fondamentale, d'une manière similaire à l'estimation de distortions axiales de la littérature en vision par ordinateur. Nous montrons également que le même modèle peut être directement adapté pour reconstruire des surfaces réflectives sous l'hypothèse que les surfaces soient planes par morceaux. Nous présentons des résultats de reconstruction 3D encourageants, et analysons leur précision. Alors que les deux approches précédentes se focalisent seulement sur la reconstruction d'une ou plusieurs surfaces planes réfractives en utilisant uniquement l'information géométrique, les surfaces spéculaires modifient également la manière dont l'énergie lumineuse à la surface est redistribuée. Le modèle sous-jacent correspondant peut être expliqué par les équations de Fresnel. En exploitant à la fois cette information géométrique et photométrique, nous proposons une méthode pour reconstruire la forme de surfaces spéculaires arbitraires. Nous montrons que notre approche implique un scenario d'acquisition simple. Tout d'abord, nous analysons plusieurs cas minimals pour la reconstruction de formes, et en déduisons une nouvelle contrainte qui combine la géométrie et la théorie de Fresnel à propos des surfaces transparentes. Ensuite, nous illustrons la nature complémentaire de ces attributs qui nous aident à obtenir une information supplémentaire sur l'objet, qu'il est difficile d'avoir autrement. Finalement, nous proposons une discussion sur les aspects pratiques de notre algorithme de reconstruction, et présentons des résultats sur des données difficiles et non triviales
The task of understanding, 3D reconstruction and analysis of the multiple view geometry related to transparent objects is one of the long standing challenging problems in computer vision. In this thesis, we look at novel approaches to analyze images of transparent surfaces to deduce their geometric and photometric properties. At first, we analyze the multiview geometry of the simple case of planar refraction. We show how the image of a 3D line is a quartic curve in an image, and thus derive the first imaging model that accounts for planar refraction. We use this approach to then derive other properties that involve multiple cameras, like fundamental and homography matrices. Finally, we propose approaches to estimate the refractive surface parameters and camera poses, given images. We then extend our approach to derive algorithms for recovering the geometry of multiple planar refractive surfaces from a single image. We propose a simple technique to compute the normal of such surfaces given in various scenarios, by equating our setup to an axial camera. We then show that the same model could be used to reconstruct reflective surfaces using a piecewise planar assumption. We show encouraging 3D reconstruction results, and analyse the accuracy of results obtained using this approach. We then focus our attention on using both geometric and photometric cues for reconstructing transparent 3D surfaces. We show that in the presence of known illumination, we can recover the shape of such objects from single or multiple views. The cornerstone of our approach are the Fresnel equations, and we both derive and analyze their use for 3D reconstruction. Finally, we show our approach could be used to produce high quality reconstructions, and discuss other potential future applications

APA, Harvard, Vancouver, ISO, and other styles

14

Price, Timothy C. "Using Multiview Annotation to Annotate Multiple Images Simultaneously." BYU ScholarsArchive, 2017. https://scholarsarchive.byu.edu/etd/6560.

Full text

Abstract:

In order for a system to learn a model for object recognition, it must have a lot of positive images to learn from. Because of this, datasets of similar objects are built to train the model. These object datasets used for learning models are best when large, diverse and have annotations. But the process of obtaining the images and creating the annotations often times take a long time, and are costly. We use a method that obtains many images of the same objects in different angles very quickly and then reconstructs those images into a 3D model. We then use the 3D reconstruction of these images of an object to connect information about the different images of the same object together. We use that information to annotate all of the images taken very quickly and cheaply. These annotated images are then used to train the model.

APA, Harvard, Vancouver, ISO, and other styles

15

Garbas, Jens-Uwe [Verfasser]. "Scalable Wavelet-Based Multiview Video Coding / Jens-Uwe Garbas." München : Verlag Dr. Hut, 2010. http://d-nb.info/1009972251/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Seeling, Christian [Verfasser]. "MultiView-Systeme zur explorativen Analyse unstrukturierter Information / Christian Seeling." Aachen : Shaker, 2007. http://d-nb.info/1000271293/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Blais, Gerard. "Creating 3D computer objects by integrating multiview range data." Thesis, McGill University, 1993. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=69716.

Full text

Abstract:

The research presented in this thesis deals with the problem of range image registration for the purpose of building surface models of three-dimensional objects. The registration task consists of finding the translation and rotation parameters which properly align overlapping views of the surface so as to reconstruct from these partial surfaces, the complete surface representation of the object.
The approach taken is to express the registration task as an optimization problem. We define a function which measures the quality of the alignment between the partial surfaces contained in two range images. The function computes a sum of Euclidean distances between a set of control points on one of the surfaces to corresponding points on the other surface. The strength of the approach resides in the method used to determine point correspondences across range images.
Dual-view registration experiments conducted with the algorithm displayed extremely good results obtained in a very reasonable time. A multi-view registration experiment was also completed with success, except for the fact that a large processing time was required. A complete surface model of a typical 3D object was then constructed from the integration of its multiple partial views. (Abstract shortened by UMI.)

APA, Harvard, Vancouver, ISO, and other styles

18

Zatt, Bruno. "Energy-efficient algorithms and architectures for multiview video coding." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2012. http://hdl.handle.net/10183/70197.

Full text

Abstract:

The robust popularization of 3D videos noticed along the last decade, allied to the omnipresence of smart mobile devices handling multimedia-capable features, has led to intense development and research focusing on efficient 3D-video encoding techniques, display technologies, and 3D-video capable mobile devices. In this scenario, the Multiview Video Coding (MVC) standard is key enabler of the current 3D-video systems by leading to meaningful data reduction through advanced encoding techniques. However, real-time MVC encoding for high definition videos demands high processing performance and, consequently, high energy consumption. These requirements are attended neither by the performance budget nor by the energy envelope available in the state-of-the-art mobile devices. As a result, the realization of MVC targeting mobile systems has been posing serious challenges to industry and academia. The main goal of this thesis is to propose and demonstrate energy-efficient MVC solutions to enable high-definition 3D-video encoding on mobile battery-powered embedded systems. To expedite high performance under severe energy constraints, this thesis proposes jointly considering energy-efficient optimizations at algorithmic and architectural levels. On the one hand, extensive application knowledge and data analysis was employed to reduce and control the MVC complexity and energy consumption at algorithmic level. On the other hand, hardware architectures specifically designed targeting the proposed algorithms were implemented applying low-power design techniques, dynamic voltage scaling, and application-aware dynamic power management. The algorithmic contribution lies in the MVC energy reduction by shorten the computational complexity of the energy-hungriest encoder blocks, the Mode Decision and the Motion and Disparity Estimation. The proposed energy-efficient algorithms take advantage of the video properties along with the strong correlation available within the 3D-Neighborhood (spatial, temporal and disparity) space in order to efficiently reduce energy consumption. Our Multi-Level Fast Mode Decision defines two complexity reduction operation modes able to provide, on average, 63% and 71% of complexity reduction, respectively. Additionally, the proposed Fast ME/DE algorithm reduces the complexity in about 83%, for the average case. Considering the run-time variations posed by changing coding parameters and video content, an Energy-Aware Complexity Adaptation algorithm is proposed to handle the energy versus coding efficiency tradeoff while providing graceful quality degradation under severe battery draining scenarios by employing asymmetric video coding. Finally, to cope with eventual video quality losses posed by the energy-efficient algorithms, we define a video quality management technique based on our Hierarchical Rate Control. The Hierarchical Rate Control implements a frame-level rate control based on a Model Predictive Controller able to increase in 0.8dB (Bjøntegaard) the overall video quality. The video quality is increased in 1.9dB (Bjøntegaard) with the integration of the basic unit-level rate control designed using Markov Decision Process and Reinforcement Learning. Even though the energy-efficient algorithms drive to meaningful energy reduction, hardware acceleration is mandatory to reach the energy-efficiency demanded by the MVC. Aware of this requirement, this thesis brings architectural solutions for the Motion and Disparity Estimation unit focusing on energy reduction while attending real-time throughput requirements. To achieve the desired results, as shown along this volume, there is a need to reduce the energy related to the ME/DE computation and related to the intense memory communication. Therefore, the ME/DE architectures incorporate the Fast ME/DE algorithm in order to reduce the computational complexity while the memory hierarchy was carefully designed to find the optimal energy tradeoff between external memory accesses and on-chip video memory size. Statistical analysis where used to define the size and organization of the on-chip cache memory while avoiding increased memory misses and the consequent data retransmission. A prefetching technique based on search window prediction also supports the reduction of external memory access. Moreover, a memory power gating technique based on dynamic search window formation and an application aware power management were proposed to reduce the static energy consumption related to on-chip video memory. To implement these techniques a SRAM memory featuring multiple power states was used. The architectural contribution contained in this thesis extends the state-of-the-art by achieving real-time ME/DE processing for 4-views HD1080p running at 300MHz and consuming 57mW.

APA, Harvard, Vancouver, ISO, and other styles

19

Mahmood, Nasrul Humaimi. "3D surface reconstruction from multiviews for orthotic and prosthetic design." Thesis, University of Reading, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.494971.

Full text

Abstract:

Existing methods that use a fringe projection technique for orthotic and prosthetic designs produce good results for the trunk and lower limbs; however, the devices used for this purpose are expensive. This thesis investigates the use of an inexpensive passive method involving 3D surface reconstruction from video images taken at multiple views. The design and evaluation methodology, consisting of a number of techniques suitable for orthotic and prosthetic design, is developed. The method that focuses on fitting the reference model (3D model) of an object to the target data (3D data) is presented. The 3D model is obtained by a computer program while the 3D data uses the shape-from silhouette technique in an approximately circular motion.

APA, Harvard, Vancouver, ISO, and other styles

20

Kanaan, Samir. "Multiview pattern recognition methods for data visualization, embedding and clustering." Doctoral thesis, Universitat Politècnica de Catalunya, 2017. http://hdl.handle.net/10803/662852.

Full text

Abstract:

Multiview data is defined as data for whose samples there exist several different data views, i.e. different data matrices obtained through different experiments, methods or situations. Multiview dimensionality reduction methods transform a highdimensional, multiview dataset into a single, low-dimensional space or projection. Their goal is to provide a more manageable representation of the original data, either for data visualization or to simplify the following analysis stages. Multiview clustering methods receive a multiview dataset and propose a single clustering assignment of the data samples in the dataset, considering the information from all the input data views. The main hypothesis defended in this work is that using multiview data along with methods able to exploit their information richness produces better dimensionality reduction and clustering results than simply using single views or concatenating all views into a single matrix. Consequently, the objectives of this thesis are to develop and test multiview pattern recognition methods based on well known single-view dimensionality reduction and clustering methods. Three multiview pattern recognition methods are presented: multiview t-distributed stochastic neighbourhood embedding (MV-tSNE), multiview multimodal scaling (MV-MDS) and a novel formulation of multiview spectral clustering (MVSC-CEV). These methods can be applied both to dimensionality reduction tasks and to clustering tasks. The MV-tSNE method computes a matrix of probabilities based on distances between sam ples for each input view. Then it merges the different probability matrices using results from expert opinion pooling theory to get a common matrix of probabilities, which is then used as reference to build a low-dimensional projection of the data whose probabilities are similar. The MV-MDS method computes the common eigenvectors of all the normalized distance matrices in order to obtain a single low-dimensional space that embeds the essential information from all the input spaces, avoiding redundant information to be included. The MVSC-CEV method computes the symmetric Laplacian matrices of the similaritymatrices of all data views. Then it generates a single, low-dimensional representation of the input data by computing the common eigenvectors of the Laplacian matrices, obtaining a projection of the data that embeds the most relevan! information of the input data views, also avoiding the addition of redundant information. A thorough set of experiments has been designed and run in order to compare the proposed methods with their single view counterpart. Also, the proposed methods have been compared with all the available results of equivalent methods in the state of the art. Finally, a comparison between the three proposed methods is presented in order to provide guidelines on which method to use for a given task. MVSC-CEV consistently produces better clustering results than other multiview methods in the state of the art. MV-MDS produces overall better results than the reference methods in dimensionality reduction experiments. MV-tSNE does not excel on any of these tasks. As a consequence, for multiview clustering tasks it is recommended to use MVSC-CEV, and MV-MDS for multiview dimensionality reduction tasks. Although several multiview dimensionality reduction or clustering methods have been proposed in the state of the art, there is no software implementation available. In order to compensate for this fact and to provide the communitywith a potentially useful set of multiview pattern recognition methods, an R software package containg the proposed methods has been developed and released to the public.
Los datos multivista se definen como aquellos datos para cuyas muestras existen varias vistas de datos distintas , es decir diferentes matrices de datos obtenidas mediante diferentes experimentos , métodos o situaciones. Los métodos multivista de reducción de la dimensionalidad transforman un conjunto de datos multivista y de alta dimensionalidad en un único espacio o proyección de baja dimensionalidad. Su objetivo es producir una representación más manejable de los datos originales, bien para su visualización o para simplificar las etapas de análisis subsiguientes. Los métodos de agrupamiento multivista reciben un conjunto de datos multivista y proponen una única asignación de grupos para sus muestras, considerando la información de todas las vistas de datos de entrada. La principal hipótesis defendida en este trabajo es que el uso de datos multivista junto con métodos capaces de aprovechar su riqueza informativa producen mejores resultados en reducción de la dimensionalidad y agrupamiento frente al uso de vistas únicas o la concatenación de varias vistas en una única matriz. Por lo tanto, los objetivos de esta tesis son desarrollar y probar métodos multivista de reconocimiento de patrones basados en métodos univista reconocidos. Se presentan tres métodos multivista de reconocimiento de patrones: proyección estocástica de vecinos multivista (MV-tSNE), escalado multidimensional multivista (MV-MDS) y una nueva formulación de agrupamiento espectral multivista (MVSC-CEV). Estos métodos pueden aplicarse tanto a tareas de reducción de la dimensionalidad como a de agrupamiento. MV-tSNE calcula una matriz de probabilidades basada en distancias entre muestras para cada vista de datos. A continuación combina las matrices de probabilidad usando resultados de la teoría de combinación de expertos para obtener una matriz común de probabilidades, que se usa como referencia para construir una proyección de baja dimensionalidad de los datos. MV-MDS calcula los vectores propios comunes de todas las matrices normalizadas de distancia para obtener un único espacio de baja dimensionalidad que integre la información esencial de todos los espacios de entrada, evitando información redundante. MVSC-CEVcalcula las matrices Laplacianas de las matrices de similitud de los datos. A continuación genera una única representación de baja dimensionalidad calculando los vectores propios comunes de las Laplacianas. Así obtiene una proyección de los datos que integra la información más relevante y evita añadir información redundante. Se ha diseñado y ejecutado una batería de experimentos completa para comparar los métodos propuestos con sus equivalentes univista. Además los métodos propuestos se han comparado con los resultados disponibles en la literatura. Finalmente, se presenta una comparación entre los tres métodos para proporcionar orientaciones sobre el método más adecuado para cada tarea. MVSC-CEV produce mejores agrupamientos que los métodos equivalentes en la literatura. MV-MDS produce en general mejores resultados que los métodos de referencia en experimentos de reducción de la dimensionalidad. MV-tSNE no destaca en ninguna de esas tareas . Consecuentemente , para agrupamiento multivista se recomienda usar MVSC-CEV, y para reducción de la dimensionalidad multivista MV-MDS. Aunque se han propuesto varios métodos multivista en la literatura, no existen programas disponibles públicamente. Para remediar este hecho y para dotar a la comunidad de un conjunto de métodos potencialmente útil, se ha desarrollado un paquete de programas en R y se ha puesto a disposición del público.

APA, Harvard, Vancouver, ISO, and other styles

21

önder, gül, and aydın kayacık. "Multiview Face Detection Using Gabor Filter and Support Vector Machines." Thesis, Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-2152.

Full text

Abstract:

Face detection is a preprocessing step for face recognition algorithms. It is the localization of face/faces in an image or image sequence. Once the face(s) are localized, other computer vision algorithms such as face recognition, image compression, camera auto focusing etc are

applied. Because of the multiple usage areas, there are many research efforts in face processing. Face detection is a challenging computer vision problem because of lighting conditions, a high degree of variability in size, shape, background, color, etc. To build fully

automated systems, robust and efficient face detection algorithms are required.

Numerous techniques have been developed to detect faces in a single image; in this project we have used a classification-based face detection method using Gabor filter features. We have designed five frequencies corresponding to eight orientations channels for extracting facial features from local images. The feature vector based on Gabor filter is used as the input of the face/non-face classifier, which is a Support Vector Machine (SVM) on a reduced feature

subspace extracted by using principal component analysis (PCA).

Experimental results show promising performance especially on single face images where 78% accuracy is achieved with 0 false acceptances.

APA, Harvard, Vancouver, ISO, and other styles

22

Tola, Engin. "Multiview 3d Reconstruction Of A Scene Containing Independently Moving Objects." Master's thesis, METU, 2005. http://etd.lib.metu.edu.tr/upload/12606411/index.pdf.

Full text

Abstract:

In this thesis, the structure from motion problem for calibrated scenes containing independently moving objects (IMO) has been studied. For this purpose, the overall reconstruction process is partitioned into various stages. The first stage deals with the fundamental problem of estimating structure and motion by using only two views. This process starts with finding some salient features using a sub-pixel version of the Harris corner detector. The features are matched by the help of a similarity and neighborhood-based matcher. In order to reject the outliers and estimate the fundamental matrix of the two images, a robust estimation is performed via RANSAC and normalized 8-point algorithms. Two-view reconstruction is finalized by decomposing the fundamental matrix and estimating the 3D-point locations as a result of triangulation. The second stage of the reconstruction is the generalization of the two-view algorithm for the N-view case. This goal is accomplished by first reconstructing an initial framework from the first stage and then relating the additional views by finding correspondences between the new view and already reconstructed views. In this way, 3D-2D projection pairs are determined and the projection matrix of this new view is estimated by using a robust procedure. The final section deals with scenes containing IMOs. In order to reject the correspondences due to moving objects, parallax-based rigidity constraint is used. In utilizing this constraint, an automatic background pixel selection algorithm is developed and an IMO rejection algorithm is also proposed. The results of the proposed algorithm are compared against that of a robust outlier rejection algorithm and found to be quite promising in terms of execution time vs. reconstruction quality.

APA, Harvard, Vancouver, ISO, and other styles

23

Muddala, Suryanarayana Murthy, Mårten Sjöström, Roger Olsson, and Sylvain Tourancheau. "Edge-aided virtual view rendering for multiview video plus depth." Mittuniversitetet, Avdelningen för informations- och kommunikationssystem, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-18474.

Full text

Abstract:

Depth-Image-Based Rendering (DIBR) of virtual views is a fundamental method in three dimensional 3-D video applications to produce dierent perspectives from texture and depth information, in particular the multi-viewplus-depth (MVD) format. Artifacts are still present in virtual views as a consequence of imperfect rendering using existing DIBR methods. In this paper, we propose an alternative DIBR method for MVD. In the proposed method we introduce an edge pixel and interpolate pixel values in the virtual view using the actual projected coordinates from two adjacent views, by which cracks and disocclusions are automatically lled. In particular, we propose a method to merge pixel information from two adjacent views in the virtual view before the interpolation; we apply a weighted averaging of projected pixels within the range of one pixel in the virtual view. We compared virtual view images rendered by the proposed method to the corresponding view images rendered by state-of-theart methods. Objective metrics demonstrated an advantage of the proposed method for most investigated media contents. Subjective test results showed preference to dierent methods depending on media content, and the test could not demonstrate a signicant dierence between the proposed method and state-of-the-art methods.

APA, Harvard, Vancouver, ISO, and other styles

24

Feja, Sven [Verfasser]. "Validierung von MultiView-basierten Prozessmodellen mit grafischen Validierungsregeln / Sven Feja." Kiel : Universitätsbibliothek Kiel, 2012. http://d-nb.info/1022376136/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Zhang, Guang Yao. "Computational complexity optimization on H.264 scalable/multiview video coding." Thesis, University of Central Lancashire, 2014. http://clok.uclan.ac.uk/10569/.

Full text

Abstract:

The H.264/MPEG-4 Advanced Video Coding (AVC) standard is a high efficiency and flexible video coding standard compared to previous standards. The high efficiency is achieved by utilizing a comprehensive full search motion estimation method. Although the H.264 standard improves the visual quality at low bitrates, it enormously increases the computational complexity. The research described in this thesis focuses on optimization of the computational complexity on H.264 scalable and multiview video coding. Nowadays, video application areas range from multimedia messaging and mobile to high definition television, and they use different type of transmission systems. The Scalable Video Coding (SVC) extension of the H.264/AVC standard is able to scale the video stream in order to adapt to a variety of devices with different capabilities. Furthermore, a rate control scheme is utilized to improve the visual quality under the constraints of capability and channel bandwidth. However, the computational complexity is increased. A simplified rate control scheme is proposed to reduce the computational complexity. In the proposed scheme, the quantisation parameter can be computed directly instead of using the exhaustive Rate-Quantization model. The linear Mean Absolute Distortion (MAD) prediction model is used to predict the scene change, and the quantisation parameter will be increased directly by a threshold when the scene changes abruptly; otherwise, the comprehensive Rate-Quantisation model will be used. Results show that the optimized rate control scheme is efficient on time saving. Multiview Video Coding (MVC) is efficient on reducing the huge amount of data in multiple-view video coding. The inter-view reference frames from the adjacent views are exploited for prediction in addition to the temporal prediction. However, due to the increase in the number of reference frames, the computational complexity is also increased. In order to manage the reference frame efficiently, a phase correlation algorithm is utilized to remove the inefficient inter-view reference frame from the reference list. The dependency between the inter-view reference frame and current frame is decided based on the phase correlation coefficients. If the inter-view reference frame is highly related to the current frame, it is still enabled in the reference list; otherwise, it will be disabled. The experimental results show that the proposed scheme is efficient on time saving and without loss in visual quality and increase in bitrate. The proposed optimization algorithms are efficient in reducing the computational complexity on H.264/AVC extension. The low computational complexity algorithm is useful in the design of future video coding standards, especially on low power handheld devices.

APA, Harvard, Vancouver, ISO, and other styles

26

Gelman, Andriy. "Compression of multiview images using a sparse layer-based representation." Thesis, Imperial College London, 2012. http://hdl.handle.net/10044/1/9653.

Full text

Abstract:

Multiview images are obtained by recording a scene from different viewpoints. The additional information can be used to improve the performance of various applications ranging from e-commerce to security surveillance. Many such applications process large arrays of images, and therefore it is important to consider how the information is stored and transmitted. In this thesis we address the issue of multiview image compression. Our approach is based on the concept that a point in a 3D space maps to a constant intensity line in specific multiview image arrays. We use this property to develop a sparse representation of multiview images. To obtain the representation we segment the data into layers, where each layer is defined by an object located at a constant depth in the scene. We extract the layers by initialising the layer contours and then by iteratively evolving them in the direction which minimises an appropriate cost function. To obtain the sparse representation we reduce the redundancy of each layer by using a multi-dimensional discrete wavelet transform (DWT). We apply the DWT in a separable approach; first across the camera viewpoint dimensions, followed by a 2D DWT applied to the spatial dimensions. The camera viewpoint DWT is modified to take into account the structure of each layer, and also the occluded regions. Based on the sparse representation, we propose two compression algorithms. The first is a centralised approach, which achieves a high compression, however requires the transmission of all the data. The second is an interactive method, which trades-off compression performance in order to facilitate random access to the multiview image dataset. In addition, we address the issue of rate allocation between encoding of the layer contours and the texture. We demonstrate that the proposed centralised and interactive methods outperform H.264/MVC and JPEG 2000, respectively.

APA, Harvard, Vancouver, ISO, and other styles

27

Bilen, Cagdas. "A Hybrid Approach For Full Frame Loss Concealment Of Multiview Video." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/12608559/index.pdf.

Full text

Abstract:

Multiview video is one of the emerging research areas especially among the video coding community. Transmission of multiview video over an error prone network is possible with efficient compression of these videos. But along with the studies for efficiently compressing the multiview video, new error concealment and error protection methods are also necessary to overcome the problems due to erroneous channel conditions in practical applications. In packet switching networks, packet losses may lead to block losses in a frame or the loss of an entire frame in an encoded video sequence. In recent years several algorithms are proposed to handle the loss of an entire frame efficiently. However methods for full frame losses in stereoscopic or multiview videos are limited in the literature. In this thesis a stereoscopic approach for full frame loss concealment of multiview video is proposed. In the proposed methods, the redundancy and disparity between the views and motion information between the previously decoded frames are used to estimate the lost frame. Even though multiview video can be composed of more than two views, at most three view are utilized for concealment. The performance of the proposed algorithms are tested against monoscopic methods and the conditions under which the proposed methods are superior are investigated. The proposed algorithms are applied to both stereoscopic and multiview video.

APA, Harvard, Vancouver, ISO, and other styles

28

Wood-Harper, A. T. C. "Comparison of information systems definition methodologies : an action research, multiview perspective." Thesis, University of East Anglia, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.235602.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Suárez, Vanessa Isabel Tardillo. "Sistema de microscopia com multi-pontas : força atômica e campo próximo." Universidade Federal de Alagoas, 2012. http://repositorio.ufal.br/handle/riufal/1008.

Full text

Abstract:

In this work we made a review of how a multi-probes microscope using Atomic Force and Near Field Scanning Microscopy works. Currently, the Nanonics Multiview 4000 instaled at the Materials Caracterization and Microscopy Laboratory (LCMMAT) is not completly working. Nowadays, we are able to do Atomic Force Microscopy (AFM) and reflection and transmission Scanning Near Field Microscopy (SNOM) measurements. This kind of microscope have three probes which are able to do simultaneaous mesurements of AFM, C-AFM, SNOM, Raman Microscopy and nanolitography. It is the first multi-probe microscope to be instaled in Latin America. This work consists in studying the structure of this kind of microscope, how does it make AFM and SNOM measurements and how to analise them. We study the different electronic circuits which are used in this kind of microscopes and we compare both optical and tuning-fork feedback. It was explain step by step how to do and AFM and SNOM measurement. We study the processing and analise of this measurements. Finally, we made some different measurements using this tecniques. Some of this measurements were compared with that found in references in order to try to find some possible aplications which could be useful for future researches at our laboratory.
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Neste trabalho realizamos uma revisão do funcionamento do Microscópio Raman Confocal Multipontas com Campo Próximo e Força Atômica modelo Multiview 4000 da empresa Nanonics. Atualmente, o microscópio Multiview 4000 do Laboratório de Caracterização e Microscopia de Materiais (LCMMAT) ainda não se encontra operando aos 100%. Ele encontra-se em fase de montagem, estando disponível hoje em dia para uso só o Microscópio de Força Atômica (AFM) e o Microscópio de Campo Próximo (SNOM) nos modos reflexão e transmissão. Este modelo de microscópio, o qual possui três ponteiras que são capazes de fazer medidas em simultâneo de AFM, C-AFM, SNOM e microscopia Raman Confocal, alem de poder fazer nanolitografia, é o primeiro a ser instalado na America Latina. Durante a realização do presente trabalho, estudamos a estrutura do microscópio, como ele realiza as medidas destas duas técnicas e como elas são feitas. No estudo estrutural do Microscópio foram descritos os princípios físicos que são usados para a formação da imagem, alem dos diferentes tipos de circuitos eletrônicos usados em equipamentos de este tipo. Explicou-se passo a passo como são feitas as medidas de AFM e de SNOM. Estudamos também como é feito o analise e o processamento das imagens. Finalmente foram mostradas algumas imagens que foram feitas usando o microscópio, e comparou-se com alguns resultados encontrados na bibliografia a fim de encontrar possíveis aplicações de cada uma das amostras aqui mostradas.

APA, Harvard, Vancouver, ISO, and other styles

30

Koreshev, Iliya. "Improvements of interpolation and extrapolation view synthesis rendering for 3D and multiview displays." Thesis, University of British Columbia, 2013. http://hdl.handle.net/2429/45009.

Full text

Abstract:

To display video content in 3D, traditional stereoscopic televisions require two views of the same scene filmed at a small distance from one another. Unfortunately, having the required number of views is not always possible due to the complexity of obtaining them and the required bandwidth for transmission. In cases where more advanced auto-stereoscopic televisions require more than two views, the issue of obtaining and transmitting those additional views becomes even more complex. These issues led to the idea of having a small number of real views and their corresponding depth-maps, showing the distance of each object from the viewing plane, which together can be used to generate virtual intermediate views. These virtual synthesized views are generated by moving different objects in the real views a specific amount of pixels based on their distance from the viewing plane. The need for synthesizing virtual views is more pronounced with the introduction of stereoscopic and autostreoscopic (multiview) displays to the consumer market. In this case, as it is not practical to capture all the required views for different multiview display technologies, a limited number of views are captured and the remaining views are synthesized using the available views. View synthesis is also important in converting existing 2D content to 3D, a development that is necessary in the quest for 3D content which has been deemed a vital factor for faster adoption of the 3D technology. In this thesis a new hybrid approach for synthesizing views for stereoscopic and multiview applications is presented. This approach utilizes a unique and effective hole filling method that generates high quality 3D content. First, we present a new method for view interpolation where missing-texture areas are filled with data from the other available view and a unique warping approach that stretches background objects to fill in these areas. Second, a view extrapolation method is proposed where small areas of the image are filled using nearest-neighbor interpolation and larger areas are filled with the same unique image warping approach. Subjective evaluations confirm that this approach outperforms current state-of-the-art pixel interpolation-based as well as existing warping-based techniques.

APA, Harvard, Vancouver, ISO, and other styles

31

Chu, Jiaqi. "Orbital angular momentum encoding/decoding of 2D images for scalable multiview colour displays." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/274903.

Full text

Abstract:

Three-dimensional (3D) displays project 3D images that give 3D perceptions and mimic real-world objects. Among the rich varieties of 3D displays, multiview displays take advantage of light’s various degrees of freedom and provide some of the 3D perceptions by projecting 2D subsampling of a 3D object. More 2D subsampling is required to project images with smoother parallax and more realistic sensation. As an additional degree of freedom with theoretically unlimited state space, orbital angular momentum (OAM) modes may be an alternative to the conventional multiview approaches and potentially project more images. This research involves exploring the possibility of encoding/decoding off-axis points in 2D images with OAM modes, development of the optical system, and design and development of a multiview colour display architecture. The first part of the research is exploring encoding/decoding off-axis points with OAM modes. Conventionally OAM modes are used to encode/decode the on-axis information only. Analysis of on-axis OAM beams referenced to off-axis points suggests representation of off-axis displacements as a set of expanded OAM components. At current stage off-axis points within an effective coding area are possible to be encoded/decoded with chosen OAM modes for multiplexing. Experimentally a 2D image is encoded/decoded with an OAM modes. When the encoding/decoding OAM modes match, the image is reconstructed. On the other hand, a dark region with zero intensity is shown. The dark region suggests the effective coding area for multiplexing. The final part of the research develops a multiview colour display. Based on understandings of off-axis representation of a set of different OAM components and experimental test of the optical system, three 1 mm monochromatic images are encoded, multiplexed and projected. Having studied wavelength effects on OAM coding, the initial architecture is updated to a scalable colour display consisting of four wavelengths.

APA, Harvard, Vancouver, ISO, and other styles

32

Dou, Qingxu. "From small to large baseline multiview stereo : dealing with blur, clutter and occlusions." Thesis, Heriot-Watt University, 2011. http://hdl.handle.net/10399/2466.

Full text

Abstract:

This thesis addresses the problem of reconstructing the three-dimensional (3D) digital model of a scene from a collection of two-dimensional (2D) images taken from it. To address this fundamental computer vision problem, we propose three algorithms. They are the main contributions of this thesis. First, we solve multiview stereo with the o -axis aperture camera. This system has a very small baseline as images are captured from viewpoints close to each other. The key idea is to change the size or the 3D location of the aperture of the camera so as to extract selected portions of the scene. Our imaging model takes both defocus and stereo information into account and allows to solve shape reconstruction and image restoration in one go. The o -axis aperture camera can be used in a small-scale space where the camera motion is constrained by the surrounding environment, such as in 3D endoscopy. Second, to solve multiview stereo with large baseline, we present a framework that poses the problem of recovering a 3D surface in the scene as a regularized minimal partition problem of a visibility function. The formulation is convex and hence guarantees that the solution converges to the global minimum. Our formulation is robust to view-varying extensive occlusions, clutter and image noise. At any stage during the estimation process the method does not rely on the visual hull, 2D silhouettes, approximate depth maps, or knowing which views are dependent(i.e., overlapping) and which are independent( i.e., non overlapping). Furthermore, the degenerate solution, the null surface, is not included as a global solution in this formulation. One limitation of this algorithm is that its computation complexity grows with the number of views that we combine simultaneously. To address this limitation, we propose a third formulation. In this formulation, the visibility functions are integrated within a narrow band around the estimated surface by setting weights to each point along optical rays. This thesis presents technical descriptions for each algorithm and detailed analyses to show how these algorithms improve existing reconstruction techniques.

APA, Harvard, Vancouver, ISO, and other styles

33

Sampaio, Felipe Martin. "Energy-efficient memory hierarchy for motion and disparity estimation in multiview video coding." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2013. http://hdl.handle.net/10183/71292.

Full text

Abstract:

Esta dissertação de mestrado propõe uma hierarquia de memória para a Estimação de Movimento e de Disparidade (ME/DE) centrada nas referências da codificação, estratégia chamada de Reference-Centered Data Reuse (RCDR), com foco em redução de energia em codificadores de vídeo multivistas (MVC - Multiview Video Coding). Nos codificadores MVC, a ME/DE é responsável por praticamente 98% do consumo total de energia. Além disso, até 90% desta energia está relacionada com a memória do codificador: (a) acessos à memória externa para a busca das referências da ME/DE (45%) e (b) memória interna (cache) para manter armazenadas as amostras da área de busca e enviá-las para serem processadas pela ME/DE (45%). O principal objetivo deste trabalho é minimizar de maneira conjunta a energia consumida pelo módulo de ME/DE com relação às memórias externa e interna necessárias para a codificação MVC. A hierarquia de memória é composta por uma memória interna (a qual armazena a área de busca inteira), um controle dinâmico para a estratégia de power-gating da memória interna e um compressor de resultados parciais. Um controle de buscas foi proposto para explorar o comportamento da busca com o objetivo de atingir ainda mais reduções de energia. Além disso, este trabalho também agrega à hierarquia de memória um compressor de quadros de referência de baixa complexidade. A estratégia RCDR provê reduções de até 68% no consumo de energia quando comparada com estratégias estadoda- arte que são centradas no bloco atual da codificação. O compressor de resultados parciais é capaz de reduzir em 52% a comunicação com memória externa necessária para o armazenamento desses elementos. Quando comparada a técnicas de reuso de dados que não acessam toda área de busca, a estratégia RCDR também atinge os melhores resultados em consumo de energia, visto que acessos regulares a memórias externas DDR são energeticamente mais eficientes. O compressor de quadros de referência reduz ainda mais o número de acessos a memória externa (2,6 vezes menos acessos), aliando isso a perdas insignificantes na eficiência da codificação MVC. A memória interna requerida pela estratégia RCDR é até 74% menor do que estratégias centradas no bloco atual, como Level C. Além disso, o controle dinâmico para a técnica de power-gating provê reduções de até 82% na energia estática, o que é o melhor resultado entre os trabalho relacionados. A energia dinâmica é tratada pela técnica de união dos blocos candidatos, atingindo ganhos de mais de 65%. Considerando as reduções de consumo de energia atingidas pelas técnicas propostas neste trabalho, conclui-se que o sistema de hierarquia de memória proposto nesta dissertação atinge seu objetivo de atender às restrições impostas pela codificação MVC, no que se refere ao processamento do módulo de ME/DE.
This Master Thesis proposes a memory hierarchy for the Motion and Disparity Estimation (ME/DE) centered on the encoding references, called Reference-Centered Data Reuse (RCDR), focusing on energy reduction in the Multiview Video Coding (MVC). In the MVC encoders the ME/DE represents more than 98% of the overall energy consumption. Moreover, in the overall ME/DE energy, up to 90% is related to the memory issues, and only 10% is related to effective computation. The two items to be concerned with: (1) off-chip memory communication to fetch the reference samples (45%) and (2) on-chip memory to keep stored the search window samples and to send them to the ME/DE processing core (45%). The main goal of this work is to jointly minimize the on-chip and off-chip energy consumption in order to reduce the overall energy related to the ME/DE on MVC. The memory hierarchy is composed of an onchip video memory (which stores the entire search window), an on-chip memory gating control, and a partial results compressor. A search control unit is also proposed to exploit the search behavior to achieve further energy reduction. This work also aggregates to the memory hierarchy a low-complexity reference frame compressor. The experimental results proved that the proposed system accomplished the goal of the work of jointly minimizing the on-chip and off-chip energies. The RCDR provides off-chip energy savings of up to 68% when compared to state-of-the-art. the traditional MBcentered approach. The partial results compressor is able to reduce by 52% the off-chip memory communication to handle this RCDR penalty. When compared to techniques that do not access the entire search window, the proposed RCDR also achieve the best results in off-chip energy consumption due to the regular access pattern that allows lots of DDR burst reads (30% less off-chip energy consumption). Besides, the reference frame compressor is capable to improve by 2.6x the off-chip memory communication savings, along with negligible losses on MVC encoding performance. The on-chip video memory size required for the RCDR is up to 74% smaller than the MB-centered Level C approaches. On top of that, the power-gating control is capable to save 82% of leakage energy. The dynamic energy is treated due to the candidate merging technique, with savings of more than 65%. Due to the jointly off-chip communication and on-chip storage energy savings, the proposed memory hierarchy system is able to meet the MVC constraints for the ME/DE processing.

APA, Harvard, Vancouver, ISO, and other styles

34

Goyal, Anil. "Learning a Multiview Weighted Majority Vote Classifier : Using PAC-Bayesian Theory and Boosting." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSES037/document.

Full text

Abstract:

La génération massive de données, nous avons de plus en plus de données issues de différentes sources d’informations ayant des propriétés hétérogènes. Il est donc important de prendre en compte ces représentations ou vues des données. Ce problème d'apprentissage automatique est appelé apprentissage multivue. Il est utile dans de nombreux domaines d’applications, par exemple en imagerie médicale, nous pouvons représenter le cerveau humains via des IRM, t-fMRI, EEG, etc. Dans cette cette thèse, nous nous concentrons sur l’apprentissage multivue supervisé, où l’apprentissage multivue est une combinaison de différents modèles de classifications ou de vues. Par conséquent, selon notre point de vue, il est intéressant d’aborder la question de l’apprentissage à vues multiples dans le cadre PAC-Bayésien. C’est un outil issu de la théorie de l’apprentissage statistique étudiant les modèles s’exprimant comme des votes de majorité. Un des avantages est qu’elle permet de prendre en considération le compromis entre précision et diversité des votants, au cœur des problématiques liées à l’apprentissage multivue. La première contribution de cette thèse étend la théorie PAC-Bayésienne classique (avec une seule vue) à l’apprentissage multivue (avec au moins deux vues). Pour ce faire, nous définissons une hiérarchie de votants à deux niveaux: les classifieurs spécifiques à la vue et les vues elles-mêmes. Sur la base de cette stratégie, nous avons dérivé des bornes en généralisation PAC-Bayésiennes (probabilistes et non-probabilistes) pour l’apprentissage multivue. D'un point de vue pratique, nous avons conçu deux algorithmes d'apprentissage multivues basés sur notre stratégie PAC-Bayésienne à deux niveaux. Le premier algorithme appelé PB-MVBoost est un algorithme itératif qui apprend les poids sur les vues en contrôlant le compromis entre la précision et la diversité des vues. Le second est une approche de fusion tardive où les prédictions des classifieurs spécifiques aux vues sont combinées via l’algorithme PAC-Bayésien CqBoost proposé par Roy et al. Enfin, nous montrons que la minimisation des erreurs pour le vote de majorité multivue est équivalente à la minimisation de divergences de Bregman. De ce constat, nous proposons un algorithme appelé MωMvC2 pour apprendre un vote de majorité multivue
With tremendous generation of data, we have data collected from different information sources having heterogeneous properties, thus it is important to consider these representations or views of the data. This problem of machine learning is referred as multiview learning. It has many applications for e.g. in medical imaging, we can represent human brain with different set of features for example MRI, t-fMRI, EEG, etc. In this thesis, we focus on supervised multiview learning, where we see multiview learning as combination of different view-specific classifiers or views. Therefore, according to our point of view, it is interesting to tackle multiview learning issue through PAC-Bayesian framework. It is a tool derived from statistical learning theory studying models expressed as majority votes. One of the advantages of PAC-Bayesian theory is that it allows to directly capture the trade-off between accuracy and diversity between voters, which is important for multiview learning. The first contribution of this thesis is extending the classical PAC-Bayesian theory (with a single view) to multiview learning (with more than two views). To do this, we considered a two-level hierarchy of distributions over the view-specific voters and the views. Based on this strategy, we derived PAC-Bayesian generalization bounds (both probabilistic and expected risk bounds) for multiview learning. From practical point of view, we designed two multiview learning algorithms based on our two-level PAC-Bayesian strategy. The first algorithm is a one-step boosting based multiview learning algorithm called as PB-MVBoost. It iteratively learns the weights over the views by optimizing the multiview C-Bound which controls the trade-off between the accuracy and the diversity between the views. The second algorithm is based on late fusion approach where we combine the predictions of view-specific classifiers using the PAC-Bayesian algorithm CqBoost proposed by Roy et al. Finally, we show that minimization of classification error for multiview weighted majority vote is equivalent to the minimization of Bregman divergences. This allowed us to derive a parallel update optimization algorithm (referred as MωMvC2) to learn our multiview weighted majority vote

APA, Harvard, Vancouver, ISO, and other styles

35

Marques, Hugo R. "Efficient and scalable architecture for multiview real-time media distribution for next generation networks." Thesis, University of Surrey, 2017. http://epubs.surrey.ac.uk/813526/.

Full text

Abstract:

With the massive deployment of broadband access to the end-users, the continuous improvement of the hardware capabilities of end devices and better video compression techniques, acceptable conditions have been met to unleash over-the-top bandwidth demanding and time-stringent P2P applications, as multiview real-time media distribution. Such applications enable the transmission of multiple views of the same scene, providing consumers with a more immersive visual experience. This thesis proposes an architecture to distribute multiview real-time media content using a hybrid DVB-T2, client-server and P2P paradigms, supported by an also novel QoS solution. The approach minimizes packet delay, inter-ISP traffic and traffic at the ISP core network, which are some of the main drawbacks of P2P networks, whilst still meeting stringent QoS demands. The proposed architecture uses DVB-T2 to distribute a self-contained and fully decodable base-layer video signal, assumed to be always available to the end-user, and an IP network to distribute in parallel - with increased delay - additional IP video streams. The result is a decoded video quality that adapts to individual end-user conditions and maximizes viewing experience. To achieve its target goal this architecture: defines new services for the ISP’s services network and new roles for the ISP core, edge and border routers; makes use of pure IP multicast transmission at the ISP’s core network, greatly minimizing bandwidth consumption; constructs a geographically contained P2P network that uses P2P application-level multicast trees to assist the distribution of the IP video streams at the ISP access networks, greatly reducing inter-ISP traffic, and; describes a novel QoS control architecture that takes advantage of the Internet resource over-provisioning techniques to meet stringent QoS demands in a scalable manner. The proposed architecture has been implemented in both real testbed implementation and ns-2 simulations. Results have shown a highly scalable P2P overlay construction algorithm with very fast computation of application-level multicast trees (in the order of milliseconds) and efficient reaction to peer-churn, with no perceptually annoying impairments noticed. Furthermore, huge bandwidth savings are achieved at the ISP core network, which considerably lower the management and investment costs in infrastructure. The QoS based results have also shown that the proposed approach effectively deploys a fast and scalable resource and admission control mechanism, greatly minimizing QoS related signalling events by using a per-class over-provisioning approach and thus preventing per-flow QoS reservation signalling messages. Moreover, the QoS control architecture is aware of network link resources in real-time and supports for service differentiation and network convergence by guaranteeing that each admitted traffic flow receives the contracted QoS. Finally, the proposed Scalable Architecture for Multiview Real-Time Media Distribution for Next Generation Networks, as a component for a large project demonstrator, has been evaluated by an independent panel of experts following ITU recommendations, obtaining an excellent evaluation as computed by Mean Opinion Score.

APA, Harvard, Vancouver, ISO, and other styles

36

Renström, Ida. "Evaluation of autostereoscopic 3Dvideo for short-term exposure : produced using semiautomatic stereo-to-multiview conversion." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-161885.

Full text

Abstract:

Ida Renström Evaluation of autostereoscopic 3D video for short-term exposure, produced using semiautomatic stereo-to-multiview conversion ABSTRACT In this study, I investigate semiautomatic conversion from stereoscopic 3D to autostereoscopic 3D with multiple views. The conversion simplifies the production compared to creating autostereoscopic 3D from scratch. The research question of this study is; what level of holistic experience can be achieved from converted 3D compared to stereoscopic 3D video with glasses? The intended 3D contexts for this study require or are favored by glassesfree 3D with multiple views, and the exposure of 3D is shorttermed. I conducted user tests in a controlled setting as well as a public setting. The results show that it is difficult to make a general evaluation of the user experience of the final product, because different individuals perceive 3D very differently. Results from experiments in a controlled setting, where stereoscopic 3D was used as a direct reference, indicate that converted autostereoscopic 3D does not achieve the same perceived video quality as stereoscopic 3D. However, the fact that no glasses are needed compensates for this in the overall user experience. In an experiment with a public setting, where the participants' previous experiences of stereoscopic 3D were used as reference, a majority perceived the quality of converted autostereoscopic 3D to be better than, or equivalent to, that of previous experiences with stereoscopic 3D. A majority also said that the experience was positive. The latter experience made u se of an environment and situation that was close to real life and the intended types of contexts. Therefore, these results argue that autostereoscopic multiview 3D video converted from stereoscopic 3D is useful and gives a good holistic experience compared to stereoscopic 3D with glasses. This is in contexts favored by glassesfree 3D with multiple views and where the exposure of 3D is shorttermed. An autostereoscopic display in a retail space, where people walk by and view advertised material for a few seconds, is one example of a context suited for converted autostereoscopic 3D.
Ida Renström Evaluering av autostereoskopisk 3D-video för korttidsexponering, producerad genom semiautomatisk stereo-till-multiview konvertering SAMMANFATTNING I denna studie undersöker jag semiautomatisk konvertering från stereoskopisk 3D till autostereoskopisk 3D med multipla vyer. Konverteringen förenklar produktionen, jämfört med att skapa autostereoskopisk 3D från grunden. Frågeställningen i denna studie är; vilken helhetsupplevelse resultatet av konverteringen ger i jämförelse med stereoskopisk 3D, inom vissa specifika användningsområden. Dessa användningsområden utgörs av 3D-sammanhang som starkt gynnas av glasögonfri 3D med multipla vyer och där exponeringen av 3D är kortvarig. Jag genomförde användartester i dels kontrollerad miljö samt offentlig miljö. Resultaten i denna studie visar att det är svårt att göra en generell utvärdering av användarupplevelsen av de resultat som denna konverteringsmetod ger, eftersom 3D upplevs väldigt olika av olika individer. Experimenten i kontrollerad miljö, där stereoskopisk 3D användes som direkt referens, visar att konverterad autostereoskopisk 3D håller en lägre upplevd kvalité än stereoskopisk 3D. Det faktum att inte glasögon behövs kompenserar dock för detta när man ser till den totala upplevelsen och de krav som situationen ställer. Vid det offentliga experimentet, med deltagarnas tidigare erfarenheter av stereoskopisk 3D som referens, ansåg en majoritet att den upplevda kvalitén av konverterad autostereoskopisk 3D var bättre än eller lika bra som tidigare erfarenheter av stereoskopisk 3D. Det senare experimentet genomfördes i en miljö och ett sammanhang som var mer likt verkligheten och det tänkta sammanhanget än det tidigare. Därför väger dessa resultat tungt och argumenterar för att autostereoskopisk 3D video med multipla vyer konverterad från stereoskopisk 3D är användbar och ger en bra helhetsupplevelse, inom användningsområden som starkt gynnas av glasögonfri 3D med multipla vyer och där exponeringen av 3D är kortvarig. Ett konkret exempel på detta kan vara att folk passerar en autostereskopisk skärm på en offentlig plats och möter reklam i 3D under några sekunder.

APA, Harvard, Vancouver, ISO, and other styles

37

Watson, Heather. "A critical study of the multiview methodology : a poststructuralist textual analysis of concepts in inquiry." Thesis, University of Salford, 1995. http://usir.salford.ac.uk/14713/.

Full text

Abstract:

This thesis considers the concept of information as meaning through the following research question: how can we work critically with a tradition of information systems development methodologies? Motivation for this derives from the way 'hard' methodologies have traditionally regarded information as structured data. This neglects 'soft' concerns for how people attribute meaning to data through a process of 'inward-forming' as they use data to make sense of a situation. The research is potentially important insofar as it considers how viewing information as structured data may have confused attempts at theory building. That is, if information is conceived of as structured data, then this may be reflected in how we conceive of a methodology's theory with the result that the meaning of a methodology becomes guaranteed by the theory. This gives rise to a prescriptive tradition of theory that is potentially misleading because it neglects the personal skills of those who use methodologies. This is investigated through a descriptive/interpretive research approach using a poststructuralist textual analysis of concepts in the theory and practice of a methodology. While structuralism views meaning as something static contained 'within' a text that readers passively consume, poststructuralism emphasises how readers actively derive meaning through their interactions with texts. In addressing the hermeneutic and deconstructive aspects of poststructuralism, the research draws on the philosophers, Paul Ricoeur and Jacques Derrida respectively. With regard to Derrida, deconstruction is used to argue how the main position asserted by a methodology's texts is undermined by elements within the texts themselves. This critically questions the foundations on which a methodology claims to be based. The general purpose is to build theories of methodology that address information as meaning. To this end, the thesis centres on four areas of investigation: it considers themes associated with linking 'hard' and 'soft' methodologies, investigates a specific methodology that links such approaches, raises a critical element by deconstructing concepts in inquiry, and considers implications for the relationship between theory and practice of methodology. The area of application for the research was Multiview Methodology (MVM) because it combines a range of existing methodologies that reflect 'soft' concerns for how people interpret meaning as well as a traditional 'hard' focus on structuring data for use on computerised information systems. The deconstructive approach used in this research is not yet common in the field of information systems. As such, this research is intended to contribute towards new critical strategies that challenge methodologies as conceptual systems in their own right as distinct from strategies that challenge their authors. Focusing on the conceptual implications of methodologies rather than their authors' intentions resulted in four main outcomes: a conception of paradigm as network, which refers to a shared conception of meaning, though commitments to beliefs in particular models vary from heuristic to ontological; a Trojan horse phenomenon, which refers to tendencies to reiterate limitations criticised in others; constraints of traditional print media insofar as these are associated with linear and static descriptions of methodology in use; and methodology as metaphor, which refers to the process through which we understand the unfamiliar in terms of the familiar thereby creating new concepts while still retaining aspects of our past experiences.

APA, Harvard, Vancouver, ISO, and other styles

38

Sheik, Osman Wan Rozaini. "Using multiview to develop information systems for Malaysian small and medium scale manufacturing industries (SMIs)." Thesis, University of Salford, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.360442.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Renström, Ida. "Evaluation of autostereoscopic 3D video for short-term exposure : produced using semiautomatic stereo-to-multiview conversion." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-159188.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Schweiger, Florian [Verfasser], Eckehard [Akademischer Betreuer] Steinbach, and Gerhard [Akademischer Betreuer] Rigoll. "Spatio-temporal Analysis of Multiview Video / Florian Schweiger. Gutachter: Gerhard Rigoll ; Eckehard Steinbach. Betreuer: Eckehard Steinbach." München : Universitätsbibliothek der TU München, 2013. http://d-nb.info/1045023523/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Montgomery, Peter Roland James. "Improving operations viability and reducing variety using A.D.I.S (Accurate drawing information system) : a multiview methodology of design." Master's thesis, University of Cape Town, 1997. http://hdl.handle.net/11427/9497.

Full text

Abstract:

Includes bibliographical references.
Gabriel S.A. is a South African shockabsorber manufacturing company which has undergone a strategic repositioning to become internationally competitive. This entailed a move away from the traditional hierarchical management structure and production line manufacturer, to a flatter structure with cross-functional Business Units. Each Business Unit is made up of self-contained, Manufacturing cells run by self-directed work teams. The objective of this change is to ensure that Gabriel S.A. becomes a world class manufacturer.. The company has gone a long way down this road in implementing World Class Manufacturing techniques through the Gabriel Total Quality Production System (GTQPS). However, problems still arise within the system, especially with regard to new product/component designs and changed designs reaching the shop floor timeously. This is aggravated by the necessity to penetrate new markets and retain existing ones successfully. The number of quotations to be prepared will increase. As will the subsequent number of required assembly and component drawings and modification to existing products. These, in turn, will involve revisions to current drawings. This is compounded by the fact that in the current business operations, there are already concerns regarding the routine drawing information requirements. This thesis investigates the affect of the drawing information system on the viability of the Manufacturing cells and documents the intervention of a socio-technical drawing information system.

APA, Harvard, Vancouver, ISO, and other styles

42

Berent, Jesse. "Coherent multi-dimensional segmentation of multiview images using a variational framework and applications to image based rendering." Thesis, Imperial College London, 2008. http://hdl.handle.net/10044/1/1419.

Full text

Abstract:

Image Based Rendering (IBR) and in particular light field rendering has attracted a lot of attention for interpolating new viewpoints from a set of multiview images. New images of a scene are interpolated directly from nearby available ones, thus enabling a photorealistic rendering. Sampling theory for light fields has shown that exact geometric information in the scene is often unnecessary for rendering new views. Indeed, the band of the function is approximately limited and new views can be rendered using classical interpolation methods. However, IBR using undersampled light fields suffers from aliasing effects and is difficult particularly when the scene has large depth variations and occlusions. In order to deal with these cases, we study two approaches: New sampling schemes have recently emerged that are able to perfectly reconstruct certain classes of parametric signals that are not bandlimited but characterized by a finite number of parameters. In this context, we derive novel sampling schemes for piecewise sinusoidal and polynomial signals. In particular, we show that a piecewise sinusoidal signal with arbitrarily high frequencies can be exactly recovered given certain conditions. These results are applied to parametric multiview data that are not bandlimited. We also focus on the problem of extracting regions (or layers) in multiview images that can be individually rendered free of aliasing. The problem is posed in a multidimensional variational framework using region competition. In extension to previous methods, layers are considered as multi-dimensional hypervolumes. Therefore the segmentation is done jointly over all the images and coherence is imposed throughout the data. However, instead of propagating active hypersurfaces, we derive a semi-parametric methodology that takes into account the constraints imposed by the camera setup and the occlusion ordering. The resulting framework is a global multi-dimensional region competition that is consistent in all the images and efficiently handles occlusions. We show the validity of the approach with captured light fields. Other special effects such as augmented reality and disocclusion of hidden objects are also demonstrated.

APA, Harvard, Vancouver, ISO, and other styles

43

Temerinac-Ott, Maja [Verfasser], and Hans [Akademischer Betreuer] Burkhardt. "Multiview reconstruction for 3D Images from light sheet based fluorescence microscopy = Rekonstruktion für 3D Aufnahmen von lichtschichtbasierter Fluoreszenzmikroskopie." Freiburg : Universität, 2012. http://d-nb.info/112347222X/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Bai, Baochun. "Multiview Video Compression." Phd thesis, 2009. http://hdl.handle.net/10048/744.

Full text

Abstract:

With the progress of computer graphics and computer vision technologies, 3D/multiview video applications such as 3D-TV and tele-immersive conference become more and more popular and are very likely to emerge as a prime application in the near future. A successful 3D/multiview video system needs synergistic integration of various technologies such as 3D/multiview video acquisition, compression, transmission and rendering. In this thesis, we focus on addressing the challenges for multiview video compression. In particular, we have made 5 major contributions: (1) We propose a novel neighbor-based multiview video compression system which helps remove the inter-view redundancies among multiple video streams and improve the performance. An optimal stream encoding order algorithm is designed to enable the encoder to automatically decide the stream encoding order and find the best reference streams. (2) A novel multiview video transcoder is designed and implemented. The proposed multiview video transcoder can be used to encode multiple compressed video streams and reduce the cost of multiview video acquisition system. (3) A learning-based multiview video compression scheme is invented. The novel multiview video compression algorithms are built on the recent advances on semi-supervised learning algorithms and achieve compression by finding a sparse representation of images. (4) Two novel distributed source coding algorithms, EETG and SNS-SWC, are put forward. Both EETG and SNS-SWC are capable to achieve the whole Slepian-Wolf rate region and are syndrome-based schemes. EETG simplifies the code construction algorithm for distributed source coding schemes using extended Tanner graph and is able to handle mismatched bits at the encoder. SNS-SWC has two independent decoders and thus can simplify the decoding process. (5) We propose a novel distributed multiview video coding scheme which allows flexible rate allocation between two distributed multiview video encoders. SNS-SWC is used as the underlying Slepian-Wolf coding scheme. It is the first work to realize simultaneous Slepian-Wolf coding of stereo videos with the help of a distributed source code that achieves the whole Slepian-Wolf rate region. The proposed scheme has a better rate-distortion performance than the separate H.264 coding scheme in the high-rate case.
Computer Networks and Multimedia Systems

APA, Harvard, Vancouver, ISO, and other styles

45

Hsiang-YiChen and 陳湘宜. "A Multiview Video Face Recognition System." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/06064592936521970707.

Full text

Abstract:

碩士
國立成功大學
電腦與通信工程研究所
98
Human-computer interaction (HCI), has been rapidly developed in recent years. Computer vision has been used in surveillance systems, and it gradually plays an important role in our lives. In the research of face recognition, in this thesis, we discuss the performance of single-based and video-based face recognition respectively. To improve the recognition rate, we use the multi-view video face images to synthesize a virtual frontal face. In the still image face recognition, in this thesis, we first present a fast face detector. To compare with the face detector of Viola-Jones, experimental results show that our method can improve the detection speed and reduce false alarm rate. For the video-based face recognition, we then use the PCA and LDA to analyze the number of test images, the different views of face and the frontal views generated from non-frontal images, which affect the performance of face recognition. Respectively, we used the AT＆T, Stereo face database and our multi-view face database to do experiments and validations.

APA, Harvard, Vancouver, ISO, and other styles

46

Yu, Jo-Heng, and 余若珩. "Multiview 3-D Laser Foot Scanner." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/40468084335274786086.

Full text

Abstract:

碩士
國立暨南國際大學
資訊工程學系
104
With the 3D scanner technology gradually mature, 3D scanner has become indispensable in all fields In footwear industry, when people choosing the shoes they want, they mostly depend on how well the shoes look, how well they fit, and how comfortable they are. However, whether the shoes are capable of supporting the foot for taking a long walk and are durable to wear mostly depends on whether the shoe size fits the shape of the user foot. Therefore, this work aims to develop a multi-view 3D laser scanner dedicated to the specific fields of 3D human foot scanning. The developed system consists of easily obtainable parts such as the laser line modules, webcams, and steppers. The main advantage of the proposed foot scanner is that it does not require the user to wear any sock nor need to paste any fiducial marker at the phalange joint. To compute the 3D foot shape, it is essential to calibration the cameras in order to obtain the extrinsic and intrinsic camera parameters. The camera parameters are used to compose camera projection matrices. Based on the projection matrices, epipolar constraints are evaluated for simplifying the stereo matching of laser light stripes projected on the foot surface. The computed stereo correspondences are then used to compute 3D point cloud of the foot surface. The point cloud is used to compute the length, the width, and the ball girth of the foot. Before 3D scanning, the user is asked to align her/his foot to a reference line marked on the measurement area. Therefore, the width and the length of the foot can be computed with the width and the height of the bounding box of the point cloud. Since the ball girth is closely related to the phalange joints of the first and the fifth toes, the first step is to compute those two joint positions in a range proportional to the foot length. Then, a plane perpendicular to the ground plane and passing through the two phalange joint points is constructed and the cross-sectional curve of the foot is computed to evaluate the ball girth. To verify the proposed system, we compare the estimation errors of manual measurements and automatic measurements. Seven subjects including five males and two females participated in the experiments. The experimental results show that the measurement errors of the proposed method are all less than 5 mm which is satisfactory for the shoe fitting application.

APA, Harvard, Vancouver, ISO, and other styles

47

Lan, Hsu-Kai, and 藍旭凱. "MULTIVIEW SYNTHESIS FROM MONOCULAR IMAGE SEQUENCES." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/76635093102080390696.

Full text

Abstract:

碩士
大同大學
通訊工程研究所
99
Depth estimation plays an important role in computer vision and computer graphics with applications such as robotics, 3D reconstruction and image refocusing. Conventional methods for estimating the depth of a scene have relied on multiple images. In this paper, we focus on a challenging problem of estimating depth from a single defocused image captured by an uncalibrated camera. Due to the limited depth of focus, the image of objects nearer or farther than the point of focus tends to be defocus and blurred.Theoretically, the degrees of image blur caused by defocus are quantifatively related to the variations of local spatial frequency. In this work , we derive the depth information by measuring the local spatial frequency of textures and edges using multi-resolution wavelet analysis and Lipschitz regularity. The proposed depth recovering algorithm consists of three steps : First, we compute wavelet coefficients by using Haar wavelet transform on the local windows of each pixel in the image. By manipulating the coefficients, we obtain a fair estimation of initial depth . Then we use mean-shift segmentation to divide the image into segments. By adjusting the size of segmented block , we further refine the estimated depth. Finally, we compute Lipschitz exponents from the slope of the wavelet modulus maxima curves in the logarithmic domain. We find the foreground segments by combining refinement depth and Lipschitz exponents of a segment. The depth of foreground segments will be rectified by an optimizing equation. We explore several defocus images and simulation results demonstrate that our depth estimation algorithm is capable of producing accurate depth maps.

APA, Harvard, Vancouver, ISO, and other styles

48

Huang, Tzu-Kuei, and 黃子魁. "Computational Photography Applications on Multiview Images." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/2v8hcg.

Full text

Abstract:

博士
國立臺灣大學
資訊工程學研究所
105
Multi-view images give more useful informations than single image if we can find the correspondence between each view. It''s means that we can use these additional informations to improve the computational photography applications. This thesis presents three application on multi-view images. The first application introduce a warping-based novel view synthesis framework for both binocular stereoscopic images and videos. Large-size autostereoscopic displays require multiple views while most stereoscopic cameras and digital video recorder can only capture two. Obtain accurate depth maps from two-view image or video is still difficult and time consuming, but popular novel view synthesis methods, such as depth image based rendering (DIBR), often heavily rely on it. The proposed framework requires neither depth maps nor user intervention. Dense and reliable features is extracted and find the correspondences of two-view. Image warped basis on these correspondences to synthesize novel views while simultaneously maintaining stereoscopic properties ,preserving image structures and keeping temporal coherence on video. Our method produces higher-quality multi-view images and video more efficiently without tedious parameter tuning. This is useful to convert stereoscopic images and videos taken by binocular cameras into multi-view images and videos ready to be displayed on autostereoscopic displays. 3D printing has become an important and prevalent tool. Image-based modeling is a popular way to acquire 3D models for further editing and printing. However, exiting tools are often not robust enough for users to obtain the 3D models they want. The constructed models are often incomplete, disjoint and noisy. Here we proposed a shape from silhouette system to reconstruct 3D models more robustly. In second part of this thesis introduce a robust automatic method for segmenting an object out of the background using a set of multi-view images. The segmentation is performed by minimizing an energy function which incorporates color statistics, spatial coherency, appearance proximity, epipolar constraints and back projection consistency of 3D feature points. It can be efficiently optimized using the min-cut algorithm. With the segmentation, the visual hull method is applied to reconstruct the 3D model of the object. However, the primary weakness of this approach is the inability to reproduce the concave region. To fix this problem, we use the photo-consistency principle of multi-view introduced at the third part of this thesis. Those voxel belong to the object surface will be found and give a refined result model with more details. Experiments show that the proposed method can generate better models than some popular systems.

APA, Harvard, Vancouver, ISO, and other styles

49

Pooja, A. "A Multiview Extension Of The ICP Algorithm." Thesis, 2010. http://etd.iisc.ernet.in/handle/2005/1284.

Full text

Abstract:

The Iterative Closest Point (ICP) algorithm has been an extremely popular method for 3D points or surface registration. Given two point sets, it simultaneously solves for correspondences and estimates the motion between these two point sets. However, by only registering two such views at a time, ICP fails to exploit the redundant information available in multiple views that have overlapping regions. In this thesis, a multiview extension of the ICP algorithm is provided that simultaneously averages the redundant information available in the views with overlapping regions. Variants of this method that carry out such simultaneous registration in a causal manner and that utilize the transitivity property of point correspondences are also provided. The improved accuracy in registration of these motion averaged approaches in comparison with the conventional ICP method is established through extensive experiments. In addition, the motion averaged approaches are compared with the existing multiview techniques of Bergevin et. al. and Benjemaa et. al. The results of the methods applied to the Happy Buddha and the Stanford Bunny datasets of 3D Stanford repository and to the Pooh and the Bunny datasets of the Ohio (MSU/WSU) Range Image database are also presented.

APA, Harvard, Vancouver, ISO, and other styles

50

Yi-HsiangChiu and 邱怡翔. "GPU Implementation of Versatile Multiview DIBR Algorithms." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/81139116823348649874.

Full text

Abstract:

碩士
國立成功大學
電腦與通信工程研究所
100
Naked-eye stereo displays provide a renovation of vision experiences; audiences can enjoy the stereo images without stereo glasses. As only few native 3D contents are available, 2D-to-3D conversion is considered a solution. How to render or synthesize 3D images from conventional 2D images with post-processing technologies or systems is a major topic. The multiview naked-eye stereo display is considered the next mainstream of stereo displays since it provides a more unrestricted and comfortable vision experience. The goal of this thesis targets on a real-time multiview video rendering system; with the proposed Direct Single-image Multiview Rendering (DMSR) algorithm based on depth-image-based rendering (DIBR) algorithm, stereo images with intact multiview information could be real-time rendering form a one-view-plus-one-depth video sequence. The system can deal with versatile multiview displays and adopted by a variety of 3D applications.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Multiview2'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles