Dissertations / Theses on the topic 'Video image analysi'

To see the other types of publications on this topic, follow the link: Video image analysi.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Video image analysi.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

GIACHELLO, SILVIA. "Identità' e memoria visuale: comunità', eventi, documentazione." Doctoral thesis, Politecnico di Torino, 2012. http://hdl.handle.net/11583/2540089.

Full text
Abstract:
Il progetto indaga la potenzialita' delle immagini, con particolare riferimento alla fotografia, di descrivere, documentare e interpretare gli eventi culturali, e il loro linguaggio particolare e funzionale; le conseguenze della proliferante diffusione della cultura visuale (grassroots, social networks, etc.) nell'interpretazione della realtà e nella costruzione della memoria; il valore attribuito dal pubblico e da esperti qualificati al concetto di “evento culturale” attraverso la rilevazione, il monitoraggio e la documentazione di consumi culturali, e la funzione identitaria di tale concetto. La ricerca è stata strutturata sviluppando tecniche della sociologia visuale in prospettiva interculturale, attraverso una metodologia combinata ad hoc: foto-stimolo a partire da immagini da me prodotte per la documentazione di un caso studio - il Ganesh Festival di Pune, Maharashtra, India -, e analisi di Visual Memos (prodotti visuali di autodocumentazione). Due sono in sintesi gli scopi della ricerca: individuazione di frame nella definizione dell'evento culturale e determinazione della sua capacita' di condizionare l’identità’ personale e collettiva, e analisi della funzione identitaria e mnemonica dei prodotti visuali che mediano e/o documentano l'evento, e il loro ruolo nel determinare la permanenza nel tempo dell’evento stesso. La ricerca sul campo è stata effettuata a Pune, città del Maharashtra (India), seconda a Mumbai per dimensioni e importanza economico-culturale nello stato. Tale opportunità è stata utilizzata per operare un confronto fra differenti prospettive culturali in merito ad entrambi gli obiettivi della ricerca, come presupposto cognitivo alla progettazione di campagne di comunicazione e valorizzazione di eventi culturali in una prospettiva interculturale (glocal culture).
APA, Harvard, Vancouver, ISO, and other styles
2

Dye, Brigham R. "Reliability of pre-service teachers' coding of teaching videos using a video-analysis tool /." Diss., CLICK HERE for online access, 2007. http://contentdm.lib.byu.edu/ETD/image/etd2020.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Kim, Tae-Kyun. "Discriminant analysis of patterns in images, image ensembles, and videos." Thesis, University of Cambridge, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.612084.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Sdiri, Bilel. "2D/3D Endoscopic image enhancement and analysis for video guided surgery." Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCD030.

Full text
Abstract:
Grâce à l’évolution des procédés de diagnostiques médicaux et les développements technologiques, la chirurgie mini-invasive a fait des progrès remarquables au cours des dernières décennies surtout avec l’innovation de nouveaux outils médicaux tels que les systèmes chirurgicaux robotisés et les caméras endoscopiques sans fil. Cependant, ces techniques souffrent de quelques limitations liées essentiellement l’environnement endoscopique telles que la non uniformité de l’éclairage, les réflexions spéculaires des tissus humides, le faible contraste/netteté et le flou dû aux mouvements du chirurgien et du patient (i.e. la respiration). La correction de ces dégradations repose sur des critères de qualité d’image subjective et objective dans le contexte médical. Il est primordial de développer des solutions d’amélioration de la qualité perceptuelle des images acquises par endoscopie 3D. Ces solutions peuvent servir plus particulièrement dans l’étape d’extraction de points d’intérêts pour la reconstruction 3D des organes, qui sert à la planification de certaines opérations chirurgicales. C’est dans cette optique que cette thèse aborde le problème de la qualité des images endoscopiques en proposant de nouvelles méthodes d’analyse et de rehaussement de contraste des images endoscopiques 2D et 3D.Pour la détection et la classification automatique des anomalies tissulaires pour le diagnostic des maladies du tractus gastro-intestinal, nous avons proposé une méthode de rehaussement de contraste local et global des images endoscopiques 2D classiques et pour l’endoscopie capsulaire sans fil.La méthode proposée améliore la visibilité des structures locales fines et des détails de tissus. Ce prétraitement a permis de faciliter le processus de détection des points caractéristiques et d’améliorer le taux de classification automatique des tissus néoplasiques et tumeurs bénignes. Les méthodes développées exploitent également la propriété d’attention visuelle et de perception de relief en stéréovision. Dans ce contexte, nous avons proposé une technique adaptative d’amélioration de la qualité des images stéréo endoscopiques combinant l’information de profondeur et les contours des tissues. Pour rendre la méthode plus efficace et adaptée aux images 3Dl e rehaussement de contraste est ajusté en fonction des caractéristiques locales de l’image et du niveau de profondeur dans la scène tout en contrôlant le traitement inter-vues par un modèle de perception binoculaire.Un test subjectif a été mené pour évaluer la performance de l’algorithme proposé en termes de qualité visuelle des images générées par des observateurs experts et non experts dont les scores ont démontré l’efficacité de notre technique 3D d’amélioration du contraste. Dans cette même optique,nous avons développé une autre technique de rehaussement du contraste des images endoscopiques stéréo basée sur la décomposition en ondelettes.Ce qui offre la possibilité d’effectuer un traitement multi-échelle et d’opérer une traitement sélectif. Le schéma proposé repose sur un traitement stéréo qui exploite à la fois l’informations de profondeur et les redondances intervues,ainsi que certaines propriétés du système visuel humain, notamment la sensibilité au contraste et à la rivalité/combinaison binoculaire. La qualité visuelle des images traitées et les mesures de qualité objective démontrent l’efficacité de notre méthode qui ajuste l’éclairage des images dans les régions sombres et saturées et accentue la visibilité des détails liés aux vaisseaux sanguins et les textures de tissues
Minimally invasive surgery has made remarkable progress in the last decades and became a very popular diagnosis and treatment tool, especially with the rapid medical and technological advances leading to innovative new tools such as robotic surgical systems and wireless capsule endoscopy. Due to the intrinsic characteristics of the endoscopic environment including dynamic illumination conditions and moist tissues with high reflectance, endoscopic images suffer often from several degradations such as large dark regions,with low contrast and sharpness, and many artifacts such as specular reflections and blur. These challenges together with the introduction of three dimensional(3D) imaging surgical systems have prompted the question of endoscopic images quality, which needs to be enhanced. The latter process aims either to provide the surgeons/doctors with a better visual feedback or improve the outcomes of some subsequent tasks such as features extraction for 3D organ reconstruction and registration. This thesis addresses the problem of endoscopic image quality enhancement by proposing novel enhancement techniques for both two-dimensional (2D) and stereo (i.e. 3D)endoscopic images.In the context of automatic tissue abnormality detection and classification for gastro-intestinal tract disease diagnosis, we proposed a pre-processing enhancement method for 2D endoscopic images and wireless capsule endoscopy improving both local and global contrast. The proposed method expose inner subtle structures and tissues details, which improves the features detection process and the automatic classification rate of neoplastic,non-neoplastic and inflammatory tissues. Inspired by binocular vision attention features of the human visual system, we proposed in another workan adaptive enhancement technique for stereo endoscopic images combining depth and edginess information. The adaptability of the proposed method consists in adjusting the enhancement to both local image activity and depth level within the scene while controlling the interview difference using abinocular perception model. A subjective experiment was conducted to evaluate the performance of the proposed algorithm in terms of visual qualityby both expert and non-expert observers whose scores demonstrated the efficiency of our 3D contrast enhancement technique. In the same scope, we resort in another recent stereo endoscopic image enhancement work to the wavelet domain to target the enhancement towards specific image components using the multiscale representation and the efficient space-frequency localization property. The proposed joint enhancement methods rely on cross-view processing and depth information, for both the wavelet decomposition and the enhancement steps, to exploit the inter-view redundancies together with perceptual human visual system properties related to contrast sensitivity and binocular combination and rivalry. The visual qualityof the processed images and objective assessment metrics demonstrate the efficiency of our joint stereo enhancement in adjusting the image illuminationin both dark and saturated regions and emphasizing local image details such as fine veins and micro vessels, compared to other endoscopic enhancement techniques for 2D and 3D images
APA, Harvard, Vancouver, ISO, and other styles
5

Li, Dong. "Thermal image analysis using calibrated video imaging." Diss., Columbia, Mo. : University of Missouri-Columbia, 2006. http://hdl.handle.net/10355/4455.

Full text
Abstract:
Thesis (Ph.D.)--University of Missouri-Columbia, 2006.
The entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file (viewed on April 23, 2009) Includes bibliographical references.
APA, Harvard, Vancouver, ISO, and other styles
6

Eastwood, Brian S. Taylor Russell M. "Multiple layer image analysis for video microscopy." Chapel Hill, N.C. : University of North Carolina at Chapel Hill, 2009. http://dc.lib.unc.edu/u?/etd,2813.

Full text
Abstract:
Thesis (Ph. D.)--University of North Carolina at Chapel Hill, 2009.
Title from electronic title page (viewed Mar. 10, 2010). "... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science." Discipline: Computer Science; Department/School: Computer Science.
APA, Harvard, Vancouver, ISO, and other styles
7

Sheikh, Faridul Hasan. "Analysis of 3D color matches for the creation and consumption of video content." Thesis, Saint-Etienne, 2014. http://www.theses.fr/2014STET4001.

Full text
Abstract:
L'objectif de cette thèse est de proposer une solution au problème de la constance des couleurs entre les images d'une même scène acquises selon un même point de vue ou selon différents points de vue. Ce problème constitue un défi majeur en vision par ordinateur car d'un point de vue à l'autre, on peut être confronté à des variations des conditions d'éclairage (spectre de l'éclairage, intensité de l'éclairage) et des conditions de prise de vue (point de vue, type de caméra, paramètres d'acquisition tels que focus, exposition, balance des blancs, etc.). Ces variations induisent alors des différences d'apparence couleur entre les images acquises qui touchent soit sur l'ensemble de la scène observée soit sur une partie de celle-ci. Dans cette thèse, nous proposons une solution à ce problème qui permet de modéliser puis de compenser, de corriger, ces variations de couleur à partir d'une méthode basée sur quatre étapes : (1) calcul des correspondances géométriques à partir de points d'intérêt (SIFT et MESR) ; (2) calculs des correspondances couleurs à partir d'une approche locale; (3) modélisation de ces correspondances par une méthode de type RANSAC; (4) compensation des différences de couleur par une méthode polynomiale à partir de chacun des canaux couleur, puis par une méthode d'approximation linéaire conjuguée à une méthode d'estimation de l'illuminant de type CAT afin de tenir compte des intercorrélations entre canaux couleur et des changements couleur dus à l'illuminant. Cette solution est comparée aux autres approches de l'état de l'art. Afin d'évaluer quantitativement et qualitativement la pertinence, la performance et la robustesse de cette solution, nous proposons deux jeux d'images spécialement conçus à cet effet. Les résultats de différentes expérimentations que nous avons menées prouvent que la solution que nous proposons est plus performante que toutes les autres solutions proposées jusqu'alors
The objective of this thesis is to propose a solution to the problem of color consistency between images originate from the same scene irrespective of acquisition conditions. Therefore, we present a new color mapping framework that is able to compensate color differences and achieve color consistency between views of the same scene. Our proposed, new framework works in two phases. In the first phase, we propose a new method that can robustly collect color correspondences from the neighborhood of sparse feature correspondences, despite the low accuracy of feature correspondences. In the second phase, from these color correspondences, we introduce a new, two-step, robust estimation of the color mapping model: first, nonlinear channel-wise estimation; second, linear cross-channel estimation. For experimental assessment, we propose two new image datasets: one with ground truth for quantitative assessment; another, without the ground truth for qualitative assessment. We have demonstrated a series of experiments in order to investigate the robustness of our proposed framework as well as its comparison with the state of the art. We have also provided brief overview, sample results, and future perspectives of various applications of color mapping. In experimental results, we have demonstrated that, unlike many methods of the state of the art, our proposed color mapping is robust to changes of: illumination spectrum, illumination intensity, imaging devices (sensor, optic), imaging device settings (exposure, white balance), viewing conditions (viewing angle, viewing distance)
APA, Harvard, Vancouver, ISO, and other styles
8

Lee, Sangkeun. "Video analysis and abstraction in the compressed domain." Diss., Available online, Georgia Institute of Technology, 2004:, 2003. http://etd.gatech.edu/theses/available/etd-04072004-180041/unrestricted/lee%5fsangkeun%5f200312%5fphd.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Guo, Y. (Yimo). "Image and video analysis by local descriptors and deformable image registration." Doctoral thesis, Oulun yliopisto, 2013. http://urn.fi/urn:isbn:9789526201412.

Full text
Abstract:
Abstract Image description plays an important role in representing inherent properties of entities and scenes in static images. Within the last few decades, it has become a fundamental issue of many practical vision tasks, such as texture classification, face recognition, material categorization, and medical image processing. The study of static image analysis can also be extended to video analysis, such as dynamic texture recognition, classification and synthesis. This thesis contributes to the research and development of image and video analysis from two aspects. In the first part of this work, two image description methods are presented to provide discriminative representations for image classification. They are designed in unsupervised (i.e., class labels of texture images are not available) and supervised (i.e., class labels of texture images are available) manner, respectively. First, a supervised model is developed to learn discriminative local patterns, which formulates the image description as an integrated three-layered model to estimate an optimal pattern subset of interest by simultaneously considering the robustness, discriminative power and representation capability of features. Second, in the case that class labels of training images are unavailable, a linear configuration model is presented to describe microscopic image structures in an unsupervised manner, which is subsequently combined together with a local descriptor: local binary pattern (LBP). This description is theoretically verified to be rotation invariant and is able to provide a discriminative complement to the conventional LBPs. In the second part of the thesis, based on static image description and deformable image registration, video analysis is studied for the applications of dynamic texture description, synthesis and recognition. First, a dynamic texture synthesis model is proposed to create a continuous and infinitely varying stream of images given a finite input video, which stitches video clips in the time domain by selecting proper matching frames and organizing them into a logical order. Second, a method for the application of facial expression recognition, which formulates the dynamic facial expression recognition problem as the construction of longitudinal atlases and groupwise image registration problem, is proposed
Tiivistelmä Kuvan deskriptiolla on tärkeä rooli staattisissa kuvissa esiintyvien luontaisten kokonaisuuksien ja näkymien kuvaamisessa. Viime vuosikymmeninä se on tullut perustavaa laatua olevaksi ongelmaksi monissa käytännön konenäön tehtävissä, kuten tekstuurien luokittelu, kasvojen tunnistaminen, materiaalien luokittelu ja lääketieteellisten kuvien analysointi. Staattisen kuva-analyysin tutkimusala voidaan myös laajentaa videoanalyysiin, kuten dynaamisten tekstuurien tunnistukseen, luokitteluun ja synteesiin. Tämä väitöskirjatutkimus myötävaikuttaa kuva- ja videoanalyysin tutkimukseen ja kehittymiseen kahdesta näkökulmasta. Työn ensimmäisessä osassa esitetään kaksi kuvan deskriptiomenetelmää erottelukykyisten esitystapojen luomiseksi kuvien luokitteluun. Ne suunnitellaan ohjaamattomiksi (eli tekstuurikuvien luokkien leimoja ei ole käytettävissä) tai ohjatuiksi (eli luokkien leimat ovat saatavilla). Aluksi kehitetään ohjattu malli oppimaan erottelukykyisiä paikallisia kuvioita, mikä formuloi kuvan deskriptiomenetelmän integroituna kolmikerroksisena mallina - tavoitteena estimoida optimaalinen kiinnostavien kuvioiden alijoukko ottamalla samanaikaisesti huomioon piirteiden robustisuus, erottelukyky ja esityskapasiteetti. Seuraavaksi, sellaisia tapauksia varten, joissa luokkaleimoja ei ole saatavilla, esitetään työssä lineaarinen konfiguraatiomalli kuvaamaan kuvan mikroskooppisia rakenteita ohjaamattomalla tavalla. Tätä käytetään sitten yhdessä paikallisen kuvaajan, eli local binary pattern (LBP) –operaattorin kanssa. Teoreettisella tarkastelulla osoitetaan kehitetyn kuvaajan olevan rotaatioinvariantti ja kykenevän tuottamaan erottelukykyistä, täydentävää informaatiota perinteiselle LBP-menetelmälle. Työn toisessa osassa tutkitaan videoanalyysiä, perustuen staattisen kuvan deskriptioon ja deformoituvaan kuvien rekisteröintiin – sovellusaloina dynaamisten tekstuurien kuvaaminen, synteesi ja tunnistaminen. Aluksi ehdotetaan sellainen malli dynaamisten tekstuurien synteesiin, joka luo jatkuvan ja äärettömän kuvien virran annetusta äärellisen mittaisesta videosta. Menetelmä liittää yhteen videon pätkiä aika-avaruudessa valitsemalla keskenään yhteensopivia kuvakehyksiä videosta ja järjestämällä ne loogiseen järjestykseen. Seuraavaksi työssä esitetään sellainen uusi menetelmä kasvojen ilmeiden tunnistukseen, joka formuloi dynaamisen kasvojen ilmeiden tunnistusongelman pitkittäissuuntaisten kartastojen rakentamisen ja ryhmäkohtaisen kuvien rekisteröinnin ongelmana
APA, Harvard, Vancouver, ISO, and other styles
10

Stobaugh, John David. "Novel use of video and image analysis in a video compression system." Thesis, University of Iowa, 2015. https://ir.uiowa.edu/etd/1766.

Full text
Abstract:
As consumer demand for higher quality video at lower bit-rate increases, so does the need for more sophisticated methods of compressing videos into manageable file sizes. This research attempts to address these concerns while still maintaining reasonable encoding times. Modern segmentation and grouping analysis are used with code vectorization techniques and other optimization paradigms to improve quality and performance within the next generation coding standard, High Efficiency Video Coding. This research saw on average a 50% decrease in run-time by the encoder with marginal decreases in perceived quality.
APA, Harvard, Vancouver, ISO, and other styles
11

Forsthoefel, Dana. "Leap segmentation in mobile image and video analysis." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/50285.

Full text
Abstract:
As demand for real-time image processing increases, the need to improve the efficiency of image processing systems is growing. The process of image segmentation is often used in preprocessing stages of computer vision systems to reduce image data and increase processing efficiency. This dissertation introduces a novel image segmentation approach known as leap segmentation, which applies a flexible definition of adjacency to allow groupings of pixels into segments which need not be spatially contiguous and thus can more accurately correspond to large surfaces in the scene. Experiments show that leap segmentation correctly preserves an average of 20% more original scene pixels than traditional approaches, while using the same number of segments, and significantly improves execution performance (executing 10x - 15x faster than leading approaches). Further, leap segmentation is shown to improve the efficiency of a high-level vision application for scene layout analysis within 3D scene reconstruction. The benefits of applying image segmentation in preprocessing are not limited to single-frame image processing. Segmentation is also often applied in the preprocessing stages of video analysis applications. In the second contribution of this dissertation, the fast, single-frame leap segmentation approach is extended into the temporal domain to develop a highly-efficient method for multiple-frame segmentation, called video leap segmentation. This approach is evaluated for use on mobile platforms where processing speed is critical using moving-camera traffic sequences captured on busy, multi-lane highways. Video leap segmentation accurately tracks segments across temporal bounds, maintaining temporal coherence between the input sequence frames. It is shown that video leap segmentation can be applied with high accuracy to the task of salient segment transformation detection for alerting drivers to important scene changes that may affect future steering decisions. Finally, while research efforts in the field of image segmentation have often recognized the need for efficient implementations for real-time processing, many of today’s leading image segmentation approaches exhibit processing times which exceed their camera frame periods, making them infeasible for use in real-time applications. The third research contribution of this dissertation focuses on developing fast implementations of the single-frame leap segmentation approach for use on both single-core and multi-core platforms as well as on both high-performance and resource-constrained systems. While the design of leap segmentation lends itself to efficient implementations, the efficiency achieved by this algorithm, as in any algorithm, is can be improved with careful implementation optimizations. The leap segmentation approach is analyzed in detail and highly optimized implementations of the approach are presented with in-depth studies, ranging from storage considerations to realizing parallel processing potential. The final implementations of leap segmentation for both serial and parallel platforms are shown to achieve real-time frame rates even when processing very high resolution input images. Leap segmentation’s accuracy and speed make it a highly competitive alternative to today’s leading segmentation approaches for modern, real-time computer vision systems.
APA, Harvard, Vancouver, ISO, and other styles
12

Acosta, Jesus-Adolfo. "Pavement surface distress evaluation using video image analysis." Case Western Reserve University School of Graduate Studies / OhioLINK, 1994. http://rave.ohiolink.edu/etdc/view?acc_num=case1057760579.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

McEuen, Matt. "Expert object recognition in video /." Link to online version, 2005. https://ritdml.rit.edu/dspace/handle/1850/1168.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Todd, Douglas Wallace, and Douglas Wallace Todd. "Zebrafish Video Analysis System for High-Throughput Drug Assay." Thesis, The University of Arizona, 2016. http://hdl.handle.net/10150/623150.

Full text
Abstract:
Zebrafish swimming behavior is used in a new, automated drug assay system as a biomarker to measure drug efficiency to prevent or restore hearing loss. This system records video of zebrafish larvae under infrared lighting using Raspberry Pi cameras and measures fish swimming behavior. This automated system significantly reduces the operator time required to process experiments in parallel. Multiple tanks, each consisting of sixteen experiments are operated in parallel. Once a set of experiments starts, all data transfer and processing operations are automatic. A web interface allows the operator to configure, monitor and control the experiments and review reports. Ethernet connects the various hardware components, allowing loose coupling of the distributed software used to schedule and run the experiments. The operator can configure the data processing to be done on the local computer or offloaded to a high-performance computer cluster to achieve even higher throughput. Computationally efficient image processing algorithms provided automated zebrafish detection and motion analysis. Quantitative assessment of error in the position and orientation of the detected fish uses manual data analysis by human observers as the reference. The system error in orientation and position is comparable to human inter-operator error.
APA, Harvard, Vancouver, ISO, and other styles
15

Wright, Geoffrey A. "How does video analysis impact teacher reflection-for-action? /." Diss., CLICK HERE for online access, 2008. http://contentdm.lib.byu.edu/ETD/image/etd2347.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Thomson, Malcolm S. "Real-time image processing for traffic analysis." Thesis, Edinburgh Napier University, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.260986.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Dickinson, Keith William. "Traffic data capture and analysis using video image processing." Thesis, University of Sheffield, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.306374.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Liu, Gaowen. "Learning with Shared Information for Image and Video Analysis." Doctoral thesis, Università degli studi di Trento, 2017. https://hdl.handle.net/11572/368806.

Full text
Abstract:
Image and video recognition is a fundamental and challenging problem in computer vision, which has progressed tremendously fast recently. In the real world, a realistic setting for image or video recognition is that we have some classes containing lots of training data and many classes that contain only a small amount of training data. Therefore, how to use the frequent classes to help learning the rare classes is an open question. Learning with shared information is an emerging topic which can solve this problem. There are different components that can be shared during concept modeling and machine learning procedure, such as sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, etc. For example, representations based on attributes define a finite vocabulary that is common to all categories, with each category using a subset of the attributes. Therefore, sharing some common attributes for multiple classes will benefit the final recognition system. In this thesis, we investigate some challenging image and video recognition problems under the framework of learning with shared information. My Ph.D research comprised of two parts. The first part focuses on the two domains (source and target) problems where the emphasis is to boost the recognition performance on the target domain by utilizing useful knowledge from source domain. The second part focuses on multi-domains problems where all domains are considered equally important. This means we want to improve performance for all domains by exploring the useful information across domains. In particular, we investigate three topics to achieve this goal in the thesis, which are active domain adaptation, multi-task learning, and dictionary learning, respectively.
APA, Harvard, Vancouver, ISO, and other styles
19

Liu, Gaowen. "Learning with Shared Information for Image and Video Analysis." Doctoral thesis, University of Trento, 2017. http://eprints-phd.biblio.unitn.it/2011/1/PhD-Thesis.pdf.

Full text
Abstract:
Image and video recognition is a fundamental and challenging problem in computer vision, which has progressed tremendously fast recently. In the real world, a realistic setting for image or video recognition is that we have some classes containing lots of training data and many classes that contain only a small amount of training data. Therefore, how to use the frequent classes to help learning the rare classes is an open question. Learning with shared information is an emerging topic which can solve this problem. There are different components that can be shared during concept modeling and machine learning procedure, such as sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, etc. For example, representations based on attributes define a finite vocabulary that is common to all categories, with each category using a subset of the attributes. Therefore, sharing some common attributes for multiple classes will benefit the final recognition system. In this thesis, we investigate some challenging image and video recognition problems under the framework of learning with shared information. My Ph.D research comprised of two parts. The first part focuses on the two domains (source and target) problems where the emphasis is to boost the recognition performance on the target domain by utilizing useful knowledge from source domain. The second part focuses on multi-domains problems where all domains are considered equally important. This means we want to improve performance for all domains by exploring the useful information across domains. In particular, we investigate three topics to achieve this goal in the thesis, which are active domain adaptation, multi-task learning, and dictionary learning, respectively.
APA, Harvard, Vancouver, ISO, and other styles
20

Hocking, Laird Robert. "Shell-based geometric image and video inpainting." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/281805.

Full text
Abstract:
The subject of this thesis is a class of fast inpainting methods (image or video) based on the idea of filling the inpainting domain in successive shells from its boundary inwards. Image pixels (or video voxels) are filled by assigning them a color equal to a weighted average of either their already filled neighbors (the ``direct'' form of the method) or those neighbors plus additional neighbors within the current shell (the ``semi-implicit'' form). In the direct form, pixels (voxels) in the current shell may be filled independently, but in the semi-implicit form they are filled simultaneously by solving a linear system. We focus in this thesis mainly on the image inpainting case, where the literature contains several methods corresponding to the {\em direct} form of the method - the semi-implicit form is introduced for the first time here. These methods effectively differ only in the order in which pixels (voxels) are filled, the weights used for averaging, and the neighborhood that is averaged over. All of them are very fast, but at the same time all of them leave undesirable artifacts such as ``kinking'' (bending) or blurring of extrapolated isophotes. This thesis has two main goals. First, we introduce new algorithms within this class, which are aimed at reducing or eliminating these artifacts, and also target a specific application - the 3D conversion of images and film. The first part of this thesis will be concerned with introducing 3D conversion as well as Guidefill, a method in the above class adapted to the inpainting problems arising in 3D conversion. However, the second and more significant goal of this thesis is to study these algorithms as a class. In particular, we develop a mathematical theory aimed at understanding the origins of artifacts mentioned. Through this, we seek is to understand which artifacts can be eliminated (and how), and which artifacts are inevitable (and why). Most of the thesis is occupied with this second goal. Our theory is based on two separate limits - the first is a {\em continuum} limit, in which the pixel width →0, and in which the algorithm converges to a partial differential equation. The second is an asymptotic limit in which h is very small but non-zero. This latter limit, which is based on a connection to random walks, relates the inpainted solution to a type of discrete convolution. The former is useful for studying kinking artifacts, while the latter is useful for studying blur. Although all the theoretical work has been done in the context of image inpainting, experimental evidence is presented suggesting a simple generalization to video. Finally, in the last part of the thesis we explore shell-based video inpainting. In particular, we introduce spacetime transport, which is a natural generalization of the ideas of Guidefill and its predecessor, coherence transport, to three dimensions (two spatial dimensions plus one time dimension). Spacetime transport is shown to have much in common with shell-based image inpainting methods. In particular, kinking and blur artifacts persist, and the former of these may be alleviated in exactly the same way as in two dimensions. At the same time, spacetime transport is shown to be related to optical flow based video inpainting. In particular, a connection is derived between spacetime transport and a generalized Lucas-Kanade optical flow that does not distinguish between time and space.
APA, Harvard, Vancouver, ISO, and other styles
21

Hampson, Robert W. "Video-based nearshore depth inversion using WDM method." Access to citation, abstract and download form provided by ProQuest Information and Learning Company; downloadable PDF file, 129 p, 2009. http://proquest.umi.com/pqdweb?did=1650507521&sid=2&Fmt=2&clientId=8331&RQT=309&VName=PQD.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Salehi, Doolabi Saeed. "Cubic-Panorama Image Dataset Analysis for Storage and Transmission." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/24053.

Full text
Abstract:
This thesis involves systems for virtual presence in remote locations, a field referred to as telepresence. Recent image-based representations such as Google map's street view provide a familiar example. Several areas of research are open; such image-based representations are huge in size and the necessity to compress data efficiently for storage is inevitable. On the other hand, users are usually located in remote areas, and thus efficient transmission of the visual information is another issue of great importance. In this work, real-world images are used in preference to computer graphics representations, mainly due to the photorealism that they provide as well as to avoid the high computational cost required for simulating large-scale environments. The cubic format is selected for panoramas in this thesis. A major feature of the captured cubic-panoramic image datasets in this work is the assumption of static scenes, and major issues of the system are compression efficiency and random access for storage, as well as computational complexity for transmission upon remote users' requests. First, in order to enable smooth navigation across different view-points, a method for aligning cubic-panorama image datasets by using the geometry of the scene is proposed and tested. Feature detection and camera calibration are incorporated and unlike the existing method, which is limited to a pair of panoramas, our approach is applicable to datasets with a large number of panoramic images, with no need for extra numerical estimation. Second, the problem of cubic-panorama image dataset compression is addressed in a number of ways. Two state-of-the-art approaches, namely the standardized scheme of H.264 and a wavelet-based codec named Dirac, are used and compared for the application of virtual navigation in image based representations of real world environments. Different frame prediction structures and group of pictures lengths are investigated and compared for this new type of visual data. At this stage, based on the obtained results, an efficient prediction structure and bitstream syntax using features of the data as well as satisfying major requirements of the system are proposed. Third, we have proposed novel methods to address the important issue of disparity estimation. A client-server based scheme is assumed and a remote user is assumed to seek information at each navigation step. Considering the compression stage, a fast method that uses our previous work on the geometry of the scene as well as the proposed prediction structure together with the cubic format of panoramas is used to estimate disparity vectors efficiently. Considering the transmission stage, a new transcoding scheme is introduced and a number of different frame-format conversion scenarios are addressed towards the goal of free navigation. Different types of navigation scenarios including forward or backward navigation, as well as user pan, tilt, and zoom are addressed. In all the aforementioned cases, results are compared both visually through error images and videos as well as using the objective measures. Altogether free navigation within the captured panoramic image datasets will be facilitated using our work and it can be incorporated in state-of-the-art of emerging cubic-panorama image dataset compression/transmission schemes.
APA, Harvard, Vancouver, ISO, and other styles
23

Fletcher, M. J. "A modular system for video based motion analysis." Thesis, University of Reading, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.293144.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Massaro, James. "A PCA based method for image and video pose sequencing /." Online version of thesis, 2010. http://hdl.handle.net/1850/11991.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Jain, Raja P. "Extraction and interaction analysis of foreground objects in panning video /." Link to online version, 2006. https://ritdml.rit.edu/dspace/handle/1850/1879.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Bothmann, Ludwig [Verfasser]. "Efficient statistical analysis of video and image data / Ludwig Bothmann." München : Verlag Dr. Hut, 2017. http://d-nb.info/1135594317/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Howard, Elizabeth Helen Civil &amp Environmental Engineering Faculty of Engineering UNSW. "A laboratory study of the 'shoreline' detected in video imagery." Publisher:University of New South Wales. Civil & Environmental Engineering, 2008. http://handle.unsw.edu.au/1959.4/41497.

Full text
Abstract:
A controlled laboratory experiment was undertaken to simulate varying swash zone characteristics and sensor-target geometry found in digital images collected by ARGUS coastal imaging systems. Using a hyperspectral sensor, reflectance data were integrated over the respective red, blue and green wavelengths corresponding to a standard ARGUS video imaging sensor. The dominant swash zone parameters affecting shoreline detection were found to be the presence or absence of surface foam, site-specific sediment characteristics (especially colour), and water depth. Winter versus summer solar elevation and the sensor zenith were also found to affect the cross-shore location of the detected waterline. With this new information, site- and time-specific corrections can be applied to coastal digital imagery, to improve the confidence of shoreline detection.
APA, Harvard, Vancouver, ISO, and other styles
28

Baradel, Fabien. "Structured deep learning for video analysis." Thesis, Lyon, 2020. http://www.theses.fr/2020LYSEI045.

Full text
Abstract:
Avec l’augmentation massive du contenu vidéo sur Internet et au-delà, la compréhension automatique du contenu visuel pourrait avoir un impact sur de nombreux domaines d’application différents tels que la robotique, la santé, la recherche de contenu ou le filtrage. Le but de cette thèse est de fournir des contributions méthodologiques en vision par ordinateur et apprentissage statistique pour la compréhension automatique du contenu des vidéos. Nous mettons l’accent sur les problèmes de la reconnaissance de l’action humaine à grain fin et du raisonnement visuel à partir des interactions entre objets. Dans la première partie de ce manuscrit, nous abordons le problème de la reconnaissance fine de l’action humaine. Nous introduisons deux différents mécanismes d’attention, entrainés sur le contenu visuel à partir de la pose humaine articulée. Une première méthode est capable de porter automatiquement l’attention sur des points pré-sélectionnés importants de la vidéo, conditionnés sur des caractéristiques apprises extraites de la pose humaine articulée. Nous montrons qu’un tel mécanisme améliore les performances sur la tâche finale et fournit un bon moyen de visualiser les parties les plus discriminantes du contenu visuel. Une deuxième méthode va au-delà de la reconnaissance de l’action humaine basée sur la pose. Nous développons une méthode capable d’identifier automatiquement un nuage de points caractéristiques non structurés pour une vidéo à l’aide d’informations contextuelles. De plus, nous introduisons un système distribué entrainé pour agréger les caractéristiques de manière récurrente et prendre des décisions de manière distribuée. Nous démontrons que nous pouvons obtenir de meilleures performances que celles illustrées précédemment, sans utiliser d’informations de pose articulée au moment de l’inférence. Dans la deuxième partie de cette thèse, nous étudions les représentations vidéo d’un point de vue objet. Étant donné un ensemble de personnes et d’objets détectés dans la scène, nous développons une méthode qui a appris à déduire les interactions importantes des objets à travers l’espace et le temps en utilisant uniquement l’annotation au niveau vidéo. Cela permet d’identifier une interaction inter-objet importante pour une action donnée ainsi que le biais potentiel d’un ensemble de données. Enfin, dans une troisième partie, nous allons au-delà de la tâche de classification et d’apprentissage supervisé à partir de contenus visuels, en abordant la causalité à travers les interactions, et en particulier le problème de l’apprentissage contrefactuel. Nous introduisons une nouvelle base de données, à savoir CoPhy, où, après avoir regardé une vidéo, la tâche consiste à prédire le résultat après avoir modifié la phase initiale de la vidéo. Nous développons une méthode basée sur des interactions au niveau des objets capables d’inférer les propriétés des objets sans supervision ainsi que les emplacements futurs des objets après l’intervention
With the massive increase of video content on Internet and beyond, the automatic understanding of visual content could impact many different application fields such as robotics, health care, content search or filtering. The goal of this thesis is to provide methodological contributions in Computer Vision and Machine Learning for automatic content understanding from videos. We emphasis on problems, namely fine-grained human action recognition and visual reasoning from object-level interactions. In the first part of this manuscript, we tackle the problem of fine-grained human action recognition. We introduce two different trained attention mechanisms on the visual content from articulated human pose. The first method is able to automatically draw attention to important pre-selected points of the video conditioned on learned features extracted from the articulated human pose. We show that such mechanism improves performance on the final task and provides a good way to visualize the most discriminative parts of the visual content. The second method goes beyond pose-based human action recognition. We develop a method able to automatically identify unstructured feature clouds of interest in the video using contextual information. Furthermore, we introduce a learned distributed system for aggregating the features in a recurrent manner and taking decisions in a distributed way. We demonstrate that we can achieve a better performance than obtained previously, without using articulated pose information at test time. In the second part of this thesis, we investigate video representations from an object-level perspective. Given a set of detected persons and objects in the scene, we develop a method which learns to infer the important object interactions through space and time using the video-level annotation only. That allows to identify important objects and object interactions for a given action, as well as potential dataset bias. Finally, in a third part, we go beyond the task of classification and supervised learning from visual content by tackling causality in interactions, in particular the problem of counterfactual learning. We introduce a new benchmark, namely CoPhy, where, after watching a video, the task is to predict the outcome after modifying the initial stage of the video. We develop a method based on object- level interactions able to infer object properties without supervision as well as future object locations after the intervention
APA, Harvard, Vancouver, ISO, and other styles
29

Cutolo, Alfredo. "Image partition and video segmentation using the Mumford-Shah functional." Doctoral thesis, Universita degli studi di Salerno, 2012. http://hdl.handle.net/10556/280.

Full text
Abstract:
2010 - 2011
The aim of this Thesis is to present an image partition and video segmentation procedure, based on the minimization of a modified version of Mumford-Shah functional. The Mumford-Shah functional used for image partition has been then extended to develop a video segmentation procedure. Differently by the image processing, in video analysis besides the usual spatial connectivity of pixels (or regions) on each single frame, we have a natural notion of “temporal” connectivity between pixels (or regions) on consecutive frames given by the optical flow. In this case, it makes sense to extend the tree data structure used to model a single image with a graph data structure that allows to handle a video sequence. The video segmentation procedure is based on minimization of a modified version of a Mumford-Shah functional. In particular the functional used for image partition allows to merge neighboring regions with similar color without considering their movement. Our idea has been to merge neighboring regions with similar color and similar optical flow vector. Also in this case the minimization of Mumford-Shah functional can be very complex if we consider each possible combination of the graph nodes. This computation becomes easy to do if we take into account a hierarchy of partitions constructed starting by the nodes of the graph.[edited by author]
X n.s.
APA, Harvard, Vancouver, ISO, and other styles
30

Emmot, Sebastian. "Characterizing Video Compression Using Convolutional Neural Networks." Thesis, Luleå tekniska universitet, Datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-79430.

Full text
Abstract:
Can compression parameters used in video encoding be estimated, given only the visual information of the resulting compressed video? If so, these parameters could potentially improve existing parametric video quality estimation models. Today, parametric models use information like bitrate to estimate the quality of a given video. This method is inaccurate since it does not consider the coding complexity of a video. The constant rate factor (CRF) parameter for h.264 encoding aims to keep the quality constant while varying the bitrate, if the CRF for a video is known together with bitrate, a better quality estimate could potentially be achieved. In recent years, artificial neural networks and specifically convolutional neural networks have shown great promise in the field of image processing. In this thesis, convolutional neural networks are investigated as a way of estimating the constant rate factor parameter for a degraded video by identifying the compression artifacts and their relation to the CRF used. With the use of ResNet, a model for estimating the CRF for each frame of a video can be derived, these per-frame predictions are further used in a video classification model which performs a total CRF prediction for a given video. The results show that it is possible to find a relation between the visual encoding artifacts and CRF used. The top-5 accuracy achieved for the model is at 61.9% with the use of limited training data. Given that today’s parametric bitrate based models for quality have no information about coding complexity, even a rough estimate of the CRF could improve the precision of them.
APA, Harvard, Vancouver, ISO, and other styles
31

Vercillo, Richard 1953. "Very high resolution video display memory and base image memory for a radiologic image analysis console." Thesis, The University of Arizona, 1988. http://hdl.handle.net/10150/276707.

Full text
Abstract:
Digital radiographic images are created by a variety of diagnostic imaging modalities. A multi-modality workstation, known as the Arizona Viewing Console (AVC), was designed and built by the University of Arizona Radiology Department to support research in radiographic image processing and image display. Two specially designed VMEbus components, the base image memory and the video display memory, were integrated into the AVC and are the subject of this thesis. The base image memory is a multi-ported, 8 megabyte memory array based on random access memory used for raw image storage. It supports a 10 megapixel per second image processor and can interface to a 320 megabit per second network. The video display memory utilizes video memories and is capable of displaying two independent high resolution images, each 1024 pixels by 1536 lines, on separate video monitors. In part, these two memory designs have allowed the AVC to excel as a radiographic image workstation.
APA, Harvard, Vancouver, ISO, and other styles
32

Gatica, Perez Daniel. "Extensive operators in lattices of partitions for digital video analysis /." Thesis, Connect to this title online; UW restricted, 2001. http://hdl.handle.net/1773/5874.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Wang, Feng. "Video content analysis and its applications for multimedia authoring of presentations /." View abstract or full-text, 2006. http://library.ust.hk/cgi/db/thesis.pl?CSED%202007%20WANG.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Zhao, Bin. "Towards Scalable Analysis of Images and Videos." Research Showcase @ CMU, 2014. http://repository.cmu.edu/dissertations/583.

Full text
Abstract:
With widespread availability of low-cost devices capable of photo shooting and high-volume video recording, we are facing explosion of both image and video data. The sheer volume of such visual data poses both challenges and opportunities in machine learning and computer vision research. In image classification, most of previous research has focused on small to mediumscale data sets, containing objects from dozens of categories. However, we could easily access images spreading thousands of categories. Unfortunately, despite the well-known advantages and recent advancements of multi-class classification techniques in machine learning, complexity concerns have driven most research on such super large-scale data set back to simple methods such as nearest neighbor search, one-vs-one or one-vs-rest approach. However, facing image classification problem with such huge task space, it is no surprise that these classical algorithms, often favored for their simplicity, will be brought to their knees not only because of the training time and storage cost they incur, but also because of the conceptual awkwardness of such algorithms in massive multi-class paradigms. Therefore, it is our goal to directly address the bigness of image data, not only the large number of training images and high-dimensional image features, but also the large task space. Specifically, we present algorithms capable of efficiently and effectively training classifiers that could differentiate tens of thousands of image classes. Similar to images, one of the major difficulties in video analysis is also the huge amount of data, in the sense that videos could be hours long or even endless. However, it is often true that only a small portion of video contains important information. Consequently, algorithms that could automatically detect unusual events within streaming or archival video would significantly improve the efficiency of video analysis and save valuable human attention for only the most salient contents. Moreover, given lengthy recorded videos, such as those captured by digital cameras on mobile phones, or surveillance cameras, most users do not have the time or energy to edit the video such that only the most salient and interesting part of the original video is kept. To this end, we also develop algorithm for automatic video summarization, without human intervention. Finally, we further extend our research on video summarization into a supervised formulation, where users are asked to generate summaries for a subset of a class of videos of similar nature. Given such manually generated summaries, our algorithm learns the preferred storyline within the given class of videos, and automatically generates summaries for the rest of videos in the class, capturing the similar storyline as in those manually summarized videos.
APA, Harvard, Vancouver, ISO, and other styles
35

Xu, K. "An investigation of sewer pipe deformation by image analysis of video surveys." Thesis, Swansea University, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.636703.

Full text
Abstract:
Close Circuit Television (CCTV) surveys of sewers are widely used in the UK to assess the structural integrity of sewer pipes. The video images are examined visually, and classified into five grades according to the degree of damage that can be observed. For severely damaged pipes, this technique is adequate, but there is considerable doubt in classifying pipes with very slight damage. In addition, the archiving of very many video tapes is expensive, and repeated access to video images of particular sections of pipe is difficult and time consuming. Therefore, an automatic sewer-pipe inspection system is required, based on the CCTV survey, which can extract and assess the structural condition of sewer pipes to ensure accuracy, efficiency and economy of sewer pipe examination. To this end, the objective of the thesis is to investigate the practical use of computer vision for automatic pipe-joint assessments, and the main effort is concentrated on software development. Initial damage in sewer pipes is associated with changes in shape of the pipe profile, with the undamaged profile being circular. Preliminary work was conducted to investigate the use of the pipe joint as a measure of the pipe shape. Automated pipe-joint shape assessments were investigated using existing software, but this could not handle images from video tapes. However, a manual technique proved that pipe joints could be used to assess pipe shape change. The main achievement of this work is the investigation of image processing algorithms, and associated software development, for pipe-joint boundary extraction which work with relatively poor contrast and noisy background as well as boundary shape recognition and analyses which deal with incomplete boundary outlines. Also a reference circle for the undamaged profile was estimated for use in pipe-joint shape discrimination. For most video pictures, reasonable results would be obtained with these algorithms. Two algorithms have been investigated for crack detection, one based on boundary curvature analysis, the other on a new boundary segment analysis technique. Also a neural network was introduced into pipe-joint shape discrimination. A Sewer Image Processing System (SIPS) has been established, using Microsoft Windows application software, which is based on the image processing techniques developed in this work.
APA, Harvard, Vancouver, ISO, and other styles
36

Srinivasan, Sabeshan. "Object Tracking in Distributed Video Networks Using Multi-Dimentional Signatures." Fogler Library, University of Maine, 2006. http://www.library.umaine.edu/theses/pdf/SrinivasanSX2006.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Cheng, Guangchun. "Video Analytics with Spatio-Temporal Characteristics of Activities." Thesis, University of North Texas, 2015. https://digital.library.unt.edu/ark:/67531/metadc799541/.

Full text
Abstract:
As video capturing devices become more ubiquitous from surveillance cameras to smart phones, the demand of automated video analysis is increasing as never before. One obstacle in this process is to efficiently locate where a human operator’s attention should be, and another is to determine the specific types of activities or actions without ambiguity. It is the special interest of this dissertation to locate spatial and temporal regions of interest in videos and to develop a better action representation for video-based activity analysis. This dissertation follows the scheme of “locating then recognizing” activities of interest in videos, i.e., locations of potentially interesting activities are estimated before performing in-depth analysis. Theoretical properties of regions of interest in videos are first exploited, based on which a unifying framework is proposed to locate both spatial and temporal regions of interest with the same settings of parameters. The approach estimates the distribution of motion based on 3D structure tensors, and locates regions of interest according to persistent occurrences of low probability. Two contributions are further made to better represent the actions. The first is to construct a unifying model of spatio-temporal relationships between reusable mid-level actions which bridge low-level pixels and high-level activities. Dense trajectories are clustered to construct mid-level actionlets, and the temporal relationships between actionlets are modeled as Action Graphs based on Allen interval predicates. The second is an effort for a novel and efficient representation of action graphs based on a sparse coding framework. Action graphs are first represented using Laplacian matrices and then decomposed as a linear combination of primitive dictionary items following sparse coding scheme. The optimization is eventually formulated and solved as a determinant maximization problem, and 1-nearest neighbor is used for action classification. The experiments have shown better results than existing approaches for regions-of-interest detection and action recognition.
APA, Harvard, Vancouver, ISO, and other styles
38

Maczyta, Léo. "Dynamic visual saliency in image sequences." Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S046.

Full text
Abstract:
Les travaux de la thèse portent sur l'estimation de la saillance du mouvement dans des séquences d'images. Dans une première partie, nous avons traité un sujet très peu abordé: la détection des images présentant un mouvement saillant. Pour cela, nous nous appuyons sur un réseau de neurones convolutif et sur la compensation du mouvement de la caméra. Dans une seconde partie, nous avons conçu une méthode originale d'estimation de cartes de saillance du mouvement. Cette méthode ne requiert pas d'apprentissage. L'indice de saillance est obtenu par une étape d'inpainting du flot optique, suivie d'une comparaison avec le flot initial. Dans un troisième temps, nous nous sommes intéressés à l'estimation de la saillance de trajectoires pour appréhender une saillance progressive. Nous construisons une méthode faiblement supervisée s'appuyant sur un réseau auto-encodeur récurrent, qui représente chaque trajectoire avec un code latent. Toutes ces méthodes ont été validées sur des données de vidéo réelles
Our thesis research is concerned with the estimation of motion saliency in image sequences. First, we have defined an original method to detect frames in which a salient motion is present. For this, we propose a framework relying on a deep neural network, and on the compensation of the dominant camera motion. Second, we have designed a method for estimating motion saliency maps. This method requires no learning. The motion saliency cue is obtained by an optical flow inpainting step, followed by a comparison with the initial flow. Third, we consider the problem of trajectory saliency estimation to handle progressive saliency over time. We have built a weakly supervised framework based on a recurrent auto-encoder that represents trajectories with latent codes. Performance of the three methods was experimentally assessed on real video datasets
APA, Harvard, Vancouver, ISO, and other styles
39

Luo, Ying. "Statistical semantic analysis of spatio-temporal image sequences /." Thesis, Connect to this title online; UW restricted, 2004. http://hdl.handle.net/1773/5884.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Kong, Lingchao. "Modeling of Video Quality for Automatic Video Analysis and Its Applications in Wireless Camera Networks." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1563295836742645.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Schneider, Bradley A. "Gait Analysis from Wearable Devices using Image and Signal Processing." Wright State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=wright1514820042511803.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Kumara, Muthukudage Jayantha. "Automated Real-time Objects Detection in Colonoscopy Videos for Quality Measurements." Thesis, University of North Texas, 2013. https://digital.library.unt.edu/ark:/67531/metadc283843/.

Full text
Abstract:
The effectiveness of colonoscopy depends on the quality of the inspection of the colon. There was no automated measurement method to evaluate the quality of the inspection. This thesis addresses this issue by investigating an automated post-procedure quality measurement technique and proposing a novel approach automatically deciding a percentage of stool areas in images of digitized colonoscopy video files. It involves the classification of image pixels based on their color features using a new method of planes on RGB (red, green and blue) color space. The limitation of post-procedure quality measurement is that quality measurements are available long after the procedure was done and the patient was released. A better approach is to inform any sub-optimal inspection immediately so that the endoscopist can improve the quality in real-time during the procedure. This thesis also proposes an extension to post-procedure method to detect stool, bite-block, and blood regions in real-time using color features in HSV color space. These three objects play a major role in quality measurements in colonoscopy. The proposed method partitions very large positive examples of each of these objects into a number of groups. These groups are formed by taking intersection of positive examples with a hyper plane. This hyper plane is named as 'positive plane'. 'Convex hulls' are used to model positive planes. Comparisons with traditional classifiers such as K-nearest neighbor (K-NN) and support vector machines (SVM) proves the soundness of the proposed method in terms of accuracy and speed that are critical in the targeted real-time quality measurement system.
APA, Harvard, Vancouver, ISO, and other styles
43

Robinault, Lionel. "Mosaïque d’images multi résolution et applications." Thesis, Lyon 2, 2009. http://www.theses.fr/2009LYO20039.

Full text
Abstract:
Le travail de thèse que nous présentons s’articule autour de l’utilisation de caméras motorisées à trois degrés de liberté, également appelées caméras PTZ. Ces caméras peuvent être pilotées suivant deux angles. L’angle de panorama (θ) permet une rotation autour d’un axe vertical et l’angle de tangage (ϕ) permet une rotation autour d’un axe horizontal. Si, théoriquement, ces caméras permettent donc une vue omnidirectionnelle, elles limitent le plus souvent la rotation suivant l’angle de panorama mais surtout suivant l’angle de tangage. En plus du pilotage des rotations, ces caméras permettent également de contrôler la distance focale permettant ainsi un degré de liberté supplémentaire. Par rapport à d’autres modèles, les caméras PTZ permettent de construire un panorama - représentation étendue d’une scène construite à partir d’une collection d’image - de très grande résolution. La première étape dans la construction d’un panorama est l’acquisition des différentes prises de vue. A cet effet, nous avons réalisé une étude théorique permettant une couverture optimale de la sphère à partir de surfaces rectangulaires en limitant les zones de recouvrement. Cette étude nous permet de calculer une trajectoire optimale de la caméra et de limiter le nombre de prises de vues nécessaires à la représentation de la scène. Nous proposons également différents traitements permettant d’améliorer sensiblement le rendu et de corriger la plupart des défauts liés à l’assemblage d’une collection d’images acquises avec des paramètres de prises de vue différents. Une part importante de notre travail a été consacrée au recalage automatique d’images en temps réel, c’est à dire que chaque étapes est effectuée en moins de 40ms pour permettre le traitement de 25 images par seconde. La technologie que nous avons développée permet d’obtenir un recalage particulièrement précis avec un temps d’exécution de l’ordre de 4ms (AMD1.8MHz). Enfin, nous proposons deux applications de suivi d’objets en mouvement directement issues de nos travaux de recherche. La première associe une caméra PTZ à un miroir sphérique. L’association de ces deux éléments permet de détecter tout objet en mouvement dans la scène puis de se focaliser sur l’un d’eux. Dans le cadre de cette application, nous proposons un algorithme de calibrage automatique de l’ensemble caméra et miroir. La deuxième application n’exploite que la caméra PTZ et permet la segmentation et le suivi des objets dans la scène pendant le mouvement de la caméra. Par rapport aux applications classiques de suivi de cible en mouvement avec une caméra PTZ, notre approche se différencie par le fait que réalisons une segmentation fine des objets permettant leur classification
The thesis considers the of use motorized cameras with 3 degrees of freedom which are commonly called PTZ cameras. The orientation of such cameras is controlled according to two angles: the panorama angle (θ) describes the degree of rotation around on vertical axis and the tilt angle (ϕ) refers to rotation along a meridian line. Theoretically, these cameras can cover an omnidirectional field of vision of 4psr. Generally, the panorama angle and especially the tilt angle are limited for such cameras. In addition to control of the orientation of the camera, it is also possible to control focal distance, thus allowing an additional degree of freedom. Compared to other material, PTZ cameras thus allow one to build a panorama of very high resolution. A panorama is a wide representation of a scene built starting from a collection of images. The first stage in the construction of a panorama is the acquisition of the various images. To this end, we made a theoretical study to determine the optimal paving of the sphere with rectangular surfaces to minimize the number of zones of recovery. This study enables us to calculate an optimal trajectory of the camera and to limit the number of images necessary to the representation of the scene. We also propose various processing techniques which appreciably improve the rendering of the mosaic image and correct the majority of the defaults related to the assembly of a collection of images which were acquired with differing image capture parameters. A significant part of our work was used to the automatic image registration in real time, i.e. lower than 40ms. The technology that we developed makes it possible to obtain a particularly precise image registration with an computation time about 4ms (AMD1.8MHz). Our research leads directly to two proposed applications for the tracking of moving objects. The first involves the use of a PTZ camera and a spherical mirror. The combination of these two elements makes it possible to detect any motion object in the scene and to then to focus itself on one of them. Within the framework of this application, we propose an automatic algorithm of calibration of the system. The second application exploits only PTZ camera and allows the segmentation and the tracking of the objects in the scene during the movement of the camera. Compared to the traditional applications of motion detection with a PTZ camera, our approach is different by the fact that it compute a precise segmentation of the objects allowing their classification
APA, Harvard, Vancouver, ISO, and other styles
44

Bothmann, Ludwig [Verfasser], and Göran [Akademischer Betreuer] Kauermann. "Efficient statistical analysis of video and image data / Ludwig Bothmann ; Betreuer: Göran Kauermann." München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2016. http://d-nb.info/1115654764/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Al-Jawad, Naseer. "Exploiting statistical properties of wavelet coefficients for image/video processing and analysis tasks." Thesis, University of Buckingham, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.601354.

Full text
Abstract:
In this thesis the statistical properties of wavelet transform high frequency sub-bands has been used and exploited in three main different applications. These applications are; Image/video feature preserving compression, Face Biometric content based video retrieval and Face feature extraction for face verification and recognition. The main idea of this thesis was also used previously in watermarking (Dietze 2005) where the watermark can be hidden automatically near the significant features in the wavelet sub-bands. The idea is also used in image compression where special integer compression applied on low constrained devices (Ehlers 2008). In image quality measurement, Laplace Distribution Histogram (LDH) also used to measure the image quality. The theoretical LOH of any high frequency wavelet sub-band can match the histogram produced from the same high frequency wavelet sub-band of a high quality picture, where the noisy or blurred one can have a LOH which can be fitted to the theoretical one (Wang and Simoncelli 2005). Some research used the idea of wavelet high frequency sub-band features extraction implicitly, in this thesis we are focussed explicitly on using the statistical properties of the wavelet sub-bands in its multi-resolution wavelet transform. The fact that each high frequency wavelet sub-band frequencies have a Laplace Distribution (LO) (or so called General Gaussian distribution) has been mentioned in the literature. Here the relation between the statistical properties of the wavelet high frequency sub-bands and the feature extractions is well established. LOH has two tails, this make the LOH shape either symmetrical or skewed to the left, or the right This symmetry or skewing is normally around the mean which is theoretically equal to zero. In our study we paid a deep attention for these tails, these tails actually represent the image significant features which can be mapped from the wavelet domain to the spatial domain. The features can be maintained, accessed, and modified very easily using a certain threshold.
APA, Harvard, Vancouver, ISO, and other styles
46

Al-Jawad, Neseer. "Exploiting Statical Properties of Wavelet Coefficients for image/Video Processing and Analysis Tasks." Thesis, University of Exeter, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.515492.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Queiroz, Isabela Nascimento Fernandes De. "Study on methodology for analysis of traffic flow based on video image data." 京都大学 (Kyoto University), 2005. http://hdl.handle.net/2433/145385.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Tun, Min Han. "Virtual image sensors to track human activity in a smart house." Thesis, Curtin University, 2007. http://hdl.handle.net/20.500.11937/904.

Full text
Abstract:
With the advancement of computer technology, demand for more accurate and intelligent monitoring systems has also risen. The use of computer vision and video analysis range from industrial inspection to surveillance. Object detection and segmentation are the first and fundamental task in the analysis of dynamic scenes. Traditionally, this detection and segmentation are typically done through temporal differencing or statistical modelling methods. One of the most widely used background modeling and segmentation algorithms is the Mixture of Gaussians method developed by Stauffer and Grimson (1999). During the past decade many such algorithms have been developed ranging from parametric to non-parametric algorithms. Many of them utilise pixel intensities to model the background, but some use texture properties such as Local Binary Patterns. These algorithms function quite well under normal environmental conditions and each has its own set of advantages and short comings. However, there are two drawbacks in common. The first is that of the stationary object problem; when moving objects become stationary, they get merged into the background. The second problem is that of light changes; when rapid illumination changes occur in the environment, these background modelling algorithms produce large areas of false positives.These algorithms are capable of adapting to the change, however, the quality of the segmentation is very poor during the adaptation phase. In this thesis, a framework to suppress these false positives is introduced. Image properties such as edges and textures are utilised to reduce the amount of false positives during adaptation phase. The framework is built on the idea of sequential pattern recognition. In any background modelling algorithm, the importance of multiple image features as well as different spatial scales cannot be overlooked. Failure to focus attention on these two factors will result in difficulty to detect and reduce false alarms caused by rapid light change and other conditions. The use of edge features in false alarm suppression is also explored. Edges are somewhat more resistant to environmental changes in video scenes. The assumption here is that regardless of environmental changes, such as that of illumination change, the edges of the objects should remain the same. The edge based approach is tested on several videos containing rapid light changes and shows promising results. Texture is then used to analyse video images and remove false alarm regions. Texture gradient approach and Laws Texture Energy Measures are used to find and remove false positives. It is found that Laws Texture Energy Measure performs better than the gradient approach. The results of using edges, texture and different combination of the two in false positive suppression are also presented in this work. This false positive suppression framework is applied to a smart house senario that uses cameras to model ”virtual sensors” to detect interactions of occupants with devices. Results show the accuracy of virtual sensors compared with the ground truth is improved.
APA, Harvard, Vancouver, ISO, and other styles
49

Qadir, Ghulam. "Digital foresnic analysis for compressed images and videos." Thesis, University of Surrey, 2013. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.604341.

Full text
Abstract:
The advancement of imaging devices and image manipulation software has made the tasks of tracking and protecting of digital multimedia content becoming increasingly difficult. In order to protect and verify the integrity of the digital content, many active watermarking and passive forensic techniques have been developed for various image and video formats in the past decade or so. In this thesis, we focus on the research and development of digit.al image forensic techniques, particularly for the processing history recovery of JPEG2000 (J2I<) images. J2K is a. new and improved format introduced by the Joint Photographic experts Group (JPEG). Unlike JPEG, it is based on the Discrete Wavelet Transform (DWT) and has a more complex coding system. However, the size-to-compression ratio of J2K is significantly better than JPEG and can be used for storing CCTV data and also for digital cinema applications. In this thesis, the novel use of the Benfords Law for the analysis of J2K compressed images is investigated. The Benfords law is essentially a statistical law that has previously been used for the detection of linfU1cial and accounting frauds . Initial results obtained after testing 1,338 grayscale images show that the first digit probability distribution of the DWT follows the Benfords Law. However, when images are compressed with J2K compression, the first digit probability graph starts to deviate from the actual distribution of the Benfords Law curve. The compression can also be detected via the divergence factor derived from the graph. furthermore, the use of Benfords law can be applied for the analysis of an image feature known as glare, by investigating the anomaly in the first digit probability curve of DWT coefficients. The results show that out of 1,338 images, 122 images exhibit the irregular peak at digit 5 with each of these images possesses glare. This can potentially be used. as a tool to isolate images containing glare for large-scale image databases. This thesis also presents a novel J2I< compression strength detection technique. The compression strength is classified into three categories which correspond to high, medium and low subjective image quality representing compression strength low, medium and high compression strengths, respectively, ranging from 0 to 1 bits per' pixel (bpp). The proposed technique employs a. no-reference (NR) perceptual blur metric and double compression calibration to identify some heuristic rules that arc then used to design an unsupervised classifier for determining the J2K compression strength of a given image. In our experiments we experiment on 100 images to identify the heuristic rules, followed by another set of 100 different images for testing the ,Performance of our method. The results show that the compression strength achieves an accuracy of approximately 90%. The thesis also presents a new benchmarking tool for video forensics known as Surrey University Library for Forensics Analysis (SULFA). The library is considered to be the first of its kind available to the research community and contains 150 untouched original videos obtained from three digital cameras of different makes and models, as well as a. number of tampered videos and supporting ground-truth datasets tha.t can be used for video forensic experiments and analysis.
APA, Harvard, Vancouver, ISO, and other styles
50

Olgemar, Markus. "Camera Based Navigation : Matching between Sensor reference and Video image." Thesis, Linköping University, Department of Electrical Engineering, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-15952.

Full text
Abstract:

an Internal Navigational System and a Global Navigational Satellite System (GNSS). In navigational warfare the GNSS can be jammed, therefore are a third navigational system is needed. The system that has been tried in this thesis is camera based navigation. Through a video camera and a sensor reference the position is determined. This thesis will process the matching between the sensor reference and the video image.

Two methods have been implemented: normalized cross correlation and position determination through a homography. Normalized cross correlation creates a correlation matrix. The other method uses point correspondences between the images to determine a homography between the images. And through the homography obtain a position. The more point correspondences the better the position determination will be.

The results have been quite good. The methods have got the right position when the Euler angles of the UAV have been known. Normalized cross correlation has been the best method of the tested methods.

APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography