Dissertations / Theses: 'Video'

1

Sedlařík, Vladimír. "Informační strategie firmy." Master's thesis, Vysoké učení technické v Brně. Fakulta podnikatelská, 2012. http://www.nusl.cz/ntk/nusl-223526.

Full text

Abstract:

This thesis analyzes the YouTube service and describes its main deficiencies. Based on theoretical methods and analyses, its main goal is to design a service that will solve the main YouTube problems, build a company around this service and introduce this service to the market. This service will not replace YouTube, but it will supplement it. Further, this work will suggest a possible structure, strategy and information strategy of this new company and its estimated financial results in the first few years.

APA, Harvard, Vancouver, ISO, and other styles

2

Lindskog, Eric, and Wrang Jesper. "Design of video players for branched videos." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-148592.

Full text

Abstract:

Interactive branched video allows users to make viewing decisions while watching, that affect the playback path of the video and potentially the outcome of the story. This type of video introduces new challenges in terms of design, for example displaying the playback progress, the structure of the branched video as well as the choices that the viewers can make. In this thesis we test three implementations of working video players with different types of playback bars: one fully viewed with no moving parts, one that zooms into the currently watched section of the video, and one that leverages a fisheye distortion. A number of usability tests are carried out using surveys complemented with observations made during the tests. Based on these user tests we concluded that the implementation with a zoomed in playback bar was the easiest to understand and that fisheye effect received mixed results, ranging from distracting and annoying to interesting and clear. With this feedback a new set of implementations was created and solutions for each component of the video player were identified. These new implementations support more general solutions for the shape of the branch segments and the position and location of the choices for upcoming branches. The new implementations have not gone through any testing, but we expect that future work can further explore this subject with the help of our code and suggestions.

APA, Harvard, Vancouver, ISO, and other styles

3

Salam, Sazilah. "VidIO : a model for personalized video information management." Thesis, University of Southampton, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.242411.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Aklouf, Mourad. "Video for events : Compression and transport of the next generation video codec." Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASG029.

Full text

Abstract:

L'acquisition et la diffusion de contenus avec une latence minimale sont devenus essentiel dans plusieurs domaines d'activités tels que la diffusion d'évènements sportifs, la vidéoconférence, la télé-présence, la télé-opération de véhicules ou le contrôle à distance de systèmes. L'industrie de la diffusion en direct a connu une croissance en 2020, et son importance va encore croitre au cours des prochaines années grâce à l'émergence de nouveaux codecs vidéo à haute efficacité reposant sur le standard Versatile Video Coding(VVC)et à la cinquième génération de réseaux mobiles (5G).Les méthodes de streaming de type HTTP Adaptive Streaming (HAS) telles que MPEG-DASH, grâce aux algorithmes d'adaptation du débit de transmission de vidéo compressée, se sont révélées très efficaces pour améliorer la qualité d'expérience (QoE) dans un contexte de vidéo à la demande (VOD).Cependant, dans les applications où la latence est critique, minimiser le délai entre l'acquisition de l'image et son affichage au récepteur est essentiel. La plupart des algorithmes d'adaptation de débit sont développés pour optimiser la transmission vidéo d'un serveur situé dans le cœur de réseau vers des clients mobiles. Dans les applications nécessitant un streaming à faible latence, le rôle du serveur est joué par un terminal mobile qui va acquérir, compresser et transmettre les images via une liaison montante comportant un canal radio vers un ou plusieurs clients. Les approches d'adaptation de débit pilotées par le client sont par conséquent inadaptées. De plus, les HAS, pour lesquelles la prise de décision se fait avec une périodicité de l'ordre de la seconde ne sont pas suffisamment réactives lors d'une mobilité importante du serveur et peuvent engendrer des délais importants. Il est donc essentiel d'utiliser une granularité d'adaptation très fine afin de réduire le délai de bout-en-bout. En effet, la taille réduite des tampons d'émission et de réception afin de minimiser la latence rend plus délicate l'adaptation du débit dans notre cas d'usage. Lorsque la bande passante varie avec une constante de temps plus petite que la période avec laquelle la régulation est faite, les mauvaises décisions de débit de transmission peuvent induire un surcroit de latence important.L'objet de cette thèse est d'apporter des éléments de réponse à la problématique de la transmission vidéo à faible latence depuis des terminaux (émetteurs) mobiles. Nous présentons d'abord un algorithme d'adaptation de débit image-par-image pour la diffusion à faible latence. Une approche de type Model Predictive Control (MPC) est proposée pour déterminer le débit de codage de chaque image à transmettre. Cette approche utilise des informations relatives au niveau de tampon de l'émetteur et aux caractéristiques du canal de transmission. Les images étant codées en direct, un modèle reliant le paramètre de quantification (QP) au débit de sortie du codeur vidéo est nécessaire. Nous avons donc proposé un nouveau modèle reliant le débit au paramètre de quantification et à la distorsion de l'image précédente. Ce modèle fournit de bien meilleurs résultats dans le contexte d'une décision prise image par image du débit de codage que les modèle de référence de la littérature.En complément des techniques précédentes, nous avons également proposé des outils permettant de réduire la complexité de codeurs vidéo tels que VVC. La version actuelle du codeur VVC (VTM10) a un temps d'exécution neuf fois supérieur à celui du codeur HEVC. Par conséquent, le codeur VVC n'est pas adapté aux applications de codage et diffusion en temps réel sur les plateformes actuellement disponibles. Dans ce contexte, nous présentons une méthode systématique, de type branch-and-prune, permettant d'identifier un ensemble d'outils de codage pouvant être désactivés tout en satisfaisant une contrainte sur l'efficacité de codage. Ce travail contribue à la réalisation d'un codeur VVC temps réel
The acquisition and delivery of video content with minimal latency has become essential in several business areas such as sports broadcasting, video conferencing, telepresence, remote vehicle operation, or remote system control. The live streaming industry has grown in 2020 and it will expand further in the next few years with the emergence of new high-efficiency video codecs based on the Versatile Video Coding (VVC) standard and the fifth generation of mobile networks (5G).HTTP Adaptive Streaming (HAS) methods such as MPEG-DASH, using algorithms to adapt the transmission rate of compressed video, have proven to be very effective in improving the quality of experience (QoE) in a video-on-demand (VOD) context.Nevertheless, minimizing the delay between image acquisition and display at the receiver is essential in applications where latency is critical. Most rate adaptation algorithms are developed to optimize video transmission from a server situated in the core network to mobile clients. In applications requiring low-latency streaming, such as remote control of drones or broadcasting of sports events, the role of the server is played by a mobile terminal. The latter will acquire, compress, and transmit the video and transmit the compressed stream via a radio access channel to one or more clients. Therefore, client-driven rate adaptation approaches are unsuitable in this context because of the variability of the channel characteristics. In addition, HAS, for which the decision-making is done with a periodicity of the order of a second, are not sufficiently reactive when the server is moving, which may generate significant delays. It is therefore important to use a very fine adaptation granularity in order to reduce the end-to-end delay. The reduced size of the transmission and reception buffers (to minimize latency) makes it more difficult to adapt the throughput in our use case. When the bandwidth varies with a time constant smaller than the period with which the regulation is made, bad transmission rate decisions can induce a significant latency overhead.The aim of this thesis is to provide some answers to the problem of low-latency delivery of video acquired, compressed, and transmitted by mobile terminals. We first present a frame-by-frame rate adaptation algorithm for low latency broadcasting. A Model Predictive Control (MPC) approach is proposed to determine the coding rate of each frame to be transmitted. This approach uses information about the buffer level of the transmitter and about the characteristics of the transmission channel. Since the frames are coded live, a model relating the quantization parameter (QP) to the output rate of the video encoder is required. Hence, we have proposed a new model linking the rate to the QP of the current frame and to the distortion of the previous frame. This model provides much better results in the context of a frame-by-frame decision on the coding rate than the reference models in the literature.In addition to the above techniques, we have also proposed tools to reduce the complexity of video encoders such as VVC. The current version of the VVC encoder (VTM10) has an execution time nine times higher than that of the HEVC encoder. Therefore, the VVC encoder is not suitable for real-time encoding and streaming applications on currently available platforms. In this context, we present a systematic branch-and-prune method to identify a set of coding tools that can be disabled while satisfying a constraint on coding efficiency. This work contributes to the realization of a real-time VVC coder

APA, Harvard, Vancouver, ISO, and other styles

5

Le, Thuc Trinh. "Video inpainting and semi-supervised object removal." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLT026/document.

Full text

Abstract:

De nos jours, l'augmentation rapide de les vidéos crée une demande massive d'applications d'édition de vidéos. Dans cette thèse, nous résolvons plusieurs problèmes relatifs au post-traitement vidéo. Nous nous concentrons sur l'application de suppression d'objets en vidéo. Pour mener à bien cette tâche, nous l'avons divisé en deux problèmes: (1) une étape de segmentation des objets vidéo pour sélectionner les objets à supprimer et (2) une étape d'inpainting vidéo pour remplir les zones endommagées. Pour le problème de la segmentation vidéo, nous concevons un système adapté aux applications de suppression d’objets avec différentes exigences en termes de précision et d’efficacité. Notre approche repose sur la combinaison de réseaux de neurones convolutifs (CNN) pour la segmentation et de la méthode classique de suivi des masks. Nous adoptons des réseaux de segmentation d’images et les appliquons à la casse vidéo en effectuant une segmentation image par image. En exploitant à la fois les formations en ligne et hors ligne avec uniquement une annotation de première image, les réseaux sont en mesure de produire une segmentation extrêmement précise des objets vidéo. En outre, nous proposons un module de suivi de masque pour assurer la continuité temporelle et un module de liaison de masque pour assurer la cohérence de l'identité entre les trames. De plus, nous présentons un moyen simple d’apprendre la couche de dilatation dans le masque, ce qui nous aide à créer des masques appropriés pour l’application de suppression d’objets vidéo.Pour le problème d’inpainting vidéo, nous divisons notre travail en deux catégories basées sur le type de fond. En particulier, nous présentons une méthode simple de propagation de pixels guidée par le mouvement pour traiter les cas d’arrière-plan statiques. Nous montrons que le problème de la suppression d'objets avec un arrière-plan statique peut être résolu efficacement en utilisant une technique simple basée sur le mouvement. Pour traiter le fond dynamique, nous introduisons la méthode d’inpainting vidéo en optimisant une fonction d’énergie globale basée sur des patchs. Pour augmenter la vitesse de l'algorithme, nous avons proposé une extension parallèle de l'algorithme 3D PatchMatch. Pour améliorer la précision, nous intégrons systématiquement le flux optique dans le processus global. Nous nous retrouvons avec une méthode d’inpainting vidéo capable de reconstruire des objets en mouvement ainsi que de reproduire des textures dynamiques tout en fonctionnant dans des délais raisonnables.Enfin, nous combinons les méthodes de segmentation des objets vidéo et d’inpainting vidéo dans un système unifié pour supprimer les objets non souhaités dans les vidéos. A notre connaissance, il s'agit du premier système de ce type. Dans notre système, l'utilisateur n'a qu'à délimiter approximativement dans le premier cadre les objets à modifier. Ce processus d'annotation est facilité par l'aide de superpixels. Ensuite, ces annotations sont affinées et propagées dans la vidéo par la méthode de segmentation des objets vidéo. Un ou plusieurs objets peuvent ensuite être supprimés automatiquement à l’aide de nos méthodes d’inpainting vidéo. Il en résulte un outil de montage vidéo informatique flexible, avec de nombreuses applications potentielles, allant de la suppression de la foule à la correction de scènes non physiques
Nowadays, the rapid increase of video creates a massive demand for video-based editing applications. In this dissertation, we solve several problems relating to video post-processing and focus on objects removal application in video. To complete this task, we divided it into two problems: (1) A video objects segmentation step to select which objects to remove and (2) a video inpainting step to filling the damaged regions.For the video segmentation problem, we design a system which is suitable for object removal applications with different requirements in terms of accuracy and efficiency. Our approach relies on the combination of Convolutional Neural Networks (CNNs) for segmentation and the classical mask tracking method. In particular, we adopt the segmentation networks for image case and apply them to video case by performing frame-by-frame segmentation. By exploiting both offline and online training with first frame annotation only, the networks are able to produce highly accurate video object segmentation. Besides, we propose a mask tracking module to ensure temporal continuity and a mask linking module to ensure the identity coherence across frames. Moreover, we introduce a simple way to learn the dilation layer in the mask, which helps us create suitable masks for video objects removal application.For the video inpainting problem, we divide our work into two categories base on the type of background. In particular, we present a simple motion-guided pixel propagation method to deal with static background cases. We show that the problem of objects removal with a static background can be solved efficiently using a simple motion-based technique. To deal with dynamic background, we introduce video inpainting method by optimization a global patch-based energy function. To increase the speed of the algorithm, we proposed a parallel extension of the 3D PatchMatch algorithm. To improve accuracy, we systematically incorporate the optical flow in the overall process. We end up with a video inpainting method which is able to reconstruct moving objects as well as reproduce dynamic textures while running in a reasonable time.Finally, we combine the video objects segmentation and video inpainting methods into a unified system to removes undesired objects in videos. To the best of our knowledge, this is the first system of this kind. In our system, the user only needs to approximately delimit in the first frame the objects to be edited. These annotation process is facilitated by the help of superpixels. Then, these annotations are refined and propagated through the video by the video objects segmentation method. One or several objects can then be removed automatically using our video inpainting methods. This results in a flexible computational video editing tool, with numerous potential applications, ranging from crowd suppression to unphysical scenes correction

APA, Harvard, Vancouver, ISO, and other styles

6

Lei, Zhijun. "Video transcoding techniques for wireless video communications." Thesis, University of Ottawa (Canada), 2004. http://hdl.handle.net/10393/29134.

Full text

Abstract:

The transmission of compressed video over channels with different capacities may require a reduction in bit rate if the transmission channel has a lower capacity than the capacity required by the video bit-stream, or when the channel capacity is changing over time. The process of converting a compressed video format into another compressed format is known as transcoding. This thesis addresses the specific transcoding problem of dynamic bit-rate adaptation for transmission over low bandwidth wireless channels. Transmitting compressed video over lower bandwidth wireless channels require accurate and efficient rate-control schemes. In this thesis, we propose several techniques to improve transcoding performance. Based on our experimental results, we present an approximate linear bit allocation model and macroblock layer rate-control algorithm, which can achieve accurate transcoding bit-rate. By reusing useful statistics information from the incoming compressed video, the bit-rate of the transcoded video can be determined according to the video scene context. Considering a specific bursty error wireless channel, we propose a solution which combines video transcoding and an ARQ protocol to transmit compressed video over this channel. In order to make sure that the end decoder can decode and play the transcoded video within the required end-to-end delay, we analyze the rate and buffer constraints of the transcoder and derive the conditions that have to be met by the transcoder. In order to test the proposed solution, we use a statistical channel model to simulate the wireless channel and use this model and channel observation to estimate the effective channel bandwidth, which will be fed back to the transcoder for better rate control. In this thesis, we discuss two applications. For real time video communication over wireless channel, we propose an algorithm that determines the transcoding scaling factor considering end-to-end delay, buffer fullness and effective channel bandwidth. For pre-encoded video distribution over wireless channels, we propose an algorithm which can determine the transcoding bit budget based on end-to-end delay, effective bandwidth, and original video bit profile. The proposed algorithm outperforms H.263 TMN8 in terms of video quality and buffer behavior with the same computational requirements.

APA, Harvard, Vancouver, ISO, and other styles

7

Milovanovic, Marta. "Pruning and compression of multi-view content for immersive video coding." Electronic Thesis or Diss., Institut polytechnique de Paris, 2023. http://www.theses.fr/2023IPPAT023.

Full text

Abstract:

Cette thèse aborde le problème de la compression efficace de contenus vidéo immersifs, représentés avec le format Multiview Video plus Depth (MVD). Le standard du Moving Picture Experts Group (MPEG) pour la transmission des données MVD est appelé MPEG Immersive Video (MIV), qui utilise des codecs vidéo 2D compresser les informations de texture et de profondeur de la source. Par rapport au codage vidéo traditionnel, le codage vidéo immersif est complexe et limité non seulement par le compromis entre le débit binaire et la qualité, mais aussi par le débit de pixels. C'est pourquoi la MIV utilise le pruning pour réduire le débit de pixels et les corrélations entre les vues et crée une mosaïque de morceaux d'images (patches). L'estimation de la profondeur côté décodeur (DSDE) est apparue comme une approche alternative pour améliorer le système vidéo immersif en évitant la transmission de cartes de profondeur et en déplaçant le processus d'estimation de la profondeur du côté du décodeur. DSDE a été étudiée dans le cas de nombreuses vues entièrement transmises (sans pruning). Dans cette thèse, nous démontrons les avancées possibles en matière de codage vidéo immersif, en mettant l'accent sur le pruning du contenu de source. Nous allons au-delà du DSDE et examinons l'effet distinct de la restauration de la profondeur au niveau du patch du côté du décodeur. Nous proposons deux approches pour intégrer la DSDE sur le contenu traité avec le pruning du MIV. La première approche exclut un sous-ensemble de cartes de profondeur de la transmission, et la seconde approche utilise la qualité des patchs de profondeur estimés du côté de l'encodeur pour distinguer ceux qui doivent être transmis de ceux qui peuvent être récupérés du côté du décodeur. Nos expériences montrent un gain de 4.63 BD-rate pour Y-PSNR en moyenne. En outre, nous étudions également l'utilisation de techniques neuronales de synthèse basées sur l'image (IBR) pour améliorer la qualité de la synthèse de nouvelles vues et nous montrons que la synthèse neuronale elle-même fournit les informations nécessaires au pruning du contenu. Nos résultats montrent un bon compromis entre le taux de pixels et la qualité de la synthèse, permettant d'améliorer la synthèse visuelle de 3.6 dB en moyenne
This thesis addresses the problem of efficient compression of immersive video content, represented with Multiview Video plus Depth (MVD) format. The Moving Picture Experts Group (MPEG) standard for the transmission of MVD data is called MPEG Immersive Video (MIV), which utilizes 2D video codecs to compress the source texture and depth information. Compared to traditional video coding, immersive video coding is more complex and constrained not only by trade-off between bitrate and quality, but also by the pixel rate. Because of that, MIV uses pruning to reduce the pixel rate and inter-view correlations and creates a mosaic of image pieces (patches). Decoder-side depth estimation (DSDE) has emerged as an alternative approach to improve the immersive video system by avoiding the transmission of depth maps and moving the depth estimation process to the decoder side. DSDE has been studied for the case of numerous fully transmitted views (without pruning). In this thesis, we demonstrate possible advances in immersive video coding, emphasized on pruning the input content. We go beyond DSDE and examine the distinct effect of patch-level depth restoration at the decoder side. We propose two approaches to incorporate decoder-side depth estimation (DSDE) on content pruned with MIV. The first approach excludes a subset of depth maps from the transmission, and the second approach uses the quality of depth patches estimated at the encoder side to distinguish between those that need to be transmitted and those that can be recovered at the decoder side. Our experiments show 4.63 BD-rate gain for Y-PSNR on average. Furthermore, we also explore the use of neural image-based rendering (IBR) techniques to enhance the quality of novel view synthesis and show that neural synthesis itself provides the information needed to prune the content. Our results show a good trade-off between pixel rate and synthesis quality, achieving the view synthesis improvements of 3.6 dB on average

APA, Harvard, Vancouver, ISO, and other styles

8

Arrufat, Batalla Adrià. "Multiple transforms for video coding." Thesis, Rennes, INSA, 2015. http://www.theses.fr/2015ISAR0025/document.

Full text

Abstract:

Les codeurs vidéo état de l’art utilisent des transformées pour assurer une représentation compacte du signal. L’étape de transformation constitue le domaine dans lequel s’effectue la compression, pourtant peu de variabilité dans les types de transformations est constatée dans les systèmes de codage vidéo normalisés : souvent, une seule transformée est considérée, habituellement la transformée en cosinus discrète (DCT). Récemment, d’autres transformées ont commencé à être considérées en complément de la DCT. Par exemple, dans le dernier standard de compression vidéo, nommé HEVC (High Efficiency Video Coding), les blocs de taille 4x4 peuvent utiliser la transformée en sinus discrète (DST), de plus, il est également possible de ne pas les transformer. Ceci révèle un intérêt croissant pour considérer une pluralité de transformées afin d’augmenter les taux de compression. Cette thèse se concentre sur l’extension de HEVC au travers de l’utilisation de multiples transformées. Après une introduction générale au codage vidéo et au codage par transformée, une étude détaillée de deux méthodes de construction de transformations est menée : la transformée de Karhunen Loève (KLT) et une transformée optimisée en débit et distorsion sont considérées. Ces deux méthodes sont comparées entre-elles en substituant les transformées utilisées par HEVC. Une expérimentation valide la pertinence des approches. Un schéma de codage qui incorpore et augmente l’utilisation de multiples transformées est alors introduit : plusieurs transformées sont mises à disposition de l’encodeur, qui sélectionne celle qui apporte le meilleur compromis dans le plan débit distorsion. Pour ce faire, une méthode de construction qui permet de concevoir des systèmes comportant de multiples transformations est décrite. Avec ce schéma de codage, le débit est significativement réduit par rapport à HEVC, tout particulièrement lorsque les transformées sont nombreuses et complexes à mettre en oeuvre. Néanmoins, ces améliorations viennent au prix d’une complexité accrue en termes d’encodage, de décodage et de contrainte de stockage. En conséquence, des simplifications sont considérées dans la suite du document, qui ont vocation à limiter l’impact en réduction de débit. Une première approche est introduite dans laquelle des transformées incomplètes sont motivées. Les transformations de ce type utilisent un seul vecteur de base, et sont conçues pour travailler de concert avec les transformations de HEVC. Cette technique est évaluée et apporte une réduction de complexité significative par rapport au précédent système, bien que la réduction de débit soit modeste. Une méthode systématique, qui détermine les meilleurs compromis entre le nombre de transformées et l’économie de débit est alors définie. Cette méthode utilise deux types différents de transformée : basés sur des transformées orthogonales séparables et des transformées trigonométriques discrètes (DTT) en particulier. Plusieurs points d’opération sont présentés qui illustrent plusieurs compromis complexité / gain en débit. Ces systèmes révèlent l’intérêt de l’utilisation de transformations multiples pour le codage vidéo
State of the art video codecs use transforms to ensure a compact signal representation. The transform stage is where compression takes place, however, little variety is observed in the type of transforms used for standardised video coding schemes: often, a single transform is considered, usually a Discrete Cosine Transform (DCT). Recently, other transforms have started being considered in addition to the DCT. For instance, in the latest video coding standard, High Efficiency Video Coding (HEVC), the 4x4 sized blocks can make use of the Discrete Sine Transform (DST) and, in addition, it also possible not to transform them. This fact reveals an increasing interest to consider a plurality of transforms to achieve higher compression rates. This thesis focuses on extending HEVC through the use of multiple transforms. After a general introduction to video compression and transform coding, two transform designs are studied in detail: the Karhunen Loève Transform (KLT) and a Rate-Distortion Optimised Transform are considered. These two methods are compared against each other by replacing the transforms in HEVC. This experiment validates the appropriateness of the design. A coding scheme that incorporates and boosts the use of multiple transforms is introduced: several transforms are made available to the encoder, which chooses the one that provides the best rate-distortion trade-off. Consequently, a design method for building systems using multiple transforms is also described. With this coding scheme, significant amounts of bit-rate savings are achieved over HEVC, especially when using many complex transforms. However, these improvements come at the expense of increased complexity in terms of coding, decoding and storage requirements. As a result, simplifications are considered while limiting the impact on bit-rate savings. A first approach is introduced, in which incomplete transforms are used. This kind of transforms use one single base vector and are conceived to work as companions of the HEVC transforms. This technique is evaluated and provides significant complexity reductions over the previous system, although the bit-rate savings are modest. A systematic method, which specifically determines the best trade-offs between the number of transforms and bit-rate savings, is designed. This method uses two different types of transforms based separable orthogonal transforms and Discrete Trigonometric Transforms (DTTs) in particular. Several designs are presented, allowing for different complexity and bitrate savings trade-offs. These systems reveal the interest of using multiple transforms for video coding

APA, Harvard, Vancouver, ISO, and other styles

9

Engin, Deniz. "Video question answering with limited supervision." Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS016.

Full text

Abstract:

Le contenu vidéo a considérablement augmenté en volume et en diversité à l'ère numérique, et cette expansion a souligné la nécessité de technologies avancées de compréhension des vidéos. Poussée par cette nécessité, cette thèse explore la compréhension sémantique des vidéos, en exploitant plusieurs modes perceptuels similaires aux processus cognitifs humains et un apprentissage efficace avec une supervision limitée, semblable aux capacités d'apprentissage humain. Cette thèse se concentre spécifiquement sur la réponse aux questions sur les vidéos comme l'une des principales tâches de compréhension vidéo. Notre première contribution traite de la réponse aux questions sur les vidéos à long terme, nécessitant une compréhension du contenu vidéo étendu. Alors que les approches récentes dépendent de sources externes générées par les humains, nous traitons des données brutes pour générer des résumés vidéo. Notre contribution suivante explore la réponse aux questions vidéo en zéro-shot et en few-shot, visant à améliorer l'apprentissage efficace à partir de données limitées. Nous exploitons la connaissance des modèles à grande échelle existants en éliminant les défis d'adaptation des modèles pré-entraînés à des données limitées. Nous démontrons que ces contributions améliorent considérablement les capacités des systèmes de réponse aux questions vidéo multimodaux, où les données étiquetées spécifiquement annotées par l'homme sont limitées ou indisponibles
Video content has significantly increased in volume and diversity in the digital era, and this expansion has highlighted the necessity for advanced video understanding technologies. Driven by this necessity, this thesis explores semantically understanding videos, leveraging multiple perceptual modes similar to human cognitive processes and efficient learning with limited supervision similar to human learning capabilities. This thesis specifically focuses on video question answering as one of the main video understanding tasks. Our first contribution addresses long-range video question answering, requiring an understanding of extended video content. While recent approaches rely on human-generated external sources, we process raw data to generate video summaries. Our following contribution explores zero-shot and few-shot video question answering, aiming to enhance efficient learning from limited data. We leverage the knowledge of existing large-scale models by eliminating challenges in adapting pre-trained models to limited data. We demonstrate that these contributions significantly enhance the capabilities of multimodal video question-answering systems, where specifically human-annotated labeled data is limited or unavailable

APA, Harvard, Vancouver, ISO, and other styles

10

Le, Thuc Trinh. "Video inpainting and semi-supervised object removal." Electronic Thesis or Diss., Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLT026.

Full text

Abstract:

De nos jours, l'augmentation rapide de les vidéos crée une demande massive d'applications d'édition de vidéos. Dans cette thèse, nous résolvons plusieurs problèmes relatifs au post-traitement vidéo. Nous nous concentrons sur l'application de suppression d'objets en vidéo. Pour mener à bien cette tâche, nous l'avons divisé en deux problèmes: (1) une étape de segmentation des objets vidéo pour sélectionner les objets à supprimer et (2) une étape d'inpainting vidéo pour remplir les zones endommagées. Pour le problème de la segmentation vidéo, nous concevons un système adapté aux applications de suppression d’objets avec différentes exigences en termes de précision et d’efficacité. Notre approche repose sur la combinaison de réseaux de neurones convolutifs (CNN) pour la segmentation et de la méthode classique de suivi des masks. Nous adoptons des réseaux de segmentation d’images et les appliquons à la casse vidéo en effectuant une segmentation image par image. En exploitant à la fois les formations en ligne et hors ligne avec uniquement une annotation de première image, les réseaux sont en mesure de produire une segmentation extrêmement précise des objets vidéo. En outre, nous proposons un module de suivi de masque pour assurer la continuité temporelle et un module de liaison de masque pour assurer la cohérence de l'identité entre les trames. De plus, nous présentons un moyen simple d’apprendre la couche de dilatation dans le masque, ce qui nous aide à créer des masques appropriés pour l’application de suppression d’objets vidéo.Pour le problème d’inpainting vidéo, nous divisons notre travail en deux catégories basées sur le type de fond. En particulier, nous présentons une méthode simple de propagation de pixels guidée par le mouvement pour traiter les cas d’arrière-plan statiques. Nous montrons que le problème de la suppression d'objets avec un arrière-plan statique peut être résolu efficacement en utilisant une technique simple basée sur le mouvement. Pour traiter le fond dynamique, nous introduisons la méthode d’inpainting vidéo en optimisant une fonction d’énergie globale basée sur des patchs. Pour augmenter la vitesse de l'algorithme, nous avons proposé une extension parallèle de l'algorithme 3D PatchMatch. Pour améliorer la précision, nous intégrons systématiquement le flux optique dans le processus global. Nous nous retrouvons avec une méthode d’inpainting vidéo capable de reconstruire des objets en mouvement ainsi que de reproduire des textures dynamiques tout en fonctionnant dans des délais raisonnables.Enfin, nous combinons les méthodes de segmentation des objets vidéo et d’inpainting vidéo dans un système unifié pour supprimer les objets non souhaités dans les vidéos. A notre connaissance, il s'agit du premier système de ce type. Dans notre système, l'utilisateur n'a qu'à délimiter approximativement dans le premier cadre les objets à modifier. Ce processus d'annotation est facilité par l'aide de superpixels. Ensuite, ces annotations sont affinées et propagées dans la vidéo par la méthode de segmentation des objets vidéo. Un ou plusieurs objets peuvent ensuite être supprimés automatiquement à l’aide de nos méthodes d’inpainting vidéo. Il en résulte un outil de montage vidéo informatique flexible, avec de nombreuses applications potentielles, allant de la suppression de la foule à la correction de scènes non physiques
Nowadays, the rapid increase of video creates a massive demand for video-based editing applications. In this dissertation, we solve several problems relating to video post-processing and focus on objects removal application in video. To complete this task, we divided it into two problems: (1) A video objects segmentation step to select which objects to remove and (2) a video inpainting step to filling the damaged regions.For the video segmentation problem, we design a system which is suitable for object removal applications with different requirements in terms of accuracy and efficiency. Our approach relies on the combination of Convolutional Neural Networks (CNNs) for segmentation and the classical mask tracking method. In particular, we adopt the segmentation networks for image case and apply them to video case by performing frame-by-frame segmentation. By exploiting both offline and online training with first frame annotation only, the networks are able to produce highly accurate video object segmentation. Besides, we propose a mask tracking module to ensure temporal continuity and a mask linking module to ensure the identity coherence across frames. Moreover, we introduce a simple way to learn the dilation layer in the mask, which helps us create suitable masks for video objects removal application.For the video inpainting problem, we divide our work into two categories base on the type of background. In particular, we present a simple motion-guided pixel propagation method to deal with static background cases. We show that the problem of objects removal with a static background can be solved efficiently using a simple motion-based technique. To deal with dynamic background, we introduce video inpainting method by optimization a global patch-based energy function. To increase the speed of the algorithm, we proposed a parallel extension of the 3D PatchMatch algorithm. To improve accuracy, we systematically incorporate the optical flow in the overall process. We end up with a video inpainting method which is able to reconstruct moving objects as well as reproduce dynamic textures while running in a reasonable time.Finally, we combine the video objects segmentation and video inpainting methods into a unified system to removes undesired objects in videos. To the best of our knowledge, this is the first system of this kind. In our system, the user only needs to approximately delimit in the first frame the objects to be edited. These annotation process is facilitated by the help of superpixels. Then, these annotations are refined and propagated through the video by the video objects segmentation method. One or several objects can then be removed automatically using our video inpainting methods. This results in a flexible computational video editing tool, with numerous potential applications, ranging from crowd suppression to unphysical scenes correction

APA, Harvard, Vancouver, ISO, and other styles

11

Dufour, Sophie-Isabelle. "Imaginem video : L'image vidéo dans l'"histoire longue" des images." Paris 3, 2004. http://www.theses.fr/2004PA030054.

Full text

Abstract:

Que vois-je devant la vidéo? Je ne vois pas une vidéo, mais de l'image. La présente étude propose d'interroger le statut de l'image vidéo dans l'"histoire longue" des images. Il s'agit de déployer des problèmes multiséculaires qui ont surgi bien avant l'invention technique du médium considéré. Notre présupposé est qu'il y a une différence entre l'image - en tant que notion - et les images: l'on pourrait dire que l'image, difficile à définir, ne peut être appréhendée que dans les différents médiums qui l'actualisent. Aussi la vidéo est-elle traitée, ici, en tant qu'elle questionne l'image elle-même. Priorité sera donnée aux œuvres d'art, considérées comme plus révélatrices du statut de l'image vidéo. Mais par delà l'esthétique, les pouvoirs de l'image dépasseront ceux de l'art. Si la première question qu'affonte la présente étude est celle de l'amour de l'image, posée exemplairement par le mythe de Narcisse, c'est que celle-ci fait ensuite surgir d'autres questions fondamentales. C'est ainsi que la notion de fluidité se posera comme le fil conducteur de notre réflexion sur la fantomalité de l'image vidéo et sur sa spatialité. Notre étude des rapports entre l'image vidéo et le temps sera, quant à elle, orientée par la notion de flux - celle de Bergson en particulier. Il s'agira en fin de compte de penser l'image vidéo dans toute sa singularité
What do I see when I look at a video? Actually, I do not see a video but an image. My purpose is to study the status of video image from the point of view of the so-called " long history " of images, dealing therefore with very ancient problems that occured long before the technical invention of the medium. A distinction must be made between images and the very notion of image : one could say that the difficult notion of image can be specified only through the various media in which it embodies itself. In this study, video questions image itself. Art works will keep their privilege, because through them the status of video image is best revealed; but my intention is to show that the powers of image go far beyond aesthetics. The first problem will be the one raised by the myth of Narcissus, as a lover of image(s), because it is seminal. It leads, for instance, to the notion of fluidity, which will prove essential in my study of the " ghostliness " of video image (as well as in my study of space in video). Last but not least, the relations between time and video image should be specified with Bergson's help, and I shall try to prove how useful can be this philosopher's notion of time when one hopes to understand the singularity of video image

APA, Harvard, Vancouver, ISO, and other styles

12

Hammouri, Ghassan. "Video++, an object-oriented approach to video algebra." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp04/mq26329.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Pu, Ruonan. "Target-sensitive video segmentation for seamless video composition /." View abstract or full-text, 2007. http://library.ust.hk/cgi/db/thesis.pl?CSED%202007%20PU.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Bhat, Abharana Ramdas. "A new video quality metric for compressed video." Thesis, Robert Gordon University, 2012. http://hdl.handle.net/10059/794.

Full text

Abstract:

Video compression enables multimedia applications such as mobile video messaging and streaming, video conferencing and more recently online social video interactions to be possible. Since most multimedia applications are meant for the human observer, measuring perceived video quality during the designing and testing of these applications is important. Performance of existing perceptual video quality measurement techniques is limited due to poor correlation with subjective quality and implementation complexity. Therefore, this thesis presents new techniques for measuring perceived quality of compressed multimedia video using computationally simple and efficient algorithms. A new full reference perceptual video quality metric called the MOSp metric for measuring subjective quality of multimedia video sequences compressed using block-based video coding algorithms is developed. The metric predicts subjective quality of compressed video using the mean squared error between original and compressed sequences, and video content. Factors which influence the visibility of compression-induced distortion such as spatial texture masking, temporal masking and cognition, are considered for quantifying video content. The MOSp metric is simple to implement and can be integrated into block-based video coding algorithms for real time quality estimations. Performance results presented for a variety of multimedia content compressed to a large range of bitrates show that the metric has high correlation with subjective quality and performs better than popular video quality metrics. As an application of the MOSp metric to perceptual video coding, a new MOSpbased mode selection algorithm for a H264/AVC video encoder is developed. Results show that, by integrating the MOSp metric into the mode selection process, it is possible to make coding decisions based on estimated visual quality rather than mathematical error measures and to achieve visual quality gain in content that is identified as visually important by the MOSp metric. The novel algorithms developed in this research work are particularly useful for integrating into block based video encoders such as the H264/AVC standard for making real time visual quality estimations and coding decisions based on estimated visual quality rather than the currently used mathematical error measures.

APA, Harvard, Vancouver, ISO, and other styles

15

Tsoi, Yau Chat. "Video cosmetics : digital removal of blemishes from video /." View Abstract or Full-Text, 2003. http://library.ust.hk/cgi/db/thesis.pl?COMP%202003%20TSOI.

Full text

Abstract:

Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2003.
Includes bibliographical references (leaves 83-86). Also available in electronic version. Access restricted to campus users.

APA, Harvard, Vancouver, ISO, and other styles

16

Banda, Dalitso Hansini. "Deep video-to-video transformations for accessibility applications." Thesis, Massachusetts Institute of Technology, 2018. https://hdl.handle.net/1721.1/121622.

Full text

Abstract:

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 73-79).
We develop a class of visual assistive technologies that can learn visual transforms to improve accessibility as an alternative to traditional methods that mostly rely on extracted symbolic information. In this thesis, we mainly focus on how we can apply this class of systems to address photosensitivity. People with photosensitivity may have seizures, migraines or other adverse reactions to certain visual stimuli such as flashing images and alternating patterns. We develop deep learning models that learn to identify and transform video sequences containing such stimuli whilst preserving video quality and content. Using descriptions of the adverse visual stimuli, we train models to learn transforms to remove such stimuli. We show that these deep learning models are able to generalize to real-world examples of images with these problematic stimuli. From our experimental trials, human subjects rated video sequences transformed by our models as having significantly less problematic stimuli than their input. We extend these ideas; we show how these deep transformation networks can be applied in other visual assistive domains through demonstration of an application addressing the problem of emotion recognition in those with the Autism Spectrum Disorder.
by Dalitso Hansini Banda.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science

APA, Harvard, Vancouver, ISO, and other styles

17

Napolitano, Pasquale. "Video-Design: progettare lo spazio con il video." Doctoral thesis, Università degli Studi di Salerno, 2010. http://hdl.handle.net/10556/123.

Full text

Abstract:

2008 - 2009
Il progetto di ricerca di cui questo testo rappresenta un primo, paparziale punto fermo, è consistito principalmente nel rilevare e nel descrivere in che modo il videodesign, prima ancora che un corollario di competenze tecniche e progettuali, costituisce una particolare disposizione nei confronti del visualscape contemporaneo, una pratica dello sguardo in grado di seguire i filamenti ibridi che si intrecciano in ogni oggetto audiovisivo. L'obiettivo generale della ricerca è stato quello di utilizzare la speculazione sul design del video per delineare una serie di forme simboliche che mettano in forma una peculiare tipologia di sguardo. La visione messa in forma dal cinema aderisce ai canoni tradizionalmente attribuiti alla prospettiva rinascimentale, con la sua concezione vettoriale dello sguardo, conformemente alla teoria di Ervin Panfosky, che vede nella prospettiva lineare la forma simbolica dell'area moderna. Il tipo di sguardo proposto invece dai video-oggetti è, non più vettoriale, ma sintetico. Uno sguardo che non riduce a sintesi, ma resta paratattico. Inoltre si è cercato di far emergere dalla presente analisi la gamma di forme simboliche sottese alla forma-video, attraverso un escursus storico(nel primo capitolo) e puntate mirate sul contemporaneo , specialmente quelle forme audiovisive non ancora precisamente catalogabili, ma che si mostrano come forma ibrida tra video e spazio (i capitoli su motion picture, ambienti sensibili-interattivi, suond sculptures e live-media).
VIII ciclo n.s.

APA, Harvard, Vancouver, ISO, and other styles

18

Chen, Juan. "Content-based Digital Video Processing. Digital Videos Segmentation, Retrieval and Interpretation." Thesis, University of Bradford, 2009. http://hdl.handle.net/10454/4256.

Full text

Abstract:

Recent research approaches in semantics based video content analysis require shot boundary detection as the first step to divide video sequences into sections. Furthermore, with the advances in networking and computing capability, efficient retrieval of multimedia data has become an important issue. Content-based retrieval technologies have been widely implemented to protect intellectual property rights (IPR). In addition, automatic recognition of highlights from videos is a fundamental and challenging problem for content-based indexing and retrieval applications. In this thesis, a paradigm is proposed to segment, retrieve and interpret digital videos. Five algorithms are presented to solve the video segmentation task. Firstly, a simple shot cut detection algorithm is designed for real-time implementation. Secondly, a systematic method is proposed for shot detection using content-based rules and FSM (finite state machine). Thirdly, the shot detection is implemented using local and global indicators. Fourthly, a context awareness approach is proposed to detect shot boundaries. Fifthly, a fuzzy logic method is implemented for shot detection. Furthermore, a novel analysis approach is presented for the detection of video copies. It is robust to complicated distortions and capable of locating the copy of segments inside original videos. Then, iv objects and events are extracted from MPEG Sequences for Video Highlights Indexing and Retrieval. Finally, a human fighting detection algorithm is proposed for movie annotation.

APA, Harvard, Vancouver, ISO, and other styles

19

Krist, Antonín. "Pokročilé metody postprodukce a distribuce videa s využitím IT." Master's thesis, Vysoká škola ekonomická v Praze, 2010. http://www.nusl.cz/ntk/nusl-19121.

Full text

Abstract:

This thesis deals with advanced methods of digital video postproduction and distribution using broadcasting technologies and internet protocol. Describes and compares the distribution methods, using information technology and discusses the current problems. Describes digitization methods and methods that can save bandwidth for distribution. Deals with the possible practical implementation of distribution od three dimensional video to upcoming standards and analyzes the possibilities of their future development. Discusses the overall problems of transmission standardization and advanced video coding. In a conclusion, based on a comparison of methods and practical experience of the author, thesis recommends certain procedures to implement to the standard and specifies the direction of the technological solutions.

APA, Harvard, Vancouver, ISO, and other styles

20

Bordes, Philippe. "Adapting video compression to new formats." Thesis, Rennes 1, 2016. http://www.theses.fr/2016REN1S003/document.

Full text

Abstract:

Les nouvelles techniques de compression vidéo doivent intégrer un haut niveau d'adaptabilité, à la fois en terme de bande passante réseau, de scalabilité des formats (taille d'images, espace de couleur…) et de compatibilité avec l'existant. Dans ce contexte, cette thèse regroupe des études menées en lien avec le standard HEVC. Dans une première partie, plusieurs adaptations qui exploitent les propriétés du signal et qui sont mises en place lors de la création du bit-stream sont explorées. L'étude d'un nouveau partitionnement des images pour mieux s'ajuster aux frontières réelles du mouvement permet des gains significatifs. Ce principe est étendu à la modélisation long-terme du mouvement à l'aide de trajectoires. Nous montrons que l'on peut aussi exploiter la corrélation inter-composantes des images et compenser les variations de luminance inter-images pour augmenter l'efficacité de la compression. Dans une seconde partie, des adaptations réalisées sur des flux vidéo compressés existants et qui s'appuient sur des propriétés de flexibilité intrinsèque de certains bit-streams sont investiguées. En particulier, un nouveau type de codage scalable qui supporte des espaces de couleur différents est proposé. De ces travaux, nous dérivons des metadata et un modèle associé pour opérer un remapping couleur générique des images. Le stream-switching est aussi exploré comme une application particulière du codage scalable. Plusieurs de ces techniques ont été proposées à MPEG. Certaines ont été adoptées dans le standard HEVC et aussi dans la nouvelle norme UHD Blu-ray Disc. Nous avons investigué des méthodes variées pour adapter le codage de la vidéo aux différentes conditions de distribution et aux spécificités de certains contenus. Suivant les scénarios, on peut sélectionner et combiner plusieurs d'entre elles pour répondre au mieux aux besoins des applications
The new video codecs should be designed with an high level of adaptability in terms of network bandwidth, format scalability (size, color space…) and backward compatibility. This thesis was made in this context and within the scope of the HEVC standard development. In a first part, several Video Coding adaptations that exploit the signal properties and which take place at the bit-stream creation are explored. The study of improved frame partitioning for inter prediction allows better fitting the actual motion frontiers and shows significant gains. This principle is further extended to long-term motion modeling with trajectories. We also show how the cross-component correlation statistics and the luminance change between pictures can be exploited to increase the coding efficiency. In a second part, post-creation stream adaptations relying on intrinsic stream flexibility are investigated. In particular, a new color gamut scalability scheme addressing color space adaptation is proposed. From this work, we derive color remapping metadata and an associated model to provide low complexity and general purpose color remapping feature. We also explore the adaptive resolution coding and how to extend scalable codec to stream-switching applications. Several of the described techniques have been proposed to MPEG. Some of them have been adopted in the HEVC standard and in the UHD Blu-ray Disc. Various techniques for adapting the video compression to the content characteristics and to the distribution use cases have been considered. They can be selected or combined together depending on the applications requirements

APA, Harvard, Vancouver, ISO, and other styles

21

Yu, Jin Nah. "Video dithering." Thesis, Texas A&M University, 2004. http://hdl.handle.net/1969.1/505.

Full text

Abstract:

In this work, we present mathematical and artistic techniques for the easy creation of artistic screening animations in video resolution by extending the artistic screening technique of adapting various patterns as screen dots for generating halftones. For video dithering, three different animations are needed. One is for screen dots which is a simple black and white animation; another is for the goal (or perceived) animation on the screen; and the other animation is for controlling the color and the size of screen dots. By combining three different animations with video dithering techniques, two animations appear simultaneously on the result video screen and provide complex and unique animation. Our techniques assure creating of aesthetic looking movies by providing frame to frame coherence and avoiding spatial and temporal aliasing that can be caused by low quality of video images. We shows how this technique is a powerful and effective way to create artistic results, by demonstrating variety of video dithering.

APA, Harvard, Vancouver, ISO, and other styles

22

Waldemarsson, Lars-Åke. "Holografisk Video." Thesis, Linköping University, Department of Electrical Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-7049.

Full text

Abstract:

Detta examensarbete utgår ifrån en artikel i vilken en metod för att skapa holografisk video beskrivs. Syftet med arbetet är att återskapa denna metod. Metoden bygger på projicering av hologram med hjälp av delar från en projektor, en laser och några linser.

Först görs en litteraturstudie för att få förståelse över hur metoden fungerar. Här behandlas hur ögat ser djup och vilka olika typer av displayer det finns för att återge tredimensionella holografiska bilder. Vidare beskrivs skillnaden mellan optisk och datorgenererad holografi. Detta arbete hanterar enbart datorgenererad holografi.

Diffraktion, böjning av ljusstrålar och interferens mellan ljusstrålar ligger som grund för metoden att skapa holografiska bilder. I optisk holografi låter man ljusstrålar från ett objekt och en referensstråle interferera med varandra. Deras interferensmönster fångas upp på en fotografisk film. Ett hologram av objektet kan därefter rekonstrueras genom att belysa den fotografiska filmen med samma referensstråle.

För att återge tredimensionella holografiska bilder så behövs en SLM (”Spatial Light Modulator”). Den SLM som används här är Texas Instruments DLP (”Digital Light Processing”). Denna återfinns i DLP-projektorer i vilken huvudkomponenten är en DMD (”Digital Micromirror Device”). En DMD är ett datorchip bestående av mikroskopiska små speglar i ett rutmönster. DMD:n belyses i projektorn av en lampa och här av en laser. Vardera mikrospegel kan vinklas mot resp. från ljuskällan och därigenom föra sitt lilla ljusknippe vidare eller inte.

Datorgenererad holografi simulerar optisk holografi, genom en fouriertransform. Denna transform har som indata en numerisk beskrivning av ett objekt och som utdata ett interferensmönster som matas in i DLP:n. De infallande ljusstrålarna på DMD:n agerar utifrån interferensmönstret och återger ett hologram. Jämför här med den fotografiska filmen inom optisk holografi.

Den andra delen av examensarbetet hanterar min återskapning av metoden. För att beskriva transformen valdes datorprogrammet Matlab. Indata till programmet är två tvådimensionella bilder. Dessa placeras i en rymd med ett inbördes avstånd mellan varandra i z-led. Denna rymd är det objekt som ska skapas ett hologram för. Programmet ger som utdata en tvådimensionell bild som utgör interferensmönstret för objektet.

Stor vikt har lagts vid optimering av detta program genom att utnyttja Matlabs styrka i matrisoperationer och att förenkla beräkningen för de punkter som i hologrammet är genomskinliga, dvs. de punkter som inte hör till objektet.

I resultatdelen presenteras interferensmönstret för ett givet objekt. En slutsats är att beräkna transformen för normalstora eller större objekt är en mycket tidsödande process. Det krävs stor datorkraft och bättre optimering för att få acceptabla tider för beräkningen. Här beräknas bara interferensmönster för enstaka objekt, för att skapa holografisk video så behövs runt 24 bilder per sekund. Det är fullt möjligt att skapa holografisk video med det presenterade programmet men det skulle ta allt för lång tid för beräkning.

APA, Harvard, Vancouver, ISO, and other styles

23

Parnow, Klaus. "Arbeitsgruppe Video." Universität Potsdam, 1999. http://opus.kobv.de/ubp/volltexte/2005/304/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Yilmaz, Fatih Levent. "Video Encryption." Thesis, Linnéuniversitetet, Institutionen för datavetenskap, fysik och matematik, DFM, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-12604.

Full text

Abstract:

Video Encryption is nearly the best method for blocking unwanted seizures and viewing of any transmitted video or information. There are several useful techniques that are available for encryping videos. However, one of the unique speciality for human eye is spotting the irregularity in videos due to weak video decoding or weak choice of video encryption hardware. Because of this situation, it is very important to select the right hardware or else our video transmissions may not be secured or our decoded video may be un-watchable. Every technique has advantages and disadvantages over other technical methods. Line-cut and rotate video encryption method is maybe the best way of acquiring safe, secured and good quality encypted videos. In this method, every line in the video frame cuts and rotates from different points and these cut points are created from a random matrix. The advantage of this method is to supply a coherent video signal, gives an excellent amount of darkness, as well as good decode quality and stableness. On the other hand it’s disadvantages is to have complex timing control and needs specialized encryption equipment.

APA, Harvard, Vancouver, ISO, and other styles

25

Daniel, G. W. "Video visualisation." Thesis, Swansea University, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.636344.

Full text

Abstract:

The main contributions of this work can be summarised as: • Presenting a collection of hypotheses that will form the backbone, underpinning the motivation for work conducted within the field of video visualisation. • Presenting a prototype system to demonstrate the technical feasibility of video visualisation within a surveillance context, along with detailing its generic pipeline. • Provide an investigation into video visualisation, offering a general solution by utilising volume visualisation techniques, such as spatial and opacity transfer functions. • Providing the first set of evidence to support some of the presented hypotheses. • Demonstrating both stream and hardware-based rendering in the context of video visualisation. • Incorporating and evaluating a collection of change detection (CD) metrics, concerning their ability to produce effective video visualisations. • Presenting a novel investigation into interaction control protocols within multi-user and multi-camera environments. Video datasets are a type of volume dataset and treated as such, allowing ray-traced rendering and advanced volume modelling techniques to be applied to the video. It is shown how the interweaving of image processing and volume visualisation techniques can be used to create effective visualisations to aid the human vision system in the interpretation of video based content and features. Through the application of CD methodologies, it is shown how feature volumes are created and rendered to show temporal variations within a period.

APA, Harvard, Vancouver, ISO, and other styles

26

Sasnett, Russ. "Reconfigurable video." Thesis, Massachusetts Institute of Technology, 1985. http://hdl.handle.net/1721.1/15100.

Full text

Abstract:

Thesis (M.S.V.S.)--Massachusetts Institute of Technology, Dept. of Architecture, 1986.
MICROFICHE COPY AVAILABLE IN ARCHIVES AND ROTCH
Bibliography: leaves 105-107.
by Russell Mayo Sasnett.
M.S.V.S.

APA, Harvard, Vancouver, ISO, and other styles

27

Lee, Ying 1979. "Scalable video." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/9071.

Full text

Abstract:

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.
Includes bibliographical references (p. 51).
This thesis presents the design and implementation of a scalable video scheme that accommodates the uncertainties in networks and the differences in receivers' displaying mechanisms. To achieve scalability, a video stream is encoded into two kinds of layers, namely the base layer and the enhancement layer. The decoder must process the base layer in order to display minimally acceptable video quality. For higher quality, the decoder simply combines the base layer with one or more enhancement layers. Incorporated with the IP multicast system, the result is a highly flexible and extensible structure that facilitates video viewing to a wide variety of devices, yet customizes the presentation for each individual receiver.
by Ying Lee.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

28

Jovičic, Zoran. "Video - film." Master's thesis, Vysoké učení technické v Brně. Fakulta výtvarných umění, 2009. http://www.nusl.cz/ntk/nusl-232206.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Jirka, Roman. "Časosběrné video." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2011. http://www.nusl.cz/ntk/nusl-236934.

Full text

Abstract:

This thesis deals with the introduction into the topic of time-lapse video creation. It focuses on cases where tripod is not used and therefore it is necessary to eliminate incurred shortcomings. The main shortcomings are different position of individual frames, different brightness and color adjustment. The next topic describes which principles should be followed during the creation process. Thesis describes and implements methods for elimination of main shortcomings during process long time-lapse videos, which are recorded by hand. Thesis also precisely describes image registration, correction of brightness and colors. Thesis is also considers histograms comparison. Result of this work is application, which eliminates problems described above.

APA, Harvard, Vancouver, ISO, and other styles

30

Richtr, Pavel. "Video syntezátor." Master's thesis, Vysoké učení technické v Brně. Fakulta výtvarných umění, 2016. http://www.nusl.cz/ntk/nusl-240574.

Full text

Abstract:

Generating a video signal for ATtiny85 , authoring software worldwide for video game console ATARI2600 on the theme of UAV attacks and their media image - a reinterpretation of using "low res" generated video.

APA, Harvard, Vancouver, ISO, and other styles

31

Arrieta, Concha José Luis, and Huamán Glendha Falconí. "Video Wall." Bachelor's thesis, Universidad Peruana de Ciencias Aplicadas (UPC), 2013. http://hdl.handle.net/10757/273539.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Horyna, Miroslav. "Video telefon." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2015. http://www.nusl.cz/ntk/nusl-221059.

Full text

Abstract:

This thesis deals with door video phone on the platform Raspberry Pi. There is described the platform Raspberry Pi, Raspberry Pi Camera module, operating systems for Raspberry Pi and described installing and configuring the software. Next is described the concept and description of programs created for door video phone and design of additional modules.

APA, Harvard, Vancouver, ISO, and other styles

33

Wang, Yi. "Design and Evaluation of Contextualized Video Interfaces." Diss., Virginia Tech, 2010. http://hdl.handle.net/10919/28798.

Full text

Abstract:

pictures. Videos have been increasingly used in multiple applications, including surveillance, teleconferencing, learning and experience sharing. Since a video captures a scene from a particular viewpoint, it can often be understood better if presented within a larger spatial context. We call such interactive visualizations that combine videos with their spatial context â Contextualized Videosâ . Over recent years, multiple innovative Contextualized Video interfaces have been proposed to taking advantage of the latest computer graphics and video processing technologies. These interfaces opened a huge design space with numerous design possibilities, each with its own benefits and limitations. To avoid piecemeal understanding of the design space, this dissertation systematically designs and evaluates Contextualized Video interfaces based on a taxonomy of tasks that can potentially benefit from Contextualized Videos. This dissertation first formalizes a design space. New designs are created incrementally along the four major dimensions of the design space. These designs are then empirically compared through a series of controlled experiments using multiple tasks. The tasks are carefully selected from a task taxonomy, which helps to avoid piecemeal understanding of the effect of the designs. Our design practices and empirical evaluations result in a set of design guidelines on how to choose proper designs according to the characteristics of the tasks and the users. Finally, we demonstrate how to apply the design guidelines to prototype a complex interface for a specific video surveillance application.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

34

But, Jason. "A novel MPEG-1 partial encryption scheme for the purposes of streaming video." Monash University, Dept. of Electrical and Computer Systems Engineering, 2004. http://arrow.monash.edu.au/hdl/1959.1/9709.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Lopes, Jose E. F. C. "Audio-coupled video content understanding of unconstrained video sequences." Thesis, Loughborough University, 2011. https://dspace.lboro.ac.uk/2134/8306.

Full text

Abstract:

Unconstrained video understanding is a difficult task. The main aim of this thesis is to recognise the nature of objects, activities and environment in a given video clip using both audio and video information. Traditionally, audio and video information has not been applied together for solving such complex task, and for the first time we propose, develop, implement and test a new framework of multi-modal (audio and video) data analysis for context understanding and labelling of unconstrained videos. The framework relies on feature selection techniques and introduces a novel algorithm (PCFS) that is faster than the well-established SFFS algorithm. We use the framework for studying the benefits of combining audio and video information in a number of different problems. We begin by developing two independent content recognition modules. The first one is based on image sequence analysis alone, and uses a range of colour, shape, texture and statistical features from image regions with a trained classifier to recognise the identity of objects, activities and environment present. The second module uses audio information only, and recognises activities and environment. Both of these approaches are preceded by detailed pre-processing to ensure that correct video segments containing both audio and video content are present, and that the developed system can be made robust to changes in camera movement, illumination, random object behaviour etc. For both audio and video analysis, we use a hierarchical approach of multi-stage classification such that difficult classification tasks can be decomposed into simpler and smaller tasks. When combining both modalities, we compare fusion techniques at different levels of integration and propose a novel algorithm that combines advantages of both feature and decision-level fusion. The analysis is evaluated on a large amount of test data comprising unconstrained videos collected for this work. We finally, propose a decision correction algorithm which shows that further steps towards combining multi-modal classification information effectively with semantic knowledge generates the best possible results.

APA, Harvard, Vancouver, ISO, and other styles

36

Barannik, Vlad, Y. Babenko, S. Shulgin, and M. Parkhomenko. "Video encoding to increase video availability in telecommunication systems." Thesis, Taras Shevchenko National University of Kyiv, 2020. https://openarchive.nure.ua/handle/document/16582.

Full text

Abstract:

Article shows presence of the imbalance caused by insufficient level of productivity of modern and perspective information communication technologies concerning information intensity of bit streams. It is described how imbalance level reducing can be formed as a result of increasing of information processing technologies efficiency and that JPEG platform is the basic concept for construction of technologies of compression representation. Therefore it is proposed to provide further development of video processing methods using individual components of the JPEG platform for improving the integrity of information in terms of ensuring the required level of its availability.:

APA, Harvard, Vancouver, ISO, and other styles

37

Park, Dong-Jun. "Video event detection framework on large-scale video data." Diss., University of Iowa, 2011. https://ir.uiowa.edu/etd/2754.

Full text

Abstract:

Detection of events and actions in video entails substantial processing of very large, even open-ended, video streams. Video data presents a unique challenge for the information retrieval community because properly representing video events is challenging. We propose a novel approach to analyze temporal aspects of video data. We consider video data as a sequence of images that form a 3-dimensional spatiotemporal structure, and perform multiview orthographic projection to transform the video data into 2-dimensional representations. The projected views allow a unique way to rep- resent video events and capture the temporal aspect of video data. We extract local salient points from 2D projection views and perform detection-via-similarity approach on a wide range of events against real-world surveillance data. We demonstrate our example-based detection framework is competitive and robust. We also investigate the synthetic example driven retrieval as a basis for query-by-example.

APA, Harvard, Vancouver, ISO, and other styles

38

Nouri, Marwen. "Propagation de Marquages pour le Matting Vidéo." Phd thesis, Université René Descartes - Paris V, 2013. http://tel.archives-ouvertes.fr/tel-00799753.

Full text

Abstract:

Cette thèse porte sur l'élaboration d'un système de manipulation de vidéo. De manière plus précise il s'agit d'extraction et de composition d'objets vidéo. Dans le domaine du traitement d'image fixe, les techniques d'extraction et de démélange (connus sous le nom de matting) et de composition ont vu une réelle amélioration au cours de la dernière décennie, surtout avec l'apparition de méthodes semi-automatiques profitant d'une interaction avec l'utilisateur pour surmonter le gap sémantique. Cela a permis d'aboutir à des algorithmes de plus en plus rapides et de plus en plus robustes. Dans le cadre du traitement de vidéo, cette problématique forme encore un très intéressant challenge, issu du caractère volumineux, en termes complexité de données et de nombre d'images dans la vidéo. Cet élément fait en sorte que la tâche accomplie par l'utilisateur pour marquer un objet d'intérêt peut être très fastidieuse ou souvent impossible. Les travaux que nous avons réalisés au cours de cette thèse se sont concentrés sur l'extension et l'adaptation de la transformée en distance et des courbes actives pour la propagation des marquages d'objets vidéo. Nous avons aussi proposé une amélioration d'une technique pouvant être utilisée avec ces marquages pour l'extraction d'objet vidéo.Dans le premier chapitre nous présentons le contexte et la problématique de nos travaux. Dans le deuxième chapitre nous faisons un tour d'horizon des approches, des outils d'édition de vidéo existant sur le marché, tout en les classant en deux familles : édition par morceaux ou par blocs et édition par objets vidéo. Ensuite, nous présentons un rapide état de l'art sur la segmentation que nous décomposons en trois parties : la segmentation classique, la segmentation interactive et l'image matting. Aussi nous détaillons l'extension de l'image matting au video matting en présentant les principales approches existantes. Le chapitre 3 présente notre première approche pour la propagation de marquage dans les vidéos. Cette approche est une approche volumique 2D+T tirant sa puissance de ce que nous avons bâti une CDT (transformée en distance couleur). Le chapitre 4, lui, présente notre évolution de perception vers un processus de propagation de marquages plus robuste et plus performant basé sur les courbes actives. Nous commençons par faire un état de l'art abrégé sur les courbes actives et nous présentons par la suite notre modélisation et son application. Nous détaillons, aussi le mécanisme de gestion dynamique des poids que nous avons mis en place. Dans le chapitre 5, nous allons discuter de l'application de notre système pour le matting vidéo et nous présentons les améliorations que nous avons apportés à l'approche Spectral Matting, dans ce but

APA, Harvard, Vancouver, ISO, and other styles

39

Barrios, Núñez Juan Manuel. "Content-based video copy detection." Tesis, Universidad de Chile, 2013. http://www.repositorio.uchile.cl/handle/2250/115521.

Full text

Abstract:

Doctor en Ciencias, Mención Computación
La cantidad y el uso de videos en Internet ha aumentado exponencialmente durante los últimos años. La investigación académica en tópicos de videos se ha desarrollado durante décadas, sin embargo la actual ubicuidad de los videos presiona por el desarrollo de nuevos y mejores algoritmos. Actualmente existen variadas necesidades por satisfacer y muchos problemas abiertos que requieren de investigación científica. En particular, la Detección de Copias de Video (DCV) aborda la necesidad de buscar los videos que son copia de un documento original. El proceso de detección compara el contenido de los videos en forma robusta a diferentes transformaciones audiovisuales. Esta tesis presenta un sistema de DCV llamado P-VCD, el cual utiliza algoritmos y técnicas novedosas para lograr alta efectividad y eficiencia. Esta tesis se divide en dos partes. La primera parte se enfoca en el estado del arte, donde se revisan técnicas comunes de procesamiento de imágenes y búsqueda por similitud, se analiza la definición y alcance de la DCV, y se presentan técnicas actuales para resolver este problema. La segunda parte de esta tesis detalla el trabajo realizado y sus contribuciones al estado del arte, analizando cada una de las tareas que componen esta solución, a saber: preprocesamiento de videos, segmentación de videos, extracción de características, búsqueda por similitud y localización de copias. En relación a la efectividad, se desarrollan las ideas de normalización de calidad de videos, descripción múltiple de contenidos, combinación de distancias, y uso de distancias métricas versus no-métricas. Como resultado se proponen las técnicas de creación automática de descriptores espacio-temporales a partir de descriptores de fotogramas, descriptores de audio combinables con descriptores visuales, selección automática de pesos, y distancia espacio-temporal para combinación de descriptores. En relación a la eficiencia, se desarrollan los enfoques de espacios métricos y tabla de pivotes para acelerar las búsquedas. Como resultado se proponen una búsqueda aproximada utilizando objetos pivotes para estimar y descartar distancias, búsquedas multimodales en grandes colecciones, y un índice que explota la similitud entre objetos de consulta consecutivos. Esta tesis ha sido evaluada usando la colección MUSCLE-VCD-2007 y participando en las evaluaciones TRECVID 2010 y 2011. El desempeño logrado en estas evaluaciones es satisfactorio. En el caso de MUSCLE-VCD-2007 se supera el mejor resultado publicado para esa colección, logrando la máxima efectividad posible, mientras que en el caso de TRECVID se obtiene una performance competitiva con otros sistemas del estado del arte.

APA, Harvard, Vancouver, ISO, and other styles

40

Dye, Brigham R. "Reliability of Pre-Service Teachers Coding of Teaching Videos Using Video-Annotation Tools." BYU ScholarsArchive, 2007. https://scholarsarchive.byu.edu/etd/990.

Full text

Abstract:

Teacher education programs that aspire to helping pre-service teachers develop expertise must help students engage in deliberate practice along dimensions of teaching expertise. However, field teaching experiences often lack the quantity and quality of feedback that is needed to help students engage in meaningful teaching practice. The limited availability of supervising teachers makes it difficult to personally observe and evaluate each student teacher's field teaching performances. Furthermore, when a supervising teacher debriefs such an observation, the supervising teacher and student may struggle to communicate meaningfully about the teaching performance. This is because the student teacher and supervisor often have very different perceptions of the same teaching performance. Video analysis tools show promise for improving the quality of feedback student teachers receive in their teaching performance by providing a common reference for evaluative debriefing and allowing students to generate their own feedback by coding videos of their own teaching. This study investigates the reliability of pre-service teacher coding using a video analysis tool. This study found that students were moderately reliable coders when coding video of an expert teacher (49%-68%). However, when the reliability of student coding of their own teaching videos was audited, students showed a high degree of accuracy (91%). These contrasting findings suggest that coding reliability scores may not be simple indicators of student understanding of the teaching competencies represented by a coding scheme. Instead, reliability scores may also be subject to the influence of extraneous factors. For example, reliability scores in this study were influenced by differences in the technical aspects of how students implemented the coding system. Furthermore, reliability scores were influenced by how coding proficiency was measured. Because this study also suggests that students can be taught to improve their coding reliability, further research may improve reliability scores"-and make them a more valid reflection of student understanding of teaching competency-"by training students about the technical aspects of implementing a coding system.

APA, Harvard, Vancouver, ISO, and other styles

41

Corbillon, Xavier. "Enable the next generation of interactive video streaming." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2018. http://www.theses.fr/2018IMTA0103/document.

Full text

Abstract:

Les vidéos omnidirectionnelles, également appelées vidéos sphériques ou vidéos360°, sont des vidéos avec des pixels enregistrés dans toutes les directions de l’espace. Un utilisateur qui regarde un tel contenu avec un Casques de Réalité Virtuelle (CRV) peut sélectionner la partie de la vidéo à afficher, usuellement nommée viewport, en bougeant la tête. Pour se sentir totalement immergé à l’intérieur du contenu, l’utilisateur a besoin de voir au moins 90 viewports par seconde en 4K. Avec les technologies de streaming traditionnelles, fournir une telle qualité nécessiterait un débit de plus de100 Mbit s−1, ce qui est bien trop élevé. Dans cette thèse, je présente mes contributions pour rendre possible le streaming de vidéos omnidirectionnelles hautement immersives sur l’Internet. On peut distinguer six contributions : une proposition d’architecture de streaming viewport adaptatif réutilisant une partie des technologies existantes ; une extension de cette architecture pour des vidéos à six degrés de liberté ; deux études théoriques des vidéos à qualité spatiale non-homogène; un logiciel open source de manipulation des vidéos 360°; et un jeu d’enregistrements de déplacements d’utilisateurs regardant des vidéos 360°
Omnidirectional videos, also denoted as spherical videos or 360° videos, are videos with pixels recorded from a given viewpoint in every direction of space. A user watching such an omnidirectional content with a Head Mounted Display (HMD) can select the portion of the videoto display, usually denoted as viewport, by moving her head. To feel high immersion inside the content a user needs to see viewport with 4K resolutionand 90 Hz frame rate. With traditional streaming technologies, providing such quality would require a data rate of more than 100 Mbit s−1, which is far too high compared to the median Internet access band width. In this dissertation, I present my contributions to enable the streaming of highly immersive omnidirectional videos on the Internet. We can distinguish six contributions : a viewport-adaptive streaming architecture proposal reusing a part of existing technologies ; an extension of this architecture for videos with six degrees of freedom ; two theoretical studies of videos with non homogeneous spatial quality ; an open-source software for handling 360° videos ; and a dataset of recorded users’ trajectories while watching 360° videos

APA, Harvard, Vancouver, ISO, and other styles

42

Cain, Julia. "Understanding film and video as tools for change : applying participatory video and video advocacy in South Africa." Thesis, Stellenbosch : Stellenbosch University, 2009. http://hdl.handle.net/10019.1/1431.

Full text

Abstract:

Thesis (DPhil (Drama))--Stellenbosch University, 2009.
The purpose of this study is to examine critically the phenomenon of participatory video and to situate within this the participatory video project that was initiated as part of this study in the informal settlement area of Kayamandi, South Africa. The overall objective of the dissertation is to consider the potential of participatory video within current-day South Africa towards enabling marginalised groups to represent themselves and achieve social change. As will be shown, the term ‘participatory video’ has been used broadly and applied to many different types of video products and processes. For the preliminary purposes of this dissertation, participatory video is defined as any video (or film) process dedicated to achieving change through which the subject(s) has been an integral part of the planning and/or production, as well as a primary end-user or target audience. The two key elements that distinguish participatory video are thus (1) understanding video (or film) as a tool for social change; and (2) understanding participation by the subject as integral to the video process. An historical analysis thus considers various filmmaking developments that fed into the emergence of participatory video. These include various film practices that used film as a tool for change -- from soviet agitprop through to the documentary movement of the 1930s, as well as various types of filmmaking in the 1960s that opened up questions of participation. The Fogo process, developed in the late 1960s, marked the start of participatory video and video advocacy and provided guiding principles for the Kayamandi project initiated as part of this dissertation. Practitioners of the Fogo process helped initiate participatory video practice in South Africa when they brought the process to South African anti-apartheid activists in the early 1970s. The Kayamandi Participatory Video Project draws on this background and context in its planned methodology and its implementation. Out of this, various theoretical issues arising from participatory video practice contextualise a reflection and an analysis of the Kayamandi project. Lastly, this study draws conclusions and recommendations on participatory video practice in South Africa.

APA, Harvard, Vancouver, ISO, and other styles

43

He, Chao. "Advanced wavelet application for video compression and video object tracking." Connect to resource, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1125659908.

Full text

Abstract:

Thesis (Ph. D.)--Ohio State University, 2005.
Title from first page of PDF file. Document formatted into pages; contains xvii, 158 p.; also includes graphics (some col.). Includes bibliographical references (p. 150-158). Available online via OhioLINK's ETD Center

APA, Harvard, Vancouver, ISO, and other styles

44

Kozica, Ermin. "Paradigms for Real-Time Video Communication and for Video Distribution." Doctoral thesis, KTH, Ljud- och bildbehandling, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-32203.

Full text

Abstract:

The use of new information technologies has drastically changed the way that we lead our lives. Communication technologies in particular have had a great impact on our day-to-day behavior. For example, it is now common to hear the voice and see the face of our loved-ones on another continents, or work with colleagues across the globe on a daily basis. With this change in behavior and the fast adoption of emerging technologies, new challenges in the telecommunications area are arising. This thesis is concerned with two such challenges: real-time video communication and video distribution. The latency constraint in real-time video communication is in essence incompatible with the uncertainty of best-effort networks, such as the Internet. The recent arrival of smart-phones has added another requirement to the application, in terms of the limited computational and battery power. The research community has invested a large amount of effort in developing techniques that allow a mobile sender to outsource video encoding complexity to an unconstrained receiver by means of a feedback channel. We question that approach with respect to real-time applications, arguing that long round-trip-times may render any feedback unusable at best, and costly in practice. We investigate the effect of channel round-trip-times on the popular distributed video coding setup, as well as on the traditional hybrid video coding architecture. Using a simple analytical framework, we propose the use of systems that adapt to the video content and the network in real- time. Our results show that substantial improvements in video quality can be achieved when the feedback channel is used correctly. The use of mobile devices has also a significant impact on the application of video distribution. In general, the multitude of devices that can be used to download and view video places new requirements on video distribution systems. The system must not only be able to scale to a large number of receivers in a bandwidth efficient manner, it must also support a wide range of network capacities and display capabilities. We address this problem by optimizing the set of rates that is used to provide video to receivers with heterogeneous requirements. Our approach is based on a favorable interpretation of the underlying mathematical problem, allowing the use of well-known quantization theoretic concepts. The resulting solution provides the possibility to design video distribution systems that adapt to changes in receiver characteristics online, with minimal delay.
QC 20110411

APA, Harvard, Vancouver, ISO, and other styles

45

Chhina, Gagun S. "Video gaming parlours : the emergence of video gaming in India." Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/video-gaming-parlours-the-emergence-of-video-gaming-in-india(75217f0f-c060-4c68-b708-ed496b3988e1).html.

Full text

Abstract:

This thesis critically interrogates the role of local context in the adoption and interpretation of video technology and gaming practices in the little studied locale of India. Video gaming is a recent phenomenon in India which has been rapidly increasing in popularity, yet it has gained little academic attention in digital gaming research. The project seeks to understand the emergence of practices of consumption of video games in India from the point of view of Indians themselves through the exploration of how Indian video gamers situate, interpret and negotiate the practice of video game play. In his book Video Gamers (2012), Gary Crawford makes a case for analysing game play as a practice, situated within everyday experiences and social networks. Crawford identifies two deficiencies in gaming studies: the dominance of a Western-centric viewpoint and the disregard for player context. This research addresses these shortcomings in two ways. First, through situating the field research in Chandigarh. Second, by employing a mixed methods qualitative approach - observations, interviews, focus groups, field notes, pictures and video recordings – to elicit the detail of the gamers' cultural context. Situating these practices within the broader social, historical, geographical and cultural milieu allows for the conceptualisation of contextual factors in terms of their influence on the adoption and interpretation of the global gaming practice in a local setting. These methods allow for the examination of, first, multiple culturally embedded factors and, second, the players' processes of sense making applied to video gaming. Each method makes the social world of the gamers visible in different ways. Fieldwork predominantly took place in video gaming parlours. Investigating game players in the space of the video gaming parlour enabled a more honed understanding of how the practice of video gaming was ‘glocalised’ within particular social, geographical and cultural contexts. A smaller second study was conducted in Manchester, to collect data in a setting that is culturally different from India. This contrasting data provided greater sensitivity to cultural factors in India which might have otherwise been overlooked or which had been obscured. The research draws theoretically upon Bourdieu’s theories of social field, habitus, and capital, combining these with Goffman’s notions of dramaturgy and framing, and Robertson’s concept of glocalisation. These concepts provided a theoretical framework that enabled a interrogation of the data to reveal the sociocultural processes embedded in the gaming parlours, and the individual’s creative engagements with video game products themselves. The methodological and theoretical framework, then, were complementary, offering both an experiential and contextual approach. This study found that video gamers interpret and make sense of the practice of video gaming through their contextual situation, and that they will both consciously and unconsciously attempt to glocalise the practice of video gaming so that it becomes culturally more acceptable.

APA, Harvard, Vancouver, ISO, and other styles

46

Keen, Seth. "Video chaos : multilinear narrative structuration in new media video practice /." Electronic version, 2005. http://adt.lib.uts.edu.au/public/adt-NTSM20050921.151215/index.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Chen, Liyong. "Joint image/video inpainting for error concealment in video coding." Click to view the E-thesis via HKUTO, 2007. http://sunzi.lib.hku.hk/HKUTO/record/B39558915.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Chen, Liyong, and 陳黎勇. "Joint image/video inpainting for error concealment in video coding." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2007. http://hub.hku.hk/bib/B39558915.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Di, Caterina Gaetano. "Video analyytics algorithms and distributed solutions for smart video surveillance." Thesis, University of Strathclyde, 2013. http://oleg.lib.strath.ac.uk:80/R/?func=dbin-jump-full&object_id=18949.

Full text

Abstract:

The growth in the number of surveillance cameras deployed and the progress in digital technologies in recent years have steered the video surveillance market towards the usage of computer systems to automatically analyse video feeds in a collaborative and distributive fashion. The semantic analysis and interpretation of video surveillance data through signal and image processing techniques is called Video Analytics (VA). In this thesis new video analytics methods are presented that are shown to be effective and efficient when compared to existing methods. A novel adaptive template matching algorithm for robust target tracking based on a modifed Sum of Absolute Differences (SAD) called Sum of Weighted Absolute Differences (SWAD) is developed. A Gaussian weighting kernel is employed to reduce the effects of partial occlusion, while the target template is updated using an Infinite Impulse Response (IIR) Filter. Experimental results demonstrate that the SWAD-based tracker outperforms conventional SAD in terms of efficiency and accuracy, and its performance is comparable to more complex trackers. Moreover, a novel technique for complete occlusion handling in the context of such a SWAD-based tracker is presented that is shown to preserve the template and recover the target after complete occlusion. A DSP embedded implementation of the SWAD-based tracker is then described, showing that such an algorithm is ideal for real-time implementations on devices with low computational capabilities, as in the case of xed-point embedded DSP platforms. When colour is selected as target feature to track, the mean shift (MS) tracker can be used. Although it has been shown to be fast, effective and robust in many scenarios, it fails in case of severe and complete occlusion or fast moving targets. A new improved MS tracker is presented which incorporates a failure recovery strategy. The improved MS is simple and fast, and experimental results show that it can effectively recover a target after complete occlusion or loss, to successfully track target in complex scenarios, such as crowd scenes. Although many methods have been proposed in the literature to detect abandoned and removed objects, they are not really designed to be able to trigger alerts within a time interval defined by the user. It is actually the background model updating procedure that dictates when the alerts are triggered. A novel algorithm for abandoned and removed object detection in real-time is presented. A detection time can be directly specified and the background is "healed" only after a new event has been detected. Moreover the actual detection time and the background model updating rate are computed in an adaptive way with respect to the algorithm frame processing rate, so that even on different machines the detection time is generally the same. This is in contrast with other algorithms, where either the frame rate or the background updating rate is considered to be fixed. The algorithm is employed in the context of a reactive smart surveillance system, which notified the occurrence of events of interest to registered users, within seconds, through SMS alerts. In the context of multi-camera systems, spatio-temporal information extracted from a set of semantically clustered cameras can be fused together and exploited, to achieve a better understanding of the environment surrounding the cameras and monitor areas wider than a single camera FOV. A highly flexible decentralised system software architecture is presented, for decentralised multi-view target tracking, where synchronisation constraints among processes can be relaxed. The improved MS tracker is extended to a collaborative multi-camera environment, wherein algorithm parameters are set automatically in separate views, upon colour characteristics of the target. Such a decentralised multi-Such a decentralised multi-camera tracking system does not rely on camera positional information to initialise the trackers or handle camera hand off events. Tracking in separate camera views is performed solely on the visible characteristics of the target, reducing the system setup phase to the minimum. Such a system can automatically select from a set of views, the one that gives the best visualisation of the target. Moreover, camera overlapping information can be exploited to overcome target occlusion.

APA, Harvard, Vancouver, ISO, and other styles

50

Bai, Yannan. "Video analytics system for surveillance videos." Thesis, 2018. https://hdl.handle.net/2144/30739.

Full text

Abstract:

Developing an intelligent inspection system that can enhance the public safety is challenging. An efficient video analytics system can help monitor unusual events and mitigate possible damage or loss. This thesis aims to analyze surveillance video data, report abnormal activities and retrieve corresponding video clips. The surveillance video dataset used in this thesis is derived from ALERT Dataset, a collection of surveillance videos at airport security checkpoints. The video analytics system in this thesis can be thought as a pipelined process. The system takes the surveillance video as input, and passes it through a series of processing such as object detection, multi-object tracking, person-bin association and re-identification. In the end, we can obtain trajectories of passengers and baggage in the surveillance videos. Abnormal events like taking away other's belongings will be detected and trigger the alarm automatically. The system could also retrieve the corresponding video clips based on user-defined query.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Video'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles