Academic literature on the topic 'MULTI VIEW VIDEOS'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'MULTI VIEW VIDEOS.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "MULTI VIEW VIDEOS"

1

Luo, Lei, Rong Xin Jiang, Xiang Tian, and Yao Wu Chen. "Reference Viewpoints Selection for Multi-View Video Plus Depth Coding Based on the Network Bandwidth Constraint." Applied Mechanics and Materials 303-306 (February 2013): 2134–38. http://dx.doi.org/10.4028/www.scientific.net/amm.303-306.2134.

Full text
Abstract:
In multi-view video plus depth (MVD) coding based free viewpoint video applications, a few reference viewpoints’ texture and depth videos should be compressed and transmitted at the server side. At the terminal side, the display view videos could be the decoded reference view videos or the virtual viewpoints’ videos which are synthesized by DIBR technology. The entire video quality of all display views are decided by the number of reference viewpoints and the compression distortion of each reference viewpoint’s texture and depth videos. This paper studies the impact of the reference viewpoints selection on the entire video quality of all display views. The results show that depending on the available network bandwidth, the MVD coding requires different selections of reference viewpoints to maximize the entire video quality of all display views.
APA, Harvard, Vancouver, ISO, and other styles
2

Chen, Jiawei, Zhenshi Zhang, and Xupeng Wen. "Target Identification via Multi-View Multi-Task Joint Sparse Representation." Applied Sciences 12, no. 21 (October 28, 2022): 10955. http://dx.doi.org/10.3390/app122110955.

Full text
Abstract:
Recently, the monitoring efficiency and accuracy of visible and infrared video have been relatively low. In this paper, we propose an automatic target identification method using surveillance video, which provides an effective solution for the surveillance video data. Specifically, a target identification method via multi-view and multi-task sparse learning is proposed, where multi-view includes various types of visual features such as textures, edges, and invariant features. Each view of a candidate is regarded as a template, and the potential relationship between different tasks and different views is considered. These multiple views are integrated into the multi-task spare learning framework. The proposed MVMT method can be applied to solve the ship’s identification. Extensive experiments are conducted on public datasets, and custom sequence frames (i.e., six sequence frames from ship videos). The experimental results show that the proposed method is superior to other classical methods, qualitatively and quantitatively.
APA, Harvard, Vancouver, ISO, and other styles
3

Zhong, Chengzhang, Amy R. Reibman, Hansel A. Mina, and Amanda J. Deering. "Multi-View Hand-Hygiene Recognition for Food Safety." Journal of Imaging 6, no. 11 (November 7, 2020): 120. http://dx.doi.org/10.3390/jimaging6110120.

Full text
Abstract:
A majority of foodborne illnesses result from inappropriate food handling practices. One proven practice to reduce pathogens is to perform effective hand-hygiene before all stages of food handling. In this paper, we design a multi-camera system that uses video analytics to recognize hand-hygiene actions, with the goal of improving hand-hygiene effectiveness. Our proposed two-stage system processes untrimmed video from both egocentric and third-person cameras. In the first stage, a low-cost coarse classifier efficiently localizes the hand-hygiene period; in the second stage, more complex refinement classifiers recognize seven specific actions within the hand-hygiene period. We demonstrate that our two-stage system has significantly lower computational requirements without a loss of recognition accuracy. Specifically, the computationally complex refinement classifiers process less than 68% of the untrimmed videos, and we anticipate further computational gains in videos that contain a larger fraction of non-hygiene actions. Our results demonstrate that a carefully designed video action recognition system can play an important role in improving hand hygiene for food safety.
APA, Harvard, Vancouver, ISO, and other styles
4

Kumar, Yaman, Rohit Jain, Khwaja Mohd Salik, Rajiv Ratn Shah, Yifang Yin, and Roger Zimmermann. "Lipper: Synthesizing Thy Speech Using Multi-View Lipreading." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 2588–95. http://dx.doi.org/10.1609/aaai.v33i01.33012588.

Full text
Abstract:
Lipreading has a lot of potential applications such as in the domain of surveillance and video conferencing. Despite this, most of the work in building lipreading systems has been limited to classifying silent videos into classes representing text phrases. However, there are multiple problems associated with making lipreading a text-based classification task like its dependence on a particular language and vocabulary mapping. Thus, in this paper we propose a multi-view lipreading to audio system, namely Lipper, which models it as a regression task. The model takes silent videos as input and produces speech as the output. With multi-view silent videos, we observe an improvement over single-view speech reconstruction results. We show this by presenting an exhaustive set of experiments for speaker-dependent, out-of-vocabulary and speaker-independent settings. Further, we compare the delay values of Lipper with other speechreading systems in order to show the real-time nature of audio produced. We also perform a user study for the audios produced in order to understand the level of comprehensibility of audios produced using Lipper.
APA, Harvard, Vancouver, ISO, and other styles
5

Ata, Sezin Kircali, Yuan Fang, Min Wu, Jiaqi Shi, Chee Keong Kwoh, and Xiaoli Li. "Multi-View Collaborative Network Embedding." ACM Transactions on Knowledge Discovery from Data 15, no. 3 (April 12, 2021): 1–18. http://dx.doi.org/10.1145/3441450.

Full text
Abstract:
Real-world networks often exist with multiple views, where each view describes one type of interaction among a common set of nodes. For example, on a video-sharing network, while two user nodes are linked, if they have common favorite videos in one view, then they can also be linked in another view if they share common subscribers. Unlike traditional single-view networks, multiple views maintain different semantics to complement each other. In this article, we propose M ulti-view coll A borative N etwork E mbedding (MANE), a multi-view network embedding approach to learn low-dimensional representations. Similar to existing studies, MANE hinges on diversity and collaboration—while diversity enables views to maintain their individual semantics, collaboration enables views to work together. However, we also discover a novel form of second-order collaboration that has not been explored previously, and further unify it into our framework to attain superior node representations. Furthermore, as each view often has varying importance w.r.t. different nodes, we propose MANE , an attention -based extension of MANE, to model node-wise view importance. Finally, we conduct comprehensive experiments on three public, real-world multi-view networks, and the results demonstrate that our models consistently outperform state-of-the-art approaches.
APA, Harvard, Vancouver, ISO, and other styles
6

Pan, Yingwei, Yue Chen, Qian Bao, Ning Zhang, Ting Yao, Jingen Liu, and Tao Mei. "Smart Director: An Event-Driven Directing System for Live Broadcasting." ACM Transactions on Multimedia Computing, Communications, and Applications 17, no. 4 (November 30, 2021): 1–18. http://dx.doi.org/10.1145/3448981.

Full text
Abstract:
Live video broadcasting normally requires a multitude of skills and expertise with domain knowledge to enable multi-camera productions. As the number of cameras keeps increasing, directing a live sports broadcast has now become more complicated and challenging than ever before. The broadcast directors need to be much more concentrated, responsive, and knowledgeable, during the production. To relieve the directors from their intensive efforts, we develop an innovative automated sports broadcast directing system, called Smart Director, which aims at mimicking the typical human-in-the-loop broadcasting process to automatically create near-professional broadcasting programs in real-time by using a set of advanced multi-view video analysis algorithms. Inspired by the so-called “three-event” construction of sports broadcast [ 14 ], we build our system with an event-driven pipeline consisting of three consecutive novel components: (1) the Multi-View Event Localization to detect events by modeling multi-view correlations, (2) the Multi-View Highlight Detection to rank camera views by the visual importance for view selection, and (3) the Auto-Broadcasting Scheduler to control the production of broadcasting videos. To our best knowledge, our system is the first end-to-end automated directing system for multi-camera sports broadcasting, completely driven by the semantic understanding of sports events. It is also the first system to solve the novel problem of multi-view joint event detection by cross-view relation modeling. We conduct both objective and subjective evaluations on a real-world multi-camera soccer dataset, which demonstrate the quality of our auto-generated videos is comparable to that of the human-directed videos. Thanks to its faster response, our system is able to capture more fast-passing and short-duration events which are usually missed by human directors.
APA, Harvard, Vancouver, ISO, and other styles
7

Salik, Khwaja Mohd, Swati Aggarwal, Yaman Kumar, Rajiv Ratn Shah, Rohit Jain, and Roger Zimmermann. "Lipper: Speaker Independent Speech Synthesis Using Multi-View Lipreading." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 10023–24. http://dx.doi.org/10.1609/aaai.v33i01.330110023.

Full text
Abstract:
Lipreading is the process of understanding and interpreting speech by observing a speaker’s lip movements. In the past, most of the work in lipreading has been limited to classifying silent videos to a fixed number of text classes. However, this limits the applications of the lipreading since human language cannot be bound to a fixed set of words or languages. The aim of this work is to reconstruct intelligible acoustic speech signals from silent videos from various poses of a person which Lipper has never seen before. Lipper, therefore is a vocabulary and language agnostic, speaker independent and a near real-time model that deals with a variety of poses of a speaker. The model leverages silent video feeds from multiple cameras recording a subject to generate intelligent speech of a speaker. It uses a deep learning based STCNN+BiGRU architecture to achieve this goal. We evaluate speech reconstruction for speaker independent scenarios and demonstrate the speech output by overlaying the audios reconstructed by Lipper on the corresponding videos.
APA, Harvard, Vancouver, ISO, and other styles
8

Obayashi, Mizuki, Shohei Mori, Hideo Saito, Hiroki Kajita, and Yoshifumi Takatsume. "Multi-View Surgical Camera Calibration with None-Feature-Rich Video Frames: Toward 3D Surgery Playback." Applied Sciences 13, no. 4 (February 14, 2023): 2447. http://dx.doi.org/10.3390/app13042447.

Full text
Abstract:
Mounting multi-view cameras within a surgical light is a practical choice since some cameras are expected to observe surgery with few occlusions. Such multi-view videos must be reassembled for easy reference. A typical way is to reconstruct the surgery in 3D. However, the geometrical relationship among cameras is changed because each camera independently moves every time the lighting is reconfigured (i.e., every time surgeons touch the surgical light). Moreover, feature matching between surgical images is potentially challenging because of missing rich features. To address the challenge, we propose a feature-matching strategy that enables robust calibration of the multi-view camera system by collecting a set of a small number of matches over time while the cameras stay stationary. Our approach would enable conversion from multi-view videos to a 3D video. However, surgical videos are long and, thus, the cost of the conversion rapidly grows. Therefore, we implement a video player where only selected frames are converted to minimize time and data until playbacks. We demonstrate that sufficient calibration quality with real surgical videos can lead to a promising 3D mesh and a recently emerged 3D multi-layer representation. We reviewed comments from surgeons to discuss the differences between those 3D representations on an autostereoscopic display with respect to medical usage.
APA, Harvard, Vancouver, ISO, and other styles
9

Ming Du, Aswin C. Sankaranarayanan, and Rama Chellappa. "Robust Face Recognition From Multi-View Videos." IEEE Transactions on Image Processing 23, no. 3 (March 2014): 1105–17. http://dx.doi.org/10.1109/tip.2014.2300812.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Mallik, Bruhanth, Akbar Sheikh-Akbari, Pooneh Bagheri Zadeh, and Salah Al-Majeed. "HEVC Based Frame Interleaved Coding Technique for Stereo and Multi-View Videos." Information 13, no. 12 (November 25, 2022): 554. http://dx.doi.org/10.3390/info13120554.

Full text
Abstract:
The standard HEVC codec and its extension for coding multiview videos, known as MV-HEVC, have proven to deliver improved visual quality compared to its predecessor, H.264/MPEG-4 AVC’s multiview extension, H.264-MVC, for the same frame resolution with up to 50% bitrate savings. MV-HEVC’s framework is similar to that of H.264-MVC, which uses a multi-layer coding approach. Hence, MV-HEVC would require all frames from other reference layers decoded prior to decoding a new layer. Thus, the multi-layer coding architecture would be a bottleneck when it comes to quicker frame streaming across different views. In this paper, an HEVC-based Frame Interleaved Stereo/Multiview Video Codec (HEVC-FISMVC) that uses a single layer encoding approach to encode stereo and multiview video sequences is presented. The frames of stereo or multiview video sequences are interleaved in such a way that encoding the resulting monoscopic video stream would maximize the exploitation of temporal, inter-view, and cross-view correlations and thus improving the overall coding efficiency. The coding performance of the proposed HEVC-FISMVC codec is assessed and compared with that of the standard MV-HEVC’s performance for three standard multi-view video sequences, namely: “Poznan_Street”, “Kendo” and “Newspaper1”. Experimental results show that the proposed codec provides more substantial coding gains than the anchor MV-HEVC for coding both stereo and multi-view video sequences.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "MULTI VIEW VIDEOS"

1

Wang, Dongang. "Action Recognition in Multi-view Videos." Thesis, The University of Sydney, 2018. http://hdl.handle.net/2123/19740.

Full text
Abstract:
A long-lasting goal in the field of artificial intelligence is to develop agents that can perceive and understand the rich visual world around us. With the improvement in deep learning and neural networks, many previous difficulties in the computer vision area have been resolved. For example, the accuracy in image classification has even exceeded human being in the ImageNet challenge. However, some issues are still attractive in the community, like action recognition and its application in multi-view videos. Based on a large number of previous works in the last few years, we propose a new Dividing and Aggregating Network (DA-Net) to address the problem of action recognition in multi-view videos in this thesis. First, the DA-Net can learn view-independent representations shared by all views at lower layers and learn one view-specific representation for each view at higher layers. We then train view-specific action classifiers based on the view-specific representation for each view and a view classifier based on the shared representation at lower layers. The view classifier is used to predict how likely each video belongs to each view. Finally, the predicted view probabilities from multiple views are used as the weights when fusing the prediction scores of view-specific action classifiers. We also propose a new approach based on the conditional random field (CRF) formulation to pass message among view-specific representations from different branches to help each other. Comprehensive experiments are conducted accordingly. The experiments on three benchmark datasets clearly demonstrate the effectiveness of our proposed DA-Net for multi-view action recognition. We also conduct the ablation study, which indicates the three modules we proposed can provide steady improvements to the prediction accuracy.
APA, Harvard, Vancouver, ISO, and other styles
2

Canavan, Shaun. "Face recognition by multi-frame fusion of rotating heads in videos /." Connect to resource online, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1210446052.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Canavan, Shaun J. "Face Recognition by Multi-Frame Fusion of Rotating Heads in Videos." Youngstown State University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1210446052.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Balusu, Anusha. "Multi-Vehicle Detection and Tracking in Traffic Videos Obtained from UAVs." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1593266183551245.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Twinanda, Andru Putra. "Vision-based approaches for surgical activity recognition using laparoscopic and RBGD videos." Thesis, Strasbourg, 2017. http://www.theses.fr/2017STRAD005/document.

Full text
Abstract:
Cette thèse a pour objectif la conception de méthodes pour la reconnaissance automatique des activités chirurgicales. Cette reconnaissance est un élément clé pour le développement de systèmes réactifs au contexte clinique et pour des applications comme l’assistance automatique lors de chirurgies complexes. Nous abordons ce problème en utilisant des méthodes de Vision puisque l’utilisation de caméras permet de percevoir l’environnement sans perturber la chirurgie. Deux types de vidéos sont utilisées : des vidéos laparoscopiques et des vidéos multi-vues RGBD. Nous avons d’abord étudié les résultats obtenus avec les méthodes de l’état de l’art, puis nous avons proposé des nouvelles approches basées sur le « Deep learning ». Nous avons aussi généré de larges jeux de données constitués d’enregistrements de chirurgies. Les résultats montrent que nos méthodes permettent d’obtenir des meilleures performances pour la reconnaissance automatique d’activités chirurgicales que l’état de l’art
The main objective of this thesis is to address the problem of activity recognition in the operating room (OR). Activity recognition is an essential component in the development of context-aware systems, which will allow various applications, such as automated assistance during difficult procedures. Here, we focus on vision-based approaches since cameras are a common source of information to observe the OR without disrupting the surgical workflow. Specifically, we propose to use two complementary video types: laparoscopic and OR-scene RGBD videos. We investigate how state-of-the-art computer vision approaches perform on these videos and propose novel approaches, consisting of deep learning approaches, to carry out the tasks. To evaluate our proposed approaches, we generate large datasets of recordings of real surgeries. The results demonstrate that the proposed approaches outperform the state-of-the-art methods in performing surgical activity recognition on these new datasets
APA, Harvard, Vancouver, ISO, and other styles
6

Ozcinar, Cagri. "Multi-view video communication." Thesis, University of Surrey, 2015. http://epubs.surrey.ac.uk/807807/.

Full text
Abstract:
The proliferation of three-dimensional (3D) video technology increases the demand for multiview video (MVV) communication tremendously. Applications that involve MVV constitute the next step in 3D media technology, given that they offer a more realistic viewing experience. The distribution of MVV to users brings significant challenges due to the large volume of data involved and the inherent limitations imposed by the communication protocols. As the number of views increases, current systems will struggle to meet the demand of delivering the MVV at a consistent quality level. To this end, this thesis addresses efficient coding, adaptive streaming, and loss-resilient delivery techniques for MVV. The first contribution of this thesis addresses the problem of cost-efficient transmission of MVV with a provided per-view depth map. The primary goal is to facilitate the delivery of the maximum possible number of MVV streams over the Internet in order to ensure that the MVV reconstruction quality is maximised. Accordingly, a novel view scalable MVV coding approach is introduced, which includes a new view discarding and reconstruction algorithm. The results of extensive experiments demonstrate that the proposed MVV coding scheme has considerably improved rate-distortion (R-D) performance compared to the state-of-the-art standards. The second contribution of this thesis is the design of an adaptive MVV streaming technique that offers uninterrupted high-quality delivery to users. In order to achieve this, a number of novel mechanisms are introduced that can adapt the MVV content to collaborative peer-to-peer (P2P) and server-client multimedia dissemination networks. Experiment results show that the suggested adaptation technique yields a superior playback performance over a broad range of network conditions. The final contribution of this thesis is the design of an error-resilient scheme that addresses packet losses for MVV streaming. The aim is to make the MVV streaming more reliable against communication failures. Simulation results clearly show that the proposed approach outperforms reference solutions by a significant margin, not only objectively, but through subjective testing as well.
APA, Harvard, Vancouver, ISO, and other styles
7

Salvador, Marcos Jordi. "Surface reconstruction for multi-view video." Doctoral thesis, Universitat Politècnica de Catalunya, 2011. http://hdl.handle.net/10803/108907.

Full text
Abstract:
This thesis introduces a methodology for obtaining an alternative representation of video sequences captured by calibrated multi-camera systems in controlled environments with known scene background. This representation consists in a 3D description of the surfaces of foreground objects, which allows for the recovering of part of the 3D information of the original scene lost in the projection process in each camera. The choice of the type of representation and the design of the reconstruction techniques are driven by three requirements that appear in smart rooms or recording studios. In these scenarios, video sequences captured by a multi-camera rig are used both for analysis applications and interactive visualization methods. The requirements are: the reconstruction method must be fast in order to be usable in interactive applications, the surface representation must provide a compression of the multi-view data redundancies and this representation must also provide all the relevant information to be used for analysis applications as well as for free-viewpoint video. Once foreground and background are segregated for each view, the reconstruction process is divided in two stages. The first one obtains a sampling of the foreground surfaces (including orientation and texture), whereas the second provides closed, continuous surfaces from the samples, through interpolation. The sampling process is interpreted as a search for 3D positions that result in feature matchings between different views. This search process can be driven by different mechanisms: an image-based approach, another one based on the deformation of a surface from frame to frame or a statistical sampling approach where samples are searched around the positions of other detected samples, which is the fastest and easiest to parallelize of the three approaches. A meshing algorithm is also presented, which allows for the interpolation of surfaces between samples. Starting by an initial triangle, which connects three points coherently oriented, an iterative expansion of the surface over the complete set of samples takes place. The proposed method presents a very accurate reconstruction and results in a correct topology. Furthermore, it is fast enough to be used interactively. The presented methodology for surface reconstruction permits obtaining a fast, compressed and complete representation of foreground elements in multi-view video, as reflected by the experimental results.
Aquesta tesi presenta diferents tècniques per a la definiciò d’una metodologia per obtenir una representaciò alternativa de les seqüències de vídeo capturades per sistemes multi-càmera calibrats en entorns controlats, amb fons de l’escena conegut. Com el títol de la tesi suggereix, aquesta representació consisteix en una descripció tridimensional de les superfícies dels objectes de primer pla. Aquesta aproximació per la representació de les dades multi-vista permet recuperar part de la informació tridimensional de l’escena original perduda en el procés de projecció que fa cada càmera. L’elecció del tipus de representació i el disseny de les tècniques per la reconstrucció de l’escena responen a tres requeriments que apareixen en entorns controlats del tipus smart room o estudis de gravació, en què les seqüències capturades pel sistema multi-càmera són utilitzades tant per aplicacions d’anàlisi com per diferents mètodes de visualització interactius. El primer requeriment és que el mètode de reconstrucció ha de ser ràpid, per tal de poder-ho utilitzar en aplicacions interactives. El segon és que la representació de les superfícies sigui eficient, de manera que en resulti una compressió de les dades multi-vista. El tercer requeriment és que aquesta representació sigui efectiva, és a dir, que pugui ser utilitzada en aplicacions d’anàlisi, així com per visualitació. Un cop separats els continguts de primer pla i de fons de cada vista –possible en entorns controlats amb fons conegut–, l’estratègia que es segueix en el desenvolupament de la tesi és la de dividir el procés de reconstrucció en dues etapes. La primera consisteix en obtenir un mostreig de les superfícies (incloent orientació i textura). La segona etapa proporciona superfícies tancades, contínues, a partir del conjunt de mostres, mitjançant un procés d’interpolació. El resultat de la primera etapa és un conjunt de punts orientats a l’espai 3D que representen localment la posició, orientació i textura de les superfícies visibles pel conjunt de càmeres. El procés de mostreig s’interpreta com un procés de cerca de posicions 3D que resulten en correspondències de característiques de la imatge entre diferents vistes. Aquest procés de cerca pot ser conduït mitjançant diferents mecanismes, els quals es presenten a la primera part d’aquesta tesi. La primera proposta és fer servir un mètode basat en les imatges que busca mostres de superfície al llarg de la semi-recta que comença al centre de projeccions de cada càmera i passa per un determinat punt de la imatge corresponent. Aquest mètode s’adapta correctament al cas de voler explotar foto-consistència en un escenari estàtic i presenta caracterìstiques favorables per la seva utilizació en GPUs–desitjable–, però no està orientat a explotar les redundàncies temporals existentsen seqüències multi-vista ni proporciona superfícies tancades. El segon mètode efectua la cerca a partir d’una superfície inicial mostrejada que tanca l’espai on es troben els objectes a reconstruir. La cerca en direcció inversa a les normals –apuntant a l’interior– permet obtenir superfícies tancades amb un algorisme que explota la correlació temporal de l’escena per a l’evolució de reconstruccions 3D successives al llarg del temps. Un inconvenient d’aquest mètode és el conjunt d’operacions topològiques sobre la superfície inicial, que en general no són aplicables eficientment en GPUs. La tercera estratègia de mostreig està orientada a la paral·lelització –GPU– i l’explotació de correlacions temporals i espacials en la cerca de mostres de superfície. Definint un espai inicial de cerca que inclou els objectes a reconstruir, es busquen aleatòriament unes quantes mostres llavor sobre la superfície dels objectes. A continuació, es continuen buscant noves mostres de superfície al voltant de cada llavor –procés d’expansió– fins que s’aconsegueix una densitat suficient. Per tal de millorar l’eficiència de la cerca inicial de llavors, es proposa reduir l’espai de cerca, explotant d’una banda correlacions temporals en seqüències multi-vista i de l’altra aplicant multi-resolució. A continuació es procedeix amb l’expansió, que explota la correlació espacial en la distribució de les mostres de superfície. A la segona part de la tesi es presenta un algorisme de mallat que permet interpolar la superfície entre les mostres. A partir d’un triangle inicial, que connecta tres punts coherentment orientats, es procedeix a una expansió iterativa de la superfície sobre el conjunt complet de mostres. En relació amb l’estat de l’art, el mètode proposat presenta una reconstrucció molt precisa (no modifica la posició de les mostres) i resulta en una topologia correcta. A més, és prou ràpid com per ser utilitzable en aplicacions interactives, a diferència de la majoria de mètodes disponibles. Els resultats finals, aplicant ambdues etapes –mostreig i interpolació–, demostren la validesa de la proposta. Les dades experimentals mostren com la metodologia presentada permet obtenir una representació ràpida, eficient –compressió– i efectiva –completa– dels elements de primer pla de l’escena.
APA, Harvard, Vancouver, ISO, and other styles
8

Abdullah, Jan Mirza, and Mahmododfateh Ahsan. "Multi-View Video Transmission over the Internet." Thesis, Linköping University, Department of Electrical Engineering, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-57903.

Full text
Abstract:

3D television using multiple views rendering is receiving increasing interest. In this technology a number of video sequences are transmitted simultaneously and provides a larger view of the scene or stereoscopic viewing experience. With two views stereoscopic rendition is possible. Nowadays 3D displays are available that are capable of displaying several views simultaneously and the user is able to see different views by moving his head.

The thesis work aims at implementing a demonstration system with a number of simultaneous views. The system will include two cameras, computers at both the transmitting and receiving end and a multi-view display. Besides setting up the hardware, the main task is to implement software so that the transmission can be done over an IP-network.

This thesis report includes an overview and experiences of similar published systems, the implementation of real time video, its compression, encoding, and transmission over the internet with the help of socket programming and finally the multi-view display in 3D format.  This report also describes the design considerations more precisely regarding the video coding and network protocols.

APA, Harvard, Vancouver, ISO, and other styles
9

Fecker, Ulrich. "Coding techniques for multi-view video signals /." Aachen : Shaker, 2009. http://d-nb.info/993283179/04.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ozkalayci, Burak Oguz. "Multi-view Video Coding Via Dense Depth Field." Master's thesis, METU, 2006. http://etd.lib.metu.edu.tr/upload/12607517/index.pdf.

Full text
Abstract:
Emerging 3-D applications and 3-D display technologies raise some transmission problems of the next-generation multimedia data. Multi-view Video Coding (MVC) is one of the challenging topics in this area, that is on its road for standardization via ISO MPEG. In this thesis, a 3-D geometry-based MVC approach is proposed and analyzed in terms of its compression performance. For this purpose, the overall study is partitioned into three preceding parts. The first step is dense depth estimation of a view from a fully calibrated multi-view set. The calibration information and smoothness assumptions are utilized for determining dense correspondences via a Markov Random Field (MRF) model, which is solved by Belief Propagation (BP) method. In the second part, the estimated dense depth maps are utilized for generating (predicting) arbitrary (other camera) views of a scene, that is known as novel view generation. A 3-D warping algorithm, which is followed by an occlusion-compatible hole-filling process, is implemented for this aim. In order to suppress the occlusion artifacts, an intermediate novel view generation method, which fuses two novel views generated from different source views, is developed. Finally, for the last part, dense depth estimation and intermediate novel view generation tools are utilized in the proposed H.264-based MVC scheme for the removal of the spatial redundancies between different views. The performance of the proposed approach is compared against the simulcast coding and a recent MVC proposal, which is expected to be the standard recommendation for MPEG in the near future. These results show that the geometric approaches in MVC can still be utilized, especially in certain 3-D applications, in addition to conventional temporal motion compensation techniques, although the rate-distortion performances of geometry-free approaches are quite superior.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "MULTI VIEW VIDEOS"

1

Coelho, Alessandra Martins. Multimedia Networking and Coding: State-of-the Art Motion Estimation in the Context of 3D TV. Cyprus: INTECH, 2013.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "MULTI VIEW VIDEOS"

1

Yashika, B. L., and Vinod B. Durdi. "Image Fusion in Multi-view Videos Using SURF Algorithm." In Information and Communication Technology for Competitive Strategies (ICTCS 2020), 1061–71. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-16-0882-7_96.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Hossain, Emdad, Girija Chetty, and Roland Goecke. "Multi-view Multi-modal Gait Based Human Identity Recognition from Surveillance Videos." In Lecture Notes in Computer Science, 88–99. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37081-6_11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hossain, Emdad, and Girija Chetty. "Gait Based Human Identity Recognition from Multi-view Surveillance Videos." In Algorithms and Architectures for Parallel Processing, 319–28. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-33065-0_34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Wang, Chao, Yunhong Wang, Zhaoxiang Zhang, and Yiding Wang. "Model-Based Multi-view Face Construction and Recognition in Videos." In Lecture Notes in Computer Science, 280–87. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-31576-3_37.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Nosrati, Masoud S., Jean-Marc Peyrat, Julien Abinahed, Osama Al-Alao, Abdulla Al-Ansari, Rafeef Abugharbieh, and Ghassan Hamarneh. "Efficient Multi-organ Segmentation in Multi-view Endoscopic Videos Using Pre-operative Priors." In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014, 324–31. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-10470-6_41.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

He, Ming, Yong Ge, Le Wu, Enhong Chen, and Chang Tan. "Predicting the Popularity of DanMu-enabled Videos: A Multi-factor View." In Database Systems for Advanced Applications, 351–66. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-32049-6_22.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Santhoshkumar, R., and M. Kalaiselvi Geetha. "Emotion Recognition on Multi View Static Action Videos Using Multi Blocks Maximum Intensity Code (MBMIC)." In New Trends in Computational Vision and Bio-inspired Computing, 1143–51. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-41862-5_116.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Hossain, Emdad, and Girija Chetty. "Multi-view Gait Fusion for Large Scale Human Identification in Surveillance Videos." In Advanced Concepts for Intelligent Vision Systems, 527–37. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-33140-4_46.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Guo, Lintao, Hunter Quant, Nikolas Lamb, Benjamin Lowit, Sean Banerjee, and Natasha Kholgade Banerjee. "Spatiotemporal 3D Models of Aging Fruit from Multi-view Time-Lapse Videos." In MultiMedia Modeling, 466–78. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-73603-7_38.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Guo, Lintao, Hunter Quant, Nikolas Lamb, Benjamin Lowit, Natasha Kholgade Banerjee, and Sean Banerjee. "Multi-camera Microenvironment to Capture Multi-view Time-Lapse Videos for 3D Analysis of Aging Objects." In MultiMedia Modeling, 381–85. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-73600-6_37.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "MULTI VIEW VIDEOS"

1

Cai, Jia-Jia, Jun Tang, Qing-Guo Chen, Yao Hu, Xiaobo Wang, and Sheng-Jun Huang. "Multi-View Active Learning for Video Recommendation." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/284.

Full text
Abstract:
On many video websites, the recommendation is implemented as a prediction problem of video-user pairs, where the videos are represented by text features extracted from the metadata. However, the metadata is manually annotated by users and is usually missing for online videos. To train an effective recommender system with lower annotation cost, we propose an active learning approach to fully exploit the visual view of videos, while querying as few annotations as possible from the text view. On one hand, a joint model is proposed to learn the mapping from visual view to text view by simultaneously aligning the two views and minimizing the classification loss. On the other hand, a novel strategy based on prediction inconsistency and watching frequency is proposed to actively select the most important videos for metadata querying. Experiments on both classification datasets and real video recommendation tasks validate that the proposed approach can significantly reduce the annotation cost.
APA, Harvard, Vancouver, ISO, and other styles
2

Lin, Xinyu, Vlado Kitanovski, Qianni Zhang, and Ebroul Izquierdo. "Enhanced multi-view dancing videos synchronisation." In 2012 13th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS). IEEE, 2012. http://dx.doi.org/10.1109/wiamis.2012.6226773.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Wang, Xueting. "Viewing support system for multi-view videos." In ICMI '16: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2993148.2997613.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Lee, Ji-Tang, De-Nian Yang, and Wanjiun Liao. "Efficient Caching for Multi-View 3D Videos." In GLOBECOM 2016 - 2016 IEEE Global Communications Conference. IEEE, 2016. http://dx.doi.org/10.1109/glocom.2016.7841773.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Shuai, Qing, Chen Geng, Qi Fang, Sida Peng, Wenhao Shen, Xiaowei Zhou, and Hujun Bao. "Novel View Synthesis of Human Interactions from Sparse Multi-view Videos." In SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference. New York, NY, USA: ACM, 2022. http://dx.doi.org/10.1145/3528233.3530704.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Panda, Rameswar, Abir Das, and Amit K. Roy-Chowdhury. "Embedded sparse coding for summarizing multi-view videos." In 2016 IEEE International Conference on Image Processing (ICIP). IEEE, 2016. http://dx.doi.org/10.1109/icip.2016.7532345.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Shimizu, Tomohiro, Kei Oishi, Hideo Saito, Hiroki Kajita, and Yoshifumi Takatsume. "Automatic Viewpoint Switching for Multi-view Surgical Videos." In 2019 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). IEEE, 2019. http://dx.doi.org/10.1109/ismar-adjunct.2019.00037.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kuntintara, Wichukorn, Kanokphan Lertniphonphan, and Punnarai Siricharoen. "Multi-class Vehicle Counting System for Multi-view Traffic Videos." In 2022 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2022. http://dx.doi.org/10.23919/apsipaasc55919.2022.9980202.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Wei, Chen-Hao, Chen-Kuo Chiang, and Shang-Hong Lai. "Iterative depth recovery for multi-view video synthesis from stereo videos." In 2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA). IEEE, 2014. http://dx.doi.org/10.1109/apsipa.2014.7041695.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Davila, Daniel, Dawei Du, Bryon Lewis, Christopher Funk, Joseph Van Pelt, Roderic Collins, Kellie Corona, et al. "MEVID: Multi-view Extended Videos with Identities for Video Person Re-Identification." In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE, 2023. http://dx.doi.org/10.1109/wacv56688.2023.00168.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "MULTI VIEW VIDEOS"

1

Tao, Yang, Amos Mizrach, Victor Alchanatis, Nachshon Shamir, and Tom Porter. Automated imaging broiler chicksexing for gender-specific and efficient production. United States Department of Agriculture, December 2014. http://dx.doi.org/10.32747/2014.7594391.bard.

Full text
Abstract:
Extending the previous two years of research results (Mizarch, et al, 2012, Tao, 2011, 2012), the third year’s efforts in both Maryland and Israel were directed towards the engineering of the system. The activities included the robust chick handling and its conveyor system development, optical system improvement, online dynamic motion imaging of chicks, multi-image sequence optimal feather extraction and detection, and pattern recognition. Mechanical System Engineering The third model of the mechanical chick handling system with high-speed imaging system was built as shown in Fig. 1. This system has the improved chick holding cups and motion mechanisms that enable chicks to open wings through the view section. The mechanical system has achieved the speed of 4 chicks per second which exceeds the design specs of 3 chicks per second. In the center of the conveyor, a high-speed camera with UV sensitive optical system, shown in Fig.2, was installed that captures chick images at multiple frames (45 images and system selectable) when the chick passing through the view area. Through intensive discussions and efforts, the PIs of Maryland and ARO have created the protocol of joint hardware and software that uses sequential images of chick in its fall motion to capture opening wings and extract the optimal opening positions. This approached enables the reliable feather feature extraction in dynamic motion and pattern recognition. Improving of Chick Wing Deployment The mechanical system for chick conveying and especially the section that cause chicks to deploy their wings wide open under the fast video camera and the UV light was investigated along the third study year. As a natural behavior, chicks tend to deploy their wings as a mean of balancing their body when a sudden change in the vertical movement was applied. In the latest two years, this was achieved by causing the chicks to move in a free fall, in the earth gravity (g) along short vertical distance. The chicks have always tended to deploy their wing but not always in wide horizontal open situation. Such position is requested in order to get successful image under the video camera. Besides, the cells with checks bumped suddenly at the end of the free falling path. That caused the chicks legs to collapse inside the cells and the image of wing become bluer. For improving the movement and preventing the chick legs from collapsing, a slowing down mechanism was design and tested. This was done by installing of plastic block, that was printed in a predesign variable slope (Fig. 3) at the end of the path of falling cells (Fig.4). The cells are moving down in variable velocity according the block slope and achieve zero velocity at the end of the path. The slop was design in a way that the deacceleration become 0.8g instead the free fall gravity (g) without presence of the block. The tests showed better deployment and wider chick's wing opening as well as better balance along the movement. Design of additional sizes of block slops is under investigation. Slops that create accelerations of 0.7g, 0.9g, and variable accelerations are designed for improving movement path and images.
APA, Harvard, Vancouver, ISO, and other styles
2

Anderson, Gerald L., and Kalman Peleg. Precision Cropping by Remotely Sensed Prorotype Plots and Calibration in the Complex Domain. United States Department of Agriculture, December 2002. http://dx.doi.org/10.32747/2002.7585193.bard.

Full text
Abstract:
This research report describes a methodology whereby multi-spectral and hyperspectral imagery from remote sensing, is used for deriving predicted field maps of selected plant growth attributes which are required for precision cropping. A major task in precision cropping is to establish areas of the field that differ from the rest of the field and share a common characteristic. Yield distribution f maps can be prepared by yield monitors, which are available for some harvester types. Other field attributes of interest in precision cropping, e.g. soil properties, leaf Nitrate, biomass etc. are obtained by manual sampling of the filed in a grid pattern. Maps of various field attributes are then prepared from these samples by the "Inverse Distance" interpolation method or by Kriging. An improved interpolation method was developed which is based on minimizing the overall curvature of the resulting map. Such maps are the ground truth reference, used for training the algorithm that generates the predicted field maps from remote sensing imagery. Both the reference and the predicted maps are stratified into "Prototype Plots", e.g. 15xl5 blocks of 2m pixels whereby the block size is 30x30m. This averaging reduces the datasets to manageable size and significantly improves the typically poor repeatability of remote sensing imaging systems. In the first two years of the project we used the Normalized Difference Vegetation Index (NDVI), for generating predicted yield maps of sugar beets and com. The NDVI was computed from image cubes of three spectral bands, generated by an optically filtered three camera video imaging system. A two dimensional FFT based regression model Y=f(X), was used wherein Y was the reference map and X=NDVI was the predictor. The FFT regression method applies the "Wavelet Based", "Pixel Block" and "Image Rotation" transforms to the reference and remote images, prior to the Fast - Fourier Transform (FFT) Regression method with the "Phase Lock" option. A complex domain based map Yfft is derived by least squares minimization between the amplitude matrices of X and Y, via the 2D FFT. For one time predictions, the phase matrix of Y is combined with the amplitude matrix ofYfft, whereby an improved predicted map Yplock is formed. Usually, the residuals of Y plock versus Y are about half of the values of Yfft versus Y. For long term predictions, the phase matrix of a "field mask" is combined with the amplitude matrices of the reference image Y and the predicted image Yfft. The field mask is a binary image of a pre-selected region of interest in X and Y. The resultant maps Ypref and Ypred aremodified versions of Y and Yfft respectively. The residuals of Ypred versus Ypref are even lower than the residuals of Yplock versus Y. The maps, Ypref and Ypred represent a close consensus of two independent imaging methods which "view" the same target. In the last two years of the project our remote sensing capability was expanded by addition of a CASI II airborne hyperspectral imaging system and an ASD hyperspectral radiometer. Unfortunately, the cross-noice and poor repeatability problem we had in multi-spectral imaging was exasperated in hyperspectral imaging. We have been able to overcome this problem by over-flying each field twice in rapid succession and developing the Repeatability Index (RI). The RI quantifies the repeatability of each spectral band in the hyperspectral image cube. Thereby, it is possible to select the bands of higher repeatability for inclusion in the prediction model while bands of low repeatability are excluded. Further segregation of high and low repeatability bands takes place in the prediction model algorithm, which is based on a combination of a "Genetic Algorithm" and Partial Least Squares", (PLS-GA). In summary, modus operandi was developed, for deriving important plant growth attribute maps (yield, leaf nitrate, biomass and sugar percent in beets), from remote sensing imagery, with sufficient accuracy for precision cropping applications. This achievement is remarkable, given the inherently high cross-noice between the reference and remote imagery as well as the highly non-repeatable nature of remote sensing systems. The above methodologies may be readily adopted by commercial companies, which specialize in proving remotely sensed data to farmers.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography