Letteratura scientifica selezionata sul tema "Semantic video coding"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Consulta la lista di attuali articoli, libri, tesi, atti di convegni e altre fonti scientifiche attinenti al tema "Semantic video coding".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Articoli di riviste sul tema "Semantic video coding":

1

Essel, Daniel Danso, Ben-Bright Benuwa e Benjamin Ghansah. "Video Semantic Analysis". International Journal of Computer Vision and Image Processing 11, n. 2 (aprile 2021): 1–21. http://dx.doi.org/10.4018/ijcvip.2021040101.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Sparse Representation (SR) and Dictionary Learning (DL) based Classifier have shown promising results in classification tasks, with impressive recognition rate on image data. In Video Semantic Analysis (VSA) however, the local structure of video data contains significant discriminative information required for classification. To the best of our knowledge, this has not been fully explored by recent DL-based approaches. Further, similar coding findings are not being realized from video features with the same video category. Based on the foregoing, a novel learning algorithm, Sparsity based Locality-Sensitive Discriminative Dictionary Learning (SLSDDL) for VSA is proposed in this paper. In the proposed algorithm, a discriminant loss function for the category based on sparse coding of the sparse coefficients is introduced into structure of Locality-Sensitive Dictionary Learning (LSDL) algorithm. Finally, the sparse coefficients for the testing video feature sample are solved by the optimized method of SLSDDL and the classification result for video semantic is obtained by minimizing the error between the original and reconstructed samples. The experimental results show that, the proposed SLSDDL significantly improves the performance of video semantic detection compared with state-of-the-art approaches. The proposed approach also shows robustness to diverse video environments, proving the universality of the novel approach.
2

Chen, Sovann, Supavadee Aramvith e Yoshikazu Miyanaga. "Learning-Based Rate Control for High Efficiency Video Coding". Sensors 23, n. 7 (30 marzo 2023): 3607. http://dx.doi.org/10.3390/s23073607.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
High efficiency video coding (HEVC) has dramatically enhanced coding efficiency compared to the previous video coding standard, H.264/AVC. However, the existing rate control updates its parameters according to a fixed initialization, which can cause errors in the prediction of bit allocation to each coding tree unit (CTU) in frames. This paper proposes a learning-based mapping method between rate control parameters and video contents to achieve an accurate target bit rate and good video quality. The proposed framework contains two main structural codings, including spatial and temporal coding. We initiate an effective learning-based particle swarm optimization for spatial and temporal coding to determine the optimal parameters at the CTU level. For temporal coding at the picture level, we introduce semantic residual information into the parameter updating process to regulate the bit correctly on the actual picture. Experimental results indicate that the proposed algorithm is effective for HEVC and outperforms the state-of-the-art rate control in the HEVC reference software (HM-16.10) by 0.19 dB on average and up to 0.41 dB for low-delay P coding structure.
3

Antoszczyszyn, P. M., J. M. Hannah e P. M. Grant. "Reliable tracking of facial features in semantic-based video coding". IEE Proceedings - Vision, Image, and Signal Processing 145, n. 4 (1998): 257. http://dx.doi.org/10.1049/ip-vis:19982153.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
4

NOMURA, Yoshihiko, Ryutaro MATSUDA, Ryota Sakamoto, Tokuhiro SUGIURA, Hirokazu Matsui e Norihiko KATO. "2301 Low Bit-Rate Semantic Coding Technology for Lecture Video". Proceedings of the JSME annual meeting 2005.7 (2005): 89–90. http://dx.doi.org/10.1299/jsmemecjo.2005.7.0_89.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
5

Benuwa, Ben-Bright, Yongzhao Zhan, Benjamin Ghansah, Ernest K. Ansah e Andriana Sarkodie. "Sparsity Based Locality-Sensitive Discriminative Dictionary Learning for Video Semantic Analysis". Mathematical Problems in Engineering 2018 (5 agosto 2018): 1–11. http://dx.doi.org/10.1155/2018/9312563.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Dictionary learning (DL) and sparse representation (SR) based classifiers have greatly impacted the classification performance and have had good recognition rate on image data. In video semantic analysis (VSA), the local structure of video data contains more vital discriminative information needed for classification. However, this has not been fully exploited by the current DL based approaches. Besides, similar coding findings are not being realized from video features with the same video category. Based on the issues stated afore, a novel learning algorithm, called sparsity based locality-sensitive discriminative dictionary learning (SLSDDL) for VSA is proposed in this paper. In the proposed algorithm, a discriminant loss function for the category based on sparse coding of the sparse coefficients is introduced into structure of locality-sensitive dictionary learning (LSDL) algorithm. Finally, the sparse coefficients for the testing video feature sample are solved by the optimized method of SLSDDL and the classification result for video semantic is obtained by minimizing the error between the original and reconstructed samples. The experiment results show that the proposed SLSDDL significantly improves the performance of video semantic detection compared with the comparative state-of-the-art approaches. Moreover, the robustness to various diverse environments in video is also demonstrated, which proves the universality of the novel approach.
6

Pimentel-Niño, M. A., Paresh Saxena e M. A. Vazquez-Castro. "Reliable Adaptive Video Streaming Driven by Perceptual Semantics for Situational Awareness". Scientific World Journal 2015 (2015): 1–16. http://dx.doi.org/10.1155/2015/394956.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
A novel cross-layer optimized video adaptation driven by perceptual semantics is presented. The design target is streamed live video to enhance situational awareness in challenging communications conditions. Conventional solutions for recreational applications are inadequate and novel quality of experience (QoE) framework is proposed which allows fully controlled adaptation and enables perceptual semantic feedback. The framework relies on temporal/spatial abstraction for video applications serving beyond recreational purposes. An underlying cross-layer optimization technique takes into account feedback on network congestion (time) and erasures (space) to best distribute available (scarce) bandwidth. Systematic random linear network coding (SRNC) adds reliability while preserving perceptual semantics. Objective metrics of the perceptual features in QoE show homogeneous high performance when using the proposed scheme. Finally, the proposed scheme is in line with content-aware trends, by complying with information-centric-networking philosophy and architecture.
7

Guo, Jia, Xiangyang Gong, Wendong Wang, Xirong Que e Jingyu Liu. "SASRT: Semantic-Aware Super-Resolution Transmission for Adaptive Video Streaming over Wireless Multimedia Sensor Networks". Sensors 19, n. 14 (15 luglio 2019): 3121. http://dx.doi.org/10.3390/s19143121.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
There are few network resources in wireless multimedia sensor networks (WMSNs). Compressing media data can reduce the reliance of user’s Quality of Experience (QoE) on network resources. Existing video coding software, such as H.264 and H.265, focuses only on spatial and short-term information redundancy. However, video usually contains redundancy over a long period of time. Therefore, compressing video information redundancy with a long period of time without compromising the user experience and adaptive delivery is a challenge in WMSNs. In this paper, a semantic-aware super-resolution transmission for adaptive video streaming system (SASRT) for WMSNs is presented. In the SASRT, some deep learning algorithms are used to extract video semantic information and enrich the video quality. On the multimedia sensor, different bit-rate semantic information and video data are encoded and uploaded to user. Semantic information can also be identified on the user side, further reducing the amount of data that needs to be transferred. However, identifying semantic information on the user side may increase the computational cost of the user side. On the user side, video quality is enriched with super-resolution technologies. The major challenges faced by SASRT include where the semantic information is identified, how to choose the bit rates of semantic and video information, and how network resources should be allocated to video and semantic information. The optimization problem is formulated as a complexity-constrained nonlinear NP-hard problem. Three adaptive strategies and a heuristic algorithm are proposed to solve the optimization problem. Simulation results demonstrate that SASRT can compress video information redundancy with a long period of time effectively and enrich the user experience with limited network resources while simultaneously improving the utilization of these network resources.
8

Stivaktakis, Radamanthys, Grigorios Tsagkatakis e Panagiotis Tsakalides. "Semantic Predictive Coding with Arbitrated Generative Adversarial Networks". Machine Learning and Knowledge Extraction 2, n. 3 (25 agosto 2020): 307–26. http://dx.doi.org/10.3390/make2030017.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In spatio-temporal predictive coding problems, like next-frame prediction in video, determining the content of plausible future frames is primarily based on the image dynamics of previous frames. We establish an alternative approach based on their underlying semantic information when considering data that do not necessarily incorporate a temporal aspect, but instead they comply with some form of associative ordering. In this work, we introduce the notion of semantic predictive coding by proposing a novel generative adversarial modeling framework which incorporates the arbiter classifier as a new component. While the generator is primarily tasked with the anticipation of possible next frames, the arbiter’s principal role is the assessment of their credibility. Taking into account that the denotative meaning of each forthcoming element can be encapsulated in a generic label descriptive of its content, a classification loss is introduced along with the adversarial loss. As supported by our experimental findings in a next-digit and a next-letter scenario, the utilization of the arbiter not only results in an enhanced GAN performance, but it also broadens the network’s creative capabilities in terms of the diversity of the generated symbols.
9

Herranz, Luis. "Integrating semantic analysis and scalable video coding for efficient content-based adaptation". Multimedia Systems 13, n. 2 (30 giugno 2007): 103–18. http://dx.doi.org/10.1007/s00530-007-0090-0.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Motlicek, Petr, Stefan Duffner, Danil Korchagin, Hervé Bourlard, Carl Scheffler, Jean-Marc Odobez, Giovanni Del Galdo, Markus Kallinger e Oliver Thiergart. "Real-Time Audio-Visual Analysis for Multiperson Videoconferencing". Advances in Multimedia 2013 (2013): 1–21. http://dx.doi.org/10.1155/2013/175745.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
We describe the design of a system consisting of several state-of-the-art real-time audio and video processing components enabling multimodal stream manipulation (e.g., automatic online editing for multiparty videoconferencing applications) in open, unconstrained environments. The underlying algorithms are designed to allow multiple people to enter, interact, and leave the observable scene with no constraints. They comprise continuous localisation of audio objects and its application for spatial audio object coding, detection, and tracking of faces, estimation of head poses and visual focus of attention, detection and localisation of verbal and paralinguistic events, and the association and fusion of these different events. Combined all together, they represent multimodal streams with audio objects and semantic video objects and provide semantic information for stream manipulation systems (like a virtual director). Various experiments have been performed to evaluate the performance of the system. The obtained results demonstrate the effectiveness of the proposed design, the various algorithms, and the benefit of fusing different modalities in this scenario.

Tesi sul tema "Semantic video coding":

1

Al-Qayedi, Ali. "Internet video-conferencing using model-based image coding with agent technology". Thesis, University of Essex, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.298836.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Mitrica, Iulia. "Video compression of airplane cockpit screens content". Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT042.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Cette thèse aborde le problème de l'encodage de la vidéo des cockpits d'avion. Le cockpit des avions de ligne modernes consiste en un ou plusieurs écrans affichant l'état des instruments de l'avion (par exemple, la position de l'avion telle que rapportée par le GPS, le niveau de carburant tel que lu par les capteurs dans les réservoirs, etc.,) souvent superposés au naturel images (par exemple, cartes de navigation, caméras extérieures, etc.). Les capteurs d'avion sont généralement inaccessibles pour des raisons de sécurité, de sorte que l'enregistrement du cockpit est souvent le seul moyen de consigner les données vitales de l'avion en cas, par exemple, d'un accident. Les contraintes sur la mémoire d'enregistrement disponible à bord nécessitent que la vidéo du cockpit soit codée à des débits faibles à très faibles, alors que pour des raisons de sécurité, les informations textuelles doivent rester intelligibles après le décodage. De plus, les contraintes sur l'enveloppe de puissance des dispositifs avioniques limitent la complexité du sous-système d'enregistrement du poste de pilotage. Au fil des ans, un certain nombre de schémas de codage d'images ou de vidéos avec des contenus mixtes générés par ordinateur et naturels ont été proposés. Le texte et d'autres graphiques générés par ordinateur produisent des composants haute fréquence dans le domaine transformé. Par conséquent, la perte due à la compression peut nuire à la lisibilité de la vidéo et donc à son utilité. Par exemple, l'extension récemment normalisée SCC (Screen Content Coding) de la norme H.265/HEVC comprend des outils conçus explicitement pour la compression du contenu de l'écran. Nos expériences montrent cependant que les artefacts persistent aux bas débits ciblés par notre application, incitant à des schémas où la vidéo n'est pas encodée dans le domaine des pixels. Cette thèse propose des méthodes de codage d'écran de faible complexité où le texte et les primitives graphiques sont codés en fonction de leur sémantique plutôt que sous forme de blocs de pixels. Du côté du codeur, les caractères sont détectés et lus à l'aide d'un réseau neuronal convolutif. Les caractères détectés sont ensuite supprimés de l'écran via le pixel inpainting, ce qui donne une vidéo résiduelle plus fluide avec moins de hautes fréquences. La vidéo résiduelle est codée avec un codec vidéo standard et est transmise du côté récepteur avec une sémantique textuelle et graphique en tant qu'informations secondaires. Du côté du décodeur, le texte et les graphiques sont synthétisés à l'aide de la sémantique décodée et superposés à la vidéo résiduelle, récupérant finalement l'image d'origine. Nos expériences montrent qu'un encodeur AVC/H.264 équipé de notre méthode a de meilleures performances de distorsion-débit que H.265/HEVC et se rapproche de celle de son extension SCC. Si les contraintes de complexité permettent la prédiction inter-trame, nous exploitons également le fait que les caractères co-localisés dans les trames voisines sont fortement corrélés. À savoir, les symboles mal classés sont récupérés à l'aide d'une méthode proposée basée sur un modèle de faible complexité des probabilités de transition pour les caractères et les graphiques. Concernant la reconnaissance de caractères, le taux d'erreur chute jusqu'à 18 fois dans les cas les plus faciles et au moins 1,5 fois dans les séquences les plus difficiles malgré des occlusions complexes.En exploitant la redondance temporelle, notre schéma s'améliore encore en termes de distorsion de débit et permet un décodage de caractères quasi sans erreur. Des expériences avec de vraies séquences vidéo de cockpit montrent des gains de distorsion de débit importants pour la méthode proposée par rapport aux normes de compression vidéo
This thesis addresses the problem of encoding the video of airplane cockpits.The cockpit of modern airliners consists in one or more screens displaying the status of the plane instruments (e.g., the plane location as reported by the GPS, the fuel level as read by the sensors in the tanks, etc.,) often superimposed over natural images (e.g., navigation maps, outdoor cameras, etc.).Plane sensors are usually inaccessible due to security reasons, so recording the cockpit is often the only way to log vital plane data in the event of, e.g., an accident.Constraints on the recording storage available on-board require the cockpit video to be coded at low to very low bitrates, whereas safety reasons require the textual information to remain intelligible after decoding. In addition, constraints on the power envelope of avionic devices limit the cockpit recording subsystem complexity.Over the years, a number of schemes for coding images or videos with mixed computer-generated and natural contents have been proposed. Text and other computer generated graphics yield high-frequency components in the transformed domain. Therefore, the loss due to compression may hinder the readability of the video and thus its usefulness. For example, the recently standardized Screen Content Coding (SCC) extension of the H.265/HEVC standard includes tools designed explicitly for screen contents compression. Our experiments show however that artifacts persist at the low bitrates targeted by our application, prompting for schemes where the video is not encoded in the pixel domain.This thesis proposes methods for low complexity screen coding where text and graphical primitives are encoded in terms of their semantics rather than as blocks of pixels.At the encoder side, characters are detected and read using a convolutional neural network.Detected characters are then removed from screen via pixel inpainting, yielding a smoother residual video with fewer high frequencies. The residual video is encoded with a standard video codec and is transmitted to the receiver side together with text and graphics semantics as side information.At the decoder side, text and graphics are synthesized using the decoded semantics and superimposed over the residual video, eventually recovering the original frame. Our experiments show that an AVC/H.264 encoder retrofitted with our method has better rate-distortion performance than H.265/HEVC and approaches that of its SCC extension.If the complexity constraints allow inter-frame prediction, we also exploit the fact that co-located characters in neighbor frames are strongly correlated.Namely, the misclassified symbols are recovered using a proposed method based on low-complexity model of transitional probabilities for characters and graphics. Concerning character recognition, the error rate drops up to 18 times in the easiest cases and at least 1.5 times in the most difficult sequences despite complex occlusions.By exploiting temporal redundancy, our scheme further improves in rate-distortion terms and enables quasi-errorless character decoding. Experiments with real cockpit video footage show large rate-distortion gains for the proposed method with respect to video compression standards

Capitoli di libri sul tema "Semantic video coding":

1

Mezaris, Vasileios, Nikolaos Thomos, Nikolaos V. Boulgouris e Ioannis Kompatsiaris. "Knowledge-Assisted Analysis of Video for Content-Adaptive Coding and Transmission". In Advances in Semantic Media Adaptation and Personalization, 221–40. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-76361_11.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Lin, Yu-Tzu, e Chia-Hu Chang. "User-aware Video Coding Based on Semantic Video Understanding and Enhancing". In Recent Advances on Video Coding. InTech, 2011. http://dx.doi.org/10.5772/16498.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
3

Thomas-Kerr, Joseph, Ian Burnett e Christian Ritz. "What Are You Trying to Say? Format-Independent Semantic-Aware Streaming and Delivery". In Recent Advances on Video Coding. InTech, 2011. http://dx.doi.org/10.5772/16763.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Cavallaro, Andrea, e Stefan Winkler. "Perceptual Semantics". In Multimedia Technologies, 1441–55. IGI Global, 2008. http://dx.doi.org/10.4018/978-1-59904-953-3.ch105.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The design of image and video compression or transmission systems is driven by the need for reducing the bandwidth and storage requirements of the content while maintaining its visual quality. Therefore, the objective is to define codecs that maximize perceived quality as well as automated metrics that reliably measure perceived quality. One of the common shortcomings of traditional video coders and quality metrics is the fact that they treat the entire scene uniformly, assuming that people look at every pixel of the image or video. In reality, we focus only on particular areas of the scene. In this chapter, we prioritize the visual data accordingly in order to improve the compression performance of video coders and the prediction performance of perceptual quality metrics. The proposed encoder and quality metric incorporate visual attention and use a semantic segmentation stage, which takes into account certain aspects of the cognitive behavior of people when watching a video. This semantic model corresponds to a specific human abstraction, which need not necessarily be characterized by perceptual uniformity. In particular, we concentrate on segmenting moving objects and faces, and we evaluate the perceptual impact on video coding and on quality evaluation.
5

Cavallaro, Andrea, e Stefan Winkler. "Perceptual Semantics". In Digital Multimedia Perception and Design, 1–20. IGI Global, 2006. http://dx.doi.org/10.4018/978-1-59140-860-4.ch001.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The design of image and video compression or transmission systems is driven by the need for reducing the bandwidth and storage requirements of the content while maintaining its visual quality. Therefore, the objective is to define codecs that maximize perceived quality as well as automated metrics that reliably measure perceived quality. One of the common shortcomings of traditional video coders and quality metrics is the fact that they treat the entire scene uniformly, assuming that people look at every pixel of the image or video. In reality, we focus only on particular areas of the scene. In this chapter, we prioritize the visual data accordingly in order to improve the compression performance of video coders and the prediction performance of perceptual quality metrics. The proposed encoder and quality metric incorporate visual attention and use a semantic segmentation stage, which takes into account certain aspects of the cognitive behavior of people when watching a video. This semantic model corresponds to a specific human abstraction, which need not necessarily be characterized by perceptual uniformity. In particular, we concentrate on segmenting moving objects and faces, and we evaluate the perceptual impact on video coding and on quality evaluation.

Atti di convegni sul tema "Semantic video coding":

1

Décombas, M., F. Capman, E. Renan, F. Dufaux e B. Pesquet-Popescu. "Seam carving for semantic video coding". In SPIE Optical Engineering + Applications, a cura di Andrew G. Tescher. SPIE, 2011. http://dx.doi.org/10.1117/12.895317.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Silva, Michel M., Mario F. M. Campos e Erickson R. Nascimento. "Semantic Hyperlapse: a Sparse Coding-based and Multi-Importance Approach for First-Person Videos". In XXXII Conference on Graphics, Patterns and Images. Sociedade Brasileira de Computação - SBC, 2019. http://dx.doi.org/10.5753/sibgrapi.est.2019.8302.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The availability of low-cost, high-quality personal wearable cameras combined with the unlimited storage capacity of video-sharing websites has evoked a growing interest in First-Person Videos (FPVs). Such videos are usually composed of long-running unedited streams captured by a device attached to the user body, which makes them tedious and visually unpleasant to watch. Consequently, there is a rise in the need to provide quick access to the information therein. To address this need, efforts have been applied to the development of techniques such as Hyperlapse and Semantic Hyperlapse, which aims to create visually pleasant shorter videos and emphasize semantic portions of the video, respectively. The state-of-the-art Semantic Hyperlapse method SSFF, negligees the level of importance of the relevant information, by only evaluating if it is significant or not. Other limitations of SSFF are the number of input parameters, the scalability in the number of visual features to describe the frames, and the abrupt change in the speed-up rate of consecutive video segments. In this dissertation, we propose a parameter-free Sparse Coding based methodology to adaptively fast-forward First-Person Videos, that emphasize the semantic portions applying a multi-importance approach. Experimental evaluations show that the proposed method creates shorter version video retaining more semantic information, with fewer abrupt transitions of speed-up rates, and more stable final videos than the output of SSFF. Visual results and graphical explanation of the methodology can be visualized through the link: https://youtu.be/8uStih8P5-Y.
3

Zhu, Chen, Guo Lu, Rong Xie e Li Song. "Perceptual Video Coding Based on Semantic-Guided Texture Detection and Synthesis". In 2022 Picture Coding Symposium (PCS). IEEE, 2022. http://dx.doi.org/10.1109/pcs56426.2022.10018028.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Silva, Michel M., Mario F. M. Campos e Erickson R. Nascimento. "Semantic Hyperlapse: a Sparse Coding-based and Multi-Importance Approach for First-Person Videos". In Concurso de Teses e Dissertações da SBC. Sociedade Brasileira de Computação - SBC, 2020. http://dx.doi.org/10.5753/ctd.2020.11364.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The availability of low-cost and high-quality wearable cameras combined with the unlimited storage capacity of video-sharing websites have evoked a growing interest in First-Person Videos. Such videos are usually composed of long-running unedited streams captured by a device attached to the user body, which makes them tedious and visually unpleasant to watch. Consequently, it raises the need to provide quick access to the information therein. We propose a Sparse Coding based methodology to fast-forward First-Person Videos adaptively. Experimental evaluations show that the shorter version video resulting from the proposed method is more stable and retain more semantic information than the state-of-the-art. Visual results and graphical explanation of the methodology can be visualized through the link: https://youtu.be/rTEZurH64ME
5

Yang, Jianping, Jie Zhang e Xiangjun Chen. "Semantic-preload video model based on VOP coding". In 2012 International Conference on Graphic and Image Processing, a cura di Zeng Zhu. SPIE, 2013. http://dx.doi.org/10.1117/12.2012827.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Galteri, Leonardo, Marco Bertini, Lorenzo Seidenari, Tiberio Uricchio e Alberto Del Bimbo. "Increasing Video Perceptual Quality with GANs and Semantic Coding". In MM '20: The 28th ACM International Conference on Multimedia. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3394171.3413508.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Xie, Guangqi, Xin Li, Shiqi Lin, Zhibo Chen, Li Zhang, Kai Zhang e Yue Li. "Hierarchical Reinforcement Learning Based Video Semantic Coding for Segmentation". In 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP). IEEE, 2022. http://dx.doi.org/10.1109/vcip56404.2022.10008806.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Hu, Yujie, Youmin Xu, Jianhui Chang e Jian Zhang. "Semantic Neural Rendering-based Video Coding: Towards Ultra-Low Bitrate Video Conferencing". In 2022 Data Compression Conference (DCC). IEEE, 2022. http://dx.doi.org/10.1109/dcc52660.2022.00067.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Zhang Liang, Wen Xiangming, Wang Bo e Zheng Wei. "A concept-based approach to video semantic analysis and coding". In 2010 2nd International Conference on Information Science and Engineering (ICISE). IEEE, 2010. http://dx.doi.org/10.1109/icise.2010.5689076.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Mezaris, Vasileios, Nikolaos Boulgouris e Ioannis Kompatsiaris. "Knowledge-Assisted Video Analysis for Content-Adaptive Coding and Transmission". In 2006 First International Workshop on Semantic Media Adaptation and Personalization (SMAP'06). IEEE, 2006. http://dx.doi.org/10.1109/smap.2006.22.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri

Vai alla bibliografia