To see the other types of publications on this topic, follow the link: Encoded video stream.

Journal articles on the topic 'Encoded video stream'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Encoded video stream.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Al-Tamimi, Abdel-Karim, Raj Jain, and Chakchai So-In. "High-Definition Video Streams Analysis, Modeling, and Prediction." Advances in Multimedia 2012 (2012): 1–13. http://dx.doi.org/10.1155/2012/539396.

Full text
Abstract:
High-definition video streams' unique statistical characteristics and their high bandwidth requirements are considered to be a challenge in both network scheduling and resource allocation fields. In this paper, we introduce an innovative way to model and predict high-definition (HD) video traces encoded with H.264/AVC encoding standard. Our results are based on our compilation of over 50 HD video traces. We show that our model, simplified seasonal ARIMA (SAM), provides an accurate representation for HD videos, and it provides significant improvements in prediction accuracy. Such accuracy is vital to provide better dynamic resource allocation for video traffic. In addition, we provide a statistical analysis of HD videos, including both factor and cluster analysis to support a better understanding of video stream workload characteristics and their impact on network traffic. We discuss our methodology to collect and encode our collection of HD video traces. Our video collection, results, and tools are available for the research community.
APA, Harvard, Vancouver, ISO, and other styles
2

Reljin, Irini, and Branimir Reljin. "Fractal and multifractal analyses of compressed video sequences." Facta universitatis - series: Electronics and Energetics 16, no. 3 (2003): 401–14. http://dx.doi.org/10.2298/fuee0303401r.

Full text
Abstract:
The paper considers compressed video streams from the fractal and multifractal (MF) points of view. Video traces in H.263 and MPEG-4 formats generated at the Technical University Berlin and publicly available, were investigated. It was shown that all compressed videos exhibit fractal (long-range dependency) nature and that higher compression ratios provoke more variability of the encoded video stream. This conclusion is approved from the MF spectra of frame size video traces. By analyzing individual frames and their MF spectra the additive nature is approved.
APA, Harvard, Vancouver, ISO, and other styles
3

Grois, Dan, Evgeny Kaminsky, and Ofer Hadar. "Efficient Real-Time Video-in-Video Insertion into a Pre-Encoded Video Stream." ISRN Signal Processing 2011 (February 14, 2011): 1–11. http://dx.doi.org/10.5402/2011/975462.

Full text
Abstract:
This work relates to the developing and implementing of an efficient method and system for the fast real-time Video-in-Video (ViV) insertion, thereby enabling efficiently inserting a video sequence into a predefined location within a pre-encoded video stream. The proposed method and system are based on dividing the video insertion process into two steps. The first step (i.e., the Video-in-Video Constrained Format (ViVCF) encoder) includes the modification of the conventional H.264/AVC video encoder to support the visual content insertion Constrained Format (CF), including generation of isolated regions without using the Frequent Macroblock Ordering (FMO) slicing, and to support the fast real-time insertion of overlays. Although, the first step is computationally intensive, it should to be performed only once even if different overlays have to be modified (e.g., for different users). The second step for performing the ViV insertion (i.e., the ViVCF inserter) is relatively simple (operating mostly in a bit-domain), and is performed separately for each different overlay. The performance of the presented method and system is demonstrated and compared with the H.264/AVC reference software (JM 12); according to our experimental results, there is a significantly low bit-rate overhead, while there is substantially no degradation in the PSNR quality.
APA, Harvard, Vancouver, ISO, and other styles
4

Stankowski, Jakub, Damian Karwowski, Tomasz Grajek, Krzysztof Wegner, Jakub Siast, Krzysztof Klimaszewski, Olgierd Stankiewicz, and Marek Domański. "Analysis of Compressed Data Stream Content in HEVC Video Encoder." International Journal of Electronics and Telecommunications 61, no. 2 (June 1, 2015): 121–27. http://dx.doi.org/10.1515/eletel-2015-0015.

Full text
Abstract:
Abstract In this paper, a detailed analysis of the content of the bitstream, produced by the HEVC video encoder is presented. With the use of the HM 10.0 reference software the following statistics were investigated: 1) the amount of data in the encoded stream related to individual frame types, 2) the relationship between the value of the QP and the size of the bitstream at the output of the encoder, 3) contribution of individual types of data to I and B frames. The above mentioned aspects have been thoroughly explored for a wide range of target bitrates. The obtained results became the basis for highlighting guidelines that allow for efficient bitrate control in the HEVC encoder.
APA, Harvard, Vancouver, ISO, and other styles
5

Politis, Ilias, Michail Tsagkaropoulos, Thomas Pliakas, and Tasos Dagiuklas. "Distortion Optimized Packet Scheduling and Prioritization of Multiple Video Streams over 802.11e Networks." Advances in Multimedia 2007 (2007): 1–11. http://dx.doi.org/10.1155/2007/76846.

Full text
Abstract:
This paper presents a generic framework solution for minimizing video distortion of all multiple video streams transmitted over 802.11e wireless networks, including intelligent packet scheduling and channel access differentiation mechanisms. A distortion prediction model designed to capture the multireferenced frame coding characteristic of H.264/AVC encoded videos is used to predetermine the distortion importance of each video packet in all streams. Two intelligent scheduling algorithms are proposed: the “even-loss distribution,” where each video sender is experiencing the same loss and the “greedy-loss distribution” packet scheduling, where selected packets are dropped over all streams, ensuring that the most significant video stream in terms of picture context and quality characteristics will experience minimum losses. The proposed model has been verified with actual distortion measurements and has been found more accurate than the “additive distortion” model that omits the correlation among lost frames. The paper includes analytical and simulation results from the comparison of both schemes and from their comparison to the simplified additive model, for different video sequences and channel conditions.
APA, Harvard, Vancouver, ISO, and other styles
6

Yang, Fu Zheng, Jia Run Song, and Shu Ai Wan. "A No-Reference Quality Assessment System for Video Streaming over RTP." Advanced Materials Research 179-180 (January 2011): 243–48. http://dx.doi.org/10.4028/www.scientific.net/amr.179-180.243.

Full text
Abstract:
In the paper a no-reference system for quality assessment of video streaming over RTP is proposed for monitoring the quality of networked video. The proposed system is composed of four modules, where the quality assessment module utilizes information extracted from the bit-stream by the modules of RTP header analysis, frame header analysis and display buffer simulation. Taking MPEG-4 encoded video stream over RTP as an example, the process of video quality assessment using the proposed system is described in this paper. The proposed system is featured by its high efficiency without sorting to the original video or video decoding, and therefore well suited for real-time networked video applications.
APA, Harvard, Vancouver, ISO, and other styles
7

Wang, Ke, Xuejing Li, Jianhua Yang, Jun Wu, and Ruifeng Li. "Temporal action detection based on two-stream You Only Look Once network for elderly care service robot." International Journal of Advanced Robotic Systems 18, no. 4 (July 1, 2021): 172988142110383. http://dx.doi.org/10.1177/17298814211038342.

Full text
Abstract:
Human action segmentation and recognition from the continuous untrimmed sensor data stream is a challenging issue known as temporal action detection. This article provides a two-stream You Only Look Once-based network method, which fuses video and skeleton streams captured by a Kinect sensor, and our data encoding method is used to turn the spatiotemporal temporal action detection into a one-dimensional object detection problem in constantly augmented feature space. The proposed approach extracts spatial–temporal three-dimensional convolutional neural network features from video stream and view-invariant features from skeleton stream, respectively. Furthermore, these two streams are encoded into three-dimensional feature spaces, which are represented as red, green, and blue images for subsequent network input. We proposed the two-stream You Only Look Once-based networks which are capable of fusing video and skeleton information by using the processing pipeline to provide two fusion strategies, boxes-fusion or layers-fusion. We test the temporal action detection performance of two-stream You Only Look Once network based on our data set High-Speed Interplanetary Tug/Cocoon Vehicles-v1, which contains seven activities in the home environment and achieve a particularly high mean average precision. We also test our model on the public data set PKU-MMD that contains 51 activities, and our method also has a good performance on this data set. To prove that our method can work efficiently on robots, we transplanted it to the robotic platform and an online fall down detection experiment.
APA, Harvard, Vancouver, ISO, and other styles
8

Hamza, Ahmed M., Mohamed Abdelazim, Abdelrahman Abdelazim, and Djamel Ait-Boudaoud. "HEVC Rate-Distortion Optimization with Source Modeling." Electronic Imaging 2021, no. 10 (January 18, 2021): 259–1. http://dx.doi.org/10.2352/issn.2470-1173.2021.10.ipas-259.

Full text
Abstract:
The Rate-Distortion adaptive mechanisms of MPEG-HEVC (High Efficiency Video Coding) and its derivatives are an incremental improvement in the software reference encoder, providing a selective Lagrangian parameter choice which varies by encoding mode (intra or inter) and picture reference level. Since this weighting factor (and the balanced cost functions it impacts) are crucial to the RD optimization process, affecting several encoder decisions and both coding efficiency and quality of the encoded stream, we investigate an improvement by modern reinforcement learning methods. We develop a neural-based agent that learns a real-valued control policy to maximize rate savings by input signal pattern, mapping pixel intensity values from the picture at the coding tree unit level, to the appropriate weighting-parameter. Our testing on reference software yields improvements for coding efficiency performance across different video sequences, in multiple classes of video.
APA, Harvard, Vancouver, ISO, and other styles
9

Mohammed, Dhrgham Hani, and Laith Ali Abdul-Rahaim. "A Proposed of Multimedia Compression System Using Three - Dimensional Transformation." Webology 18, SI05 (October 30, 2021): 816–31. http://dx.doi.org/10.14704/web/v18si05/web18264.

Full text
Abstract:
Video compression has become especially important nowadays with the increase of data transmitted over transmission channels, the reducing the size of the videos must be done without affecting the quality of the video. This process is done by cutting the video thread into frames of specific lengths and converting them into a three-dimensional matrix. The proposed compression scheme uses the traditional red-green-blue color space representation and applies a three-dimensional discrete Fourier transform (3D-DFT) or three-dimensional discrete wavelet transform (3D-DWT) to the signal matrix after converted the video stream to three-dimensional matrices. The resulting coefficients from the transformation are encoded using the EZW encoder algorithm. Three main criteria by which the performance of the proposed video compression system will be tested; Compression ratio (CR), peak signal-to-noise ratio (PSNR) and processing time (PT). Experiments showed high compression efficiency for videos using the proposed technique with the required bit rate, the best bit rate for traditional video compression. 3D discrete wavelet conversion has a high frame rate with natural spatial resolution and scalability through visual and spatial resolution Beside the quality and other advantages when compared to current conventional systems in complexity, low power, high throughput, low latency and minimum the storage requirements. All proposed systems implement using MATLAB R2020b.
APA, Harvard, Vancouver, ISO, and other styles
10

Yamagiwa, Shinichi, and Yuma Ichinomiya. "Stream-Based Visually Lossless Data Compression Applying Variable Bit-Length ADPCM Encoding." Sensors 21, no. 13 (July 5, 2021): 4602. http://dx.doi.org/10.3390/s21134602.

Full text
Abstract:
Video applications have become one of the major services in the engineering field, which are implemented by server–client systems connected via the Internet, broadcasting services for mobile devices such as smartphones and surveillance cameras for security. Recently, the majority of video encoding mechanisms to reduce the data rate are mainly lossy compression methods such as the MPEG format. However, when we consider special needs for high-speed communication such as display applications and object detection ones with high accuracy from the video stream, we need to address the encoding mechanism without any loss of pixel information, called visually lossless compression. This paper focuses on the Adaptive Differential Pulse Code Modulation (ADPCM) that encodes a data stream into a constant bit length per data element. However, the conventional ADPCM does not have any mechanism to control dynamically the encoding bit length. We propose a novel ADPCM that provides a mechanism with a variable bit-length control, called ADPCM-VBL, for the encoding/decoding mechanism. Furthermore, since we expect that the encoded data from ADPCM maintains low entropy, we expect to reduce the amount of data by applying a lossless data compression. Applying ADPCM-VBL and a lossless data compression, this paper proposes a video transfer system that controls throughput autonomously in the communication data path. Through evaluations focusing on the aspects of the encoding performance and the image quality, we confirm that the proposed mechanisms effectively work on the applications that needs visually lossless compression by encoding video stream in low latency.
APA, Harvard, Vancouver, ISO, and other styles
11

Abdullah, Miran Taha, Najmadin Wahid Abdulrahman, Aree Ali Mohammed, and Diary Nawzad Hama. "Impact of Wireless Network Packet Loss on Real-Time Video Streaming Application: A Comparative Study of H.265 and H.266 Codecs." Kurdistan Journal of Applied Research 9, no. 2 (September 22, 2024): 23–41. http://dx.doi.org/10.24017/science.2024.2.3.

Full text
Abstract:
The transmission of real-time videos over wireless networks is prone to the negative consequences of packet loss and delay, which can have a potential effect on the video quality during streaming. These impairments can lead to interruptions, buffering, and degradation of visual and auditory elements, resulting in an unsatisfactory user experience. In this paper, we aim to address the challenges associated with packet loss and delay parameters in wireless networks, and propose an approach to alleviate their impact on real-time video transmission. The proposed approach involves utilizing the H.265/H.266 video coding standards. For Versatile Video Coding (VVC), a patch support for VVdeC and VVenC to FFmpeg (Fast Forward Moving Picture Expert Group) is added. As a result, FFmpeg is used to encode, stream and decode all videos. Raw videos of 2K qualities are encoded based on the adaptive quantization (QP) for the above-mentioned codecs. By selecting optimal transmission data based on various network conditions, this approach enhances the Quality of Experience (QoE) for end-users while minimizing resource usage in the wireless network. Furthermore, the proposed approach selects the codec standards according to their bitrates and frame rates. Simulation results indicate that the proposed approach has a significant improvement for real-time video streaming over wireless networks to satisfy the end user experience. The approach also outperforms other related work by gaining a PSNR of +12 dB for H.265 and +13 dB for H.266 when the network packet loss is 1%.
APA, Harvard, Vancouver, ISO, and other styles
12

Baba, Marius, Vasile Gui, Cosmin Cernazanu, and Dan Pescaru. "A Sensor Network Approach for Violence Detection in Smart Cities Using Deep Learning." Sensors 19, no. 7 (April 8, 2019): 1676. http://dx.doi.org/10.3390/s19071676.

Full text
Abstract:
Citizen safety in modern urban environments is an important aspect of life quality. Implementation of a smart city approach to video surveillance depends heavily on the capability of gathering and processing huge amounts of live urban data. Analyzing data from high bandwidth surveillance video streams provided by large size distributed sensor networks is particularly challenging. We propose here an efficient method for automatic violent behavior detection designed for video sensor networks. Known solutions to real-time violence detection are not suitable for implementation in a resource-constrained environment due to the high processing power requirements. Our algorithm achieves real-time processing on a Raspberry PI-embedded architecture. To ensure separation of temporal and spatial information processing we employ a computationally effective cascaded approach. It consists of a deep neural network followed by a time domain classifier. In contrast with current approaches, the deep neural network input is fed exclusively with motion vector features extracted directly from the MPEG encoded video stream. As proven by results, we achieve state-of-the-art performance, while running on a low computational resources embedded architecture.
APA, Harvard, Vancouver, ISO, and other styles
13

Barannik, Volodymyr, Yuriy Ryabukha, Pavlo Hurzhii, Vitalii Tverdokhlib, and Oleh Kulitsa. "TRANSFORMANTS CODING TECHNOLOGY IN THE CONTROL SYSTEM OF VIDEO STREAMS BIT RATE." Cybersecurity: Education, Science, Technique 3, no. 7 (2020): 63–71. http://dx.doi.org/10.28925/2663-4023.2020.7.6371.

Full text
Abstract:
The conceptual basements of constructing an effective encoding method within the bit rate control module of video traffic in the video data processing system at the source level are considered. The essence of using the proposed method in the course of the video stream bit rate controlling disclosed, namely, the principles of constructing the fragment of the frame code representation and approaches for determining the structural units of the individual video frame within which the control is performed. The method focuses on processing the bit representation of the DCT transformants, and at this processing stage transformant was considered as a structural component of the video stream frame at which the encoding is performed. At the same time, to ensure the video traffic bit rate controlling flexibility, decomposition is performed with respect to each of the transformants to the level of the plurality of bit planes. It is argued that the proposed approach is potentially capable to reducing the video stream bit rate in the worst conditions, that is, when component coding is performed. In addition, this principle of video stream fragmen code representation forming allows to control the level of error that can be made in the bit rate control process. However, in conditions where the bit representation of the transformant is encoded, the method is able to provide higher compression rates as a result of the fact that the values of the detection probability of binary series lengths and the values of detected lengths within the bit plane will be greater than in the case of component coding. This is explained by the structural features of the distribution of binary elements within each of the bit planes, which together form the transformer DCT. In particular, high-frequency transformer regions are most often formed by chains of zero elements. The solutions proposed in the development of the encoding method are able to provide sufficient flexibility to control the bit rate of the video stream, as well as the ability to quickly change the bit rate in a wide range of values.
APA, Harvard, Vancouver, ISO, and other styles
14

Barannik, Vladimir, Yuriy Ryabukha, Pavlo Gurzhiy, Vitaliy Tverdokhlib, and Igor Shevchenko. "TRANSFORMANTS BIT REPRESENTATION ENCODING WITHIN VIDEO BIT RATE CONTROL." Information systems and technologies security, no. 1 (1) (2019): 52–56. http://dx.doi.org/10.17721/ists.2019.1.52-56.

Full text
Abstract:
The conceptual basements of constructing an effective encoding method within the bit rate control module of video traffic in the video data processing system at the source level are considered. The essence of using the proposed method in the course of the video stream bit rate controlling disclosed, namely, the principles of constructing the fragment of the frame code representation and approaches for determining the structural units of the individual video frame within which the control is performed. The method focuses on processing the bit representation of the DCT transformants, and at his processing stage transformant was considered as a structural component of the video stream frame at which the encoding is performed. At the same time, to ensure the video traffic bit rate controlling flexibility, decomposition is performed with respect to each of the transformants to the level of the plurality of bit planes. It is argued that the proposed approach is potentially capable to reducing the video stream bit rate in the worst conditions, that is, when component coding is performed. In addition, this principle of video stream fragmen code representation forming allows to control the level of error that can be made in the bit rate control process. However, in conditions where the bit representation of the transformant is encoded, the method is able to provide higher compression rates as a result of the fact that the values of the detection probability of binary series lengths and the values of detected lengths within the bit plane will be greater than in the case of component coding. This is explained by the structural features of the distribution of binary elements within each of the bit planes, which together form the transformer DCT. In particular, high-frequency transformer regions are most often formed by chains of zero elements. The solutions proposed in the development of the encoding method are able to provide sufficient flexibility to control the bit rate of the video stream, as well as the ability to quickly change the bit rate in a wide range of values
APA, Harvard, Vancouver, ISO, and other styles
15

Al-Abbasi, Abubakr O., and Vaneet Aggarwal. "VidCloud." ACM Transactions on Modeling and Performance Evaluation of Computing Systems 5, no. 4 (March 2021): 1–32. http://dx.doi.org/10.1145/3442187.

Full text
Abstract:
As video-streaming services have expanded and improved, cloud-based video has evolved into a necessary feature of any successful business for reaching internal and external audiences. In this article, video streaming over distributed storage is considered where the video segments are encoded using an erasure code for better reliability. We consider a representative system architecture for a realistic (typical) content delivery network (CDN). Given multiple parallel streams/link between each server and the edge router, we need to determine, for each client request, the subset of servers to stream the video, as well as one of the parallel streams from each chosen server. To have this scheduling, this article proposes a two-stage probabilistic scheduling. The selection of video quality is also chosen with a certain probability distribution that is optimized in our algorithm. With these parameters, the playback time of video segments is determined by characterizing the download time of each coded chunk for each video segment. Using the playback times, a bound on the moment generating function of the stall duration is used to bound the mean stall duration. Based on this, we formulate an optimization problem to jointly optimize the convex combination of mean stall duration and average video quality for all requests, where the two-stage probabilistic scheduling, video quality selection, bandwidth split among parallel streams, and auxiliary bound parameters can be chosen. This non-convex problem is solved using an efficient iterative algorithm. Based on the offline version of our proposed algorithm, an online policy is developed where servers selection, quality, bandwidth split, and parallel streams are selected in an online manner. Experimental results show significant improvement in QoE metrics for cloud-based video as compared to the considered baselines.
APA, Harvard, Vancouver, ISO, and other styles
16

Di Laura, Christian, Diego Pajuelo, and Guillermo Kemper. "A Novel Steganography Technique for SDTV-H.264/AVC Encoded Video." International Journal of Digital Multimedia Broadcasting 2016 (2016): 1–9. http://dx.doi.org/10.1155/2016/6950592.

Full text
Abstract:
Today, eavesdropping is becoming a common issue in the rapidly growing digital network and has foreseen the need for secret communication channels embedded in digital media. In this paper, a novel steganography technique designed for Standard Definition Digital Television (SDTV) H.264/AVC encoded video sequences is presented. The algorithm introduced here makes use of the compression properties of the Context Adaptive Variable Length Coding (CAVLC) entropy encoder to achieve a low complexity and real-time inserting method. The chosen scheme hides the private message directly in the H.264/AVC bit stream by modifying the AC frequency quantized residual luminance coefficients of intrapredicted I-frames. In order to avoid error propagation in adjacent blocks, an interlaced embedding strategy is applied. Likewise, the steganography technique proposed allows self-detection of the hidden message at the target destination. The code source was implemented by mixing MATLAB 2010 b and Java development environments. Finally, experimental results have been assessed through objective and subjective quality measures and reveal that less visible artifacts are produced with the technique proposed by reaching PSNR values above 40.0 dB and an embedding bit rate average per secret communication channel of 425 bits/sec. This exemplifies that steganography is affordable in digital television.
APA, Harvard, Vancouver, ISO, and other styles
17

Feng, Wu‐chi. "On the efficacy of quality, frame rate, and buffer management for video streaming across best‐effort networks." Journal of High Speed Networks 11, no. 3-4 (January 1, 2002): 199–214. https://doi.org/10.3233/hsn-2002-220.

Full text
Abstract:
In this paper, we propose a mechanism that supports the high quality streaming and adaptation of stored, constant‐quality video across best‐effort networks. The difficulty in the delivery of such traffic over best‐effort networks is the fact that both the bandwidth required from the stream and the bandwidth available across the network vary considerably over time. Our proposed approach has a number of useful features. First, it uses the a priori information from the video stream to drive that adaptation policy. Second, it does not rely on any special handling or support from the network itself, although any additional support from the network will indirectly help increase the video quality. Third, it can be built on top of TCP, which effectively separates the adaptation and streaming from the transport protocol. This makes the question of TCP‐friendliness much easier to answer. Using actual MPEG encoded video data at multiple qualities, we show through experimentation that this approach provides a viable alternative for streaming media across best‐effort networks.
APA, Harvard, Vancouver, ISO, and other styles
18

Grecos, Christos. "A Spatially Enhanced Error Concealment Technique and its Potential Alternative Application to Reduce H.264 Stream Sizes." Journal of Communications Software and Systems 4, no. 4 (December 21, 2008): 266. http://dx.doi.org/10.24138/jcomss.v4i4.216.

Full text
Abstract:
With more and more video content being transmitted digitally and with user expectations continually rising, errorconcealment is becoming an increasingly important part ofstreaming media. Often overlooked in the past, even nowmanufacturers are often only doing the bare minimum necessary in order to avoid complexity. This paper first presents a combination of simple techniques that when combined produce an extremely effective concealment method that maintains spatially correlated edges throughout any lost data; this in turn gives an increase in both mathematical and visual performance when compared against the commonly used bilinear concealment technique. Secondly this paper looks at an alternative use of thebilinear passive error concealment algorithm that is often used by H.264 decoders. Occasionally a concealed macroblock is mathematically closer to the original than an encoded and decoded one, by removing these from the stream at the encoder and thus forcing the decoder to conceal the missing data, a significant reduction in the bit stream size (up to 5%) can be achieved with almost no loss in quality.
APA, Harvard, Vancouver, ISO, and other styles
19

Neima, Hikmat Z. "Fast Video Compression Method based on Wavelet Transform and Simple Efficient Search Algorithm." University of Thi-Qar Journal of Science 4, no. 3 (June 5, 2014): 143–50. http://dx.doi.org/10.32792/utq/utjsci/v4i3.646.

Full text
Abstract:
In this paper, a fast video compression method has been proposed. In proposed method, Firstly, frames were transformed using Discrete Wavelet Transform (DWT) in order to reduce computation time. The disparity between each two adjacent frames was estimated by Simple and Efficient Search (SES) algorithm. The result of the Motion Vector (MV) was encoded into a bit stream by Huffman encoding while the remaining part is compressed like the compression that is used in still image. The proposed method produced good results in terms of Peak Signal-to-Noise Ratio (PSNR), CR, and computation time.
APA, Harvard, Vancouver, ISO, and other styles
20

Zhang, Zirui, Ping Chen, Weijun Li, Xiaoming Xiong, Qianxue Wang, Heping Wen, Songbin Liu, and Shuting Cai. "Design and ARM-Based Implementation of Bitstream-Oriented Chaotic Encryption Scheme for H.264/AVC Video." Entropy 23, no. 11 (October 29, 2021): 1431. http://dx.doi.org/10.3390/e23111431.

Full text
Abstract:
In actual application scenarios of the real-time video confidential communication, encrypted videos must meet three performance indicators: security, real-time, and format compatibility. To satisfy these requirements, an improved bitstream-oriented encryption (BOE) method based chaotic encryption for H.264/AVC video is proposed. Meanwhile, an ARM-embedded remote real-time video confidential communication system is built for experimental verification in this paper. Firstly, a 4-D self-synchronous chaotic stream cipher algorithm with cosine anti-controllers (4-D SCSCA-CAC) is designed to enhance the security. The algorithm solves the security loopholes of existing self-synchronous chaotic stream cipher algorithms applied to the actual video confidential communication, which can effectively resist the combinational effect of the chosen-ciphertext attack and the divide-and-conquer attack. Secondly, syntax elements of the H.264 bitstream are analyzed in real-time. Motion vector difference (MVD) coefficients and direct-current (DC) components in Residual syntax element are extracted through the Exponential-Golomb decoding operation and entropy decoding operation based on the context-based adaptive variable length coding (CAVLC) mode, respectively. Thirdly, the DC components and MVD coefficients are encrypted by the 4-D SCSCA-CAC, and the encrypted syntax elements are re-encoded to replace the syntax elements of the original H.264 bitstream, keeping the format compatibility. Besides, hardware codecs and multi-core multi-threading technology are employed to improve the real-time performance of the hardware system. Finally, experimental results show that the proposed scheme, with the advantage of high efficiency and flexibility, can fulfill the requirement of security, real-time, and format compatibility simultaneously.
APA, Harvard, Vancouver, ISO, and other styles
21

Hussien, Marwa Kamel, and Hameed Abdul-Kareem Younis. "DWT Based-Video Compression Using (4SS) Matching Algorithm." Journal of University of Human Development 1, no. 4 (September 30, 2015): 427. http://dx.doi.org/10.21928/juhd.v1n4y2015.pp427-432.

Full text
Abstract:
Currently, multimedia technology is widely used. Using the video encoding compression technology can save storage space, and also can improve the transmission efficiency of network communications. In video compression methods, the first frame of video is independently compressed as a still image, this is called intra coded frame. The remaining successive frames are compressed by estimating the disparity between two adjacent frames, which is called inter coded frame. In this paper, Discrete Wavelet Transform (DWT) is used powerful tool in video compression. Our coder achieves a good trade-off between compression ratio and quality of the reconstructed video. The motion estimation and compensation, which is an essential part in the compression, is based on segment movements. The disparity between each two frames was estimated by Four Step Search (4SS) Algorithm. The result of the Motion Vector (MV) was encoded into a bit stream by Huffman encoding while the remaining part is compressed like the compression was used in intra frame. Experimental results showed good results in terms of Peak Signal-to-Noise Ratio (PSNR), Compression Ratio (CR), and processing time.
APA, Harvard, Vancouver, ISO, and other styles
22

Chen, Pei, Zhuo Zhang, Yang Lei, Ke Niu, and Xiaoyuan Yang. "A Multi-Domain Embedding Framework for Robust Reversible Data Hiding Scheme in Encrypted Videos." Electronics 11, no. 16 (August 15, 2022): 2552. http://dx.doi.org/10.3390/electronics11162552.

Full text
Abstract:
For easier cloud management, reversible data hiding is performed in an encrypted domain to embed label information. However, the existing schemes are not robust and may cause the loss of label information during transmission. Enhancing robustness while maintaining reversibility in data hiding is a challenge. In this paper, a multi-domain embedding framework in encrypted videos is proposed to achieve both robustness and reversibility. In the framework, the multi-domain characteristic of encrypted video is fully used. The element for robust embedding is encrypted through Logistic chaotic scrambling, which is marked as element-I. To further improve robustness, the label information will be encoded with the Bose–Chaudhuri–Hocquenghem code. Then, the label information will be robustly embedded into element-I by modulating the amplitude of element-I, in which the auxiliary information is generated for lossless recovery of the element-I. The element for reversible embedding is marked as element-II, the sign of which will be encrypted by stream cipher. The auxiliary information will be reversibly embedded into element-Ⅱ through traditional histogram shifting. To verity the feasibility of the framework, an anti-recompression RDH-EV based on the framework is proposed. The experimental results show that the proposed scheme outperforms the current representative ones in terms of robustness, while achieving reversibility. In the proposed scheme, video encryption and data hiding are commutative and the original video bitstream can be recovered fully. These demonstrate the feasibility of the multi-domain embedding framework in encrypted videos.
APA, Harvard, Vancouver, ISO, and other styles
23

Aubry, Sophie, Sohaib Laraba, Joëlle Tilmanne, and Thierry Dutoit. "Action recognition based on 2D skeletons extracted from RGB videos." MATEC Web of Conferences 277 (2019): 02034. http://dx.doi.org/10.1051/matecconf/201927702034.

Full text
Abstract:
In this paper a methodology to recognize actions based on RGB videos is proposed which takes advantages of the recent breakthrough made in deep learning. Following the development of Convolutional Neural Networks (CNNs), research was conducted on the transformation of skeletal motion data into 2D images. In this work, a solution is proposed requiring only the use of RGB videos instead of RGB-D videos. This work is based on multiple works studying the conversion of RGB-D data into 2D images. From a video stream (RGB images), a two-dimension skeleton of 18 joints for each detected body is extracted with a DNN-based human pose estimator called OpenPose. The skeleton data are encoded into Red, Green and Blue channels of images. Different ways of encoding motion data into images were studied. We successfully use state-of-the-art deep neural networks designed for image classification to recognize actions. Based on a study of the related works, we chose to use image classification models: SqueezeNet, AlexNet, DenseNet, ResNet, Inception, VGG and retrained them to perform action recognition. For all the test the NTU RGB+D database is used. The highest accuracy is obtained with ResNet: 83.317% cross-subject and 88.780% cross-view which outperforms most of state-of-the-art results.
APA, Harvard, Vancouver, ISO, and other styles
24

Ukommi, Ubong. "Assessing the Impact of Media Stream Packet Size Adaptation on Wireless Multimedia Applications." ABUAD Journal of Engineering Research and Development (AJERD) 7, no. 1 (June 5, 2024): 221–30. http://dx.doi.org/10.53982/ajerd.2024.0701.23-j.

Full text
Abstract:
Multimedia applications constitute greater percentage of traffic in wireless networks. Thus, require investigation of factors influencing effective delivering of media contents in the future, which will include not only conventional multimedia broadcast, but also video streaming to users on demand while meeting the expected quality requirements. In this article, analysis of effect of media packet size adaptation on quality performance of multimedia application is presented. Experiments were performed using standard test media sequences. The encoded media streams at different packet sizes were transmitted over wireless channel at different channel conditions. The quality performance of received media streams were measured using Peak to Signal Noise Ratio (PSNR) software tool to assess the impact of media packet adaptation on quality performance of multimedia applications. A comparative quality performance under same poor channel condition, shows that small media packet size of 256 bytes recorded the highest received quality performance of 22.52dB, compared to the quality performance of 21.87dB for 384 bytes, 21.37dB for 512 bytes, 20.68dB bytes for 640 bytes and 19.47dB for 768 byes, respectively. The findings show media packet size and channel conditions have significant impact on the quality performance of wireless multimedia applications.
APA, Harvard, Vancouver, ISO, and other styles
25

Razavi, R., M. Fleury, and M. Ghanbari. "Enabling Cognitive Load-Aware AR with Rateless Coding on a Wearable Network." Advances in Multimedia 2008 (2008): 1–12. http://dx.doi.org/10.1155/2008/853816.

Full text
Abstract:
Augmented reality (AR) on a head-mounted display is conveniently supported by a wearable wireless network. If, in addition, the AR display is moderated to take account of the cognitive load of the wearer, then additional biosensors form part of the network. In this paper, the impact of these additional traffic sources is assessed. Rateless coding is proposed to not only protect the fragile encoded video stream from wireless noise and interference but also to reduce coding overhead. The paper proposes a block-based form of rateless channel coding in which the unit of coding is a block within a packet. The contribution of this paper is that it minimizes energy consumption by reducing the overhead from forward error correction (FEC), while error correction properties are conserved. Compared to simple packet-based rateless coding, with this form of block-based coding, data loss is reduced and energy efficiency is improved. Cross-layer organization of piggy-backed response blocks must take place in response to feedback, as detailed in the paper. Compared also to variants of its default FEC scheme, results from a Bluetooth (IEEE 802.15.1) wireless network show a consistent improvement in energy consumption, packet arrival latency, and video quality at the AR display.
APA, Harvard, Vancouver, ISO, and other styles
26

Suman, Saurabh, Nilay Naharas, Badri Narayan Subudhi, and Vinit Jakhetiya. "Two-Streams: Dark and Light Networks with Graph Convolution for Action Recognition from Dark Videos (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 13 (June 26, 2023): 16340–41. http://dx.doi.org/10.1609/aaai.v37i13.27030.

Full text
Abstract:
In this article, we propose a two-stream action recognition technique for recognizing human actions from dark videos. The proposed action recognition network consists of an image enhancement network with Self-Calibrated Illumination (SCI) module, followed by a two-stream action recognition network. We have used R(2+1)D as a feature extractor for both streams with shared weights. Graph Convolutional Network (GCN), a temporal graph encoder is utilized to enhance the obtained features which are then further fed to a classification head to recognize the actions in a video. The experimental results are presented on the recent benchmark ``ARID" dark-video database.
APA, Harvard, Vancouver, ISO, and other styles
27

Barradas, Diogo, Nuno Santos, and Luís Rodrigues. "DeltaShaper: Enabling Unobservable Censorship-resistant TCP Tunneling over Videoconferencing Streams." Proceedings on Privacy Enhancing Technologies 2017, no. 4 (October 1, 2017): 5–22. http://dx.doi.org/10.1515/popets-2017-0037.

Full text
Abstract:
AbstractThis paper studies the possibility of using the encrypted video channel of widely used videoconferencing applications, such as Skype, as a carrier for unobservable covert TCP/IP communications. We propose and evaluate different alternatives to encode information in the video stream in order to increase available throughput while preserving the packet-level characteristics of the video stream. We have built a censorship-resistant system, named DeltaShaper, which offers a data-link interface and supports TCP/IP applications that tolerate low throughput / high latency links. Our results show that it is possible to run standard protocols such as FTP, SMTP, or HTTP over Skype video streams.
APA, Harvard, Vancouver, ISO, and other styles
28

Allouche, Mohamed, Elliot Cole, Mateo Zoughebi, Carl De Sousa Trias, and Mihai Mitrea. "Stream encoder identification in green video context." Electronic Imaging 37, no. 10 (February 2, 2025): 234–1. https://doi.org/10.2352/ei.2025.37.10.ipas-234.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Saleem, Gulshan, Usama Ijaz Bajwa, Rana Hammad Raza, and Fan Zhang. "Edge-Enhanced TempoFuseNet: A Two-Stream Framework for Intelligent Multiclass Video Anomaly Recognition in 5G and IoT Environments." Future Internet 16, no. 3 (February 29, 2024): 83. http://dx.doi.org/10.3390/fi16030083.

Full text
Abstract:
Surveillance video analytics encounters unprecedented challenges in 5G and IoT environments, including complex intra-class variations, short-term and long-term temporal dynamics, and variable video quality. This study introduces Edge-Enhanced TempoFuseNet, a cutting-edge framework that strategically reduces spatial resolution to allow the processing of low-resolution images. A dual upscaling methodology based on bicubic interpolation and an encoder–bank–decoder configuration is used for anomaly classification. The two-stream architecture combines the power of a pre-trained Convolutional Neural Network (CNN) for spatial feature extraction from RGB imagery in the spatial stream, while the temporal stream focuses on learning short-term temporal characteristics, reducing the computational burden of optical flow. To analyze long-term temporal patterns, the extracted features from both streams are combined and routed through a Gated Recurrent Unit (GRU) layer. The proposed framework (TempoFuseNet) outperforms the encoder–bank–decoder model in terms of performance metrics, achieving a multiclass macro average accuracy of 92.28%, an F1-score of 69.29%, and a false positive rate of 4.41%. This study presents a significant advancement in the field of video anomaly recognition and provides a comprehensive solution to the complex challenges posed by real-world surveillance scenarios in the context of 5G and IoT.
APA, Harvard, Vancouver, ISO, and other styles
30

Mel, Bartlett W. "SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition." Neural Computation 9, no. 4 (May 1, 1997): 777–804. http://dx.doi.org/10.1162/neco.1997.9.4.777.

Full text
Abstract:
Severe architectural and timing constraints within the primate visual system support the conjecture that the early phase of object recognition in the brain is based on a feedforward feature-extraction hierarchy. To assess the plausibility of this conjecture in an engineering context, a difficult three-dimensional object recognition domain was developed to challenge a pure feedforward, receptive-field based recognition model called SEEMORE. SEEMORE is based on 102 viewpoint-invariant nonlinear filters that as a group are sensitive to contour, texture, and color cues. The visual domain consists of 100 real objects of many different types, including rigid (shovel), nonrigid (telephone cord), and statistical (maple leaf cluster) objects and photographs of complex scenes. Objects were in dividually presented in color video images under normal room lighting conditions. Based on 12 to 36 training views, SEEMORE was required to recognize unnormalized test views of objects that could vary in position, orientation in the image plane and in depth, and scale (factor of 2); for non rigid objects, recognition was also tested under gross shape deformations. Correct classification performance on a test set consisting of 600 novel object views was 97 percent (chance was 1 percent) and was comparable for the subset of 15 nonrigid objects. Performance was also measured under a variety of image degradation conditions, including partial occlusion, limited clutter, color shift, and additive noise. Generalization behavior and classification errors illustrate the emergence of several striking natural shape categories that are not explicitly encoded in the dimensions of the feature space. It is concluded that in the light of the vast hardware resources available in the ventral stream of the primate visual system relative to those exercised here, the appealingly simple feature-space conjecture remains worthy of serious consideration as a neurobiological model.
APA, Harvard, Vancouver, ISO, and other styles
31

Anegekuh, Louis, Lingfen Sun, Emmanuel Jammeh, Is-Haka Mkwawa, and Emmanuel Ifeachor. "Content-Based Video Quality Prediction for HEVC Encoded Videos Streamed Over Packet Networks." IEEE Transactions on Multimedia 17, no. 8 (August 2015): 1323–34. http://dx.doi.org/10.1109/tmm.2015.2444098.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Gao, Lianli, Pengpeng Zeng, Jingkuan Song, Yuan-Fang Li, Wu Liu, Tao Mei, and Heng Tao Shen. "Structured Two-Stream Attention Network for Video Question Answering." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6391–98. http://dx.doi.org/10.1609/aaai.v33i01.33016391.

Full text
Abstract:
To date, visual question answering (VQA) (i.e., image QA and video QA) is still a holy grail in vision and language understanding, especially for video QA. Compared with image QA that focuses primarily on understanding the associations between image region-level details and corresponding questions, video QA requires a model to jointly reason across both spatial and long-range temporal structures of a video as well as text to provide an accurate answer. In this paper, we specifically tackle the problem of video QA by proposing a Structured Two-stream Attention network, namely STA, to answer a free-form or open-ended natural language question about the content of a given video. First, we infer rich longrange temporal structures in videos using our structured segment component and encode text features. Then, our structured two-stream attention component simultaneously localizes important visual instance, reduces the influence of background video and focuses on the relevant text. Finally, the structured two-stream fusion component incorporates different segments of query and video aware context representation and infers the answers. Experiments on the large-scale video QA dataset TGIF-QA show that our proposed method significantly surpasses the best counterpart (i.e., with one representation for the video input) by 13.0%, 13.5%, 11.0% and 0.3 for Action, Trans., TrameQA and Count tasks. It also outperforms the best competitor (i.e., with two representations) on the Action, Trans., TrameQA tasks by 4.1%, 4.7%, and 5.1%.
APA, Harvard, Vancouver, ISO, and other styles
33

Hou, Yanzhao, Nan Hu, Qimei Cui, and Xiaofeng Tao. "Performance analysis of scalable video transmission in machine-type-communication caching network." International Journal of Distributed Sensor Networks 15, no. 1 (January 2019): 155014771881585. http://dx.doi.org/10.1177/1550147718815851.

Full text
Abstract:
In this article, different from the traditional Device-to-Device caching wireless cellular networks, we consider the scalable video coding performance in cache-based machine-type communication network, where popular videos encoded by scalable video coding method can be cached at machine-type devices with limited memory space. We conduct a comprehensive analysis of the caching hit probability using stochastic geometry, which measures the probability of requested video files cached by nearby local devices and the user satisfaction index, which is essential to delay sensitive video streams. Simulation results prove the derivation of the performance metrics to be correct, using Random cache method and Popularity Priority cache method. It is also demonstrated that scalable video coding–based caching method can be applied according to different user requirements as well as video-type requests, to achieve a better performance.
APA, Harvard, Vancouver, ISO, and other styles
34

Zhou, Tianfei, Shunzhou Wang, Yi Zhou, Yazhou Yao, Jianwu Li, and Ling Shao. "Motion-Attentive Transition for Zero-Shot Video Object Segmentation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 13066–73. http://dx.doi.org/10.1609/aaai.v34i07.7008.

Full text
Abstract:
In this paper, we present a novel Motion-Attentive Transition Network (MATNet) for zero-shot video object segmentation, which provides a new way of leveraging motion information to reinforce spatio-temporal object representation. An asymmetric attention block, called Motion-Attentive Transition (MAT), is designed within a two-stream encoder, which transforms appearance features into motion-attentive representations at each convolutional stage. In this way, the encoder becomes deeply interleaved, allowing for closely hierarchical interactions between object motion and appearance. This is superior to the typical two-stream architecture, which treats motion and appearance separately in each stream and often suffers from overfitting to appearance information. Additionally, a bridge network is proposed to obtain a compact, discriminative and scale-sensitive representation for multi-level encoder features, which is further fed into a decoder to achieve segmentation results. Extensive experiments on three challenging public benchmarks (i.e., DAVIS-16, FBMS and Youtube-Objects) show that our model achieves compelling performance against the state-of-the-arts. Code is available at: https://github.com/tfzhou/MATNet.
APA, Harvard, Vancouver, ISO, and other styles
35

Wang, Ming Wei, Jian Zhong Lin, and Qi Wang. "Research on MPEG Video Stream Transmission Performance Based on NS-2." Applied Mechanics and Materials 443 (October 2013): 412–16. http://dx.doi.org/10.4028/www.scientific.net/amm.443.412.

Full text
Abstract:
Following the deployment of network multimedia applications, such as video conference, simulation research on video stream characteristic is more and more important. MPEG standard encodes the video sequences to frame I, P, and B, and has higher compression rate and bandwidth saving character. MPEG video traffic characteristic and simulation methods are researched in this paper. Network simulation version 2(NS-2) kernels is extended using C/C++ programming language. An MPEG video traffic generator modules and simulation interfaces are designed in NS-2 kernel. Simulation results show that the proposed extension method is feasible and effective.
APA, Harvard, Vancouver, ISO, and other styles
36

Stensland, Håkon Kvale, Martin Alexander Wilhelmsen, Vamsidhar Reddy Gaddam, Asgeir Mortensen, Ragnar Langseth, Carsten Griwodz, and Pål Halvorsen. "Using a Commodity Hardware Video Encoder for Interactive Applications." International Journal of Multimedia Data Engineering and Management 6, no. 3 (July 2015): 17–31. http://dx.doi.org/10.4018/ijmdem.2015070102.

Full text
Abstract:
Over the last years, video streaming has become one of the most dominant Internet services. Due to the increased availability of high-speed Internet access, multimedia services are becoming more interactive. Examples of such applications are both cloud gaming (OnLive, 2014) and systems where users can interact with high-resolution content (Gaddam et al., 2014). During the last few years, programmable hardware video encoders have been built into commodity hardware such as CPUs and GPUs. One of these encoders is evaluated in a scenario where individual streams are delivered to the end users. The results show that the visual video quality and the frame size of the hardware-based encoder are comparable to a software-based approach. To evaluate a complete system, a proposed streaming pipeline has been implemented into Quake III. It was found that running the game on a remote server and streaming the video output to a client web browser located in a typical home environment is possible and enjoyable. The interaction latency is measured to be less than 90 ms, which is below what is reported for OnLive in a similar environment
APA, Harvard, Vancouver, ISO, and other styles
37

Wu, Xiao, and Qingge Ji. "TBRNet: Two-Stream BiLSTM Residual Network for Video Action Recognition." Algorithms 13, no. 7 (July 15, 2020): 169. http://dx.doi.org/10.3390/a13070169.

Full text
Abstract:
Modeling spatiotemporal representations is one of the most essential yet challenging issues in video action recognition. Existing methods lack the capacity to accurately model either the correlations between spatial and temporal features or the global temporal dependencies. Inspired by the two-stream network for video action recognition, we propose an encoder–decoder framework named Two-Stream Bidirectional Long Short-Term Memory (LSTM) Residual Network (TBRNet) which takes advantage of the interaction between spatiotemporal representations and global temporal dependencies. In the encoding phase, the two-stream architecture, based on the proposed Residual Convolutional 3D (Res-C3D) network, extracts features with residual connections inserted between the two pathways, and then the features are fused to become the short-term spatiotemporal features of the encoder. In the decoding phase, those short-term spatiotemporal features are first fed into a temporal attention-based bidirectional LSTM (BiLSTM) network to obtain long-term bidirectional attention-pooling dependencies. Subsequently, those temporal dependencies are integrated with short-term spatiotemporal features to obtain global spatiotemporal relationships. On two benchmark datasets, UCF101 and HMDB51, we verified the effectiveness of our proposed TBRNet by a series of experiments, and it achieved competitive or even better results compared with existing state-of-the-art approaches.
APA, Harvard, Vancouver, ISO, and other styles
38

Ye, Xi En, and Yin Feng Zhu. "The Research of Video Signal Capture and Processing Technology in Microteching System." Applied Mechanics and Materials 229-231 (November 2012): 1582–85. http://dx.doi.org/10.4028/www.scientific.net/amm.229-231.1582.

Full text
Abstract:
This paper describes a video signal processing system, based on the research of DirectShow streaming media processing platform, H.264/AVC video coding technology and multi-channel video encoding chip SC8919 which includes H.264 hardware video encoder. This system can compress four channel D1(720*576) size of the image data collected by camera at the rate of 30fps into standard H.264 stream, and upload it to the control room to store and decode broadcast through existed campus Ethernet without additional laying line.
APA, Harvard, Vancouver, ISO, and other styles
39

H, Sivalingan. "CLOUD-SMART SURVEILLANCE: ENHANCING ANOMALY DETECTION IN VIDEO STREAMS WITH DF-CONVLSTM-BASED VAE-GAN." Kufa Journal of Engineering 15, no. 4 (November 1, 2024): 125–40. http://dx.doi.org/10.30572/2018/kje/150409.

Full text
Abstract:
Anomaly detection in computer vision is crucial, and manual identification of irregularities in videos is resource-intensive. Autonomous systems are essential for efficiently analysing and detecting anomalies in diverse video datasets. Video surveillance relies heavily on anomaly detection for monitoring equipment states through time-series data. Presently, deep learning methods, particularly those based on Generative Adversarial Networks (GAN), have gained prominence in time-series anomaly detection. This paper proposes a novel solution: the double-flow convolutional Long Short-Term Memory (DF-ConvLSTM) - based Variational Autoencoder- Generative Adversarial Network (VAE-GAN) method. By co-training the encoder, generator, and discriminator, this approach leverages the encoder's mapping skills and the discriminator's discrimination capabilities simultaneously. The proposed strategy is compared with LSTM-VAE, LSTM-VAE-Attention, and VAE. The proposed method is evaluated using metrics for recall, accuracy, precision, and F1 score. With classification accuracies of 91% on the University of Central Florida (UCF) crime dataset, the experimental results outperformed alternative techniques. Furthermore, the analysis of the ROC curve revealed that the suggested method performed better than the others, as evidenced by its higher ROC (Receiver Operating Characteristic) values. Experimental results demonstrate the proposed method's ability to rapidly and accurately detect anomalies in surveillance videos, ensuring efficient and reliable anomaly detection. Experimental results show the method's rapid, accurate anomaly detection in surveillance videos, ensuring efficiency and reliability. However, challenges include high computational costs, affecting the practicality of implementation for real-time anomaly detection.
APA, Harvard, Vancouver, ISO, and other styles
40

Li, Tong, Xinyue Chen, Fushun Zhu, Zhengyu Zhang, and Hua Yan. "Two-stream deep spatial-temporal auto-encoder for surveillance video abnormal event detection." Neurocomputing 439 (June 2021): 256–70. http://dx.doi.org/10.1016/j.neucom.2021.01.097.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Wydrych, Piotr, Krzysztof Rusek, and Piotr Cholda. "Efficient Modelling of Traffic and Quality of Scalable Video Coding (SVC) Encoded Streams." IEEE Communications Letters 17, no. 12 (December 2013): 2372–75. http://dx.doi.org/10.1109/lcomm.2013.110613.132163.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Boyadjis, Benoit, Cyril Bergeron, Beatrice Pesquet-Popescu, and Frederic Dufaux. "Extended Selective Encryption of H.264/AVC (CABAC)- and HEVC-Encoded Video Streams." IEEE Transactions on Circuits and Systems for Video Technology 27, no. 4 (April 2017): 892–906. http://dx.doi.org/10.1109/tcsvt.2015.2511879.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Xu, Heng, Cheng Hua Fu, and Yun Jin Yang. "A Design and Implementation of Video Monitoring System at Smart Home." Applied Mechanics and Materials 568-570 (June 2014): 1162–67. http://dx.doi.org/10.4028/www.scientific.net/amm.568-570.1162.

Full text
Abstract:
The traditional real-time video surveillance system at smart home tends to occupy a lot of resource .In order to solve this kind of problem, a design of an infrared sensor to trigger video monitoring system is proposed in this paper .The system uses arm9-Linux as the platform ,and infrared sensor as the trigger device and uses the mpeg-4 algorithm to encode the video stream finally .The article mainly introduces how to build the hardware and software platform and tests the feasibility of the system.
APA, Harvard, Vancouver, ISO, and other styles
44

Chen, Baoju, Simin Yu, Zeqing Zhang, David Day-Uei Li, and Jinhu Lü. "Design and Smartphone Implementation of Chaotic Duplex H.264-Codec Video Communications." International Journal of Bifurcation and Chaos 31, no. 03 (March 15, 2021): 2150045. http://dx.doi.org/10.1142/s0218127421500450.

Full text
Abstract:
In this paper, a chaotic duplex H.264-codec-based secure video communication scheme is designed and its smartphone implementation is also carried out. First, an improved self-synchronous chaotic stream cipher algorithm equipped with a sinusoidal modulation, a multiplication, a modulo operation and a round down operation (SCSCA-SMMR) is developed. Using the sinusoidal modulation and multiplication, the improved algorithm can resist the divide-and-conquer attack by traversing multiple nonzero component initial conditions (DCA-TMNCIC). Meanwhile, also by means of the round down operation and modulo operation, on the premise that the DCA-TMNCIC does not work, the original keys cannot be further deciphered only by the known-plaintext attack, the chosen-plaintext attack and the chosen-ciphertext attack, respectively. Then, the Android low-level multimedia support infrastructure MediaCodec class is used to access low-level media encoder/decoder components and the H.264 hardware encoding/decoding is performed on real-time videos, so the chaotic video encryption and decryption can be realized in real-time by smartphones. Security analysis and smartphone experimental results verify the effectiveness of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
45

Mallik, Bruhanth, Akbar Sheikh-Akbari, Pooneh Bagheri Zadeh, and Salah Al-Majeed. "HEVC Based Frame Interleaved Coding Technique for Stereo and Multi-View Videos." Information 13, no. 12 (November 25, 2022): 554. http://dx.doi.org/10.3390/info13120554.

Full text
Abstract:
The standard HEVC codec and its extension for coding multiview videos, known as MV-HEVC, have proven to deliver improved visual quality compared to its predecessor, H.264/MPEG-4 AVC’s multiview extension, H.264-MVC, for the same frame resolution with up to 50% bitrate savings. MV-HEVC’s framework is similar to that of H.264-MVC, which uses a multi-layer coding approach. Hence, MV-HEVC would require all frames from other reference layers decoded prior to decoding a new layer. Thus, the multi-layer coding architecture would be a bottleneck when it comes to quicker frame streaming across different views. In this paper, an HEVC-based Frame Interleaved Stereo/Multiview Video Codec (HEVC-FISMVC) that uses a single layer encoding approach to encode stereo and multiview video sequences is presented. The frames of stereo or multiview video sequences are interleaved in such a way that encoding the resulting monoscopic video stream would maximize the exploitation of temporal, inter-view, and cross-view correlations and thus improving the overall coding efficiency. The coding performance of the proposed HEVC-FISMVC codec is assessed and compared with that of the standard MV-HEVC’s performance for three standard multi-view video sequences, namely: “Poznan_Street”, “Kendo” and “Newspaper1”. Experimental results show that the proposed codec provides more substantial coding gains than the anchor MV-HEVC for coding both stereo and multi-view video sequences.
APA, Harvard, Vancouver, ISO, and other styles
46

Li, Xiao Ni, He Xin Chen, and Da Zhong Wang. "Research on Audio-Video Synchronization Coding Based on Mode Selection in H.264." Applied Mechanics and Materials 182-183 (June 2012): 701–5. http://dx.doi.org/10.4028/www.scientific.net/amm.182-183.701.

Full text
Abstract:
An embedded audio-video synchronization compression coding approach is presented. The proposed method takes advantage of the different mode types used by the H.264 encoder during the inter prediction stage, different modes carry corresponding audio information, and the audio will be embedded into video stream by choosing modes during the inter prediction stage, then synchronization coding is applied to the mixing video and audio. We have verified the synchronization processing method based on H.264/AVC using JM Model and experimental results show that this method has achieved synchronization between audio and video at small embedded cost, and the same time, audio signal can be extracted without distortion, besides, this method has hardly effect on the quality of video image.
APA, Harvard, Vancouver, ISO, and other styles
47

Cheng-Hsin Hsu and Mohamed M. Hefeeda. "Broadcasting Video Streams Encoded With Arbitrary Bit Rates in Energy-Constrained Mobile TV Networks." IEEE/ACM Transactions on Networking 18, no. 3 (June 2010): 681–94. http://dx.doi.org/10.1109/tnet.2009.2033058.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Koumaras, Harilaos, Charalampos Skianis, and Anastasios Kourtis. "Analysis and Modeling of H.264 Unconstrained VBR Video Traffic." International Journal of Mobile Computing and Multimedia Communications 1, no. 4 (October 2009): 14–31. http://dx.doi.org/10.4018/jmcmc.2009072802.

Full text
Abstract:
In future communication networks, video is expected to represent a large portion of the total traffic, given thatespecially variable bit rate (VBR) coded video streams, are becoming increasingly popular. Consequently, traffic modeling and characterization of such video services is essential for the efficient traffic control and resource management. Besides, providing an insight of video coding mechanisms, traffic models can be used as a tool for the allocation of network resources, the design of efficient networks for streaming services and the reassurance of specific QoS characteristics to the end users. The new H.264/AVC standard, proposed by the ITU-T Video Coding Expert Group (VCEG) and ISO/IEC Moving Pictures Expert Group (MPEG), is expected to dominate in upcoming multimedia services, due to the fact that it outperforms in many fields the previous encoded standards. This article presents both a frame and a layer (i.e. I, P and B frames) level analysis of H.264 encoded sources. Analysis of the data suggests that the video traffic can be considered as a stationary stochastic process with an autocorrelation function of exponentially fast decay and a marginal frame size distribution of approximately Gamma form. Finally, based on the statistical analysis, an efficient model of H.264 video traffic is proposed.
APA, Harvard, Vancouver, ISO, and other styles
49

Jin, Yao, Guocheng Niu, Xinyan Xiao, Jian Zhang, Xi Peng, and Jun Yu. "Knowledge-Constrained Answer Generation for Open-Ended Video Question Answering." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 7 (June 26, 2023): 8141–49. http://dx.doi.org/10.1609/aaai.v37i7.25983.

Full text
Abstract:
Open-ended Video question answering (open-ended VideoQA) aims to understand video content and question semantics to generate the correct answers. Most of the best performing models define the problem as a discriminative task of multi-label classification. In real-world scenarios, however, it is difficult to define a candidate set that includes all possible answers. In this paper, we propose a Knowledge-constrained Generative VideoQA Algorithm (KcGA) with an encoder-decoder pipeline, which enables out-of-domain answer generation through an adaptive external knowledge module and a multi-stream information control mechanism. We use ClipBERT to extract the video-question features, extract framewise object-level external knowledge from a commonsense knowledge base and compute the contextual-aware episode memory units via an attention based GRU to form the external knowledge features, and exploit multi-stream information control mechanism to fuse video-question and external knowledge features such that the semantic complementation and alignment are well achieved. We evaluate our model on two open-ended benchmark datasets to demonstrate that we can effectively and robustly generate high-quality answers without restrictions of training data.
APA, Harvard, Vancouver, ISO, and other styles
50

Wang, Xiu Qing, Xu Qie, Chun Xia Zhang, and Na Zhao. "Research of Video Surveillance and Diagnosis System for Plant Diseases Based on DM6446." Applied Mechanics and Materials 373-375 (August 2013): 892–96. http://dx.doi.org/10.4028/www.scientific.net/amm.373-375.892.

Full text
Abstract:
TMS320DM6446 was used as a platform to build a video monitoring and controlling system for plant diseases. TMS320DM6446 can capture and encode the plant video. It delivered the video stream to the PC monitoring center by using streaming technology. Monitoring center displayed the received video, monitored the plant condition of growth and disease, and then obtained the image and relevant characteristics of disease. It used a combination of neural networks and expert systems to diagnose the plant diseases, and gave the control methods. The experimental results show that the proposed system can timely collect information about plant growth status and disease, diagnose the type of disease correctly and then reach the purpose of effective prevention and control for plant diseases.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography