Dissertations / Theses on the topic 'Video retrieval'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Video retrieval.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Pickering, Marcus Jerome. "Video retrieval and summarisation." Thesis, Imperial College London, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.411790.
Full textChen, Juan. "Content-based Digital Video Processing. Digital Videos Segmentation, Retrieval and Interpretation." Thesis, University of Bradford, 2009. http://hdl.handle.net/10454/4256.
Full textFaichney, Jolon. "Content-Based Retrieval of Digital Video." Thesis, Griffith University, 2005. http://hdl.handle.net/10072/365697.
Full textThesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Information Technology
Science, Environment, Engineering and Technology
Full Text
Banda, Nagamani. "Adaptive video segmentation." Morgantown, W. Va. : [West Virginia University Libraries], 2004. https://etd.wvu.edu/etd/controller.jsp?moduleName=documentdata&jsp%5FetdId=3520.
Full textTitle from document title page. Document formatted into pages; contains vi, 52 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 50-52).
Vrochidis, Stefanos. "Interactive video retrieval using implicit user feedback." Thesis, Queen Mary, University of London, 2013. http://qmro.qmul.ac.uk/xmlui/handle/123456789/8729.
Full textZhang, Lelin. "Scalable Content-Based Image and Video Retrieval." Thesis, The University of Sydney, 2016. http://hdl.handle.net/2123/15439.
Full textAytar, Yusuf. "SEMANTIC VIDEO RETRIEVAL USING HIGH LEVEL CONTEXT." Master's thesis, University of Central Florida, 2008. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/3455.
Full textM.S.
School of Electrical Engineering and Computer Science
Engineering and Computer Science
Computer Science MS
Volkmer, Timo, and timovolkmer@gmx net. "Semantics of Video Shots for Content-based Retrieval." RMIT University. Computer Science and Information Technology, 2007. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20090220.122213.
Full textFernández, Beltrán Rubén. "Characterisation and adaptive learning in interactive video retrieval." Doctoral thesis, Universitat Jaume I, 2016. http://hdl.handle.net/10803/387220.
Full textIn this work, we are interested in the use of latent topics to overcome the current limitations in CBVR. Despite the potential of topic models to uncover the hidden structure of a collection, they have traditionally been unable to provide a competitive advantage in CBVR because of the high computational cost of their algorithms and the complexity of the latent space in the visual domain. Throughout this thesis we focus on designing new models and tools based on topic models to take advantage of the latent space in CBVR. Specifically, we have worked in four different areas within the retrieval process: vocabulary reduction, encoding, modelling and ranking, being our most important contributions related to both modelling and ranking.
Demirdizen, Goncagul. "An Ontology-driven Video Annotation And Retrieval System." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12612592/index.pdf.
Full text("
, "
)"
, "
AND"
and "
OR"
operators. For all these query types, the system supports both general and video specific query processing. By this means, the user is able to pose queries on all videos in the video databases as well as the details of a specific video of interest.
Mohanna, Farahnaz. "Content based video database retrieval using shape features." Thesis, University of Surrey, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.250764.
Full textThomas, Naveen Moham. "Motion based video object detection for event retrieval." Thesis, University of Bristol, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.441380.
Full textMarkatopoulou, Foteini. "Machine learning architectures for video annotation and retrieval." Thesis, Queen Mary, University of London, 2018. http://qmro.qmul.ac.uk/xmlui/handle/123456789/44693.
Full textDavis, Marc Eliot. "Media streams--representing video for retrieval and repurposing." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/29088.
Full textKolonias, I. "Cognitive vision systems for video understanding and retrieval." Thesis, University of Surrey, 2007. http://epubs.surrey.ac.uk/843661/.
Full textLin, Ming. "Automated Lecture Video Segmentation: Facilitate Content Browsing and Retrieval." Diss., The University of Arizona, 2006. http://hdl.handle.net/10150/193843.
Full textPark, Dong-Jun. "Video event detection framework on large-scale video data." Diss., University of Iowa, 2011. https://ir.uiowa.edu/etd/2754.
Full textAkpinar, Samet. "Ontology Based Semantic Retrieval Of Video Contents Using Metadata." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/12608772/index.pdf.
Full textWang, Lei. "Content based video retrieval via spatial-temporal information discovery." Thesis, Robert Gordon University, 2013. http://hdl.handle.net/10059/1119.
Full textDurak, Nurcan. "Semantic Video Modeling And Retrieval With Visual, Auditory, Textual Sources." Master's thesis, METU, 2004. http://etd.lib.metu.edu.tr/upload/12605438/index.pdf.
Full textRen, Jinchang. "Semantic content analysis for effective video segmentation, summarisation and retrieval." Thesis, University of Bradford, 2009. http://hdl.handle.net/10454/4251.
Full textRen, Cathy Wei. "A novel approach to video retrieval using spatio-temporal information." Thesis, University of Exeter, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.425494.
Full textAnjulan, Arasanathan. "Intelligent content based video retrieval based on local region tracks." Thesis, University of Bristol, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445827.
Full textRoux, Matthew John. "ATOM : a distributed system for video retrieval via ATM networks." Master's thesis, University of Cape Town, 1999. http://hdl.handle.net/11427/21327.
Full textGuan, Genliang. "Novel perspectives and approaches to video summarization." Thesis, The University of Sydney, 2014. http://hdl.handle.net/2123/13550.
Full textYusoff, Yusseri. "Automatic detection of shot boundaries in digital video." Thesis, University of Surrey, 2002. http://epubs.surrey.ac.uk/843079/.
Full textHopfgartner, Frank. "Personalised video retrieval : application of implicit feedback and semantic user profiles." Thesis, University of Glasgow, 2010. http://theses.gla.ac.uk/2132/.
Full textWang, Shih-Han, and 王詩涵. "Video-based Clothing Retrieval." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/93002261881198212631.
Full text國立臺灣大學
資訊網路與多媒體研究所
102
Nowadays, clothing retrieval becomes a thriving demand for online clothing shopping websites. Beyond keyword-based clothing search, image-based clothing retrieval has generated interest in recent research papers. It promotes more interesting clothing recommendation system and gives the possibility of improving identity or occupation recognition. In this paper, we present a brand-new video-based clothing retrieval system. We believe the system gives another intuitive clothing recommendation interface in a smart home with such an application scenario: one can select an impressive shot where the character is wearing a fascinating clothing by a TV remote control, and learn the clothing style from the character. However, there still are major challenges in this topic, such as human pose estimation and complex background between online shopping datasets, which often cause inaccurate retrieval results. Our research focuses on two issues here. First, we propose a human pose estimation mechanism with a video clip of frames for the refinement of inaccurate human pose. Second, we explore an automatic foreground segmentation method with "Grabcut" algorithm to tackle the complex background problem. In our experiments, we collect a few video clips and different kinds of online shopping datasets. The experimental results successfully demonstrate that our mechanism will improve the inaccurate pose estimation and can tackle the complex background problem.
Lin, Chia-Hsuan, and 林家玄. "Video Database Retrieval System." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/39317009085440223435.
Full text國立中山大學
資訊工程學系研究所
94
During the Digital Period, the more people using these digital video. When there are more and more users and amount of video data, the management of video data becomes a significant dimension during development. Therefore, there are more and more studying of accomplishing video database system, which provide users to search and get them. In this paper, a novel method for Video Scene Change Detection and video database retrieval is proposed. Uses Fractal orthonormal bases to guarantee the similar index has the similar image the characteristic union support vector clustering, splits a video into a sequence of shots, extracts a few representative frames(key-frames) to take the video database index from each shot. When image search compared to, according to MIL to pick up the characteristic, which images pursues the video database to have the similar characteristic, computation similar, makes the place output according to this.
LU, KAI-HUI, and 陸凱暉. "color-based video retrieval." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/29683920492533261928.
Full textBai, Yannan. "Video analytics system for surveillance videos." Thesis, 2018. https://hdl.handle.net/2144/30739.
Full textSu, Chih-Wen, and 蘇志文. "Content-based Video Retrieval Techniques for MPEG Video." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/9davy5.
Full text國立中央大學
資訊工程研究所
94
Gradual shot change detection is one of the most important research issues in the field of video indexing/retrieval. Among the numerous types of gradual transitions, the dissolve-type gradual transition is considered the most common one, but it is also the most difficult one to detect. In most of the existing dissolve detection algorithms, the false/miss detection problem caused by motion is very serious. In this thesis, we present a novel dissolve-type transition detection algorithm that can correctly distinguish dissolves from disturbance caused by motion. We carefully model a dissolve based on its nature and then use the model to filter out possible confusion caused by the effect of motion. Furthermore, we propose the use of motion vectors embedded in MPEG bitstreams to generate so-called “motion flows”, which are applied to perform quick video retrieval. By using the motion vectors directly, we do not need to consider the shape of a moving object and its corresponding trajectory. Instead, we simply “link” the local motion vectors across consecutive video frames to form motion flows, which are then annotated and stored in a video database. In the video retrieval phase, we propose a new matching strategy to execute the video retrieval task. Motions that do not belong to the mainstream motion flows are filtered out by our proposed algorithm. The retrieval process can be triggered by query-by-sketch (QBS) or query-by-example (QBE). The experiment results show that our method is indeed efficient and accurate in the video retrieval process.
Chin, Kuo-Hao, and 秦國豪. "An integrated approach to video retrieval." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/98338076773433550159.
Full text輔仁大學
資訊工程學系
93
In this paper, a novel approach is proposed for video clip (containing more than one shot) matching and retrieval. In contrast to the traditional key-frame based shot matching approach and frame sequence matching approach, our method analyzes all frames within a shot to fully exploit the spatio-temporal information, while generating only one representation for each visual feature used in each shot. The visual features utilized in our scheme are color fused with spatial distribution information, and motion. Video clip matching is performed at two levels, first at the shot level, then at the sequence level. A set of similarity measures are defined to evaluate the similarity between the query and the database video. Experimental results show that the proposed approach is effective and feasible in retrieving and ranking similar video clips with a variety of video contents. Our approach is also able to achieve a superior performance in comparison to a key-framed based retrieval system using only color histogram for feature representation and matching. Furthermore, to refine the search results, a technique called relevance feedback is implemented that takes into account the user’s own opinion on whether two clips are similar during the retrieval process. In some cases, improvement on the retrieval performance is observed.
Yi-Chen, Chen, and 陳怡真. "Video Retrieval Based on Temporal Texture." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/68432539684728164592.
Full text義守大學
資訊工程學系
92
It is well known that textural character play a very important role in human visual perception. Over the last few decades hundreds of papers have been published on the analysis of static textures but only a few of them studied the dynamical nature of textures. Textures evolving over time are called temporal textures and are very common in everyday life. The smoke flowing or the wavy water of a river is good examples of temporal textures. A temporal texture can be defined in terms of temporal-spatial pixel value interactions within digital video signals. In general, a video can be viewed as combination of different temporal texture segments. Therefore, the property of temporal textures plays a fundamental role in content-based video retrieval. In this thesis, we will focus on the temporal textures extraction, including the Directional Markov random field model of temporal texture and temporal texture features. The overall evaluation of the performance of video retrieval using temporal texture features will be studied.
Chen, Chi-Horng, and 陳志弘. "Data Retrieval in Video on Demand." Thesis, 1996. http://ndltd.ncl.edu.tw/handle/37440304698553521333.
Full text國立中興大學
資訊科學學系
84
While the need for applications that access video data stored in digital forms is growing rapidly, video systems that retrieve hundreds of videos from disks are becoming increasingly important.A challenge for such systems is to devise schemes that distribute the workload evenly across disks and provide good response time to client requests for video data.Different from previous research that concentrates on specific-functions, what we consider in this thesis for interactive video browsing is a full-function system composed of various playout methods, buffer management, disk striping and disk scheduling.In buffer management, we not only consider various policies such as allocation, replacement, prefetching and load control, but also consider their interactions with the quality of services in both the client and server sites.In disk striping, we propose the frame object striping method to deal with distributing retrieval requests across disks evenly.In disk scheduling, we address the issue of disk throughput.Taking all these factors into consideration, a mathematical model is set up and analyzed. Equations are derived to compute the minimum buffer size required for smooth playback and to compute the extra buffer needed to support VCR-like functions.According to various quality of services, several equations are also derived to tune the buffer size.We observe that good throughput is not only based on the balance of disk workload with any browsing rate. Cooperating with disk scheduling, appropriate buffer sizes associated with different data access patterns can also achieve good disk throughput.In this case, the number of disks required for an access pattern is inversely proportional to the greatest common divisor between the total number of disks and browsing rate.Based on these analytical results, we may design an efficient data retrieval mechanism for video-on-demand systems.
Weng, Pei-Chih, and 温培志. "Stroke-based Broadcast Basketball Video Retrieval." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/88379461752427588908.
Full text國立交通大學
資訊科學與工程研究所
102
We present a stroke based system that allows users to retrieve basketball video clips easily and intuitively. In contrast to current retrieval systems, which mainly rely on key words, users can draw player trajectories on our defined basketball court coordinate and specify the corresponding events such as shot made or shot miss to provide a more specific searching condition and prevent unwanted results during retrieval. Considering players are perspectively projected on each video frame and cameras in broadcast videos are dynamic, in which cases the specified strokes and the extracted player trajectories are not comparable, we map player positions in each frame to the defined basketball court coordinate using camera calibration. To achieve a robust mapping, our system considers the whole video clip and reconstruct a panoramic basketball court, followed by rectifying the panoramic court to our defined court using a homography. While this reconstruction is able to map pixels from a video frame to our defined court coordinate, it also is able to map player trajectories between the two coordinate systems. To obtain the event of a video clip, we extract the game time using the optical character recognition and map it to the event logs defined in a play-by-play text that is available online. Thanks to these two types of semantic information, our system is very helpful to coaches and tremendous amount of spectators. The retrieved videos with the corresponding searching conditions shown in \ref{fig:cutin1}-\ref{fig:differentstroke}, and our accompanying video verify the feasibility of our technique.
Lin, Fang-Ju, and 林芳如. "H.264 Based Entertainment Video Retrieval." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/96780571006938018308.
Full text國立交通大學
多媒體工程研究所
95
In this thesis, we provide a video retrieval system based on H.264 video compression format to retrieve complete desired videos by a short video clip. Without completely decompressing a video stream, we parse the H.264 video stream once to extract two useful features, lumas and motions. On the process of luma feature extraction, the luma calibration is proposed and similar luma features are removed. On the process of motion feature extraction, a statistic scheme is proposed to extract and combine dominant object motions and camera motions. On the retrieval process, the most relevant results are returned to users by using these two features. Our testing videos are all entertainment TV programs.
Yu, Chung-Yang, and 游重陽. "Video retrieval using 2D M-string." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/57175422747260556334.
Full text淡江大學
資訊管理學系碩士班
98
In image database systems, Content-Based Image Retrieval (CBIR) is an important approach to image retrieval. How to use strings to express the spatial relationship between objects and how to perform the inference and similarity retrieval have been widely discussed. The concept of 2D B-string notation is used to indicate the moving spatial relationship between objects in a video. Each object is represented as a focal point, and the point is marked in the "initiation position" and "end position" of each object of the video. A section from the point in the "initiation position" to the point in the "end position" is defined as the “displacement” of the object in the video. The 2D M-string is created based on the displacement of the objects in the video. The 2D M-String to spatial reasoning and search using the index structure can be a lot less similar to the filter clips to reduce the number of string matching and the efficient retrieval of similar video can be achieved.
Huang, Wei-Hsun, and 黃煒勛. "Video Retrieval based on spatial events." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/53847328751206122468.
Full text淡江大學
資訊管理學系碩士班
100
A video is composed of a sequence of frames. A frame may contain multiple objects. In a single frame, it contains the spatial relations between objects in this frame. It does not have information about the change of the spatial relations between objects in the video. In this paper, we define the occurrence of change of the spatial relations between objects between two adjacent frames as the spatial event and propose the spatial event string to represent the spatial event. Hence, the change of the spatial relations between objects can be derived from this string and the video query based on spatial events can be processed efficiently. We would propose the algorithm for the generation of the spatial event string and the way to process the video query based on spatial events. Generally speaking, the number the objects involved in spatial events is relatively small. Since the spatial event string contains only the objects involved in spatial events. Compared with other strings, the length of the string can be reduced dramatically. Consequently, the requirement of the storage and the process time of the string can be reduced.
Chen, Duan-Yu, and 陳敦裕. "Towards High-Level Content-Based Video Retrieval and Video Structuring." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/22704705540427885349.
Full text國立交通大學
資訊工程系所
93
With the increasing digital videos in education, entertainment and other multimedia applications, there is an urgent demand for tools that allow an efficient way for users to acquire desired video data. Content-based searching, browsing and retrieval is more natural, friendly and semantically meaningful to users. With the technique of video compression getting mature, lots of videos are being stored in compressed form and accordingly more and more researches focus on the feature extractions in compressed videos especially in MPEG format. This thesis aims to investigate high-level semantic video features in compressed domain for efficient video retrieval and video browsing. We propose an approach for video abstraction to generate semantically meaningful video clips and associated metadata. Based on the concept of long-term consistency of spatial-temporal relationship between objects in consecutive P-frames, the algorithm of multi-object tracking is designed to locate the objects and to generate the trajectory of each object without size constraint. Utilizing the object trajectory coupled with domain knowledge, the event inference module detects and identifies the events in the application of tennis sports. Consequently, the event information and metadata of associated video clips are extracted and the abstraction of video streams is accomplished. A novel mechanism is proposed to automatically parse sports videos in compressed domain and then to construct a concise table of video content employing the superimposed closed captions and the semantic classes of video shots. The efficient approach of closed caption localization is proposed to first detect caption frames in meaningful shots. Then caption frames instead of every frame are selected as targets for detecting closed captions based on long-term consistency without size constraint. Besides, in order to support discriminate captions of interest automatically, a novel tool – font size detector is proposed to recognize the font size of closed captions using compressed data in MPEG videos. For effective video retrieval, we propose a high-level motion activity descriptor, object-based transformed 2D-histogram (T2D-Histogram), which exploits both spatial and temporal features to characterize video sequences in a semantics-based manner. The Discrete Cosine Transform (DCT) is applied to convert the object-based 2D-histogram sequences from the time domain to the frequency domain. Using this transform, the original high-dimensional time domain features used to represent successive frames are significantly reduced to a set of low-dimensional features in frequency domain. The energy concentration property of DCT allows us to use only a few DCT coefficients to effectively capture the variations of moving objects. Having the efficient scheme for video representation, one can perform video retrieval in an accurate and efficient way. Furthermore, we propose a high-level compact motion-pattern descriptor, temporal motion intensity of moving blobs (MIMB) moments, which exploits both spatial invariants and temporal features to characterize video sequences. The energy concentration property of DCT allows us to use only a few DCT coefficients to precisely capture the variations of moving blobs. Compared to the motion activity descriptors, RLD and SAH, of MPEG-7, the proposed descriptor yield 40% and 21 % average performance gains over RLD and SAH, respectively. Comprehensive experiments have been conducted to assess the performance of the proposed methods. The empirical results show that these methods outperform state-of-the-art methods with respective various datasets of different characteristics.
邱敬昌. "Compressed-domain video object extraction for content-based video retrieval." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/89261594499871864116.
Full textHo, Yu-Hsuan, and 何宥萱. "Key-Frame Extraction for Video Summarization and Shot-Based Video Retrieval." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/73943136238044640470.
Full text國立中正大學
資訊工程研究所
92
In this paper, we present an adaptive rate-constrained key-frame selection scheme for channel-aware realtime video streaming and shot-based video retrieval. First, the streaming server dynamically determines the target number of key-frames by estimating the channel conditions according to the feedback information. Under the constraint of the target key-frame number, a two-step sequential key-frame selection scheme is adopted to select the target number of key-frames by first finding the optimal allocation among the video shots in a video clip, and then selecting most representative key-frames in each shot according to the allocation to guide the temporal-downscaling transcoding. After extracting the key-frames, we propose a multi-pass video retrieval using spatio-temporal statistics information. In the first-pass, the probability distributions of object motion for each shot of the query video clip are extracted and then are compared with the probability of the shots in the database by using the Bhattacharyya distance. In the second-pass, two consecutive shots are employed to the introduction of the “causality” effect. Finally, in the refinement-pass, we extract one key-frame from each shot using our key-frame selection method, and calculate the color histogram of each key-frame. Then we use the Bhattacharya distance to compare the similarity of the two color histograms of key-frames and cumulate the second-stage distance to be the similarity of two video shots. Without respect to the two-step key-frame selection or multi-pass video retrieval, our experimental results show that the proposed methods are efficient and satisfactory.
Teng, Shang-Ju, and 鄧尚儒. "Motion Trajectory Based Video Indexing and Retrieval." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/07158040655742557732.
Full text國立清華大學
資訊工程學系
90
This thesis presents a technique to efficiently index and retrieve video clips in terms of motion-trajectory-based similarity. We describe a motion trajectory in three representations: the horizontal and vertical movements of the trajectory, and the motion trail that indicates shape of the trajectory. Each representation is approximated by a polynomial function, and the polynomial coefficients are then indexed for retrieval. To measure the matching distance, we combine different spatio-temporal characteristics to provide flexible retrieval processes. In addition, we also proposed a multiscale mode to improve the retrieval efficiency. A unified framework is developed to deal with various query types: query-by-example, query-by-sketch, and query-by-specification. We have performed many experiments to confirm the effectiveness and efficiency of our method. Experimental results indicate satisfactory performance of our work.
Wang, Jhih-Huang, and 王志煌. "Video Retrieval Based on Color Center Descriptor." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/60820913335568072148.
Full text國立台北師範學院
自然科學教育研究所
93
In the previous paper, a fully digital course material was designed to combine the TV programs and video database distributed through the internet, and real activities conducted at each class. This research found that children’s learning promoted through the course. With the rapid development of internet technology, the amount of video is getting larger and larger. It is necessary for human beings using an efficient retrieval tool to find out the video we need. Mpeg-7, formally named “Multimedia Content Descriptor Interface”, provides a comprehensive set of audiovisual Descriptor Tools to extract feature from audiovisual content and describe audiovisual information. And color histogram is used to describe the color feature of videos. A new descriptor with the concept of mass center in Physics is proposed in this paper to promote the performance of video retrieval. The experiment result shows that the method proposed in this paper is better than the other methods proposed in previous papers.
Lin, Chih-long, and 林志隆. "Content-based Video Retrieval with Multi Features." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/ed84t6.
Full text國立臺灣科技大學
電機工程系
102
With the advance of multimedia codec technology and communications, multimedia communications become one of the major information media with the aides of internet prevalence. Under this circumstance, image and video data over the Internet contribute to the sea of media and how to search user desired media contents from the sea of media becomes important. Content-Based Video Retrieval (CBVR) methods have been proposed to search user interested video clips, precisely and quickly. Among these researches, extracting image features for similarity measurement is widely adopted. However, adopting only one kind of feature to describe video contents cannot provide satisfactory retrieval results. In general, more than one kind of image/video features are extracted to for efficient video retrieval. How to efficiently integrate different kinds of image/video features is critical and challenging in improving the video retrieval performance. In this thesis, we proposed to integrate color, texture and SIFT-BOW (Bag of Word) image features to describe one video clip. These three features not only can describe the global image feature, but also local region ones. In our experiments, the color histogram difference is used to measure similarity for video scene cut. These video scene cuts, video clips, are used as the basic media unit for description and retrieval. The average of image features within one media unit is used as the representative feature for the video clip. To perform retrieval, the feature of one query image/video is extracted and its similarity to each representative feature of one video clip in a database is calculated to perform similarity ranking. For comparisons, the video retrieval performance that adopts only one feature is implemented. In addition, the one proposed by Y. Deng [10] that adopts more than one feature for video retrieval is also carried out for comparisons. Experiments showed that the proposed CBVR method outperforms the previous method by 38.7% in the PR rate. Performing CBVR by multiple features also improves on the PR performance as compared to retrieval by single feature.
Lai, Yuan-hao, and 賴沅壕. "Video Object Retrieval by Trajectory and Appearance." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/26725122356557592907.
Full text國立臺灣科技大學
資訊管理系
101
The prevalence of video recording capability, either on surveillance or mobile devices, has contributed to the popularity of video data. As a result, video management has become relatively more important than before, and in particular, video retrieval has been one of the main issues in this regard. Traditional video retrieval systems take texts as the inputs to look for similar information from the title, annotation or embedded textual data of a video, in a way that is very similar to the keyword search adopted by a common search engine. However, the lack of visual information specification during a search often makes the result rather inaccurate or even useless. For this reason, video retrieval systems with inputs being images or videos have also been proposed; nevertheless, the associated ambiguity and complexity have made the implementation of such systems relatively difficult, and thus not as successful as desired. To address this, in this thesis, we propose to perform a video retrieval of a desired object through the inputs of its trajectory and/or appearance, together with the help of a 3D graphical user interface for more intuitive interactions, more satisfactory results can be achieved. And we firmly believe that such a framework could serve as the foundation for behavior analysis to be used in many surveillance systems.
Chang, Hao-Wei, and 張皓崴. "A Study on Content-Based Video Retrieval." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/01089217808521692359.
Full text國立交通大學
資訊科學系
90
In this paper, a content-based video retrieval method without using key-frame will be proposed. Unlike the key-frame based approach, the proposed method uses the whole information of a shot instead of selecting several key-frames to represent a shot. This method is based on the concept of the primitives of color moments and the dominant colors to extract features. To extract the primitives of color moments, first each frame in a shot is divided into several blocks. Then, the color moments of all blocks are extracted and clustered into several classes. The mean moments of each class are considered as a primitive of the shot. To extract the dominant colors, all pixel’s color in a shot are clustered into several classes, and the center of the colors in each class is treat as a dominant color. After extracting the feature vectors for each shot, we will propose two measures to compute the similarity between two different shots using primitives of color moments and dominant colors as features, respectively. Furthermore, since there is no feature that is proper for all kinds of shots, a relevance feedback algorithm is also provided to automatically determine the best method according to the user’s response.
Lee, Gin Song, and 李金松. "Semantic Video Model for Content-based Retrieval." Thesis, 1998. http://ndltd.ncl.edu.tw/handle/76910006565384416570.
Full textLiu, Chih-Chin, and 劉志俊. "Content-Based Video and Music Data Retrieval." Thesis, 1998. http://ndltd.ncl.edu.tw/handle/29623888052889787839.
Full text國立清華大學
資訊工程學系
86
AbstractIn this dissertation, we discuss important issues in content-based video and music data retrieval. First, we describe features used to model the content of image, video and music data. Based on the object model, we propose a multimedia framework and an object-level spatial/temporal model to represent the spatial/temporal relationships between media objects. Three new types of aggregation relationships composed of the composition, temporal, and spatial relationships are considered in the framework. To support content-based data retrieval, we propose a multimedia query language and two kinds of query interfaces for users to specify content-based queries. Second, since many content-based multimedia data retrieval problems can be transformed into the near neighbor searching problem in a multidimensional feature space, an efficient near neighbor searching algorithm is needed when developing a multimedia database system. We propose an approach to efficiently solve the near neighbor searching problem. In this approach, along each dimension an index is constructed according to the values of the feature points of the multimedia objects. A user can pose a content-based query by specifying a multimedia query example and a similarity threshold. The specified query example will be transformed into a query point in the multi- dimensional feature space. The possible result points in each dimension are then retrieved by searching the value of the query point in the corresponding dimension. The sets of the possible result points are merged one by one by removing the points which are not within the query radius. The result points and their distances from the query point form the answer of the query. Third, we propose a video query model based on the content of video and iconic indexing. The notion of two-dimensional strings is extended to three-dimensional strings (3D-Strings) for representing the spatial and temporal relationships among the symbols in both a video and a video query. The problem of video query processing is then transformed into a problem of three- dimensional pattern matching. To efficiently match the 3D- Strings, a data structure called 3D-List and its related algorithms are proposed. In this approach, the symbols of a video in the video database are retrieved from the video index and organized as a 3D-List according to the 3D-String of the video query. The related algorithms are then applied on the 3D- List to determine whether this video is an answer to the video query. Fourth, we propose an approach for content-based music data retrieval. In this approach, thematic feature strings, such as melody strings, rhythm strings, and chord strings are extracted from the original music objects and treated as the meta data to represent their contents. The problem of content- based music data retrieval is then transformed into the string matching problem. A new approximate string matching algorithm is proposed for content-based music data retrieval.
Yu, Shang-Li, and 于尚立. "Motion-based Video Retrieval by Trajectory Matching." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/22209157147694698298.
Full text元智大學
電機工程學系
91
This thesis proposes a motion-based video retrieval system to retrieve desired videos from a video database through trajectory matching. First, in order to extract the trajectories of moving objects, a camera compensation method is proposed for estimating possible camera motions from time-varying backgrounds. The thesis uses an affine transform to model all possible global camera motions. Then, through feature extraction, feature matching, and a voting technique, desired camera motions can be accurately estimated from pair of image frames. Thus, the trajectories of moving objects can be easily found by image differencing and trajectory tracking. Once the trajectories of moving objects have been extracted, for each subsequence of videos, a set of control points are then sampled and recorded for further indexing. Before retrieving, all the sets of control points will be first transformed into different Bezier curves. Then, similarity comparisons of different video contents can be performed by comparing the sampled points extracted along the Bezier curves. According to the Bezier representation, a novel indexing framework is then proposed to retrieve desired video sequences from video databases no matter how different scaling and translations exist in these sequences. In addition, the proposed system has good properties in dealing with partial matching of curves. Thus, even though an incomplete query trajectory is given, all desired video sequences can be very accurately retrieved and returned to users. A great variety of experiments were conducted to verify the efficiency, effectiveness, and robustness of the proposed system. Experimental results have proved the superiority of our proposed method.