Journal articles on the topic 'Video annotation'

To see the other types of publications on this topic, follow the link: Video annotation.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Video annotation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Chen, Jia, Cui-xia Ma, Hong-an Wang, Hai-yan Yang, and Dong-xing Teng. "Sketch Based Video Annotation and Organization System in Distributed Teaching Environment." International Journal of Distributed Systems and Technologies 1, no. 4 (October 2010): 27–41. http://dx.doi.org/10.4018/jdst.2010100103.

Full text
Abstract:
As the use of instructional video is becoming a key component of e-learning, there is an increasing need for a distributed system which supports collaborative video annotation and organization. In this paper, the authors construct a distributed environment on the top of NaradaBrokering to support collaborative operations on video material when users are located in different places. The concept of video annotation is enriched, making it a powerful media to improve the instructional video organizing and viewing. With panorama based and interpolation based methods, all related users can annotate or organize videos simultaneously. With these annotations, a video organization structure is consequently built through linking them with other video clips or annotations. Finally, an informal user study was conducted and result shows that this system improves the efficiency of video organizing and viewing and enhances user’s participating into the design process with good user experience.
APA, Harvard, Vancouver, ISO, and other styles
2

Groh, Florian, Dominik Schörkhuber, and Margrit Gelautz. "A tool for semi-automatic ground truth annotation of traffic videos." Electronic Imaging 2020, no. 16 (January 26, 2020): 200–1. http://dx.doi.org/10.2352/issn.2470-1173.2020.16.avm-150.

Full text
Abstract:
We have developed a semi-automatic annotation tool – “CVL Annotator” – for bounding box ground truth generation in videos. Our research is particularly motivated by the need for reference annotations of challenging nighttime traffic scenes with highly dynamic lighting conditions due to reflections, headlights and halos from oncoming traffic. Our tool incorporates a suite of different state-of-the-art tracking algorithms in order to minimize the amount of human input necessary to generate high-quality ground truth data. We focus our user interface on the premise of minimizing user interaction and visualizing all information relevant to the user at a glance. We perform a preliminary user study to measure the amount of time and clicks necessary to produce ground truth annotations of video traffic scenes and evaluate the accuracy of the final annotation results.
APA, Harvard, Vancouver, ISO, and other styles
3

Rich, Peter J., and Michael Hannafin. "Video Annotation Tools." Journal of Teacher Education 60, no. 1 (November 26, 2008): 52–67. http://dx.doi.org/10.1177/0022487108328486.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Von Wachter, Jana-Kristin, and Doris Lewalter. "Video Annotation as a Supporting Tool for Video-based Learning in Teacher Training – A Systematic Literature Review." International Journal of Higher Education 12, no. 2 (February 27, 2023): 1. http://dx.doi.org/10.5430/ijhe.v12n2p1.

Full text
Abstract:
Digital video annotation tools, which allow users to add synchronized comments to video content, have gained significant attention in teacher education in recent years. However, there is no overview of the research on the use of annotations, their implementation in teacher training and their effect on the development of professional competencies as a result of using video annotations as a supporting tool for video-based learning. In order to fill this gap, this paper reports on the results of a systematic literature review which was carried out to determine 1) how video annotations were implemented in studies in educational settings, 2) which professional competencies were investigated to be further developed with the aid of video annotations in these studies, and 3) which learning outcomes were reported in the selected studies. A total of 18 eligible studies, published between 2014 and 2022, were identified via database search and cross-referencing. A qualitative content analysis of these studies showed that video annotations were generally used to perform one or more of three functions, these being feedback, communication, and documentation, while they also enabled a deeper content knowledge of teaching, reflective skills, and professional vision, and facilitated social integration and recognition. The convincing evidence of the positive effect of using video annotation as a supporting tool in video teacher training prove them to be a powerful tool supporting the development of professional vision and other teaching skills. The use of video annotation tools in educational settings points towards further research as well.
APA, Harvard, Vancouver, ISO, and other styles
5

R. Balamurugan, R. Balamurugan. "Semi-Automatic Context-Aware Video Annotation for Searching Educational Video Resources." Indian Journal of Applied Research 3, no. 6 (October 1, 2011): 108–10. http://dx.doi.org/10.15373/2249555x/june2013/35.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Sánchez-Carballido, Sergio, Orti Senderos, Marcos Nieto, and Oihana Otaegui. "Semi-Automatic Cloud-Native Video Annotation for Autonomous Driving." Applied Sciences 10, no. 12 (June 23, 2020): 4301. http://dx.doi.org/10.3390/app10124301.

Full text
Abstract:
An innovative solution named Annotation as a Service (AaaS) has been specifically designed to integrate heterogeneous video annotation workflows into containers and take advantage of a cloud native highly scalable and reliable design based on Kubernetes workloads. Using the AaaS as a foundation, the execution of automatic video annotation workflows is addressed in the broader context of a semi-automatic video annotation business logic for ground truth generation for Autonomous Driving (AD) and Advanced Driver Assistance Systems (ADAS). The document presents design decisions, innovative developments, and tests conducted to provide scalability to this cloud-native ecosystem for semi-automatic annotation. The solution has proven to be efficient and resilient on an AD/ADAS scale, specifically in an experiment with 25 TB of input data to annotate, 4000 concurrent annotation jobs, and 32 worker nodes forming a high performance computing cluster with a total of 512 cores, and 2048 GB of RAM. Automatic pre-annotations with the proposed strategy reduce the time of human participation in the annotation up to 80% maximum and 60% on average.
APA, Harvard, Vancouver, ISO, and other styles
7

Garcia, Manuel B., and Ahmed Mohamed Fahmy Yousef. "Cognitive and affective effects of teachers’ annotations and talking heads on asynchronous video lectures in a web development course." Research and Practice in Technology Enhanced Learning 18 (December 5, 2022): 020. http://dx.doi.org/10.58459/rptel.2023.18020.

Full text
Abstract:
When it comes to asynchronous online learning, the literature recommends multimedia content like videos of lectures and demonstrations. However, the lack of emotional connection and the absence of teacher support in these video materials can be detrimental to student success. We proposed incorporating talking heads and annotations to alleviate these weaknesses. In this study, we investigated the cognitive and affective effects of integrating these solutions in asynchronous video lectures. Guided by the theoretical lens of Cognitive Theory of Multimedia Learning and Cognitive-Affective Theory of Learning with Media, we produced a total of 72 videos (average = four videos per subtopic) with a mean duration of 258 seconds (range = 193 to 318 seconds). To comparatively assess our video treatments (i.e., regular videos, videos with face, videos with annotation, or videos with face and annotation), we conducted an educational-based cluster randomized controlled trial within a 14-week academic period with four cohorts of students enrolled in an introductory web design and development course. We recorded a total of 42,425 total page views (212.13 page views per student) for all web browsing activities within the online learning platform. Moreover, 39.92% (16,935 views) of these page views were attributed to the video pages accumulating a total of 47,665 minutes of watch time. Our findings suggest that combining talking heads and annotations in asynchronous video lectures yielded the highest learning performance, longest watch time, and highest satisfaction, engagement, and attitude scores. These discoveries have significant implications for designing video lectures for online education to support students’ activities and engagement. Therefore, we concluded that academic institutions, curriculum developers, instructional designers, and educators should consider these findings before relocating face-to-face courses to online learning systems to maximize the benefits of video-based learning.
APA, Harvard, Vancouver, ISO, and other styles
8

Islam, Md Anwarul, Md Azher Uddin, and Young-Koo Lee. "A Distributed Automatic Video Annotation Platform." Applied Sciences 10, no. 15 (July 31, 2020): 5319. http://dx.doi.org/10.3390/app10155319.

Full text
Abstract:
In the era of digital devices and the Internet, thousands of videos are taken and share through the Internet. Similarly, CCTV cameras in the digital city produce a large amount of video data that carry essential information. To handle the increased video data and generate knowledge, there is an increasing demand for distributed video annotation. Therefore, in this paper, we propose a novel distributed video annotation platform that explores the spatial information and temporal information. Afterward, we provide higher-level semantic information. The proposed framework is divided into two parts: spatial annotation and spatiotemporal annotation. Therefore, we propose a spatiotemporal descriptor, namely, volume local directional ternary pattern-three orthogonal planes (VLDTP–TOP) in a distributed manner using Spark. Moreover, we developed several state-of-the-art appearance-based and spatiotemporal-based feature descriptors on top of Spark. We also provide the distributed video annotation services for the end-users so that they can easily use the video annotation and APIs for development to produce new video annotation algorithms. Due to the lack of a spatiotemporal video annotation dataset that provides ground truth for both spatial and temporal information, we introduce a video annotation dataset, namely, STAD which provides ground truth for spatial and temporal information. An extensive experimental analysis was performed in order to validate the performance and scalability of the proposed feature descriptors, which proved the excellence of our proposed approach.
APA, Harvard, Vancouver, ISO, and other styles
9

Sigurdsson, Gunnar, Olga Russakovsky, Ali Farhadi, Ivan Laptev, and Abhinav Gupta. "Much Ado About Time: Exhaustive Annotation of Temporal Data." Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 4 (September 21, 2016): 219–28. http://dx.doi.org/10.1609/hcomp.v4i1.13290.

Full text
Abstract:
Large-scale annotated datasets allow AI systems to learn from and build upon the knowledge of the crowd. Many crowdsourcing techniques have been developed for collecting image annotations. These techniques often implicitly rely on the fact that a new input image takes a negligible amount of time to perceive. In contrast, we investigate and determine the most cost-effective way of obtaining high-quality multi-label annotations for temporal data such as videos. Watching even a short 30-second video clip requires a significant time investment from a crowd worker; thus, requesting multiple annotations following a single viewing is an important cost-saving strategy. But how many questions should we ask per video? We conclude that the optimal strategy is to ask as many questions as possible in a HIT (up to 52 binary questions after watching a 30-second video clip in our experiments).We demonstrate that while workers may not correctly answer all questions, the cost-benefit analysis nevertheless favors consensus from multiple such cheap-yet-imperfect iterations over more complex alternatives. When compared with a one-question-per-video baseline, our method is able to achieve a 10% improvement in recall (76.7% ours versus 66.7% baseline) at comparable precision (83.8% ours versus 83.0% baseline) in about half the annotation time (3.8 minutes ours compared to 7.1 minutes baseline). We demonstrate the effectiveness of our method by collecting multi-label annotations of 157 human activities on 1,815 videos.
APA, Harvard, Vancouver, ISO, and other styles
10

Gligorov, Riste, Michiel Hildebrand, Jacco Van Ossenbruggen, Lora Aroyo, and Guus Schreiber. "Topical Video Search: Analysing Video Concept Annotation through Crowdsourcing Games." Human Computation 4, no. 1 (April 26, 2017): 47–70. http://dx.doi.org/10.15346/hc.v4i1.77.

Full text
Abstract:
Games with a purpose (GWAPs) are increasingly used in audio-visual collections as a mechanism for annotating videos through tagging. One such GWAP is Waisda?, a video labeling game where players tag streaming video and win points by reaching consensus on tags with other players. The open-ended and unconstrained manner of tagging in the fast-paced setting of the game has fundamental impact on the resulting tags. We find that Waisda? tags predominately describe visual objects and rarely refer to the topics of the videos. In this study we evaluate to what extent the tags entered by players can be regarded as topical descriptors of the video material. Moreover, we characterize the quality of the user tags as topical descriptors with the aim to detect and filter out the bad ones. Our results show that after filtering, game tags perform equally well compared to the manually crafted metadata when it comes to accessing the videos based on topic. An important consequence of this finding is that tagging games can provide a cost-effective alternative in situations when manual annotation by professionals is too costly.
APA, Harvard, Vancouver, ISO, and other styles
11

Ramadhan, Faisal, and Aina Musdholifah. "Online Learning Video Recommendation System Based on Course and Sylabus Using Content-Based Filtering." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 15, no. 3 (July 31, 2021): 265. http://dx.doi.org/10.22146/ijccs.65623.

Full text
Abstract:
Learning using video media such as watching videos on YouTube is an alternative method of learning that is often used. However, there are so many learning videos available that finding videos with the right content is difficult and time-consuming. Therefore, this study builds a recommendation system that can recommend videos based on courses and syllabus. The recommendation system works by looking for similarity between courses and syllabus with video annotations using the cosine similarity method. The video annotation is the title and description of the video captured in real-time from YouTube using the YouTube API. This recommendation system will produce recommendations in the form of five videos based on the selected courses and syllabus. The test results show that the average performance percentage is 81.13% in achieving the recommendation system goals, namely relevance, novelty, serendipity and increasing recommendation diversity.
APA, Harvard, Vancouver, ISO, and other styles
12

Liang, Chao, Changsheng Xu, and Hanqing Lu. "Personalized Sports Video Customization Using Content and Context Analysis." International Journal of Digital Multimedia Broadcasting 2010 (2010): 1–20. http://dx.doi.org/10.1155/2010/836357.

Full text
Abstract:
We present an integrated framework on personalized sports video customization, which addresses three research issues: semantic video annotation, personalized video retrieval and summarization, and system adaptation. Sports video annotation serves as the foundation of the video customization system. To acquire detailed description of video content, external web text is adopted to align with the related sports video according to their semantic correspondence. Based on the derived semantic annotation, a user-participant multiconstraint 0/1 Knapsack model is designed to model the personalized video customization, which can unify both video retrieval and summarization with different fusion parameters. As a measure to make the system adaptive to the particular user, a social network based system adaptation algorithm is proposed to learn latent user preference implicitly. Both quantitative and qualitative experiments conducted on twelve broadcast basketball and football videos validate the effectiveness of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
13

Cebrián-de-la-Serna, Manuel, María Jesús Gallego-Arrufat, and Violeta Cebrián-Robles. "Multimedia Annotations for Practical Collaborative Reasoning." Journal of New Approaches in Educational Research 10, no. 2 (July 15, 2021): 264. http://dx.doi.org/10.7821/naer.2021.7.664.

Full text
Abstract:
University education requires students to be trained both at university and at external internship centres. Because of Covid-19, the availability of multimedia resources and examples of practical contexts has become vital. Multimedia annotation can help students reflect on the professional world, collaborating and interacting with colleagues online. This study aims to encourage collaborative practical thinking by using new video annotation technologies. 274 students participated in an experiment of task design focusing on the analysis of a technology-based, award-winning educational innovation project. With mixed research design, qualitative and quantitative data exported from the video annotation platform used was collected and analysed. The results show differences in the quality and quantity of the answers: in the tasks with broad Folksonomy they are more numerous but more dispersed in their analysis, and vice versa. The quality of the answers given with narrow Folksonomy is also higher in both texts and videos modes. Producing multimedia annotations is a practical way to encourage students to practise reflective reasoning about the professional reality.
APA, Harvard, Vancouver, ISO, and other styles
14

Farooq, Muhammad, Abul Doulah, Jason Parton, Megan McCrory, Janine Higgins, and Edward Sazonov. "Validation of Sensor-Based Food Intake Detection by Multicamera Video Observation in an Unconstrained Environment." Nutrients 11, no. 3 (March 13, 2019): 609. http://dx.doi.org/10.3390/nu11030609.

Full text
Abstract:
Video observations have been widely used for providing ground truth for wearable systems for monitoring food intake in controlled laboratory conditions; however, video observation requires participants be confined to a defined space. The purpose of this analysis was to test an alternative approach for establishing activity types and food intake bouts in a relatively unconstrained environment. The accuracy of a wearable system for assessing food intake was compared with that from video observation, and inter-rater reliability of annotation was also evaluated. Forty participants were enrolled. Multiple participants were simultaneously monitored in a 4-bedroom apartment using six cameras for three days each. Participants could leave the apartment overnight and for short periods of time during the day, during which time monitoring did not take place. A wearable system (Automatic Ingestion Monitor, AIM) was used to detect and monitor participants’ food intake at a resolution of 30 s using a neural network classifier. Two different food intake detection models were tested, one trained on the data from an earlier study and the other on current study data using leave-one-out cross validation. Three trained human raters annotated the videos for major activities of daily living including eating, drinking, resting, walking, and talking. They further annotated individual bites and chewing bouts for each food intake bout. Results for inter-rater reliability showed that, for activity annotation, the raters achieved an average (±standard deviation (STD)) kappa value of 0.74 (±0.02) and for food intake annotation the average kappa (Light’s kappa) of 0.82 (±0.04). Validity results showed that AIM food intake detection matched human video-annotated food intake with a kappa of 0.77 (±0.10) and 0.78 (±0.12) for activity annotation and for food intake bout annotation, respectively. Results of one-way ANOVA suggest that there are no statistically significant differences among the average eating duration estimated from raters’ annotations and AIM predictions (p-value = 0.19). These results suggest that the AIM provides accuracy comparable to video observation and may be used to reliably detect food intake in multi-day observational studies.
APA, Harvard, Vancouver, ISO, and other styles
15

Jurgens, David, and Roberto Navigli. "It’s All Fun and Games until Someone Annotates: Video Games with a Purpose for Linguistic Annotation." Transactions of the Association for Computational Linguistics 2 (December 2014): 449–64. http://dx.doi.org/10.1162/tacl_a_00195.

Full text
Abstract:
Annotated data is prerequisite for many NLP applications. Acquiring large-scale annotated corpora is a major bottleneck, requiring significant time and resources. Recent work has proposed turning annotation into a game to increase its appeal and lower its cost; however, current games are largely text-based and closely resemble traditional annotation tasks. We propose a new linguistic annotation paradigm that produces annotations from playing graphical video games. The effectiveness of this design is demonstrated using two video games: one to create a mapping from WordNet senses to images, and a second game that performs Word Sense Disambiguation. Both games produce accurate results. The first game yields annotation quality equal to that of experts and a cost reduction of 73% over equivalent crowdsourcing; the second game provides a 16.3% improvement in accuracy over current state-of-the-art sense disambiguation games with WordNet.
APA, Harvard, Vancouver, ISO, and other styles
16

Hayat, Hassan, Carles Ventura, and Agata Lapedriza. "Modeling Subjective Affect Annotations with Multi-Task Learning." Sensors 22, no. 14 (July 13, 2022): 5245. http://dx.doi.org/10.3390/s22145245.

Full text
Abstract:
In supervised learning, the generalization capabilities of trained models are based on the available annotations. Usually, multiple annotators are asked to annotate the dataset samples and, then, the common practice is to aggregate the different annotations by computing average scores or majority voting, and train and test models on these aggregated annotations. However, this practice is not suitable for all types of problems, especially when the subjective information of each annotator matters for the task modeling. For example, emotions experienced while watching a video or evoked by other sources of content, such as news headlines, are subjective: different individuals might perceive or experience different emotions. The aggregated annotations in emotion modeling may lose the subjective information and actually represent an annotation bias. In this paper, we highlight the weaknesses of models that are trained on aggregated annotations for modeling tasks related to affect. More concretely, we compare two generic Deep Learning architectures: a Single-Task (ST) architecture and a Multi-Task (MT) architecture. While the ST architecture models single emotional perception each time, the MT architecture jointly models every single annotation and the aggregated annotations at once. Our results show that the MT approach can more accurately model every single annotation and the aggregated annotations when compared to methods that are directly trained on the aggregated annotations. Furthermore, the MT approach achieves state-of-the-art results on the COGNIMUSE, IEMOCAP, and SemEval_2007 benchmarks.
APA, Harvard, Vancouver, ISO, and other styles
17

Le, Trung-Nghia, Tam V. Nguyen, Quoc-Cuong Tran, Lam Nguyen, Trung-Hieu Hoang, Minh-Quan Le, and Minh-Triet Tran. "Interactive Video Object Mask Annotation." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 18 (May 18, 2021): 16067–70. http://dx.doi.org/10.1609/aaai.v35i18.18014.

Full text
Abstract:
In this paper, we introduce a practical system for interactive video object mask annotation, which can support multiple back-end methods. To demonstrate the generalization of our system, we introduce a novel approach for video object annotation. Our proposed system takes scribbles at a chosen key-frame from the end-users via a user-friendly interface and produces masks of corresponding objects at the key-frame via the Control-Point-based Scribbles-to-Mask (CPSM) module. The object masks at the key-frame are then propagated to other frames and refined through the Multi-Referenced Guided Segmentation (MRGS) module. Last but not least, the user can correct wrong segmentation at some frames, and the corrected mask is continuously propagated to other frames in the video via the MRGS to produce the object masks at all video frames.
APA, Harvard, Vancouver, ISO, and other styles
18

Ward, Thomas M., Danyal M. Fer, Yutong Ban, Guy Rosman, Ozanan R. Meireles, and Daniel A. Hashimoto. "Challenges in surgical video annotation." Computer Assisted Surgery 26, no. 1 (January 1, 2021): 58–68. http://dx.doi.org/10.1080/24699322.2021.1937320.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

De Clercq, Armand, Ann Buysse, Herbert Roeyers, William Ickes, Koen Ponnet, and Lesley Verhofstadt. "VIDANN: A video annotation system." Behavior Research Methods, Instruments, & Computers 33, no. 2 (May 2001): 159–66. http://dx.doi.org/10.3758/bf03195361.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

El‐Khoury, Vanessa, Martin Jergler, Getnet Abebe Bayou, David Coquil, and Harald Kosch. "Fine‐granularity semantic video annotation." International Journal of Pervasive Computing and Communications 9, no. 3 (August 30, 2013): 243–69. http://dx.doi.org/10.1108/ijpcc-07-2013-0019.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Martin, Matthew, James Charlton, and Andrew Miles Connor. "Mainstreaming Video Annotation Software for Critical Video Analysis." Journal of Technologies and Human Usability 11, no. 3 (2015): 1–13. http://dx.doi.org/10.18848/2381-9227/cgp/v11i03/56430.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Borth, Damian, Adrian Ulges, Christian Schulze, and Thomas M. Breuel. "Keyframe Extraktion für Video-Annotation und Video-Zusammenfassung." Informatik-Spektrum 32, no. 1 (October 15, 2008): 50–53. http://dx.doi.org/10.1007/s00287-008-0264-y.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Zimšek, Danilo, Luka Banfi, and Mirjam Sepesy Maučec. "Video annotation with metadata: A case study of soccer game video annotation with players." Anali PAZU 9, no. 1-2 (June 9, 2022): 30–37. http://dx.doi.org/10.18690/analipazu.9.1-2.30-37.2019.

Full text
Abstract:
In recent years, the use of convolutional neural networks for video processing has become very attractive. The reason lies in the computational power for data processing which is available today. There are many well-defined research areas where neural networks have brought higher reliability than other conventional approaches; for example, traffic sign recognition and isolated number recognition. In this paper, we will describe the architecture and the implementation of the process of soccer game annotation. The game is annotated with data about players. The technology of convolutional neural networks is used for number recognition. The process runs in real-time on a streaming video. Content enriched with metadata is given to the user in parallel with the real-time video. In the paper, we will describe in some detail the following modules: Image binarization, shot localization, the selection and recognition of numbers on players` jerseys.
APA, Harvard, Vancouver, ISO, and other styles
24

Man, Guangyi, and Xiaoyan Sun. "Interested Keyframe Extraction of Commodity Video Based on Adaptive Clustering Annotation." Applied Sciences 12, no. 3 (January 30, 2022): 1502. http://dx.doi.org/10.3390/app12031502.

Full text
Abstract:
Keyframe recognition in video is very important for extracting pivotal information from videos. Numerous studies have been successfully carried out on identifying frames with motion objectives as keyframes. The definition of “keyframe” can be quite different for different requirements. In the field of E-commerce, the keyframes of the products videos should be those interested by a customer and help the customer make correct and quick decisions, which is greatly different from the existing studies. Accordingly, here, we first define the key interested frame of commodity video from the viewpoint of user demand. As there are no annotations on the interested frames, we develop a fast and adaptive clustering strategy to cluster the preprocessed videos into several clusters according to the definition and make an annotation. These annotated samples are utilized to train a deep neural network to obtain the features of key interested frames and achieve the goal of recognition. The performance of the proposed algorithm in effectively recognizing the key interested frames is demonstrated by applying it to some commodity videos fetched from the E-commerce platform.
APA, Harvard, Vancouver, ISO, and other styles
25

WANG, MENG, XIAN-SHENG HUA, TAO MEI, JINHUI TANG, GUO-JUN QI, YAN SONG, and LI-RONG DAI. "INTERACTIVE VIDEO ANNOTATION BY MULTI-CONCEPT MULTI-MODALITY ACTIVE LEARNING." International Journal of Semantic Computing 01, no. 04 (December 2007): 459–77. http://dx.doi.org/10.1142/s1793351x0700024x.

Full text
Abstract:
Active learning has been demonstrated to be an effective approach to reducing human labeling effort in multimedia annotation tasks. However, most of the existing active learning methods for video annotation are studied in a relatively simple context where concepts are sequentially annotated with fixed effort and only a single modality is applied. However, we usually have to deal with multiple modalities, and sequentially annotating concepts without preference cannot suitably assign annotation effort. To address these two issues, in this paper we propose a multi-concept multi-modality active learning method for video annotation in which multiple concepts and multiple modalities can be simultaneously taken into consideration. In each round of active learning, this method selects the concept that is expected to get the highest performance gain and a batch of suitable samples to be annotated for this concept. Then, a graph-based semi-supervised learning is conducted on each modality for the selected concept. The proposed method is able to sufficiently explore the human effort by considering both the learnabilities of different concepts and the potentials of different modalities. Experimental results on TRECVID 2005 benchmark have demonstrated its effectiveness and efficiency.
APA, Harvard, Vancouver, ISO, and other styles
26

Xie, Feng, and Zheng Xu. "Semantic Based Annotation for Surveillance Big Data Using Domain Knowledge." International Journal of Cognitive Informatics and Natural Intelligence 9, no. 1 (January 2015): 16–29. http://dx.doi.org/10.4018/ijcini.2015010102.

Full text
Abstract:
Video surveillance technology is playing a more and more important role in traffic detection. Vehicle's static properties are crucial information in examining criminal and traffic violations. Image and video resources play an important role in traffic events analysis. With the rapid growth of the video surveillance devices, large number of image and video resources is increasing being created. It is crucial to explore, share, reuse, and link these multimedia resources for better organizing traffic events. With the development of Video Surveillance technology, it has been wildly used in the traffic monitoring. Therefore, there is a trend to use Video Surveillance to do intelligent analysis on vehicles. Now, using software and tools to analyze vehicles in videos has already been used in smart cards and electronic eye, which helps polices to extract useful information like plate, speed, etc. And the key technology is to obtain various properties of the vehicle. This paper provides an overview of the algorithms and technologies used in extracting static properties of vehicle in the video.
APA, Harvard, Vancouver, ISO, and other styles
27

Zhang, Jingwei, Christos Chatzichristos, Kaat Vandecasteele, Lauren Swinnen, Victoria Broux, Evy Cleeren, Wim Van Paesschen, and Maarten De Vos. "Automatic annotation correction for wearable EEG based epileptic seizure detection." Journal of Neural Engineering 19, no. 1 (February 1, 2022): 016038. http://dx.doi.org/10.1088/1741-2552/ac54c1.

Full text
Abstract:
Abstract Objective. Video-electroencephalography (vEEG), which defines the ground truth for the detection of epileptic seizures, is inadequate for long-term home monitoring. Thanks to advantages in comfort and unobtrusiveness, wearable EEG devices have been suggested as a solution for home monitoring. However, one of the challenges in data-driven automated seizure detection with wearable EEG data is to have reliable seizure annotations. Seizure annotations on the gold-standard 25-channel vEEG recordings may not be optimal to delineate seizure activity on the concomitantly recorded wearable EEG, due to artifacts or absence of ictal activity on the limited set of electrodes of the wearable EEG. This paper aims to develop an automatic approach to correct for imperfect annotations of seizure activity on wearable EEG, which can be used to train seizure detection algorithms. Approach. This paper first investigates the effectiveness of correcting the seizure annotations for the training set with a visual annotation correction. Then a novel approach has been proposed to automatically remove non-seizure data from wearable EEG in epochs annotated as seizures in gold-standard video-EEG recordings. The performance of the automatic annotation correction approach was evaluated by comparing the seizure detection models trained with (a) original vEEG seizure annotations, (b) visually corrected seizure annotations, and (c) automatically corrected seizure annotations. Main results. The automated seizure detection approach trained with automatically corrected seizure annotations was more sensitive and had fewer false-positive detections compared to the approach trained with visually corrected seizure annotations, and the approach trained with the original seizure annotations from gold-standard vEEG. Significance. The wearable EEG seizure detection approach performs better when trained with automatic seizure annotation correction.
APA, Harvard, Vancouver, ISO, and other styles
28

Zheng, Changliang, Zhiqian Zhang, Zhaoxin Liu, Hepeng Zang, Hongli Huang, and Jie Ren. "Research on Video Game Scene Annotation in Basketball Video." International Journal of Multimedia and Ubiquitous Engineering 12, no. 1 (January 31, 2017): 281–90. http://dx.doi.org/10.14257/ijmue.2017.12.1.24.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Gutierrez Becker, B., E. Giuffrida, M. Mangia, F. Arcadu, V. Whitehill, D. Guaraglia, M. Schwartz, et al. "P069 Artificial intelligence (AI)-filtered Videos for Accelerated Scoring of Colonoscopy Videos in Ulcerative Colitis Clinical Trials." Journal of Crohn's and Colitis 15, Supplement_1 (May 1, 2021): S173—S174. http://dx.doi.org/10.1093/ecco-jcc/jjab076.198.

Full text
Abstract:
Abstract Background Endoscopic assessment is a critical procedure to assess the improvement of mucosa and response to therapy, and therefore a pivotal component of clinical trial endpoints for IBD. Central scoring of endoscopic videos is challenging and time consuming. We evaluated the feasibility of using an Artificial Intelligence (AI) algorithm to automatically produce filtered videos where the non-readable portions of the video are removed, with the aim of accelerating the scoring of endoscopic videos. Methods The AI algorithm was based on a Convolutional Neural Network trained to perform a binary classification task. This task consisted of assigning the frames in a colonoscopy video to one of two classes: “readable” or “unreadable.” The algorithm was trained using annotations performed by two data scientists (BG, FA). The criteria to consider a frame “readable” were: i) the colon walls were within the field of view; ii) contrast and sharpness of the frame were sufficient to visually inspect the mucosa, and iii) no presence of artifacts completely obstructing the visibility of the mucosa. The frames were extracted randomly from 351 colonoscopy videos of the etrolizumab EUCALYPTUS (NCT01336465) Phase II ulcerative colitis clinical trial. Evaluation of the performance of the AI algorithm was performed on colonoscopy videos obtained as part of the etrolizumab HICKORY (NCT02100696) and LAUREL (NCT02165215) Phase III ulcerative colitis clinical trials. Each video was filtered using the AI algorithm, resulting in a shorter video where the sections considered unreadable by the AI algorithm were removed. Each of three annotators (EG, MM and MD) was randomly assigned an equal number of AI-filtered videos and raw videos. The gastroenterologist was tasked to score temporal segments of the video according to the Mayo Clinic Endoscopic Subscore (MCES). Annotations were performed by means of an online annotation platform (Virgo Surgical Video Solutions, Inc). Results We measured the time it took the annotators to score raw and AI-filtered videos. We observed a statistically significant reduction (Mann Whitney U test p-value=0.039) in the median time spent by the annotators scoring raw videos (10.59∓ 0.94 minutes) with respect to the time spent scoring AI-filtered videos (9.51 ∓ 0.92 minutes), with a substantial intra-rater agreement when evaluating highlight and raw videos (Cohen’s kappa 0.92 and 0.55 for experienced and junior gastroenterologists respectively). Conclusion Our analysis shows that AI can be used reliably as an assisting tool to automatically remove non-readable time segments from full colonoscopy videos. The use of our proposed algorithm can lead to reduced annotation times in the task of centrally reading colonoscopy videos.
APA, Harvard, Vancouver, ISO, and other styles
30

Ardley, Jillian, and Jacqueline Johnson. "Video Annotation Software in Teacher Education: Researching University Supervisor’s Perspective of a 21st-Century Technology." Journal of Educational Technology Systems 47, no. 4 (November 27, 2018): 479–99. http://dx.doi.org/10.1177/0047239518812715.

Full text
Abstract:
Video recordings for student teaching field experiences have been utilized with student teachers (also known as teacher candidates) to (a) capture the demonstration of their lesson plans, (b) critique their abilities within the performance, and (c) share and rate experiences for internal and external evaluations by the state and other organizations. Many times, the recording, saving, grading, and sharing process was not efficient. Thus, the feedback cycle from the university supervisor to the teacher candidate was negatively impacted. However, one communication technology tool that has the potential to facilitate the feedback process is video annotation software. This communication technology uses the storage within a remote server, known also as a cloud, to store videos that include typed commentary that is in sync with the portion of the video recorded. A group of university supervisors piloted a video annotation tool during student teaching to rate its effectiveness. Through a survey, the participants addressed how they perceived the implementation of the video annotation tool within the student teaching experience. Results suggest a video annotated technology-based supervision method is feasible and effective if paired with effective training and technical support.
APA, Harvard, Vancouver, ISO, and other styles
31

Chen, Hua-Tsung, Wen-Jiin Tsai, and Suh-Yin Lee. "Sports Information Retrieval for Video Annotation." International Journal of Digital Library Systems 1, no. 1 (2010): 62–88. http://dx.doi.org/10.4018/jdls.2010102704.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Dorado, A., J. Calic, and E. Izquierdo. "A Rule-Based Video Annotation System." IEEE Transactions on Circuits and Systems for Video Technology 14, no. 5 (May 2004): 622–33. http://dx.doi.org/10.1109/tcsvt.2004.826764.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Meng Wang, Xian-Sheng Hua, Richang Hong, Jinhui Tang, Guo-Jun Qi, and Yan Song. "Unified Video Annotation via Multigraph Learning." IEEE Transactions on Circuits and Systems for Video Technology 19, no. 5 (May 2009): 733–46. http://dx.doi.org/10.1109/tcsvt.2009.2017400.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Vondrick, Carl, Donald Patterson, and Deva Ramanan. "Efficiently Scaling up Crowdsourced Video Annotation." International Journal of Computer Vision 101, no. 1 (September 5, 2012): 184–204. http://dx.doi.org/10.1007/s11263-012-0564-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Chou, Chien-Li, Hua-Tsung Chen, and Suh-Yin Lee. "Multimodal Video-to-Near-Scene Annotation." IEEE Transactions on Multimedia 19, no. 2 (February 2017): 354–66. http://dx.doi.org/10.1109/tmm.2016.2614426.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Kankanhalli, M. S., and Tat-Seng Chua. "Video modeling using strata-based annotation." IEEE Multimedia 7, no. 1 (2000): 68–74. http://dx.doi.org/10.1109/93.839313.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Fu, Xin, John C. Schaefer, Gary Marchionini, and Xiangming Mu. "Video Annotation in a Learning Environment." Proceedings of the American Society for Information Science and Technology 43, no. 1 (October 10, 2007): 1–22. http://dx.doi.org/10.1002/meet.14504301175.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

GAYATRI, T. R., and S. RAMAN. "Natural language interface to video database." Natural Language Engineering 7, no. 1 (March 2001): 1–27. http://dx.doi.org/10.1017/s1351324901002601.

Full text
Abstract:
In this paper, we discuss a natural language interface to a database of structured textual descriptions in the form of annotations of video objects. The interface maps the natural language query input on to the annotation structures. The language processing is done in three phases of expectations and implications from the input word, disambiguation of noun implications and slot-filling of prepositional expectations, and finally, disambiguation of verbal expectations. The system has been tested with different types of user inputs, including ill-formed sentences, and studied for erroneous inputs and for different types of portability issues.
APA, Harvard, Vancouver, ISO, and other styles
39

Xu, Zheng, Fenglin Zhi, Chen Liang, Lin Mei, and Xiangfeng Luo. "Generating Semantic Annotation of Video for Organizing and Searching Traffic Resources." International Journal of Cognitive Informatics and Natural Intelligence 8, no. 1 (January 2014): 51–66. http://dx.doi.org/10.4018/ijcini.2014010104.

Full text
Abstract:
Image and video resources play an important role in traffic events analysis. With the rapid growth of the video surveillance devices, large number of image and video resources is increasing being created. It is crucial to explore, share, reuse, and link these multimedia resources for better organizing traffic events. Most of the video resources are currently annotated in an isolated way, which means that they lack semantic connections. Thus, providing the facilities for annotating these video resources is highly demanded. These facilities create the semantic connections among video resources and allow their metadata to be understood globally. Adopting semantic technologies, this paper introduces a video annotation platform. The platform enables user to semantically annotate video resources using vocabularies defined by traffic events ontologies. Moreover, the platform provides the search interface of annotated video resources. The result of initial development demonstrates the benefits of applying semantic technologies in the aspects of reusability, scalability and extensibility.
APA, Harvard, Vancouver, ISO, and other styles
40

Boldrini, Elena, Alberto Cattaneo, and Alessia Evi-Colombo. "Was it worth the effort? An exploratory study on the usefulness and acceptance of video annotation for in-service teachers training in VET sector." Research on Education and Media 11, no. 1 (June 1, 2019): 100–108. http://dx.doi.org/10.2478/rem-2019-0014.

Full text
Abstract:
Abstract In the field of teachers training of different levels (primary and secondary) and types (in-service and pre-service), exploiting video support for teaching practices analysis is a well-established training method to foster reflection on professional practices, self- and hetero-observation, and finally to improve teaching. While video has long been used to capture microteaching episodes, illustrate classroom cases and practices, and to review teaching practices, recent developments in video annotation tools may help to extend and augment the potentialities of video viewing. Various, although limited, numbers of studies have explored this field of research, especially with respect to in-service teachers training. However, this is less the case for Vocational Education and Training. The study presented here is a pilot experience in the field of in-service teachers training in the vocational sector. A two-year training programme using video annotation has been evaluated and analysed. The dimensions investigated are teachers’ perceptions on the usefulness, acceptance and sustainability of video annotation in teaching practices analysis. Results show a very good acceptance and usefulness of video annotation for reflecting on practice and to deliver feedbacks. Implications for the integration of a structural programme of analysis of practices based on video annotation are presented.
APA, Harvard, Vancouver, ISO, and other styles
41

Biel, Joan-Isaac, and Daniel Gatica-Perez. "The Good, the Bad, and the Angry: Analyzing Crowdsourced Impressions of Vloggers." Proceedings of the International AAAI Conference on Web and Social Media 6, no. 1 (August 3, 2021): 407–10. http://dx.doi.org/10.1609/icwsm.v6i1.14304.

Full text
Abstract:
We address the study of interpersonal perception in social conversational video based on multifaceted impressions collected from short video-watching. First, we crowdsourced the annotation of personality, attractiveness, and mood impressions for a dataset of YouTube vloggers, generating a corpora that has potential to develop automatic techniques for vlogger characterization. Then, we provide an analysis of the crowdsourced annotations focusing on the level of agreement among annotators, as well as the interplay between different impressions. Overall, this work provides interesting new insights on vlogger impressions and the use of crowdsourcing to collect behavioral annotations from multimodal data.
APA, Harvard, Vancouver, ISO, and other styles
42

Johnston, Trevor. "The reluctant oracle: using strategic annotations to add value to, and extract value from, a signed language corpus." Corpora 9, no. 2 (November 2014): 155–89. http://dx.doi.org/10.3366/cor.2014.0056.

Full text
Abstract:
In this paper, I discuss the ways in which multimedia annotation software is being used to transform an archive of Auslan recordings into a true machine-readable language corpus. After the basic structure of the annotation files in the Auslan corpus is described and the exercise differentiated from transcription, the glossing and annotation conventions are explained. Following this, I exemplify the searching and pattern-matching at different levels of linguistic organisation that these annotations make possible. The paper shows how, in the creation of signed language corpora, it is important to be clear about the difference between transcription and annotation. Without an awareness of this distinction – and despite time consuming and expensive processing of the video recordings – we may not be able to discern the types of patterns in our corpora that we hope to. The conventions are designed to ensure that the annotations really do enable researchers to identify regularities at different levels of linguistic organisation in the corpus and, thus, to test, or build on, existing descriptions of the language.
APA, Harvard, Vancouver, ISO, and other styles
43

Gil de Gómez Pérez, David, and Roman Bednarik. "POnline: An Online Pupil Annotation Tool Employing Crowd-sourcing and Engagement Mechanisms." Human Computation 6 (December 10, 2019): 176–91. http://dx.doi.org/10.15346/hc.v6i1.99.

Full text
Abstract:
Pupil center and pupil contour are two of the most important features in the eye-image used for video-based eye-tracking. Well annotated databases are needed in order to allow benchmarking of the available- and new pupil detection and gaze estimation algorithms. Unfortunately, creation of such a data set is costly and requires a lot of efforts, including manual work of the annotators. In addition, reliability of manual annotations is hard to establish with a low number of annotators. In order to facilitate progress of the gaze tracking algorithm research, we created an online pupil annotation tool that engages many users to interact through gamification and allows utilization of the crowd power to create reliable annotations \cite{artstein2005bias}. We describe the tool and the mechanisms employed, and report results on the annotation of a publicly available data set. Finally, we demonstrate an example utilization of the new high-quality annotation on a comparison of two state-of-the-art pupil center algorithms.
APA, Harvard, Vancouver, ISO, and other styles
44

Rolf, Rüdiger, Hannah Reuter, Martin Abel, and Kai-Christoph Hamborg. "Requirements of students for video-annotations in lecture recordings." Interactive Technology and Smart Education 11, no. 3 (September 9, 2014): 223–34. http://dx.doi.org/10.1108/itse-07-2014-0021.

Full text
Abstract:
Purpose – Improving the use of annotations in lecture recordings. Design/methodology/approach – Requirements analysis with scenario based design (SBD) on focus groups. Findings – These seven points have been extracted from the feedback of the focus groups: (1) Control of the annotation feature (turn on/turn off). (2) An option to decide who is able to see their comments (groups, lecturer, friends). (3) An easy and paper-like experience in creating a comment. (4) An option to discuss comments. (5) An option to import already existing comments. (6) Color-coding of the different types of comments. (7) An option to print their annotations within the context of the recording. Research limitations/implications – The study was performed to improve the open-source lecture recording system Opencast Matterhorn. Originality/value – Annotations can help to enable the students that use lecture recordings to move from a passive watching to an active viewing and reflecting.
APA, Harvard, Vancouver, ISO, and other styles
45

Chamaseman, Fereshteh Falah, Lilly Suriani Affendey, Norwati Mustapha, and Fatimah Khalid. "Automatic Video Annotation Framework Using Concept Detectors." Journal of Applied Sciences 15, no. 2 (January 15, 2015): 256–63. http://dx.doi.org/10.3923/jas.2015.256.263.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Howard, Craig D. "Participatory Media Literacy in Collaborative Video Annotation." TechTrends 65, no. 5 (July 14, 2021): 860–73. http://dx.doi.org/10.1007/s11528-021-00632-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Yamamoto, Daisuke, and Katashi Nagao. "Web-based Video Annotation and its Applications." Transactions of the Japanese Society for Artificial Intelligence 20 (2005): 67–75. http://dx.doi.org/10.1527/tjsai.20.67.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Yamamoto, Daisuke, Tomoki Masuda, Shigeki Ohira, and Katashi Nagao. "Collaborative Video Annotation by Sharing Tag Clouds." Transactions of the Japanese Society for Artificial Intelligence 25 (2010): 243–51. http://dx.doi.org/10.1527/tjsai.25.243.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Cortes, L., and Y. Amit. "Efficient Annotation of Vesicle Dynamics Video Microscopy." IEEE Transactions on Pattern Analysis and Machine Intelligence 30, no. 11 (November 2008): 1998–2010. http://dx.doi.org/10.1109/tpami.2008.84.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Wang, Han, Xinxiao Wu, and Yunde Jia. "Heterogeneous domain adaptation method for video annotation." IET Computer Vision 11, no. 2 (November 29, 2016): 181–87. http://dx.doi.org/10.1049/iet-cvi.2016.0148.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography