Dissertations / Theses: 'Video text'

1

Sidevåg, Emmilie. "Användarmanual text vs video." Thesis, Linnéuniversitetet, Institutionen för datavetenskap, fysik och matematik, DFM, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-17617.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Salway, Andrew. "Video annotation : the role of specialist text." Thesis, University of Surrey, 1999. http://epubs.surrey.ac.uk/843350/.

Full text

Abstract:

Digital video is among the most information-intensive modes of communication. The retrieval of video from digital libraries, along with sound and text, is a major challenge for the computing community in general and for the artificial intelligence community specifically. The advent of digital video has set some old questions in a new light. Questions relating to aesthetics and to the role of surrogates - image for reality and text for image, invariably touch upon the link between vision and language. Dealing with this link computationally is important for the artificial intelligence enterprise. Interesting images to consider both aesthetically and for research in video retrieval include those which are constrained and patterned, and which convey rich meanings; for example, dance. These are specialist images for us and require a special language for description and interpretation. Furthermore, they require specialist knowledge to be understood since there is usually more than meets the untrained eye: this knowledge may also be articulated in the language of the specialism. In order to be retrieved effectively and efficiently, video has to be annotated-, particularly so for specialist moving images. Annotation involves attaching keywords from the specialism along with, for us, commentaries produced by experts, including those written and spoken specifically for annotation and those obtained from a corpus of extant texts. A system that processes such collateral text for video annotation should perhaps be grounded in an understanding of the link between vision and language. This thesis attempts to synthesise ideas from artificial intelligence, multimedia systems, linguistics, cognitive psychology and aesthetics. The link between vision and language is explored by focusing on moving images of dance and the special language used to describe and interpret them. We have developed an object-oriented system, KAB, which helps to annotate a digital video library with a collateral corpus of texts and terminology. User evaluation has been encouraging. The system is now available on the WWW.

APA, Harvard, Vancouver, ISO, and other styles

3

Smith, Gregory. "VIDEO SCENE DETECTION USING CLOSED CAPTION TEXT." VCU Scholars Compass, 2009. http://scholarscompass.vcu.edu/etd/1932.

Full text

Abstract:

Issues in Automatic Video Biography Editing are similar to those in Video Scene Detection and Topic Detection and Tracking (TDT). The techniques of Video Scene Detection and TDT can be applied to interviews to reduce the time necessary to edit a video biography. The system has attacked the problems of extraction of video text, story segmentation, and correlation. This thesis project was divided into three parts: extraction, scene detection, and correlation. The project successfully detected scene breaks in series television episodes and displayed scenes that had similar content.

APA, Harvard, Vancouver, ISO, and other styles

4

Zhang, Jing. "Extraction of Text Objects in Image and Video Documents." Scholar Commons, 2012. http://scholarcommons.usf.edu/etd/4266.

Full text

Abstract:

The popularity of digital image and video is increasing rapidly. To help users navigate libraries of image and video, Content Based Information Retrieval (CBIR) system that can automatically index image and video documents are needed. However, due to the semantic gap between low-level machine descriptors and high-level semantic descriptors, the existing CBIR systems are still far from perfect. Text embedded in multi-media data, as a well-defined model of concepts for humans' communication, contains much semantic information related to the content. This text information can provide a much truer form of content-based access to the image and video documents if it can be extracted and harnessed efficiently. This dissertation solves the problem involved in detecting text object in image and video and tracking text event in video. For text detection problem, we propose a new unsupervised text detection algorithm. A new text model is constructed to describe text object using pictorial structure. Each character is a part in the model and every two neighboring characters are connected by a spring-like link. Two characters and the link connecting them are defined as a text unit. We localize candidate parts by extracting closed boundaries and initialize the links by connecting two neighboring candidate parts based on the spatial relationship of characters. For every candidate part, we compute character energy using three new character features, averaged angle difference of corresponding pairs, fraction of non-noise pairs, and vector of stroke width. They are extracted based on our observation that the edge of a character can be divided into two sets with high similarities in length, curvature, and orientation. For every candidate link, we compute link energy based on our observation that the characters of a text typically align along certain direction with similar color, size, and stroke width. For every candidate text unit, we combine character and link energies to compute text unit energy which indicates the probability that the candidate text model is a real text object. The final text detection results are generated using a text unit energy based thresholding. For text tracking problem, we construct a text event model by using pictorial structure as well. In this model, the detected text object in each video frame is a part and two neighboring text objects of a text event are connected by a spring-like link. Inter-frame link energy is computed for each link based on the character energy, similarity of neighboring text objects, and motion information. After refining the model using inter-frame link energy, the remaining text event models are marked as text events. At character level, because the proposed method is based on the assumption that the strokes of a character have uniform thickness, it can detect and localize characters from different languages in different styles, such as typewritten text or handwriting text, if the characters have approximately uniform stroke thickness. At text level, however, because the spatial relationship between two neighboring characters is used to localize text objects, the proposed method may fail to detect and localize the characters with multiple separate strokes or connected characters. For example, some East Asian language characters, such as Chinese, Japanese, and Korean, have many strokes of a single character. We need to group the strokes first to form single characters and then group characters to form text objects. While, the characters of some languages, such Arabic and Hindi, are connected together, we cannot extract spatial information between neighboring characters since they are detected as a single character. Therefore, in current stage the proposed method can detect and localize the text objects that are composed of separate characters with connected strokes with approximately uniform thickness. We evaluated our method comprehensively using three English language-based image and video datasets: ICDAR 2003/2005 text locating dataset (258 training images and 251 test images), Microsoft Street View text detection dataset (307 street view images), and VACE video dataset (50 broadcast news videos from CNN and ABC). The experimental results demonstrate that the proposed text detection method can capture the inherent properties of text and discriminate text from other objects efficiently.

APA, Harvard, Vancouver, ISO, and other styles

5

Sjölund, Jonathan. "Detection of Frozen Video Subtitles Using Machine Learning." Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158239.

Full text

Abstract:

When subtitles are burned into a video, an error can sometimes occur in the encoder that results in the same subtitle being burned into several frames, resulting in subtitles becoming frozen. This thesis provides a way to detect frozen video subtitles with the help of an implemented text detector and classifier. Two types of classifiers, naïve classifiers and machine learning classifiers, are tested and compared on a variety of different videos to see how much a machine learning approach can improve the performance. The naïve classifiers are evaluated using ground truth data to gain an understanding of the importance of good text detection. To understand the difficulty of the problem, two different machine learning classifiers are tested, logistic regression and random forests. The result shows that machine learning improves the performance over using naïve classifiers by improving the specificity from approximately 87.3% to 95.8% and improving the accuracy from 93.3% to 95.5%. Random forests achieve the best overall performance, but the difference compared to when using logistic regression is small enough that more computationally complex machine learning classifiers are not necessary. Using the ground truth shows that the weaker naïve classifiers would be improved by at least 4.2% accuracy, thus a better text detector is warranted. This thesis shows that machine learning is a viable option for detecting frozen video subtitles.

APA, Harvard, Vancouver, ISO, and other styles

6

Chen, Datong. "Text detection and recognition in images and video sequences /." [S.l.] : [s.n.], 2003. http://library.epfl.ch/theses/?display=detail&nr=2863.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Štindlová, Marie. "Museli to založit." Master's thesis, Vysoké učení technické v Brně. Fakulta výtvarných umění, 2015. http://www.nusl.cz/ntk/nusl-232451.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Bird, Paul. "Elementary students' comprehension of computer presented text." Thesis, University of British Columbia, 1990. http://hdl.handle.net/2429/29187.

Full text

Abstract:

The study investigated grade 6 students' comprehension of narrative text when presented on a computer and as printed words on paper. A set of comprehension tests were developed for three stories of varying length (382 words, 1047 words and 1933 words) using a skills hierarchy protocol. The text for each story was prepared for presentation on a Macintosh computer using a program written for the study and as print in the form of exact copies of the computer screen. Students from two grade 6 classes in a suburban elementary school were randomly assigned to read one of the stories in either print form or on the computer and subsequently completed a comprehension test as well as a questionnaire concerning attitude and personal information. The responses from the comprehension tests were evaluated by graduate students in Language Education. The data evolved from the tests and questionnaires were analysed to determine measures of test construct validity, inter-rater reliability, and any significant difference in the means of comprehension scores for the two experimental groups for each story. The results indicated small but insignificant differences between the means of the three comprehension test scores for computer and print. A number of students reading from the computer complained of eye fatigue. The scores of subjects reading the longest story and complaining of eye fatigue were significantly lower.
Education, Faculty of
Curriculum and Pedagogy (EDCP), Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

9

Sharma, Nabin. "Multi-lingual Text Processing from Videos." Thesis, Griffith University, 2015. http://hdl.handle.net/10072/367489.

Full text

Abstract:

Advances in digital technology have produced low priced portable imaging devices such as digital cameras attached to mobile phones, camcorders, PDA’s etc. which are highly portable. These devices can be used to capture videos and images at ease, which can be shared through the internet and other communication media. In the commercial do- main, cameras are used to create news, advertisement videos and other forms of material for information communication. The use of multiple languages to create information for targeted audiences is quite common in countries having multiple oﬃcial languages. Trans- mission of news, advertisement videos and images across various communication channels has created large databases of videos and these are increasing exponentially. Eﬀective management of such databases requires proper indexing for the retrieval of relevant in- formation. Text information is dominant in most of the videos and images, which can be used as keywords for retrieval of relevant video and images. Automatic annotation of videos and images to extract keywords requires the text to be converted to an editable form. This thesis addresses the problem of multi-lingual text processing from video frames. Multi-lingual text processing involves text detection, word segmentation, script identiﬁcation, and text recognition. Additionally, text frame classiﬁcation is required to avoid processing a video frame which does not contain text information. A new multi-lingual video word dataset was created and published as a part of the current research. The dataset comprises words of ten scripts, namely English (Roman), Hindi (Devanagari), Bengali (Bangla), Arabic, Oriya, Gujrathi, Punjabi, Kannada, Tamil and Telugu. This dataset was created to facilitate future research on multi-lingual text recognition.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Information and Communication Technology.
Science, Environment, Engineering and Technology
Full Text

APA, Harvard, Vancouver, ISO, and other styles

10

Fraz, Muhammad. "Video content analysis for intelligent forensics." Thesis, Loughborough University, 2014. https://dspace.lboro.ac.uk/2134/18065.

Full text

Abstract:

The networks of surveillance cameras installed in public places and private territories continuously record video data with the aim of detecting and preventing unlawful activities. This enhances the importance of video content analysis applications, either for real time (i.e. analytic) or post-event (i.e. forensic) analysis. In this thesis, the primary focus is on four key aspects of video content analysis, namely; 1. Moving object detection and recognition, 2. Correction of colours in the video frames and recognition of colours of moving objects, 3. Make and model recognition of vehicles and identification of their type, 4. Detection and recognition of text information in outdoor scenes. To address the first issue, a framework is presented in the first part of the thesis that efficiently detects and recognizes moving objects in videos. The framework targets the problem of object detection in the presence of complex background. The object detection part of the framework relies on background modelling technique and a novel post processing step where the contours of the foreground regions (i.e. moving object) are refined by the classification of edge segments as belonging either to the background or to the foreground region. Further, a novel feature descriptor is devised for the classification of moving objects into humans, vehicles and background. The proposed feature descriptor captures the texture information present in the silhouette of foreground objects. To address the second issue, a framework for the correction and recognition of true colours of objects in videos is presented with novel noise reduction, colour enhancement and colour recognition stages. The colour recognition stage makes use of temporal information to reliably recognize the true colours of moving objects in multiple frames. The proposed framework is specifically designed to perform robustly on videos that have poor quality because of surrounding illumination, camera sensor imperfection and artefacts due to high compression. In the third part of the thesis, a framework for vehicle make and model recognition and type identification is presented. As a part of this work, a novel feature representation technique for distinctive representation of vehicle images has emerged. The feature representation technique uses dense feature description and mid-level feature encoding scheme to capture the texture in the frontal view of the vehicles. The proposed method is insensitive to minor in-plane rotation and skew within the image. The capability of the proposed framework can be enhanced to any number of vehicle classes without re-training. Another important contribution of this work is the publication of a comprehensive up to date dataset of vehicle images to support future research in this domain. The problem of text detection and recognition in images is addressed in the last part of the thesis. A novel technique is proposed that exploits the colour information in the image for the identification of text regions. Apart from detection, the colour information is also used to segment characters from the words. The recognition of identified characters is performed using shape features and supervised learning. Finally, a lexicon based alignment procedure is adopted to finalize the recognition of strings present in word images. Extensive experiments have been conducted on benchmark datasets to analyse the performance of proposed algorithms. The results show that the proposed moving object detection and recognition technique superseded well-know baseline techniques. The proposed framework for the correction and recognition of object colours in video frames achieved all the aforementioned goals. The performance analysis of the vehicle make and model recognition framework on multiple datasets has shown the strength and reliability of the technique when used within various scenarios. Finally, the experimental results for the text detection and recognition framework on benchmark datasets have revealed the potential of the proposed scheme for accurate detection and recognition of text in the wild.

APA, Harvard, Vancouver, ISO, and other styles

11

Zheng, Yilin. "Text-Based Speech Video Synthesis from a Single Face Image." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1572168353691788.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Gokturk, Ozkan Ziya. "Metadata Extraction From Text In Soccer Domain." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609871/index.pdf.

Full text

Abstract:

Video databases and content based retrieval in these databases have become popular with the improvements in technology. Metadata extraction techniques are used for providing data to video content. One popular metadata extraction technique for mul- timedia is information extraction from text. For some domains, it is possible to &
#64257
nd accompanying text with the video, such as soccer domain, movie domain and news domain. In this thesis, we present an approach of metadata extraction from match reports for soccer domain. The UEFA Cup and UEFA Champions League Match Reports are downloaded from the web site of UEFA by a web-crawler. These match reports are preprocessed by using regular expressions and then important events are extracted by using hand-written rules. In addition to hand-written rules, two di&
#64256
erent machine learning techniques are applied on match corpus to learn event patterns and automatically extract match events. Extracted events are saved in an MPEG-7 &
#64257
le. A user interface is implemented to query the events in the MPEG-7 match corpus and view the corresponding video segments.

APA, Harvard, Vancouver, ISO, and other styles

13

Hekimoglu, M. Kadri. "Video-text processing by using Motorola 68020 CPU and its environment." Thesis, Monterey, California. Naval Postgraduate School, 1991. http://hdl.handle.net/10945/26833.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Tarczyńska, Anna. "Methods of Text Information Extraction in Digital Videos." Thesis, Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-2656.

Full text

Abstract:

Context The huge amount of existing digital video files needs to provide indexing to make it available for customers (easier searching). The indexing can be provided by text information extraction. In this thesis we have analysed and compared methods of text information extraction in digital videos. Furthermore, we have evaluated them in the new context proposed by us, namely usefulness in sports news indexing and information retrieval. Objectives The objectives of this thesis are as follows: providing a better understanding of the nature of text extraction; performing a systematic literature review on various methods of text information extraction in digital videos of TV sports news; designing and executing an experiment in the testing environment; evaluating available and promising methods of text information extraction from digital video files in the proposed context associated with video sports news indexing and retrieval; providing an adequate solution in the proposed context described above. Methods This thesis consists of three research methods: Systematic Literature Review, Video Content Analysis with the checklist, and Experiment. The Systematic Literature Review has been used to study the nature of text information extraction, to establish the methods and challenges, and to specify the effective way of conducting the experiment. The video content analysis has been used to establish the context for the experiment. Finally, the experiment has been conducted to answer the main research question: How useful are the methods of text information extraction for indexation of video sports news and information retrieval? Results Through the Systematic Literature Review we identified 29 challenges of the text information extraction methods, and 10 chains between them. We extracted 21 tools and 105 different methods, and analyzed the relations between them. Through Video Content Analysis we specified three groups of probability of text extraction from video, and 14 categories for providing video sports news indexation with the taxonomy hierarchy. We have conducted the Experiment on three videos files, with 127 frames, 8970 characters, and 1814 words, using the only available MoCA tool. As a result, we reported 10 errors and proposed recommendations for each of them. We evaluated the tool according to the categories mentioned above and offered four advantages, and nine disadvantages of the Tool mentioned above. Conclusions It is hard to compare the methods described in the literature, because the tools are not available for testing, and they are not compared with each other. Furthermore, the values of recall and precision measures highly depend on the quality of the text contained in the video. Therefore, performing the experiments on the same indexed database is necessary. However, the text information extraction is time consuming (because of huge amount of frames in video), and even high character recognition rate gives low word recognition rate. Therefore, the usefulness of text information extraction for video indexation is still low. Because most of the text information contained in the videos news is inserted in post-processing, the text extraction could be provided in the root: during the processing of the original video, by the broadcasting company (e.g. by automatically saving inserted text in separate file). Then the text information extraction will not be necessary for managing the new video files
The huge amount of existing digital video files needs to provide indexing to make it available for customers (easier searching). The indexing can be provided by text information extraction. In this thesis we have analysed and compared methods of text information extraction in digital videos. Furthermore, we have evaluated them in the new context proposed by us, namely usefulness in sports news indexing and information retrieval.

APA, Harvard, Vancouver, ISO, and other styles

15

Demirtas, Kezban. "Automatic Video Categorization And Summarization." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/3/12611113/index.pdf.

Full text

Abstract:

In this thesis, we make automatic video categorization and summarization by using subtitles of videos. We propose two methods for video categorization. The first method makes unsupervised categorization by applying natural language processing techniques on video subtitles and uses the WordNet lexical database and WordNet domains. The method starts with text preprocessing. Then a keyword extraction algorithm and a word sense disambiguation method are applied. The WordNet domains that correspond to the correct senses of keywords are extracted. Video is assigned a category label based on the extracted domains. The second method has the same steps for extracting WordNet domains of video but makes categorization by using a learning module. Experiments with documentary videos give promising results in discovering the correct categories of videos. Video summarization algorithms present condensed versions of a full length video by identifying the most significant parts of the video. We propose a video summarization method using the subtitles of videos and text summarization techniques. We identify significant sentences in the subtitles of a video by using text summarization techniques and then we compose a video summary by finding the video parts corresponding to these summary sentences.

APA, Harvard, Vancouver, ISO, and other styles

16

Hay, Richard. "Views and perceptions of the use of text and video in English teaching." Thesis, Högskolan i Gävle, Avdelningen för humaniora, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-25400.

Full text

Abstract:

This essay investigates how students studying at upper secondary level perceive the use of text and video as teaching aids as part of their English studies. Students that came from both vocational and preparatory programs completed an online survey and seven of them subsequently took part in interviews. Both the survey and the interviews show that a large majority of students would like much more video based teaching material as part of their English courses - some would even prefer all the course material to be video based. The results also show that even though the students want more video, opinion is divided when it comes to how much, and in what way video is best used or incorporated into English teaching. Many of the students that asked for more video said that they found it difficult to read and to understand longer texts, furthermore they found texts to be boring. They pointed out that video was more interesting and motivating. Video was generally seen as being the preferred choice when it came to authentic language, help with pronunciation and access to the culture of different English speaking countries. Text, on the other hand, was seen to provide a much richer and more detailed information which was especially helpful when it came to spelling and grammar. It was also clear that the preference for video was greater among the students from the vocational classes. There was also a general agreement that, although video is used as a teaching aid, it is more usually used by their teachers as a time filler or reward. Finally, even if learning English continues to be based on text and course books, there is a broad consensus among the students that more video should be used, as it is seen as a valuable and an effective complement to traditional text based material.

APA, Harvard, Vancouver, ISO, and other styles

17

Schwarz, Katharina [Verfasser], and Hendrik P. A. [Akademischer Betreuer] Lensch. "Text–to–Video : Image Semantics and NLP / Katharina Schwarz ; Betreuer: Hendrik P. A. Lensch." Tübingen : Universitätsbibliothek Tübingen, 2019. http://d-nb.info/1182985963/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Yousfi, Sonia. "Embedded Arabic text detection and recognition in videos." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEI069/document.

Full text

Abstract:

Cette thèse s'intéresse à la détection et la reconnaissance du texte arabe incrusté dans les vidéos. Dans ce contexte, nous proposons différents prototypes de détection et d'OCR vidéo (Optical Character Recognition) qui sont robustes à la complexité du texte arabe (différentes échelles, tailles, polices, etc.) ainsi qu'aux différents défis liés à l'environnement vidéo et aux conditions d'acquisitions (variabilité du fond, luminosité, contraste, faible résolution, etc.). Nous introduisons différents détecteurs de texte arabe qui se basent sur l'apprentissage artificiel sans aucun prétraitement. Les détecteurs se basent sur des Réseaux de Neurones à Convolution (ConvNet) ainsi que sur des schémas de boosting pour apprendre la sélection des caractéristiques textuelles manuellement conçus. Quant à notre méthodologie d'OCR, elle se passe de la segmentation en traitant chaque image de texte en tant que séquence de caractéristiques grâce à un processus de scanning. Contrairement aux méthodes existantes qui se basent sur des caractéristiques manuellement conçues, nous proposons des représentations pertinentes apprises automatiquement à partir des données. Nous utilisons différents modèles d'apprentissage profond, regroupant des Auto-Encodeurs, des ConvNets et un modèle d'apprentissage non-supervisé, qui génèrent automatiquement ces caractéristiques. Chaque modèle résulte en un système d'OCR bien spécifique. Le processus de reconnaissance se base sur une approche connexionniste récurrente pour l'apprentissage de l'étiquetage des séquences de caractéristiques sans aucune segmentation préalable. Nos modèles d'OCR proposés sont comparés à d'autres modèles qui se basent sur des caractéristiques manuellement conçues. Nous proposons, en outre, d'intégrer des modèles de langage (LM) arabes afin d'améliorer les résultats de reconnaissance. Nous introduisons différents LMs à base des Réseaux de Neurones Récurrents capables d'apprendre des longues interdépendances linguistiques. Nous proposons un schéma de décodage conjoint qui intègre les inférences du LM en parallèle avec celles de l'OCR tout en introduisant un ensemble d’hyper-paramètres afin d'améliorer la reconnaissance et réduire le temps de réponse. Afin de surpasser le manque de corpus textuels arabes issus de contenus multimédia, nous mettons au point de nouveaux corpus manuellement annotés à partir des flux TV arabes. Le corpus conçu pour l'OCR, nommé ALIF et composée de 6,532 images de texte annotées, a été publié a des fins de recherche. Nos systèmes ont été développés et évalués sur ces corpus. L’étude des résultats a permis de valider nos approches et de montrer leurs efficacité et généricité avec plus de 97% en taux de détection, 88.63% en taux de reconnaissance mots sur le corpus ALIF dépassant ainsi un des systèmes d'OCR commerciaux les mieux connus par 36 points
This thesis focuses on Arabic embedded text detection and recognition in videos. Different approaches robust to Arabic text variability (fonts, scales, sizes, etc.) as well as to environmental and acquisition condition challenges (contrasts, degradation, complex background, etc.) are proposed. We introduce different machine learning-based solutions for robust text detection without relying on any pre-processing. The first method is based on Convolutional Neural Networks (ConvNet) while the others use a specific boosting cascade to select relevant hand-crafted text features. For the text recognition, our methodology is segmentation-free. Text images are transformed into sequences of features using a multi-scale scanning scheme. Standing out from the dominant methodology of hand-crafted features, we propose to learn relevant text representations from data using different deep learning methods, namely Deep Auto-Encoders, ConvNets and unsupervised learning models. Each one leads to a specific OCR (Optical Character Recognition) solution. Sequence labeling is performed without any prior segmentation using a recurrent connectionist learning model. Proposed solutions are compared to other methods based on non-connectionist and hand-crafted features. In addition, we propose to enhance the recognition results using Recurrent Neural Network-based language models that are able to capture long-range linguistic dependencies. Both OCR and language model probabilities are incorporated in a joint decoding scheme where additional hyper-parameters are introduced to boost recognition results and reduce the response time. Given the lack of public multimedia Arabic datasets, we propose novel annotated datasets issued from Arabic videos. The OCR dataset, called ALIF, is publicly available for research purposes. As the best of our knowledge, it is first public dataset dedicated for Arabic video OCR. Our proposed solutions were extensively evaluated. Obtained results highlight the genericity and the efficiency of our approaches, reaching a word recognition rate of 88.63% on the ALIF dataset and outperforming well-known commercial OCR engine by more than 36%

APA, Harvard, Vancouver, ISO, and other styles

19

Uggerud, Nils. "AnnotEasy: A gesture and speech-to-text based video annotation tool for note taking in pre-recorded lectures in higher education." Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-105962.

Full text

Abstract:

This paper investigates students’ attitudes towards using gestures and speech-to- text (GaST) to take notes while watching recorded lectures. A literature review regarding video based learning, an expert interview, and a background survey regarding students’ note taking habits led to the creation of the prototype AnnotEasy, a tool that allows students to use GaST to take notes. AnnotEasy was tested in three iterations against 18 students, and was updated after each iteration. The students watched a five minute long lecture and took notes by using AnnotEasy. The participants’ perceived ease of use (PEU) and perceived usefulness (PU) was evaluated based on the TAM. Their general attitudes were evaluated in semi structured interviews. The result showed that the students had a high PEU and PU of AnnotEasy. They were mainly positive towards taking notes by using GaST. Further, the result suggests that AnnotEasy could facilitate the process of structuring a lecture’s content. Lastly, even though students had positive attitudes towards using speech to create notes, observations showed that this was problematic when the users attempted to create longer notes. This indicates that speech could be more beneficial for taking shorter notes.

APA, Harvard, Vancouver, ISO, and other styles

20

Stokes, Charlotte Ellenor. "Investigating the Efficacy of Video versus Text Instruction for the Recall of Food Safety Information." Digital Archive @ GSU, 2009. http://digitalarchive.gsu.edu/nutrition_theses/28.

Full text

Abstract:

Purpose: Teaching consumers proper home food safety practices is an important strategy to combat foodborne illness. Food safety educators with limited resources must do a cost-versus-benefit analysis before choosing the optimum medium to reach their target audiences. The objectives of this research were to determine whether presenting food safety information in a video format was more effective than text-only in terms of audience recall of the information one week later; to determine whether an intervention in text or video form increased students’ knowledge of food safety information as compared to no intervention at all; and to identify certain demographic factors that could have influenced performance on a food safety quiz. Methods: One hundred thirty-three Georgia State University undergraduate students were assigned to one of three groups. One group viewed a food safety video (n=59), a second group received the same information in text-only form (n=41), and the third group (n=33) served as the control and received no intervention. Students filled out a demographic questionnaire and completed a pre-intervention and post-intervention food safety knowledge test. Average scores were calculated, and the data were analyzed using SPSS 16.0 for Windows. Results: There was no significant difference between pre-intervention test scores among the three groups (p=.057). The video group scored significantly higher on the post-intervention test (p=.006) than the text group and the control group (p<.001). The video group (p<.001) and text group (p<.001) both scored significantly higher on the post-intervention quiz than the pre-intervention quiz, but the control group did not (p=.466). Video was superior to text overall and in conveying basic food safety principles; however, students in the text group demonstrated a better recall of more detailed food safety information such as proper internal cooking temperatures for poultry and ground beef. Previous food safety education in the classroom or online was found to be the only significant predictor of better performance on the pre-intervention test (p=.004). Conclusion: Video is more effective than text when used to deliver simple, direct food safety messages. More detailed information, such as proper internal cooking temperatures, might be best delivered in text form. Consumers are likely to benefit most from a multimedia approach to food safety education that includes videos, accompanying brochures, and Web site content.

APA, Harvard, Vancouver, ISO, and other styles

21

Tran, Anh Xuan. "Identifying latent attributes from video scenes using knowledge acquired from large collections of text documents." Thesis, The University of Arizona, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=3634275.

Full text

Abstract:

Peter Drucker, a well-known influential writer and philosopher in the field of management theory and practice, once claimed that “the most important thing in communication is hearing what isn't said.” It is not difficult to see that a similar concept also holds in the context of video scene understanding. In almost every non-trivial video scene, most important elements, such as the motives and intentions of the actors, can never be seen or directly observed, yet the identification of these latent attributes is crucial to our full understanding of the scene. That is to say, latent attributes matter.

In this work, we explore the task of identifying latent attributes in video scenes, focusing on the mental states of participant actors. We propose a novel approach to the problem based on the use of large text collections as background knowledge and minimal information about the videos, such as activity and actor types, as query context. We formalize the task and a measure of merit that accounts for the semantic relatedness of mental state terms, as well as their distribution weights. We develop and test several largely unsupervised information extraction models that identify the mental state labels of human participants in video scenes given some contextual information about the scenes. We show that these models produce complementary information and their combination significantly outperforms the individual models, and improves performance over several baseline methods on two different datasets. We present an extensive analysis of our models and close with a discussion of our findings, along with a roadmap for future research.

APA, Harvard, Vancouver, ISO, and other styles

22

Ulvbäck, Gustav, and Wingårdh Rickard Eriksson. "Förmedla information med animerad text : Blir textbaserad information på sociala medier mer intressant om det sker i rörlig bild med animerad text?" Thesis, Södertörns högskola, Institutionen för naturvetenskap, miljö och teknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:sh:diva-34509.

Full text

Abstract:

Syftet med studien är att undersöka om textbaserad information på sociala medier blir mer intressant om det sker i rörlig bild med animerad text. Studien bygger på en målgrupp bestående av unga Facebookanvändare där användare och deras digitala konsumtion på den sociala plattformen undersöks. Empirisk datainsamling har genomförts i form av enkäter som digitalt distribuerats till den utvalda målgruppen. Djupgående, kvalitativa och kompletterande intervjuer med slumpmässigt utvalda respondenter har genomförts för att säkerställa en kvalitativ ansats i studien. Studiens resultat visar på en jämn fördelning för föredragen form av nyhetsrapportering i form av rörlig bild och bild med kompletterande text där det tydligt framkommer spetsegenskaper för de olika alternativen. Undersökningen visar att de olika alternativen för informationsförmedling har tydliga kopplingar till olika motiveringar. Tillgänglighet och intressant är synonymt med rörlig bild medans informativt och tydligt är synonymt med bild med kompletterande text. Studien visar även på att det är ett område som kräver vidare forskning.

APA, Harvard, Vancouver, ISO, and other styles

23

Wells, Emily Jean. "The effects of luminance contrast, raster modulation, and ambient illumination on text readability and subjective image quality." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-07102009-040235/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Jaroňová, Eva. "Od ideálu k utopii (zítřek, co už byl)." Master's thesis, Vysoké učení technické v Brně. Fakulta výtvarných umění, 2012. http://www.nusl.cz/ntk/nusl-232359.

Full text

Abstract:

My work contains five videos. Installation presents videos on five separate tv screens. Each video duration is max 3 min. Videos are based on text. Text exists as installation in space and continues in form of performance. Content of text is simple wisdom which is twist off into nonsense. Realisation contains element of ephemerality. Former work is destroed by nature or human incidence.

APA, Harvard, Vancouver, ISO, and other styles

25

Macindoe, Annie C. "Melancholy and the memorial: Representing loss, grief and affect in contemporary visual art." Thesis, Queensland University of Technology, 2018. https://eprints.qut.edu.au/119695/1/Annie_Macindoe_Thesis.pdf.

Full text

Abstract:

Melancholy and the Memorial: Representing Loss, Grief and Affect in Contemporary Visual Art is a practice-led project that explores how contemporary art can respond to the limitations of traditional forms of language in the representation of trauma, loss and grief. The project reflects on the work of theorists and artists who also explore the ineffability of these memories and experiences. The creative outcomes have investigated how text, moving image, sound and space can be combined to reframe the dialogue around public and private expressions of trauma and open up discussion of the potential for shared, affectual experiences through art.

APA, Harvard, Vancouver, ISO, and other styles

26

Ryrå, Landgren Isabella. "Samspel i det berättartekniska : text, bild och effekter i musikvideor." Thesis, Högskolan Väst, Avd för medier och design, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:hv:diva-8965.

Full text

Abstract:

Music videos have under the last 50 years become entertainment in our society. Some are created to reflect feelings while others are a kind of showcase for the artist. There are also those based off of the lyrics and thus creates a short film or an illustration of the lyrics. Through the use of technologies such as visual effects is it possible to bring the impossible worlds and stories alive.Videos with these effects are the kind of videos I've been analyzing in this essay with the purposeto explore how much the visual effects affect the narration. To achieve this I have chosen to make a semiotic study focused on analysis and interpretations of five chosen music videos created during orafter the year 2000. CGI, slow-motion and metaphors are techniques I've been looking at and they have proved to contribute to how the story of the video is told and how it's understood. The interplay between image and text is another thing I've been studying and in the chosen videos it's been varying between interpretations and literal translation to the other.
Musikvideor har under de senaste 50 åren varit en form av underhållning för vårt samhälle. Somliga formas för att spegla känslor medan andra visar upp artisten. Det finns de som baserar sig på låttexten för att skapa en kortare film eller gestalta låttextens innehåll. Med hjälp av tekniker som visuella effekter kan dessa drömlika och omöjliga världar och historier komma till liv. Det är videor med sådana effekter jag valt att analysera i denna uppsats med syftet att ta reda påhur stor roll de visuella effekterna spelar i berättandet. För att komma fram till detta har jag gjort en semiotisk studie fokuserad på analys och tolkningar av fem valda videor skapade under eller efter 2000-talet. CGI, slow-motion och metaforer är tekniker jag kollat på och det har visat sig att de alla bidrar till hur berättandet utspelas och uppfattas. Sambandet mellan bild och text i de valda videorna har pendlat mellan tolkning till bokstavligt översatt till varandra.

APA, Harvard, Vancouver, ISO, and other styles

27

Saracoglu, Ahmet. "Localization And Recognition Of Text In Digital Media." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/2/12609028/index.pdf.

Full text

Abstract:

Textual information within digital media can be used in many areas such as, indexing and structuring of media databases, in the aid of visually impaired, translation of foreign signs and many more. This said, mainly text can be separated into two categories in digital media as, overlay-text and scene-text. In this thesis localization and recognition of video text regardless of its category in digital media is investigated. As a necessary first step, framework of a complete system is discussed. Next, a comparative analysis of feature vector and classification method pairs is presented. Furthermore, multi-part nature of text is exploited by proposing a novel Markov Random Field approach for the classification of text/non-text regions. Additionally, better localization of text is achieved by introducing bounding-box extraction method. And for the recognition of text regions, a handprint based Optical Character Recognition system is thoroughly investigated. During the investigation of text recognition, multi-hypothesis approach for the segmentation of background is proposed by incorporating k-Means clustering. Furthermore, a novel dictionary-based ranking mechanism is proposed for recognition spelling correction. And overall system is simulated on a challenging data set. Also, a through survey on scene-text localization and recognition is presented. Furthermore, challenges are identified and discussed by providing related work on them. Scene-text localization simulations on a public competition data set are also provided. Lastly, in order to improve recognition performance of scene-text on signs that are affected from perspective projection distortion, a rectification method is proposed and simulated.

APA, Harvard, Vancouver, ISO, and other styles

28

Бикова, О. Д. "Відеовербальний текст німецькомовного вербального дискурсу." Thesis, Сумський державний університет, 2013. http://essuir.sumdu.edu.ua/handle/123456789/30524.

Full text

Abstract:

Дослідивши питання відеовербального тексту німецькомовного рекламного дискурсу, можна зробити висновки, що реклама є багатогранним поняттям і єдиного визначення цього явища не існує. При цитуванні документа, використовуйте посилання http://essuir.sumdu.edu.ua/handle/123456789/30524

APA, Harvard, Vancouver, ISO, and other styles

29

Tapaswi, Makarand Murari [Verfasser], and R. [Akademischer Betreuer] Stiefelhagen. "Story Understanding through Semantic Analysis and Automatic Alignment of Text and Video / Makarand Murari Tapaswi. Betreuer: R. Stiefelhagen." Karlsruhe : KIT-Bibliothek, 2016. http://d-nb.info/1108450725/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Sartini, Emily C. "EFFECTS OF EXPLICIT INSTRUCTION AND SELF-DIRECTED VIDEO PROMPTING ON TEXT COMPREHENSION OF STUDENTS WITH AUTISM SPECTRUM DISORDER." UKnowledge, 2016. http://uknowledge.uky.edu/edsrc_etds/24.

Full text

Abstract:

The purpose of this study was to investigate the effects of explicit instruction combined with video prompting to teach text comprehension skills to students with autism spectrum disorder. Participants included 4 elementary school students with autism. A multiple probe across participants design was used to evaluate the intervention’s effectiveness. Results indicated that the intervention was successful for all participants. All participants mastered the comprehension skills; however, data were highly variable during the acquisition phase. Implications for researchers and practitioners are discussed.

APA, Harvard, Vancouver, ISO, and other styles

31

Hansen, Simon. "TEXTILE - Augmenting Text in Virtual Space." Thesis, Malmö högskola, Fakulteten för kultur och samhälle (KS), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-23172.

Full text

Abstract:

Three-dimensional literature is a virtually non-existent or in any case very rare and emergent digital art form, defined by the author as a unit of text, which is not confined to the two-dimensional layout of print literature, but instead mediated across all three axes of a virtual space. In collaboration with two artists the author explores through a bodystorming workshop how writers and readers could create and experience three-dimensional literature in mixed reality, by using mobile devices that are equipped with motion sensors, which enable users to perform embodied interactions as an integral part of the literary experience.For documenting the workshop, the author used body-mounted action cameras in order to record the point-of-view of the participants. This choice turned out to generate promising knowledge on using point-of-view footage as an integral part of the methodological approach. The author has found that by engaging creatively with such footage, the designer gains a profound understanding and vivid memory of complex design activities.As the outcome the various design activities, the author developed a concept for an app called TEXTILE. It enables users to build three-dimensional texts by positioning words in a virtual bubble of space around the user and to share them, either on an online platform or at site-specific places. A key finding of this thesis is that the creation of three-dimensional literature on a platform such as TEXTILE is not just an act of writing – it is an act of sculpture and an act of social performance.

APA, Harvard, Vancouver, ISO, and other styles

32

Escobar, Mayte. "The Body As Border: El Cuerpo Como Frontera." CSUSB ScholarWorks, 2015. https://scholarworks.lib.csusb.edu/etd/247.

Full text

Abstract:

Being First generation born Mexican American I am looking into the blend of the two cultures and the disparity between them. The border is the core of my investigation; by traveling across the border I have become conscious of the differences among both sides and duality within myself. My identity has developed from a synthesis of these two cultures, and my wok explores these two factions that cannot be one without the other. fusion is apparent in my self-portraits where I dress up with the colors from both sides of the border. But I also take a personal look into understanding the history and identity of each nation. I create a juxtaposition with these two identities that become one and explore the social, cultural, and political issues we face in the everyday. I recreate my “investigation,” by trying to dig deeper, exposing the layers, and facing my own identity crisis in the process.

APA, Harvard, Vancouver, ISO, and other styles

33

Antonér, Jakob. "Hur svårt ska det vara med lite text? : En självobservationsstudie av textinlärning i sång." Thesis, Karlstads universitet, Institutionen för konstnärliga studier, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-46386.

Full text

Abstract:

Syftet med arbetet är att identifiera olika typer av resurser för lärande. I detta självständiga arbete beskrivs tre olika typer av strategier för instudering med fokus på textinlärning, och vid varje ny strategi användes en ny sång. Studien baseras på mig själv i form av loggbok och videoinspelningar, som gjordes över tre veckor med totalt 15 loggboksinlägg och sex videoinspelningar under hösten 2015. Arbetet utgår från ett multimodalt designteoretiskt perspektiv. Resultatet visar hur jag använder mig av olika resurser för att studera in tre olika sångtexter, som studerades in utifrån tre olika textinlärningsstrategier. Slutligen diskuteras resultatet i relation till det designteoretiska perspektivet och tidigare forskning.
In this independent project three kinds of strategies for learning with focus on learning lyrics is described and explored. For every new strategy a new song is used. The purpose is to explore and identify different kinds of resources when learning. The study is based on observations of myself in form of a logbook and video recordings over three weeks with a total of 15 practicing sessions in the fall of 2015. The starting point is a multimodal and design theoretical perspective. The result show how I use different resources when learning different lyrics by using three kinds of strategies for learning lyrics. Finally I discuss, in relation to the design theoretical perspective and earlier research.

APA, Harvard, Vancouver, ISO, and other styles

34

Hung, Yu-Wan. "The use of communication strategies by learners of English and learners of Chinese in text-based and video-based synchronous computer-mediated communication (SCMC)." Thesis, Durham University, 2012. http://etheses.dur.ac.uk/4426/.

Full text

Abstract:

The use of communication strategies (CSs) has been of interest on research into second language acquisition (SLA) since it can help learners to attain mutual comprehension effectively and develops understanding of interaction in SLA research. This study investigates and clarifies a wide range of CSs that learners of English and learners of Chinese use to solve language problems as well as to facilitate problem-free discourse in both text-based and video-based SCMC environments. Seven Chinese-speaking learners of English and seven English-speaking learners of Chinese were paired up as tandem (reciprocal) learning dyads in this study. Each dyad participated in four interactions, namely, text-based SCMC in English, text-based SCMC in Chinese, video-based SCMC in English and video-based SCMC in Chinese. The interaction data were analysed along with an after-task questionnaire and stimulated reflection to explore systematically and comprehensively the differences between text-based and video-based SCMC and differences between learners of English and learners of Chinese. The results showed that learners used CSs differently in text-based and video-based SCMC compared with their own performance and indicated different learning opportunities provided by these two modes of SCMC. Although the difference in language was less salient than the medium effect, learners of English and learners of Chinese tended to have their own preferences for particular CSs. When these preferences appear to reflect an appropriate communicative style in one particular culture, learners might need to raise their awareness of some strategies during intercultural communication to avoid possible misunderstanding or offence. Some possible advantages of tandem learning interaction were also identified in this study, such as the potential to develop sociocultural and intercultural competence due to the opportunity to practice culturally appropriate language use with native speakers in a social context.

APA, Harvard, Vancouver, ISO, and other styles

35

Hermanová, Petra. "Falešná vzpomínka." Master's thesis, Vysoké učení technické v Brně. Fakulta výtvarných umění, 2012. http://www.nusl.cz/ntk/nusl-232362.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Diaz, Leanna Marie. "Usage of Emotes and Emoticons in a Massively Multiplayer Online Role-Playing Game." University of Toledo / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1533228651012048.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

D'Angelo, John J. "A Study of the Relationship Between the Use of Color for Text in Computer Screen Design and the Age of the Computer User." Thesis, University of North Texas, 1991. https://digital.library.unt.edu/ark:/67531/metadc663711/.

Full text

Abstract:

This study addresses an individual's performance, relating it to eyesight changes due to the aging of the individual and to color computer screens used for computer-based-instruction not designed specifically for older students. This study determines how existing research in gerontology, human-computer interface, and color use in visual graphics can be applied to the design of computer screen displays containing color text and backgrounds and how various color combinations will affect performance by adult learners forty years of age and older. The results of this research provide software developers and instructional designers guidelines to use when designing computer screen displays for use in instructional computing settings involving older adults.

APA, Harvard, Vancouver, ISO, and other styles

38

Strömberg, Per. "Kan jag öva utan att sjunga? : en självobservationsstudie av instudering i sång." Thesis, Karlstads universitet, Institutionen för konstnärliga studier (from 2013), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-71363.

Full text

Abstract:

Syftet med denna självobservationsstudie är att utforska tillvägagångssätt vid instudering av sång utifrån ett textfokus. I detta arbete beskrivs huvudsakligen tre metoder för sig och när de används kombinerat. Metoderna till instudering är lyssna, rösten och skriva. Studien är baserat på sociokulturellt perspektivet samt annan relevant information om instudering. Metoden som används i arbetet är videor från övningstillfällen samt loggbok som antecknats efter varje övningstillfälle. Studien är baserad på mig själv där jag under två veckor fört 14 loggboksinlägg och två videoinspelningar när jag aktivt arbetar i 20 minuter per inlägg under hösten 2017. I resultatet visas hur jag använt de tre metoder som är min röst, skrift och hörseln som användes vid instudering av två olika sånger. Slutligen diskuteras resultatet i förhållande till bakrundskapitlet.
This self-observation purpose is to find new ways to learn song with text focus. The study presents three different methods for them self and when they are integrated in each other. The methods that been used is listening, The voice and writing. The study is based on Sociocultural perspective and other relevant research when it comes to studding songs. the method I’ve used for this study is video recording when I practise and notes from practise sessions. For two weeks, I gathered information and 14 practice sessions with two films. One session was 20 minutes each and was gather autumn 2017. Later on, in the result I will present how I used these three methods which are voice, writing and listening which I used to learn the two-different song. At the end the result will be discussed to the earlier research.

APA, Harvard, Vancouver, ISO, and other styles

39

Ferguson, Ralph. "Multimodal Literacy as a form of Communication : What is the state of the students at Dalarna University multimodal literacy?" Thesis, Högskolan Dalarna, Ljud- och musikproduktion, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:du-16835.

Full text

Abstract:

Literacy is an invaluable asset to have, and has allowed for communication, documentation and the spreading of ideas since the beginning of the written language. With technological advancements, and new possibilities to communicate, it is important to question the degree to which people’s abilities to utilise these new methods have developed in relation to these emerging technologies. The purpose of this bachelor’s thesis is to analyse the state of students’ at Dalarna University mulitimodal literacy, as well as their experience of multimodality in their education. This has led to the two main research questions: What is the state of the students at Dalarna University multimodal literacy? And: How have the students at Dalarna University experienced multimodality in education? The paper is based on a mixed-method study that incorporates both a quantitative and qualitative aspect to it. The main thrust of the research paper is, however, based on a quantitative study that was conducted online and emailed to students via their program coordinators. The scope of the research is in audio-visual modes, i.e. audio, video and images, while textual literacy is presumed and serves as an inspiration to the study. The purpose of the study is to analyse the state of the students’ multimodal literacy and their experience of multimodality in education. The study revealed that the students at Dalarna University have most skill in image editing, while not being very literate in audio or video editing. The students seem to have had mediocre experience creating meaning through multimodality both in private use and in their respective educational institutions. The study also reveals that students prefer learning by means of video (rather than text or audio), yet are not able to create meaning (communicate) through it.

APA, Harvard, Vancouver, ISO, and other styles

40

Nguyen, Chu Duc. "Localization and quality enhancement for automatic recognition of vehicle license plates in video sequences." Thesis, Ecully, Ecole centrale de Lyon, 2011. http://www.theses.fr/2011ECDL0018.

Full text

Abstract:

La lecture automatique de plaques d’immatriculation de véhicule est considérée comme une approche de surveillance de masse. Elle permet, grâce à la détection /localisation ainsi que la reconnaissance optique, d’identifier un véhicule dans les images ou les séquences d’images. De nombreuses applications comme le suivi du trafic, la détection de véhicules volés, le télépéage ou la gestion d’entrée / sortie des parkings utilise ce procédé. Or malgré d’important progrès enregistré depuis l’apparition des premiers prototypes en 1979 accompagné d’un taux de reconnaissance parfois impressionnant, notamment grâce aux avancés en recherche scientifique et en technologie des capteurs, les contraintes imposés pour le bon fonctionnement de tels systèmes en limitent les portées. En effet, l’utilisation optimale des techniques de localisation et de reconnaissance de plaque d’immatriculation dans les scénarii opérationnels nécessite des conditions d’éclairage contrôlées ainsi qu’une limitation dans de la pose, de vitesse ou tout simplement de type de plaque. La lecture automatique de plaques d’immatriculation reste alors un problème de recherche ouvert. La contribution majeure de cette thèse est triple. D’abord une nouvelle approche robuste de localisation de plaque d’immatriculation dans des images ou des séquences d’images est proposée. Puis, l’amélioration de la qualité des plaques localisées est traitée par une adaptation de technique de super-résolution. Finalement, un modèle unifié de localisation et de super-résolution est proposé permettant de diminuer la complexité temporelle des deux approches combinées
Automatic reading of vehicle license plates is considered an approach to mass surveillance. It allows, through the detection / localization and optical recognition to identify a vehicle in the images or video sequences. Many applications such as traffic monitoring, detection of stolen vehicles, the toll or the management of entrance/ exit parking uses this method. Yet in spite of important progress made since the appearance of the first prototype sin 1979, with a recognition rate sometimes impressive thanks to advanced science and sensor technology, the constraints imposed for the operation of such systems limit laid. Indeed, the optimal use of techniques for localizing and recognizing license plates in operational scenarios requiring controlled lighting conditions and a limitation of the pose, velocity, or simply type plate. Automatic reading of vehicle license plates then remains an open research problem. The major contribution of this thesis is threefold. First, a new approach to robust license plate localization in images or image sequences is proposed. Then, improving the quality of the plates is treated with a localized adaptation of super-resolution technique. Finally, a unified model of location and super-resolution is proposed to reduce the time complexity of both approaches combined

APA, Harvard, Vancouver, ISO, and other styles

41

Janssen, Michael. "Balansering av ett rundbaserat strategispelsystem : En reflekterande text rörande arbetet att skapa ett verktyg för balansering av precautionstridsystemet i spelet Dreamlords – The Reawakening." Thesis, University of Skövde, School of Humanities and Informatics, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-1114.

Full text

Abstract:

Följandet arbete är en reflekterande text som handlar om hur jag skapade ett verktyg för att balansera det rundbaserade precautionstridsystemet av spelet Dreamlords – The Reawakening.

I början förklaras mitt syfte och mål med arbetet som följs av en frågeställning kopplat till det. Jag kommer även att beskriva olika teorier som jag tittade närmare på angående spelbalansering. För att inleda läsaren till spelet Dreamlords ska jag även förklara den generella spelprincipen av hela spelet. Därefter tar jag upp min arbetsprocess som handlar om hur jag strukturerade upp mitt arbete. I denna text ska jag även förklara hur precautionstridsystemet fungerar med alla matematiska beräkningar för att sedan beskriva mitt verk som praktiskt resultat av arbetet. I slutet av rapporten inleder jag med en diskussion där mitt arbete sammanfattas och problem behandlas som uppstått vid arbetets gång. Texten beskriver i sin helhet hur man kan gå tillväga för att balansera ett spelsystem såsom Dreamlords precautionstridsystem. Som resultat av mitt arbete kan betraktas mitt balanseringsverktyg och denna reflekterande text som förhoppningsvis kommer att inspirera läsaren och flera andra personer som är intresserad av att balansera dataspel.

APA, Harvard, Vancouver, ISO, and other styles

42

Templeton, Joey. "RECREATIONAL TECHNOLOGY AND ITS IMPACT ON THE LEARNING DEVELOPMENT OF CHILDREN AGES 4-8: A META-ANALYSIS FOR THE 21ST CENTURY CL." Doctoral diss., University of Central Florida, 2007. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/4297.

Full text

Abstract:

This research focuses on technology (specifically video games and interactive software games) and their effects on the cognitive development of children ages 4-8. The research will be conducted as a meta-analysis combining research and theory in order to determine if the educational approach to this age group needs to change/adapt to learners who have been affected by this technology. I will focus upon both the physical and mental aspects of their development and present a comprehensive review of current educational theory and practice. By examining current curriculum goals and cross-referencing them to research conducted in fields other than education (i.e. technology, child development, media literacy, etc.) I hope to demonstrate a need for change; and, at the end of my research, be able to make recommendations for curriculum adaptations that will work within the current educational structure. These recommendations will be made with respect to budget and time constraints.
Ph.D.
Department of English
Arts and Humanities
Texts and Technology PhD

APA, Harvard, Vancouver, ISO, and other styles

43

Ma, Zhenyu. "Semi-synchronous video for Deaf Telephony with an adapted synchronous codec." Thesis, University of the Western Cape, 2009. http://etd.uwc.ac.za/index.php?module=etd&action=viewtitle&id=gen8Srv25Nme4_2950_1370593938.

Full text

Abstract:

Communication tools such as text-based instant messaging, voice and video relay services, real-time video chat and mobile SMS and MMS have successfully been used among Deaf people. Several years of field research with a local Deaf community revealed that disadvantaged South African Deaf
people preferred to communicate with both Deaf and hearing peers in South African Sign Language as opposed to text. Synchronous video chat and video
relay services provided such opportunities. Both types of services are commonly available in developed regions, but not in developing countries like South
Africa. This thesis reports on a workaround approach to design and develop an asynchronous video communication tool that adapted synchronous video

codecs to store-and-forward video delivery. This novel asynchronous video tool provided high quality South African Sign Language video chat at the
expense of some additional latency. Synchronous video codec adaptation consisted of comparing codecs, and choosing one to optimise in order to
minimise latency and preserve video quality. Traditional quality of service metrics only addressed real-time video quality and related services. There was no
uch standard for asynchronous video communication. Therefore, we also enhanced traditional objective video quality metrics with subjective
assessment metrics conducted with the local Deaf community.

APA, Harvard, Vancouver, ISO, and other styles

44

Bull, Hannah. "Learning sign language from subtitles." Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG013.

Full text

Abstract:

Les langues des signes sont un moyen de communication essentiel pour les communautés sourdes. Elles sont des langues visuo-gestuelles, qui utilisent comme modalités les mains, les expressions faciales, le regard et les mouvements du corps. Elles ont des structures grammaticales complexes et des lexiques riches qui sont considérablement différents de ceux que l'on trouve dans les langues parlées. Les spécificités des langues des signes en termes de canaux de communication, de structure et de grammaire exigent des méthodologies distinctes. Les performances des systèmes de traduction automatique entre des langues écrites ou parlées sont actuellement suffisantes pour de nombreux cas d'utilisation quotidienne, tels que la traduction de vidéos, de sites web, d'e-mails et de documents. En revanche, les systèmes de traduction automatique pour les langues des signes n'existent pas en dehors de cas d'utilisation très spécifiques avec un vocabulaire limité. La traduction automatique de langues des signes est un défi pour deux raisons principales. Premièrement, les langues des signes sont des langues à faibles ressources avec peu de données d'entraînement disponibles. Deuxièmement, les langues des signes sont des langues visuelles et spatiales sans forme écrite, naturellement représentées sous forme de vidéo plutôt que d'audio ou de texte. Pour relever le premier défi, nous fournissons de grands corpus de données pour l'entraînement et l'évaluation des systèmes de traduction automatique en langue des signes, avec des contenus vidéo en langue des signes interprétée et originale, ainsi que des sous-titres écrits. Alors que les données interprétées nous permettent de collecter un grand nombre d'heures de vidéos, les vidéos originalement en langue des signes sont plus représentatives de l'utilisation de la langue des signes au sein des communautés sourdes. Les sous-titres écrits peuvent être utilisés comme supervision faible pour diverses tâches de compréhension de la langue des signes. Pour relever le deuxième défi, cette thèse propose des méthodes permettant de mieux comprendre les vidéos en langue des signes. Alors que la segmentation des phrases est généralement triviale pour les langues écrites, la segmentation des vidéos en langue des signes en phrases repose sur la détection d'indices sémantiques et prosodiques subtils dans les vidéos. Nous utilisons des indices prosodiques pour apprendre à segmenter automatiquement une vidéo en langue des signes en unités de type phrase, déterminées par les limites des sous-titres. En développant cette méthode de segmentation, nous apprenons ensuite à aligner les sous-titres du texte sur les segments de la vidéo en langue des signes en utilisant des indices sémantiques et prosodiques, afin de créer des paires au niveau de la phrase entre la vidéo en langue des signes et le texte. Cette tâche est particulièrement importante pour les données interprétées, où les sous-titres sont généralement alignés sur l'audio et non sur la langue des signes. En utilisant ces paires vidéo-texte alignées automatiquement, nous développons et améliorons plusieurs méthodes différentes pour annoter de façon dense les signes lexicaux en interrogeant des mots dans le texte des sous-titres et en recherchant des indices visuels dans la vidéo en langue des signes pour les signes correspondants
Sign languages are an essential means of communication for deaf communities. Sign languages are visuo-gestual languages using the modalities of hand gestures, facial expressions, gaze and body movements. They possess rich grammar structures and lexicons that differ considerably from those found among spoken languages. The uniqueness of transmission medium, structure and grammar of sign languages requires distinct methodologies. The performance of automatic translations systems between high-resource written languages or spoken languages is currently sufficient for many daily use cases, such as translating videos, websites, emails and documents. On the other hand, automatic translation systems for sign languages do not exist outside of very specific use cases with limited vocabulary. Automatic sign language translation is challenging for two main reasons. Firstly, sign languages are low-resource languages with little available training data. Secondly, sign languages are visual-spatial languages with no written form, naturally represented as video rather than audio or text. To tackle the first challenge, we contribute large datasets for training and evaluating automatic sign language translation systems with both interpreted and original sign language video content, as well as written text subtitles. Whilst interpreted data allows us to collect large numbers of hours of videos, original sign language video is more representative of sign language usage within deaf communities. Written subtitles can be used as weak supervision for various sign language understanding tasks. To address the second challenge, we develop methods to better understand visual cues from sign language video. Whilst sentence segmentation is mostly trivial for written languages, segmenting sign language video into sentence-like units relies on detecting subtle semantic and prosodic cues from sign language video. We use prosodic cues to learn to automatically segment sign language video into sentence-like units, determined by subtitle boundaries. Expanding upon this segmentation method, we then learn to align text subtitles to sign language video segments using both semantic and prosodic cues, in order to create sentence-level pairs between sign language video and text. This task is particularly important for interpreted TV data, where subtitles are generally aligned to the audio and not to the signing. Using these automatically aligned video-text pairs, we develop and improve multiple different methods to densely annotate lexical signs by querying words in the subtitle text and searching for visual cues in the sign language video for the corresponding signs

APA, Harvard, Vancouver, ISO, and other styles

45

Dias, Laura Lima. "Análise de abordagens automáticas de anotação semântica para textos ruidosos e seus impactos na similaridade entre vídeos." Universidade Federal de Juiz de Fora (UFJF), 2017. https://repositorio.ufjf.br/jspui/handle/ufjf/6473.

Full text

Abstract:

Submitted by Geandra Rodrigues (geandrar@gmail.com) on 2018-01-29T16:52:29Z No. of bitstreams: 0
Rejected by Adriana Oliveira (adriana.oliveira@ufjf.edu.br), reason: on 2018-01-30T14:50:12Z (GMT)
Submitted by Geandra Rodrigues (geandrar@gmail.com) on 2018-01-30T16:08:06Z No. of bitstreams: 0
Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2018-03-21T19:26:08Z (GMT) No. of bitstreams: 0
Made available in DSpace on 2018-03-21T19:26:08Z (GMT). No. of bitstreams: 0 Previous issue date: 2017-08-31
CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Com o acúmulo de informações digitais armazenadas ao longo do tempo, alguns esforços precisam ser aplicados para facilitar a busca e indexação de conteúdos. Recursos como vídeos e áudios, por sua vez, são mais difíceis de serem tratados por mecanismos de busca. A anotação de vídeos é uma forma considerável de resumo do vídeo, busca e classificação. A parcela de vídeos que possui anotações atribuídas pelo próprio autor na maioria das vezes é muito pequena e pouco significativa, e anotar vídeos manualmente é bastante trabalhoso quando trata-se de bases legadas. Por esse motivo, automatizar esse processo tem sido desejado no campo da Recuperação de Informação. Em repositórios de videoaulas, onde a maior parte da informação se concentra na fala do professor, esse processo pode ser realizado através de anotações automáticas de transcritos gerados por sistemas de Reconhecimento Automático de Fala. Contudo, essa técnica produz textos ruidosos, dificultando a tarefa de anotação semântica automática. Entre muitas técnicas de Processamento de Linguagem de Natural utilizadas para anotação, não é trivial a escolha da técnica mais adequada a um determinado cenário, principalmente quando trata-se de anotar textos com ruídos. Essa pesquisa propõe analisar um conjunto de diferentes técnicas utilizadas para anotação automática e verificar o seu impacto em um mesmo cenário, o cenário de similaridade entre vídeos.
With the accumulation of digital information stored over time, some efforts need to be applied to facilitate search and indexing of content. Resources such as videos and audios, in turn, are more difficult to handle with by search engines. Video annotation is a considerable form of video summary, search and classification. The share of videos that have annotations attributed by the author most often is very small and not very significant, and annotating videos manually is very laborious when dealing with legacy bases. For this reason, automating this process has been desired in the field of Information Retrieval. In video lecture repositories, where most of the information is focused on the teacher’s speech, this process can be performed through automatic annotations of transcripts gene-rated by Automatic Speech Recognition systems. However, this technique produces noisy texts, making the task of automatic semantic annotation difficult. Among many Natural Language Processing techniques used for annotation, it is not trivial to choose the most appropriate technique for a given scenario, especially when writing annotated texts. This research proposes to analyze a set of different techniques used for automatic annotation and verify their impact in the same scenario, the scenario of similarity between videos.

APA, Harvard, Vancouver, ISO, and other styles

46

Mengoli, Chiara. "Plagiarism." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20928/.

Full text

Abstract:

This document aims to provide an overview of the current situation of plagiarism. This survey presents a taxonomy of various plagiarism forms and includes discussion on each of these forms. Moreover, it presents the existence of various tools and systems for detecting plagiarism. The easy access to the Web, its increasing use, and the development of technologies have led to the need to adapt plagiarism laws, implement new mechanisms anti-plagiarism or mechanisms of detection and empower people on the problem of plagiarism. Therefore, there will be a dedicated section for each of these aspects.

APA, Harvard, Vancouver, ISO, and other styles

47

Філіппова, Є. С. "Мультимедійний текст: структурний, функціональний та перекладацький аспекти." Master's thesis, Сумський державний університет, 2019. http://essuir.sumdu.edu.ua/handle/123456789/75847.

Full text

Abstract:

Актуальність дослідження полягає у тому, що мультимедійні тексти досліджені переважно через призму соціологічної чи маркетингової проблематики, у той час як їх лінгвістичне наповнення залишилось здебільшого поза увагою. У дослідженні розкрито лінгвістичний потенціал мультимедійних текстів на прикладі Iнтернет-дискурсу, рекламних текстів та відеоконтенту платформи YouTube.
Актуальность исследования заключается в том, что мультимедийные тексты исследованы преимущественно через призму социологической или маркетинговой проблематики, в то время как их лингвистическое наполнение осталось в основном без внимания. В исследовании раскрыт лингвистический потенциал мультимедийных текстов на примере Интернет-дискурса, рекламных текстов и видеоконтента платформы YouTube.
The relevance of the study is that multimedia texts are explored mainly through the lens of sociological or marketing issues, while their linguistic content has been largely ignored. In our study, we uncover the linguistic potential of multimedia texts through the example of Internet discourse, advertising texts and video content on the YouTube platform.

APA, Harvard, Vancouver, ISO, and other styles

48

Murnane, Owen D., and Kristal M. Riska. "The Video Head Impulse Test." Digital Commons @ East Tennessee State University, 2018. https://dc.etsu.edu/etsu-works/1978.

Full text

Abstract:

Book Summary: Dizziness comes in many forms in each age group – some specific to an age group (e.g. benign paroxysmal vertigo of childhood) while others span the age spectrum (e.g., migraine-associated vertigo). This content organizes evaluation and management of the dizzy patient by age to bring a fresh perspective to seeing these often difficult patients.

APA, Harvard, Vancouver, ISO, and other styles

49

Murnane, Owen D., Stephanie M. Byrd, C. Kidd, and Faith W. Akin. "The Video Head Impulse Test." Digital Commons @ East Tennessee State University, 2013. https://dc.etsu.edu/etsu-works/1883.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Murnane, Owen D., H. Mabrey, A. Pearson, Stephanie M. Byrd, and Faith W. Akin. "The Video Head Impulse Test." Digital Commons @ East Tennessee State University, 2012. https://dc.etsu.edu/etsu-works/1888.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Video text'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles