To see the other types of publications on this topic, follow the link: Visual speech information.

Dissertations / Theses on the topic 'Visual speech information'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 30 dissertations / theses for your research on the topic 'Visual speech information.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Le, Cornu Thomas. "Reconstruction of intelligible audio speech from visual speech information." Thesis, University of East Anglia, 2016. https://ueaeprints.uea.ac.uk/67012/.

Full text
Abstract:
The aim of the work conducted in this thesis is to reconstruct audio speech signals using information which can be extracted solely from a visual stream of a speaker's face, with application for surveillance scenarios and silent speech interfaces. Visual speech is limited to that which can be seen of the mouth, lips, teeth, and tongue, where the visual articulators convey considerably less information than in the audio domain, leading to the task being difficult. Accordingly, the emphasis is on the reconstruction of intelligible speech, with less regard given to quality. A speech production model is used to reconstruct audio speech, where methods are presented in this work for generating or estimating the necessary parameters for the model. Three approaches are explored for producing spectral-envelope estimates from visual features as this parameter provides the greatest contribution to speech intelligibility. The first approach uses regression to perform the visual-to-audio mapping, and then two further approaches are explored using vector quantisation techniques and classification models, with long-range temporal information incorporated at the feature and model-level. Excitation information, namely fundamental frequency and aperiodicity, is generated using artificial methods and joint-feature clustering approaches. Evaluations are first performed using mean squared error analyses and objective measures of speech intelligibility to refine the various system configurations, and then subjective listening tests are conducted to determine word-level accuracy, giving real intelligibility scores, of reconstructed speech. The best performing visual-to-audio domain mapping approach, using a clustering-and-classification framework with feature-level temporal encoding, is able to achieve audio-only intelligibility scores of 77 %, and audiovisual intelligibility scores of 84 %, on the GRID dataset. Furthermore, the methods are applied to a larger and more continuous dataset, with less favourable results, but with the belief that extensions to the work presented will yield a further increase in intelligibility.
APA, Harvard, Vancouver, ISO, and other styles
2

Andrews, Brandie. "Auditory and visual information facilitating speech integration." Connect to resource, 2007. http://hdl.handle.net/1811/25202.

Full text
Abstract:
Thesis (Honors)--Ohio State University, 2007.
Title from first page of PDF file. Document formatted into pages: contains 43 p.; also includes graphics. Includes bibliographical references (p. 27-28). Available online via Ohio State University's Knowledge Bank.
APA, Harvard, Vancouver, ISO, and other styles
3

Fixmer, Eric Norbert Charles. "Grouping of auditory and visual information in speech." Thesis, University of Cambridge, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.612553.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Keintz, Constance Kay. "Influence of visual information on the intelligibility of dysarthric speech." Diss., The University of Arizona, 2005. http://hdl.handle.net/10150/280714.

Full text
Abstract:
Purpose. The purpose of this study was to examine the influence of visual information on the intelligibility of dysarthric speech. The two research questions posed by this study were: (1) Does the presentation mode (auditory-only versus auditory-visual) influence the intelligibility of a homogeneous group of speakers with dysarthria? and (2) Does the experience of the listener (experienced versus inexperienced with dysarthric speech) influence the intelligibility scores of these speakers? Background. Investigations of speakers with hearing impairment and laryngectomy have indicated that intelligibility scores are higher in an auditory-visual mode compared to an auditory-only mode of presentation. Studies of speakers with dysarthria have resulted in mixed findings. Methodological issues such as heterogeneity of speaker groups and factors related to the stimuli may have contributed to these mixed findings. Method. Eight speakers with dysarthria related to Parkinson disease were audio and video tape-recorded reading sentences. Movie files were created in which an auditory-only condition containing the speaker's voice but no visual image of the speaker and an auditory-visual condition containing the speaker's voice and a view of his/her face. Two groups of listeners (experienced and inexperienced with dysarthric speech) completed listening sessions in which they listened to (auditory-only) and watched and listened to (auditory-visual) the movies and transcribed what they heard each speaker say. Results. Although auditory-visual scores were significantly higher than auditory-only intelligibility scores, the difference between these scores was influenced by the order in which the two conditions were presented. A speaker effect was found across presentation modes, with less intelligible speakers demonstrating greater benefit from the inclusion of visual information. No statistically significant difference was found between the two listener groups in this study. Conclusions. These findings suggest that clinicians should include assessment of both auditory-only and auditory-visual intelligibility measures in speakers with Parkinson disease. Management of intelligibility impairment in these individuals should consider whether visual information is beneficial to listeners.
APA, Harvard, Vancouver, ISO, and other styles
5

Hagrot, Joel. "A Data-Driven Approach For Automatic Visual Speech In Swedish Speech Synthesis Applications." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-246393.

Full text
Abstract:
This project investigates the use of artificial neural networks for visual speech synthesis. The objective was to produce a framework for animated chat bots in Swedish. A survey of the literature on the topic revealed that the state-of-the-art approach was using ANNs with either audio or phoneme sequences as input. Three subjective surveys were conducted, both in the context of the final product, and in a more neutral context with less post-processing. They compared the ground truth, captured using the deep-sensing camera of the iPhone X, against both the ANN model and a baseline model. The statistical analysis used mixed effects models to find any statistically significant differences. Also, the temporal dynamics and the error were analyzed. The results show that a relatively simple ANN was capable of learning a mapping from phoneme sequences to blend shape weight sequences with satisfactory results, except for the fact that certain consonant requirements were unfulfilled. The issues with certain consonants were also observed in the ground truth, to some extent. Post-processing with consonant-specific overlays made the ANN’s animations indistinguishable from the ground truth and the subjects perceived them as more realistic than the baseline model’s animations. The ANN model proved useful in learning the temporal dynamics and coarticulation effects for vowels, but may have needed more data to properly satisfy the requirements of certain consonants. For the purposes of the intended product, these requirements can be satisfied using consonant-specific overlays.
Detta projekt utreder hur artificiella neuronnät kan användas för visuell talsyntes. Ändamålet var att ta fram ett ramverk för animerade chatbotar på svenska. En översikt över litteraturen kom fram till att state-of-the-art-metoden var att använda artificiella neuronnät med antingen ljud eller fonemsekvenser som indata. Tre enkäter genomfördes, både i den slutgiltiga produktens kontext, samt i en mer neutral kontext med mindre bearbetning. De jämförde sanningsdatat, inspelat med iPhone X:s djupsensorkamera, med både neuronnätsmodellen och en grundläggande så kallad baselinemodell. Den statistiska analysen använde mixed effects-modeller för att hitta statistiskt signifikanta skillnader i resultaten. Även den temporala dynamiken analyserades. Resultaten visar att ett relativt enkelt neuronnät kunde lära sig att generera blendshapesekvenser utifrån fonemsekvenser med tillfredsställande resultat, förutom att krav såsom läppslutning för vissa konsonanter inte alltid uppfylldes. Problemen med konsonanter kunde också i viss mån ses i sanningsdatat. Detta kunde lösas med hjälp av konsonantspecifik bearbetning, vilket gjorde att neuronnätets animationer var oskiljbara från sanningsdatat och att de samtidigt upplevdes vara bättre än baselinemodellens animationer. Sammanfattningsvis så lärde sig neuronnätet vokaler väl, men hade antagligen behövt mer data för att på ett tillfredsställande sätt uppfylla kraven för vissa konsonanter. För den slutgiltiga produktens skull kan dessa krav ändå uppnås med hjälp av konsonantspecifik bearbetning.
APA, Harvard, Vancouver, ISO, and other styles
6

Bergmann, Kirsten, and Stefan Kopp. "Verbal or visual? : How information is distributed across speech and gesture in spatial dialog." Universität Potsdam, 2006. http://opus.kobv.de/ubp/volltexte/2006/1037/.

Full text
Abstract:
In spatial dialog like in direction giving humans make frequent use of speechaccompanying gestures. Some gestures convey largely the same information as speech while others complement speech.
This paper reports a study on how speakers distribute meaning across speech and gesture, and depending on what factors. Utterance meaning and the wider dialog context were tested by statistically analyzing a corpus of direction-giving dialogs. Problems of speech production (as indicated by discourse markers and disfluencies), the communicative goals, and the information status were found to be influential, while feedback signals by the addressee do not have any influence.
APA, Harvard, Vancouver, ISO, and other styles
7

Erdener, Vahit Doğu. "The effect of auditory, visual and orthographic information on second language acquisition /." View thesis View thesis, 2002. http://library.uws.edu.au/adt-NUWS/public/adt-NUWS20030408.114825/index.html.

Full text
Abstract:
Thesis (MA (Hons)) -- University of Western Sydney, 2002.
"A thesis submitted in partial fulfillment of the requirements for the degree of Masters of Arts (Honours), MARCS Auditory Laboratories & School of Psychology, University of Western Sydney, May 2002" Bibliography : leaves 83-93.
APA, Harvard, Vancouver, ISO, and other styles
8

Patterson, Robert W. "The effects of inaccurate speech information on performance in a visual search and identification task." Thesis, Georgia Institute of Technology, 1987. http://hdl.handle.net/1853/30481.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Erdener, Vahit Dogu, University of Western Sydney, of Arts Education and Social Sciences College, and School of Psychology. "The effect of auditory, visual and orthographic information on second language acquisition." THESIS_CAESS_PSY_Erdener_V.xml, 2002. http://handle.uws.edu.au:8081/1959.7/685.

Full text
Abstract:
The current study investigates the effect of auditory and visual speech information and orthographic information on second/foreign language (L2) acquisition. To test this, native speakers of Turkish (a language with a transparent orthography) and native speakers of Australian English (a language with an opaque orthography) were exposed to Spanish (transparent orthography) and Irish (opaque orthography) legal non-word items in four experimental conditions: auditory-only, auditory-visual, auditory-orthographic, and auditory-visual-orthographic. On each trial, Turkish and Australian English speakers were asked to produce each Spanish and Irish legal non-words. In terms of phoneme errors it was found that Turkish participants generally made less errors in Spanish than their Australian counterparts, and visual speech information generally facilitated performance. Orthographic information had an overriding effect such that there was no visual advantage once it was provided. In the orthographic conditions, Turkish speakers performed better than their Australian English counterparts with Spanish items and worse with Irish terms. In terms of native speakers' ratings of participants' productions, it was found that orthographic input improved accent. Overall the results confirm findings that visual information enhances speech production in L2 and additionally show the facilitative effects of orthographic input in L2 acquisition as a function of orthographic depth. Inter-rater reliability measures revealed that the native speaker rating procedure may be prone to individual and socio-cultural influences that may stem from internal criteria for native accents. This suggests that native speaker ratings should be treated with caution.
Master of Arts (Hons)
APA, Harvard, Vancouver, ISO, and other styles
10

Ostroff, Wendy Louise. "Non-linguistic Influences on Infants' Nonnative Phoneme Perception: Exaggerated prosody and Visual Speech Information Aid Discrimination." Diss., Virginia Tech, 2000. http://hdl.handle.net/10919/27640.

Full text
Abstract:
Research indicates that infants lose the capacity to perceive distinctions in nonnative sounds as they become sensitive to the speech sounds of their native language (i.e., by 10- to 12-months of age). However, investigations into the decline in nonnative phonetic perception have neglected to examine the role of non-linguistic information. Exaggerated prosodic intonation and facial input are prominent in the infantsâ language-learning environment, and both have been shown to ease the task of speech perception. The current investigation was designed to examine the impact of infant-directed (ID) speech and facial input on infantsâ ability to discriminate phonemes that do not contrast in their native language. Specifically, 11-month-old infants were tested for discrimination of both a native phoneme contrast and a nonnative phoneme contrast across four conditions, including an auditory manipulation (ID speech vs. AD speech) and a visual manipulation (Face vs. Geometric Form). The results indicated that infants could discriminate the native phonemes across any of the four conditions. Furthermore, the infants could discriminate the nonnative phonemes if they had enhanced auditory and visual information available to them (i.e., if they were presented in ID speech with a synchronous facial display), and if the nonnative discrimination task was the infantsâ first test session. These results suggest that infants do not lose the capacity to discriminate nonnative phonemes by the end of the first postnatal year, but that they rely on certain language-relevant and non-linguistic sources of information to discriminate nonnative sounds.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
11

Abdalla, Marwa. "Can participants extract subtle information from gesturelike visual stimuli that are coordinated with speech without using any other cues?" Thesis, University of Iowa, 2012. https://ir.uiowa.edu/etd/2805.

Full text
Abstract:
Embodied cognition is the reflection of an organism's interaction with its environment on its cognitive processes. We explored the question whether participants are able to pick up on subtle cues from gestures using the Tower of Hanoi task. Previous research has shown that listeners are sensitive to the height of the gestures that they observe, and reflect this knowledge in their mouse movements (Cook & Tanenhaus, 2009). Participants in our study watched a modified video of someone explaining the Tower of Hanoi puzzle solution, so that participants only saw a black background with two moving dots representing the hand positions from the original explanation in space and time. We parametrically manipulated the location of the dots to examine whether listeners were sensitive to this subtle variation. We selected the transfer gestures from the original explanation, and tracked the hand positions with dots at varying heights relative to the original gesture height. The experimental gesture heights reflected 0%, 25%, 50%, 75% and 100% of this original height. We predicted, based on previous research (Cook in prep), that participants will be able to extract the difference in gesture height and reflect this in their mouse movements when solving the problem. Using linear model for our analysis, we found that the starting trajectory confirmed our hypothesis. However, when looking at the averaged first 15 moves (the minimum to solve the puzzle) across the five conditions, the ordered effect of the gesture heights was lost, although there were apparent differences between the gesture heights. This is an important finding because it shows that participants are able to glean subtle height information from gestures. Listeners truly interpret iconic gestures iconically.
APA, Harvard, Vancouver, ISO, and other styles
12

Alharbi, Saad T. "Graphical and Non-speech Sound Metaphors in Email Browsing: An Empirical Approach. A Usability Based Study Investigating the Role of Incorporating Visual and Non-Speech Sound Metaphors to Communicate Email Data and Threads." Thesis, University of Bradford, 2009. http://hdl.handle.net/10454/4244.

Full text
Abstract:
This thesis investigates the effect of incorporating various information visualisation techniques and non-speech sounds (i.e. auditory icons and earcons) in email browsing. This empirical work consisted of three experimental phases. The first experimental phase aimed at finding out the most usable visualisation techniques for presenting email information. This experiment involved the development of two experimental email visualisation approaches which were called LinearVis and MatrixVis. These approaches visualised email messages based on a dateline together with various types of email information such as the time and the senders. The findings of this experiment were used as a basis for the development of a further email visualisation approach which was called LinearVis II. This novel approach presented email data based on multi-coordinated views. The usability of messages retrieval in this approach was investigated and compared to a typical email client in the second experimental phase. Users were required to retrieve email messages in the two experiments with the provided relevant information such as the subject, status and priority. The third experimental phase aimed at exploring the usability of retrieving email messages by using other type of email data, particularly email threads. This experiment investigated the synergic use of graphical representations with non-speech sounds (Multimodal Metaphors), graphical representations and textual display to present email threads and to communicate contextual information about email threads. The findings of this empirical study demonstrated that there is a high potential for using information visualisation techniques and non-speech sounds (i.e. auditory icons and earcons) to improve the usability of email message retrieval. Furthermore, the thesis concludes with a set of empirically derived guidelines for the use of information visualisation techniques and non-speech sound to improve email browsing.
Taibah University in Medina and the Ministry of Higher Education in Saudi Arabia.
APA, Harvard, Vancouver, ISO, and other styles
13

Alharbi, Saad Talal. "Graphical and non-speech sound metaphors in email browsing : an empirical approach : a usability based study investigating the role of incorporating visual and non-speech sound metaphors to communicate email data and threads." Thesis, University of Bradford, 2009. http://hdl.handle.net/10454/4244.

Full text
Abstract:
This thesis investigates the effect of incorporating various information visualisation techniques and non-speech sounds (i.e. auditory icons and earcons) in email browsing. This empirical work consisted of three experimental phases. The first experimental phase aimed at finding out the most usable visualisation techniques for presenting email information. This experiment involved the development of two experimental email visualisation approaches which were called LinearVis and MatrixVis. These approaches visualised email messages based on a dateline together with various types of email information such as the time and the senders. The findings of this experiment were used as a basis for the development of a further email visualisation approach which was called LinearVis II. This novel approach presented email data based on multi-coordinated views. The usability of messages retrieval in this approach was investigated and compared to a typical email client in the second experimental phase. Users were required to retrieve email messages in the two experiments with the provided relevant information such as the subject, status and priority. The third experimental phase aimed at exploring the usability of retrieving email messages by using other type of email data, particularly email threads. This experiment investigated the synergic use of graphical representations with non-speech sounds (Multimodal Metaphors), graphical representations and textual display to present email threads and to communicate contextual information about email threads. The findings of this empirical study demonstrated that there is a high potential for using information visualisation techniques and non-speech sounds (i.e. auditory icons and earcons) to improve the usability of email message retrieval. Furthermore, the thesis concludes with a set of empirically derived guidelines for the use of information visualisation techniques and non-speech sound to improve email browsing.
APA, Harvard, Vancouver, ISO, and other styles
14

Kühnapfel, Thorsten. "Audio networks for speech enhancement and indexing." Thesis, Curtin University, 2009. http://hdl.handle.net/20.500.11937/206.

Full text
Abstract:
For humans, hearing is the second most important sense, after sight. Therefore, acoustic information greatly contributes to observing and analysing an area of interest. For this reason combining audio and video cues for surveillance enhances scene understanding and the observed events. However, when combining different sensors their measurements need to be correlated, which is done by either knowing the exact relative sensor alignment or learning a mapping function. Most deployed systems assume a known relative sensor alignment, making them susceptible to sensor drifts. Additionally, audio recordings are generally a mixture of several source signals and therefore need to be processed to extract a desired sound source, such as speech of a target person.In this thesis a generic framework is described that captures, indexes and extracts surveillance events from coordinated audio and video cues. It presents a dynamic joint-sensor calibration approach that uses audio-visual sensor measurements to dynamically and incrementally learn the calibration function, making the sensor calibration resilient to independent drifts in the sensor suite. Experiments demonstrate the use of such a framework for enhancing surveillance.Furthermore, a speech enhancement approach is presented based on a distributed network of microphones, increasing the effectiveness for acoustic surveillance of large areas. This approach is able to detect and enhance speech in the presence of rapidly changing environmental noise. Spectral subtraction, a single channel speech enhancement approach, is modified to adapt quickly to rapid noise changes of two common noise sources by incorporating multiple noise models. The result of the cross correlation based noise classification approach is also utilised to improve the voice activity detection by minimising false detection based on rapid noise changes. Experiments with real world noise consisting of scooter and café noise have proven the advantage of multiple noise models especially when the noise changes during speech.The modified spectral subtraction approach is then extended to real world scenarios by introducing more and highly non-stationary noise types. Thus, the focus is directed to implement a more sophisticated noise classification approach by extracting a variety of acoustic features and applying PCA transformation to compute the Mahalanobis distance to each noise class. This distance measurement is also included in the voice activity detection algorithm to reduce false detection for highly non-stationary noise types. However, using spectral subtraction in non-stationary noise environments, such as street noise, reduces the performance of the speech enhancement. For that reason the speech enhancement approach is further improved by using the sound information of the entire network to update the noise model of the detected noise type during speech. This adjustment considerably improved the speech enhancement performance in non-stationary noise environments. Experiments conducted under diverse real world conditions including rapid noise changes and non-stationary noise sources demonstrate the effectiveness of the presented method.
APA, Harvard, Vancouver, ISO, and other styles
15

Navarathna, Rajitha Dharshana Bandara. "Robust recognition of human behaviour in challenging environments." Thesis, Queensland University of Technology, 2014. https://eprints.qut.edu.au/66235/1/Rajitha%20Dharshana%20Bandara_Navarathna_Thesis.pdf.

Full text
Abstract:
Novel techniques have been developed for the automatic recognition of human behaviour in challenging environments using information from visual and infra-red camera feeds. The techniques have been applied to two interesting scenarios: Recognise drivers' speech using lip movements and recognising audience behaviour, while watching a movie, using facial features and body movements. Outcome of the research in these two areas will be useful in the improving the performance of voice recognition in automobiles for voice based control and for obtaining accurate movie interest ratings based on live audience response analysis.
APA, Harvard, Vancouver, ISO, and other styles
16

Ponto, Jessica J. "Speech is a Mouth, Text is a Body." Miami University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=miami1218076653.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Kalantari, Shahram. "Improving spoken term detection using complementary information." Thesis, Queensland University of Technology, 2015. https://eprints.qut.edu.au/90074/1/Shahram_Kalantari_Thesis.pdf.

Full text
Abstract:
This research has made contributions to the area of spoken term detection (STD), defined as the process of finding all occurrences of a specified search term in a large collection of speech segments. The use of visual information in the form of lip movements of the speaker in addition to audio and the use of topic of the speech segments, and the expected frequency of words in the target speech domain, are proposed. By using these complementary information, improvement in the performance of STD has been achieved which enables efficient search of key words in large collection of multimedia documents.
APA, Harvard, Vancouver, ISO, and other styles
18

Fong, Katherine KaYan. "IR-Depth Face Detection and Lip Localization Using Kinect V2." DigitalCommons@CalPoly, 2015. https://digitalcommons.calpoly.edu/theses/1425.

Full text
Abstract:
Face recognition and lip localization are two main building blocks in the development of audio visual automatic speech recognition systems (AV-ASR). In many earlier works, face recognition and lip localization were conducted in uniform lighting conditions with simple backgrounds. However, such conditions are seldom the case in real world applications. In this paper, we present an approach to face recognition and lip localization that is invariant to lighting conditions. This is done by employing infrared and depth images captured by the Kinect V2 device. First we present the use of infrared images for face detection. Second, we use the face’s inherent depth information to reduce the search area for the lips by developing a nose point detection. Third, we further reduce the search area by using a depth segmentation algorithm to separate the face from its background. Finally, with the reduced search range, we present a method for lip localization based on depth gradients. Experimental results demonstrated an accuracy of 100% for face detection, and 96% for lip localization.
APA, Harvard, Vancouver, ISO, and other styles
19

Verma, Prabhat [Verfasser]. "Speech as Interface in Web Applications for Visually Challenged / Prabhat Verma." Munich : GRIN Verlag, 2015. http://d-nb.info/1097585689/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Ceder, Maria, and Camilla Hellström. "Det maskerande brusljudets påverkan på inlärningen av visuell information : om effekten av maskerande brusljud i öppna kontorslandskap." Thesis, Högskolan i Gävle, Avdelningen för socialt arbete och psykologi, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-12404.

Full text
Abstract:
Denna experimentella studie undersökte om maskerande brusljud på ovidkommande tal påverkar inlärning av visuell information. Experimentet genomfördes i ett laboratorium med 32 försöksdeltagare. Visuella ord presenterades för försöksdeltagarna samtidigt som auditivt tal från samma semantiska kategori, med eller utan maskerande brusljud, presenterades. De visuella orden skulle återges i valfri ordning. Resultatet av studien visar att ett maskerande brusljud på ovidkommande tal har positiv effekt på inlärningsförmågan. Detta visades av att försökspersonerna mindes fler visuellt presenterade ord samt att de lyckades ignorera det ovidkommande talet bättre då talet maskerades av ett brusljud jämfört med om talet inte maskerades av ett brusljud. Resultaten av studien kan med fördel tillämpas i öppna kontorslandskap. Detta då medarbetare i öppna kontorslandskap ofta utför kognitivt krävande uppgifter i en bullrig miljö innehållande bland annat bakgrundstal. Ett maskerande brusljud kan minska störningen av ovidkommande kontorsljud och ovidkommande tal och på så sätt positivt påverka arbetsprestationen.
This study examined if a masking white noise on irrelevant speech affects the encoding of visual information. An experiment was carried out in a laboratory with 32 participants. The participants were presented to a series of written words and were prompted to recall these words in any order. While the participants studied the written words, irrelevant speech from the same semantic category was presented with or without a masking noise. The participants were told to ignore the irrelevant speech. The results of this study showed that the number of intrusions from the irrelevant speech decreases and the number of recalled written words increases when the irrelevant speech is masked by a white noise compared to irrelevant speech without a masking noise. The findings of this study could be applied in the acoustic design of open-plan offices where cognitive tasks, such as reading comprehension and proofreading, are performed in a noisy environment. A white noise can reduce the intelligibility of office noise and irrelevant speech, which have positive effect on work performance.
APA, Harvard, Vancouver, ISO, and other styles
21

Bannani, Rouaissia Sabrina. "Pour une prise en charge des difficultés de la compréhension orale en FLE : cas des collégiens tunisiens issus des milieux défavorisés." Thesis, Aix-Marseille, 2018. http://www.theses.fr/2018AIXM0466.

Full text
Abstract:
Ce travail s’inscrit dans le champ de la didactique de l’oral et a pour objet l’étude des interactions verbales dans les classes de collèges tunisiens, pour les apprenants en difficulté, issus des milieux défavorisés.Malgré les efforts investis par les enseignants, et parce qu’ils sont individuels et conçus anarchiquement, ils sont vains et la démotivation est d’une telle ampleur qu’elle inhibe tout acte d’apprentissage aussi réduit qu’il soit. Les enseignants sont conscients de la nécessité de faire acquérir la compétence orale à ces apprenants, la considérant comme accessible mais ils oublient parfois que contrairement aux milieux favorisés, la majorité des apprenants issus des milieux défavorisés exercent un oral d’une langue étrangère qu’ils ne pratiquent jamais en dehors de l’école. Comment former les élèves aux compétences de compréhension et de production de l’oral, compte tenu du contexte particulier des classes de FLE dans les milieux défavorisés ?Quel projet de prise en charge propose-t-on pour éviter l’échec et amener les apprenants en difficulté à avoir des représentations objectives et positives, vis-à-vis d’eux-mêmes, d’une part, de l’école voire de l’apprentissage en général, et du français en particulier d’autre part.Nous cherchons ainsi à déterminer quel contexte peut favoriser l’émancipation des apprenants en difficulté par leurs prises de parole, afin de contribuer à la didactique de l’oral et proposer quelques pistes didactiques qui rendraient ces apprenants actifs en classe, leur donnant l’occasion de prouver leur existence par la participation d’une part, et l’engagement dans la construction du savoir qui leur est enseigné d’autre part
This research fits in the field of oral didactics and it aims to study the verbal interactions in the classes of Tunisian middle schools, for learners in difficulty, coming from underprivileged backgrounds.Despite of the efforts, invested by the teachers, and because they are individual and conceived anarchically, they are vain and the demotivation is of such a magnitude that it inhibits any act of learning however small it can be.Teachers are now aware of the need to develop the oral skill for these learners in difficulty, considering it as accessible but they sometimes forget that unlike the privileged areas, the majority of learners from underprivileged areas practice speaking a foreign language which they never use outside of school.What do the FFL methodologies tangibly offer, for teaching oral skills, taking into account students in difficulty?How to train students in oral comprehension and production skills, given the particular context of FFL classes in underprivileged areas?What support plan is proposed to prevent failure and to bring learners in difficulty to have objective and positive representations, vis-à-vis themselves, on one hand, the school alike learning in general, and French in particular on the other hand?In that way, we seek to determine which context can favor the emancipation of learners in difficulty by their speaking up in order to contribute to the oral didactics field and to propose some didactic paths that would make these learners active in the classroom, thus giving them the opportunity to prove their existence through participation, on one hand, and commitment to building the knowledge that is taught to them on the other hand
APA, Harvard, Vancouver, ISO, and other styles
22

Simoncini, Claudio. "Intégration spatio-temporelle de l'information visuelle pour les mouvements oculaires et la perception : =Spatio-temporal integration of visual information for eye movements and perception." Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM5065/document.

Full text
Abstract:
Dans ce travail de thèse, nous avons tout d’abord étudié comment l’information de mouvement est intégrée pour estimer la vitesse d’une texture aléatoire afin de la suivre réflexivement avec les yeux ou d’estimer perception son déplacement. Dans une seconde série d’études, nous avons étudié comment la distribution spatiale du contraste dans une texture affecte à la fois les mouvements oculaires de fixation et la reconnaissance perceptive. A ces fins, nous avons utilisé un nouvel ensemble de stimuli visuels, des textures pseudo-naturelles dans lesquelles on peut finement contrôler la statistique (moyenne, variance) des fréquences spatiales et/ou temporelles. La première partie explore l’intégration et le décodage de information fréquentielle spatio-temporelle visuelle pour les réponses de poursuite réflexes et la discrimination perceptive. Nous montrons que l’action tire complètement partie de la richesse du stimuli en intégrant sur toute la distribution pour estimer la vitesse : accélération initiale, précision et robustesse sont améliorées. Au contraire, la performance perceptive décroit pour des stimuli à bandes passantes larges. Cette dissociation se maintient sur une large place d’intégration temporelle. La seconde partie élargie cette approche à la distribution spatiale de l’information et à ces différentes échelles. Nous montrons que le comportement oculaire de fixation dépend de la composition fréquentielle d’une texture, en termes de moyenne et de fréquence. Saccade et micro-saccades se distribuent au cours de la fixation de façon coordonnée en fonction de cette statistique. In fine, cette dernière joue sur la carte de salience calculée à partir de l’image
We focused on the impact of the statistical distributions of visual information on these various behavioral responses. We asked first how motion information is integrated to estimate speed in order to perform either a speed discrimination task or to control reflexive tracking eye movements. Next, we investigated how spatial distribution in textures affects both pattern recognition and fixational eye movements. To do so, we used a set of artificial stimuli that are naturalistic textures where we can maintain a tight control on their information contents as for instance their spatio-temporal frequency bandwidth. The first studies compared speed information decoding for ocular following eye movements and perceptual speed discrimination. We found a strong dissociation where ocular following take full advantage by the enlargement of the spatio-temporal frequency bandwidth while perceptual speed discrimination is largely impaired for large bandwidth stimuli. Such dissociation remains over a large temporal integration window. We propose an adaptive gain control mechanism to explain this opposite dependencies. The second series of experimental studies investigate the properties of fixation eye movements (microsaccade and saccade) as a function of the mean and variance of the spatial frequency content of visual static textures. We show that several characteristics of fixational saccades (location, direction and amplitude) varied systematically with the distribution of spatial frequencies. The spatial distribution of the fixation zones could be best predicted from the saliency maps of the stimuli
APA, Harvard, Vancouver, ISO, and other styles
23

Hill, Brian, and 廖峻廷. "Robust Speech Recognition Integrating Visual Information." Thesis, 1997. http://ndltd.ncl.edu.tw/handle/97538191028447078081.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Kao, Jen-ching, and 高仁璟. "Effects of Audio-visual Information on the Intelligibility of Esophageal Speech." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/64143533157423675383.

Full text
Abstract:
碩士
國立臺北護理健康大學
語言治療與聽力研究所
104
The purpose of this study was to determine the effects visual information on speech intelligibility for esophageal speakers, and further examining the influence of the degree of auditory speech intelligibility of the speaker on the effect of visual information. Besides, to invetigate the role of visual information on speech percetion, intelligibility scores for phonemes after visual information added were compared. The subjects of this study were 6 esophageal speakers and 60 listeners. Speakers were divided into two groups (3 per group), a Good esophgeal group whose auditoty speech intelligibility above 85%, and a Moderate group whose auditory speech intelligibility between 50%~75%. Speakers were recorded while they read sentences. Listeners transcribed sentences while watching and listening to videotapes of the speakers (audio-visual mode) and while only listening to the speakers (auditory-only mode). Scores of sentence intelligibility and phoneme intelligibility were determined baesd on listeners transcription. The results showed a statistically significant higher sentence intelligibility score for audio-visual mode compared to audio-only mode, as well as a significant interaction effect between mode of presentation and degree of auditory speech intelligibility. Within degree of auditory speech intelligibility, Good esophgeal group showed significantly greater benefit from the inclusion of visual information compared to Moderate esophgeal group. Besides, 17 out of 21 phonemes were benefited signigicantly from the inclusion of visual cues.Significant difference was found between 7 articulation places, and the greatest improvement was found for bilabial sounds. No significant difference was found between 7 articulation manners. The finding suggest that facial visual informataion increase the intelligibility of esophageal speech, and auditory speech intelligibility is an important variable that the less intelligible speakers’ scores can increase more when visual information was added. Besides, audio-visual processing is more effective than auditory processing for two reasons. First, some articulatory movement can be seen clearly by the mouth. Second, the audible and visible patterns are highly correlated. Many features of an utterance can be seen by the facial information.
APA, Harvard, Vancouver, ISO, and other styles
25

Erdener, Vahit Dogu, University of Western Sydney, College of Arts, and School of Psychology. "Development of auditory-visual speech perception in young children." 2007. http://handle.uws.edu.au:8081/1959.7/13783.

Full text
Abstract:
Unlike auditory-only speech perception, little is known about the development of auditory-visual speech perception. Recent studies show that pre-linguistic infants perceive auditory-visual speech phonetically in the absence of any phonological experience. In addition, while an increase in visual speech influence over age is observed in English speakers, particularly between six and eight years, this is not the case in Japanese speakers. This thesis aims to investigate the factors that lead to an increase in visual speech influence in English speaking children aged between 3 and 8 years. The general hypothesis of this thesis is that age-related, language-specific factors will be related to auditory-visual speech perception. Three experiments were conducted here. Results show that in linguistically challenging periods, such as school onset and reading acquisition, there is a strong link between auditory visual and language specific speech perception, and that this link appears to help cope with new linguistic challenges. However this link does not seem to be present in adults or preschool children, for whom auditory visual speech perception is predictable from auditory speech perception ability alone. Implications of these results in relation to existing models of auditory-visual speech perception and directions for future studies are discussed.
Doctor of Philosophy (PhD)
APA, Harvard, Vancouver, ISO, and other styles
26

Erdener, Dogu. "Development of auditory-visual speech perception in young children." Thesis, 2007. http://handle.uws.edu.au:8081/1959.7/13783.

Full text
Abstract:
Unlike auditory-only speech perception, little is known about the development of auditory-visual speech perception. Recent studies show that pre-linguistic infants perceive auditory-visual speech phonetically in the absence of any phonological experience. In addition, while an increase in visual speech influence over age is observed in English speakers, particularly between six and eight years, this is not the case in Japanese speakers. This thesis aims to investigate the factors that lead to an increase in visual speech influence in English speaking children aged between 3 and 8 years. The general hypothesis of this thesis is that age-related, language-specific factors will be related to auditory-visual speech perception. Three experiments were conducted here. Results show that in linguistically challenging periods, such as school onset and reading acquisition, there is a strong link between auditory visual and language specific speech perception, and that this link appears to help cope with new linguistic challenges. However this link does not seem to be present in adults or preschool children, for whom auditory visual speech perception is predictable from auditory speech perception ability alone. Implications of these results in relation to existing models of auditory-visual speech perception and directions for future studies are discussed.
APA, Harvard, Vancouver, ISO, and other styles
27

Erdener, Dogu. "The effect of auditory, visual and orthographic information on second language acquisition." Thesis, 2002. http://handle.uws.edu.au:8081/1959.7/685.

Full text
Abstract:
The current study investigates the effect of auditory and visual speech information and orthographic information on second/foreign language (L2) acquisition. To test this, native speakers of Turkish (a language with a transparent orthography) and native speakers of Australian English (a language with an opaque orthography) were exposed to Spanish (transparent orthography) and Irish (opaque orthography) legal non-word items in four experimental conditions: auditory-only, auditory-visual, auditory-orthographic, and auditory-visual-orthographic. On each trial, Turkish and Australian English speakers were asked to produce each Spanish and Irish legal non-words. In terms of phoneme errors it was found that Turkish participants generally made less errors in Spanish than their Australian counterparts, and visual speech information generally facilitated performance. Orthographic information had an overriding effect such that there was no visual advantage once it was provided. In the orthographic conditions, Turkish speakers performed better than their Australian English counterparts with Spanish items and worse with Irish terms. In terms of native speakers' ratings of participants' productions, it was found that orthographic input improved accent. Overall the results confirm findings that visual information enhances speech production in L2 and additionally show the facilitative effects of orthographic input in L2 acquisition as a function of orthographic depth. Inter-rater reliability measures revealed that the native speaker rating procedure may be prone to individual and socio-cultural influences that may stem from internal criteria for native accents. This suggests that native speaker ratings should be treated with caution.
APA, Harvard, Vancouver, ISO, and other styles
28

Lapchak, Marion Cone. "Exploring the effects of age, early-onset otitis media, and articulation errors on the integration of auditory and visual information in speech perception /." Diss., 2005. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3188497.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Parsons, Brendan. "The neuroscience of cognitive enhancement : enhanced attention, working memory and visual information processing speed using 3D-MOT." Thèse, 2015. http://hdl.handle.net/1866/16316.

Full text
Abstract:
Des interventions ciblant l’amélioration cognitive sont de plus en plus à l’intérêt dans nombreux domaines, y compris la neuropsychologie. Bien qu'il existe de nombreuses méthodes pour maximiser le potentiel cognitif de quelqu’un, ils sont rarement appuyé par la recherche scientifique. D’abord, ce mémoire examine brièvement l'état des interventions d'amélioration cognitives. Il décrit premièrement les faiblesses observées dans ces pratiques et par conséquent il établit un modèle standard contre lequel on pourrait et devrait évaluer les diverses techniques ciblant l'amélioration cognitive. Une étude de recherche est ensuite présenté qui considère un nouvel outil de l'amélioration cognitive, une tâche d’entrainement perceptivo-cognitive : 3-dimensional multiple object tracking (3D-MOT). Il examine les preuves actuelles pour le 3D-MOT auprès du modèle standard proposé. Les résultats de ce projet démontrent de l’augmentation dans les capacités d’attention, de mémoire de travail visuel et de vitesse de traitement d’information. Cette étude représente la première étape dans la démarche vers l’établissement du 3D-MOT comme un outil d’amélioration cognitive.
Cognitive enhancement is a domain of burgeoning interest in many domains including neuropsychology. While there are different methods that exist in order to achieve cognitive enhancement, there are few that are supported by research. The current work examines the state of cognitive enhancement interventions. It first outlines the weaknesses observed in these practices and then proposes a standard template for assessing cognitive enhancement tools. A research study is then presented that examines a novel cognitive enhancement tool, 3-dimensional multiple object tracking (3D-MOT), and weighs the current evidence for 3D-MOT against the proposed standard template. The results of the current work demonstrate that 3D-MOT is effective in enhancing attention, working memory and visual information processing speed, and represent a first step toward establishing 3D-MOT as a cognitive enhancement tool.
APA, Harvard, Vancouver, ISO, and other styles
30

CHEN, TING-YU, and 陳亭妤. "The evaluation and design of the visual information discrimination for the transportation tickets : A case of Taiwan High Speed Rail." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/rf77zf.

Full text
Abstract:
碩士
逢甲大學
創意設計碩士學位學程
106
With the convenience and expansion of transport systems and the development trend of traffic operation requirements, plus the increase in foreign tourists in Taiwan tourism in recent years, the number of trains has gradually increased to meet more passengers. However, at the same time, travelers of many ages should also pay attention to and improve the design of their information content. In order to allow passengers to ride the train station accurately, the information on the transport ticket is also important. In view of this, this study uses the Taiwan high-speed rail transit ticket as a sample to carry out relevant discussions, user evaluation and design on the ticket. In the research process, the content analysis method was used to analyze the visual elements of the current high-speed rail vouchers with the graphic design theory; then the experience of different age groups for the use of the current high-speed rail tickets was obtained through questionnaire survey, and feedback was given. And use the Tobii Eye Tracker T120 eye tracker to do the task test, you can more accurately know the subjects in different situations, for the point of view of the current ticket; then, summarize the problems of the current ticket and design recommendations The improved design after finishing includes the visual architecture of the message framework and ticket design, and the task test of the second testee to check whether the problem of the current ticket is resolved. Finally, based on the analysis results, a new version of the high-speed railway transport ticket is proposed. , and its message architecture and interface visual design guidelines.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography