Dissertations / Theses: 'Sound data'

1

Hebden, John Edward. "Acquisition and analysis of heart sound data." Thesis, University of Sussex, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.360518.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Bearman, N. "Using sound to represent uncertainty in spatial data." Thesis, University of East Anglia, 2013. https://ueaeprints.uea.ac.uk/52676/.

Full text

Abstract:

There is a limit to the amount of spatial data that can be shown visually in an effective manner, particularly when the data sets are extensive or complex. Using sound to represent some of these data (sonification) is a way of avoiding visual overload. This thesis creates a conceptual model showing how sonification can be used to represent spatial data and evaluates a number of elements within the conceptual model. These are examined in three different case studies to assess the effectiveness of the sonifications. Current methods of using sonification to represent spatial data have been restricted by the technology available and have had very limited user testing. While existing research shows that sonification can be done, it does not show whether it is an effective and useful method of representing spatial data to the end user. A number of prototypes show how spatial data can be sonified, but only a small handful of these have performed any user testing beyond the authors’ immediate colleagues (where n > 4). This thesis creates and evaluates sonification prototypes, which represent uncertainty using three different case studies of spatial data. Each case study is evaluated by a significant user group (between 45 and 71 individuals) who completed a task based evaluation with the sonification tool, as well as reporting qualitatively their views on the effectiveness and usefulness of the sonification method. For all three case studies, using sound to reinforce information shown visually results in more effective performance from the majority of the participants than traditional visual methods. Participants who were familiar with the dataset were much more effective at using the sonification than those who were not and an interactive sonification which requires significant involvement from the user was much more effective than a static sonification, which did not provide significant user engagement. Using sounds with a clear and easily understood scale (such as piano notes) was important to achieve an effective sonification. These findings are used to improve the conceptual model developed earlier in this thesis and highlight areas for future research.

APA, Harvard, Vancouver, ISO, and other styles

3

Diaz, Merced Wanda Liz. "Sound for the exploration of space physics data." Thesis, University of Glasgow, 2013. http://theses.gla.ac.uk/5804/.

Full text

Abstract:

Current analysis techniques for space physics 2D numerical data are based on scruti-nising the data with the eyes. Space physics data sets acquired from the natural lab of the interstellar medium may contain events that may be masked by noise making it difficult to identify. This thesis presents research on the use of sound as an adjunct to current data visualisation techniques to explore, analyse and augment signatures in space physics data. This research presents a new sonification technique to decom-pose a space physics data set into different components (frequency, oscillatory modes, etc…) of interest, and its use as an adjunct to data visualisation to explore and analyse space science data sets which are characterised by non-linearity (a system which does not satisfy the superposition principle, or whose output is not propor-tional to its input). Integrating aspects of multisensory perceptualization, human at tention mechanisms, the question addressed by this dissertation is: Does sound used as an adjunct to current data visualisation, augment the perception of signatures in space physics data masked by noise? To answer this question, the following additional questions had to be answered: a) Is sound used as an adjunct to visualisation effective in increasing sensi-tivity to signals occurring at attended, unattended, unexpected locations, extended in space, when the occurrence of the signal is in presence of a dynamically changing competing cognitive load (noise), that makes the signal visually ambiguous? b) How can multimodal perceptualization (sound as an adjunct to visualisa-tion) and attention control mechanisms, be combined to help allocate at-tention to identify visually ambiguous signals? One aim of these questions is to investigate the effectiveness of the use of sound to-gether with visual display to increase sensitivity to signal detection in presence of visual noise in the data as compared to visual display only. Radio, particle, wave and high energy data is explored using a sonification technique developed as part of this research. The sonification technique developed as part of this research, its application and re-sults are numerically validated and presented. This thesis presents the results of three experiments and results of a training experiment. In all the 4 experiments, the volun-teers were using sound as an adjunct to data visualisation to identify changes in graphical visual and audio representations and these results are compared with those of using audio rendering only and visual rendering only. In the first experiment audio rendering did not result in significant benefits when used alone or with a visual display. With the second and third experiments, the audio as an adjunct to visual rendering became significant when a fourth cue was added to the spectra. The fourth cue con-sisted of a red line sweeping across the visual display at the rate the sound was played, to synchronise the audio and visual present. The results prove that a third congruent multimodal stimulus in synchrony with the sound helps space scientists identify events masked by noise in 2D data. Results of training experiments are reported.

APA, Harvard, Vancouver, ISO, and other styles

4

Durey, Adriane Swalm. "Melody spotting using hidden Markov models." Diss., Available online, Georgia Institute of Technology, 2004:, 2003. http://etd.gatech.edu/theses/available/etd-04082004-180126/unrestricted/durey%5Fadriane%5Fs%5F200312%5Fphd.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Smith, Adrian Wilfrid. "A distributed approach to surround sound production." Thesis, Rhodes University, 1999. http://hdl.handle.net/10962/d1004855.

Full text

Abstract:

The requirement for multi-channel surround sound in audio production applications is growing rapidly. Audio processing in these applications can be costly, particularly in multi-channel systems. A distributed approach is proposed for the development of a realtime spatialization system for surround sound music production, using Ambisonic surround sound methods. The latency in the system is analyzed, with a focus on the audio processing and network delays, in order to ascertain the feasibility of an enhanced, distributed real-time spatialization system.

APA, Harvard, Vancouver, ISO, and other styles

6

Stensholt, Håkon Meyer. "Sound Meets Type : Exploring the form generating qualities of sound as input for a new typography." Thesis, Konstfack, Grafisk Design & Illustration, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:konstfack:diva-4761.

Full text

Abstract:

How can you create new letterforms using sound as input? In Sound meets Type, have I studied the form generating qualities of sound as input for a new typography. Through history the technological development has provoked new approaches to type design, which in turn has evolved letterforms. By using generative systems to search for letterforms in a contemporary and technological context, I have created a customized software that uses the data inherent in sound as a form generator for possible new letterforms. The software is developed by using a language called Javascript. The thesis consist of a written part and a creative part. The creative part is documented within this thesis.

APA, Harvard, Vancouver, ISO, and other styles

7

Manekar, Vedang V. M. S. "Development of a Low-cost Data Acquisition System using a Sound Card." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1554121267971882.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Berglund, Alexander, Fredrik Herbai, and Jonas Wedén. "Sound Propagation Through Walls." Thesis, Uppsala universitet, Avdelningen för beräkningsvetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444632.

Full text

Abstract:

Infrasound is undetectable by the human ear and excessive exposure may be a substantial health risk. Low frequency sound propagates through walls with minimal attenuation, making it difficult to avoid. This study interprets the results from both analytical calculations and simulations of pressure waves propagating through a wall in one dimension. The wall is thin compared to the wavelength; the model implements properties of three materials commonly used in walls. The results indicate that the geometry of the wall, most importantly the small ratio between wall width and wavelength, is the prime reason for the low levels of attenuation observed in transmitted amplitudes of low frequency sounds, and that damping is negligible for infrasound. Furthermore, a one-dimensional homogeneous wall model gives rise to periodicity in the transmitted amplitude, which is not observed in experiments. Future studies should prioritize the introduction of at least one more dimension to the model, to allow for variable angles of incidence.

APA, Harvard, Vancouver, ISO, and other styles

9

Ashraf, Pouya. "Improving Spatial Sound Capturing on Mobile Devices Through Fusion of Inertial Measurement Data." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-372151.

Full text

Abstract:

Through the use of sensor arrays it is possible to extract spatial information about signals present in the environment. For instance position, velocity, distance, etc. In this thesis we focus on the use of microphone arrays with the aim of accurately determining, and over time track, the Direction of Arrival (DoA) of nearby sound signals impinging on the array. As modern mobile devices like smartphones and tablets are commonly equipped with two or more microphones, these constitute a simple microphone array. This gives us the opportunity of attaining the above aim. However, in the scenario that the microphone array is not stationary, a number of problems arise. In this thesis we implement an algorithm to estimate the microphone array's orientation in three-dimensional space, with the aim of using these estimates to cancel the effect of the array's orientation on the DoA estimates. The cancellation is done with a dynamical model, which we use in a modified Kalman Filter (KF) capable of tracking an arbitrary number of sound sources over time. Additionally, we estimate the computational cost of the mentioned algorithms. The simulation results show satisfactory performance in the modified KF with respect to handling crossing trajectories and noise in the measurements. The chosen algorithm for orientation estimation proves susceptible to magnetic disturbances to the extent that usage in the context of mobile devices is undesirable. Due to this, the orientation estimates are provided by a proprietary algorithm. Experiments where the DoA of the sound sources are computed using actual sound, together with orientation estimates, confirm the correctness of the proposed model.

APA, Harvard, Vancouver, ISO, and other styles

10

Kesterton, Anthony James. "The synthesis of sound with application in a MIDI environment." Thesis, Rhodes University, 1991. http://hdl.handle.net/10962/d1006701.

Full text

Abstract:

The wide range of options for experimentation with the synthesis of sound are usually expensive, difficult to obtain, or limit the experimenter. The work described in this thesis shows how the IBM PC and software can be combined to provide a suitable platform for experimentation with different synthesis techniques. This platform is based on the PC, the Musical Instrument Digital Interface (MIDI) and a musical instrument called a digital sampler. The fundamental concepts of sound are described, with reference to digital sound reproduction. A number of synthesis techniques are described. These are evaluated according to the criteria of generality, efficiency and control. The techniques discussed are additive synthesis, frequency modulation synthesis, subtractive synthesis, granular synthesis, resynthesis, wavetable synthesis, and sampling. Spiral synthesis, physical modelling, waveshaping and spectral interpolation are discussed briefly. The Musical Instrument Digital Interface is a standard method of connecting digital musical instruments together. It is the MIDI standard and equipment conforming to that standard that makes this implementation of synthesis techniques possible. As a demonstration of the PC platform, additive synthesis, frequency modulation synthesis, granular synthesis and spiral synthesis have been implemented in software. A PC equipped with a MIDI interface card is used to perform the synthesis. The MIDI protocol is used to transmit the resultant sound to a digital sampler. The INMOS transputer is used as an accelerator, as the calculation of a waveform using software is a computational intensive process. It is concluded that sound synthesis can be performed successfully using a PC and the appropriate software, and utilizing the facilities provided by a MIDI environment including a digital sampler.

APA, Harvard, Vancouver, ISO, and other styles

11

Hill, Robert M. "Model-data comparison of shallow water acoustic reverberation in the East China Sea." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2003. http://library.nps.navy.mil/uhtbin/hyperion-image/03sep%5FHill.pdf.

Full text

Abstract:

Thesis (M.S. in Engineering Acoustics)--Naval Postgraduate School, September 2003.
Thesis advisor(s): Kevin B. Smith, Daphne Kapolka. Includes bibliographical references (p. 69-71). Also available online.

APA, Harvard, Vancouver, ISO, and other styles

12

Roemer, Jake. "Practical High-Coverage Sound Predictive Race Detection." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1563505463237874.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Worrall, David, and n/a. "SONIFICATION AND INFORMATION CONCEPTS, INSTRUMENTS AND TECHNIQUES." University of Canberra. Communication, 2009. http://erl.canberra.edu.au./public/adt-AUC20090818.142345.

Full text

Abstract:

This thesis is a study of sonification and information: what they are and how they relate to each other. The pragmatic purpose of the work is to support a new generation of software tools that are can play an active role in research and practice that involves understanding information structures found in potentially vary large multivariate datasets. The theoretical component of the work involves a review of the way the concept of information has changed through Western culture, from the Ancient Greeks to recent collaborations between cognitive science and the philosophy of mind, with a particular emphasis on the phenomenology of immanent abstractions and how they might be supported and enhanced using sonification techniques. A new software framework is presented, together with several examples of its use in presenting sonifications of financial information, including that from a high-frequency securities-exchange trading-engine.

APA, Harvard, Vancouver, ISO, and other styles

14

Lawrence, Daniel. "Sound change and social meaning : the perception and production of phonetic change in York, Northern England." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/31327.

Full text

Abstract:

This thesis investigates the relationship between social meaning and linguistic change. An important observation regarding spoken languages is that they are constantly changing: the way we speak differs from generation to generation. A second important observation is that spoken utterances convey social as well as denotational meaning: the way we speak communicates something about who we are. How, if at all, are these two characteristics of spoken languages related? Many sociolinguistic studies have argued that the social meaning of linguistic features is central to explaining the spread of linguistic innovations. A novel form might be heard as more prestigious than the older form, or it may become associated with specific social stereotypes relevant to the community in which the change occurs. It is argued that this association between a linguistic variant and social meaning leads speakers to adopt or reject the innovation, inhibiting or facilitating the spread of the change. In contrast, a number of scholars have argued that social meaning is epiphenomenal to many linguistic changes, which are instead driven by an automatic process of convergence in face-to-face interaction. The issue that such arguments raise is that many studies proposing a role of social meaning in the spread of linguistic innovations rely on production data as their primary source of evidence. Observing the variable adoption of innovations across different groups of speakers (e.g. by gender, ethnicity, or socioeconomic status), a researcher might draw on their knowledge of the social history of the community under study to infer the role of social meaning in that change. In many cases, the observed patterns of could equally be explained by the social structure of the community under study, which constrains who speaks to whom. Are linguistic changes facilitated and inhibited by social meaning? Or is it rather the case that social meaning arises as a consequence of linguistic change, without necessarily influencing the change itself? This thesis explores these questions through a study of vocalic change in York, Northern England, focusing on the fronting and diphthongization of the tense back vowels /u/ and /o/. It presents a systematic comparison of the social meanings listeners assign to innovations (captured using perceptual methods), their social attitudes with regard to those meanings (captured through sociolinguistic interviews), and their use of those forms in production (captured through acoustic analysis). It is argued that evidence of a consistent relationship between these factors would support the proposal that social meaning plays a role in linguistic change. The results of this combined analysis of sociolinguistic perception, social attitudes and speech production provide clear evidence of diachronic /u/ and /o/ fronting in this community, and show that variation in these two vowels is associated with a range of social meanings in perception. These meanings are underpinned by the notion of 'Broad Yorkshire' speech, a socially-recognized speech register linked to notions of authentic local identity and social class. Monophthongal /o/, diphthongal /u/, and back variants of both vowels are shown to be associated with this register, implying that a speaker who adopts an innovative form will likely be heard as less 'Broad'. However, there is no clear evidence that speakers' attitudes toward regional identity or social class have any influence on their adoption of innovations, nor that that their ability to recognise the social meaning of fronting in perception is related to their production behaviour. The fronting of /u/ is spreading in a socially-uniform manner in production, unaffected by any social factor tested except for age. The fronting of /o/ is conditioned by social network structure - speakers with more diverse social networks are more likely to adopt the innovative form, while speakers with closer social ties to York are more likely to retain a back variant. These findings demonstrate that York speakers hear back forms of /u/ and /o/ as more 'local' and 'working class' than fronter realizations, and express strong attitudes toward the values and practices associated with regional identity and social class. However, these factors do not appear to influence their adoption of linguistic innovations in any straightforward manner, contrasting the predictions of an account of linguistic change where social meaning plays a central role in facilitating or inhibiting the propagation of linguistic innovations. Based on these results, the thesis argues that many linguistic changes may spread through the production patterns of a speech community without the direct influence of social meaning, and advocates for the combined analysis of sociolinguistic perception, social attitudes and speech production in future work.

APA, Harvard, Vancouver, ISO, and other styles

15

Kallionpää, Roosa. "Reciprocal sound transformations for computer supported collaborative jamming." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280684.

Full text

Abstract:

Collaborative jamming with digital musical instruments (DMI) exposes a need for output synchronization. While temporal solutions have been established, a better understanding of how live sound transformations could be balanced across instruments is required. In this work, a technology probe for reciprocal sound transformations was designed and developed by networking the instruments of four musicians and employing layered mapping between a shared interface, high-level sound attributes, and the sound synthesis parameters of each instrument. The probe was designed and used during a series of participatory design workshops, where seven high-level attributes were constructed according to the spectromorphology framework. The analysis, where the notion of sonic narrative and the concept of flow were applied, reveals how live controlling reciprocal sound transformations facilitates collaboration by supporting role-taking, motivating the ensemble, and directing the focus of its members. While generality of the implemented attributes cannot be claimed, challenges of the chosen mapping strategy and requirements for the user interface were identified.
Jammande i grupp med digitala musikinstrument (DMI) avslöjar ett behov för att kunna synkronisera dem utgående signalerna. Temporära lösningar har etablerats, men en bättre förståelse för hur live ljudtransformationer skulle kunna balanseras över flera instrument är nödvändig. I detta arbete utvecklades och designades en teknologisk sond för reciproka ljudtransformationer genom att koppla ihop fyra musikers instrument och en flerlagersavbildning skapades med ett delat gränssnitt, högnivå ljudattribut samt ljudsyntesparametrarna för varje instrument. Sonden designades och användes under co-design-workshops, där sju högnivå ljudattribut konstruerades enligt spectromorfologiramverket. Analysen, där begreppen soniskt berättande och konceptet flyt applicerades, avslöjar hur realtidskontroll av reciproka ljudtransformationer främjer medverkande genom att stödja rolltagande, motivera ensemblen, samt rikta fokuset hos medlemmarna. Även om det inte går att hävda att de implementerade attributen är generella, så identifierades utmaningarna hos den valda avbildningsstrategien och hos användargränssnittet.

APA, Harvard, Vancouver, ISO, and other styles

16

Wang, Xun. "Sound source localization with data and model uncertainties using the EM and Evidential EM algorithms." Thesis, Compiègne, 2014. http://www.theses.fr/2014COMP2164/document.

Full text

Abstract:

Ce travail de thèse se penche sur le problème de la localisation de sources acoustiques à partir de signaux déterministes et aléatoires mesurés par un réseau de microphones. Le problème est résolu dans un cadre statistique, par estimation via la méthode du maximum de vraisemblance. La pression mesurée par un microphone est interprétée comme étant un mélange de signaux latents émis par les sources. Les positions et les amplitudes des sources acoustiques sont estimées en utilisant l’algorithme espérance-maximisation (EM). Dans cette thèse, deux types d’incertitude sont également pris en compte : les positions des microphones et le nombre d’onde sont supposés mal connus. Ces incertitudes sont transposées aux données dans le cadre théorique des fonctions de croyance. Ensuite, les positions et les amplitudes des sources acoustiques peuvent être estimées en utilisant l’algorithme E2M, qui est une variante de l’algorithme EM pour les données incertaines.La première partie des travaux considère le modèle de signal déterministe sans prise en compte de l’incertitude. L’algorithme EM est utilisé pour estimer les positions et les amplitudes des sources. En outre, les résultats expérimentaux sont présentés et comparés avec le beamforming et la holographie optimisée statistiquement en champ proche (SONAH), ce qui démontre l’avantage de l’algorithme EM. La deuxième partie considère le problème de l’incertitude du modèle et montre comment les incertitudes sur les positions des microphones et le nombre d’onde peuvent être quantifiées sur les données. Dans ce cas, la fonction de vraisemblance est étendue aux données incertaines. Ensuite, l’algorithme E2M est utilisé pour estimer les sources acoustiques. Finalement, les expériences réalisées sur les données réelles et simulées montrent que les algorithmes EM et E2M donnent des résultats similaires lorsque les données sont certaines, mais que ce dernier est plus robuste en présence d’incertitudes sur les paramètres du modèle. La troisième partie des travaux présente le cas de signaux aléatoires, dont l’amplitude est considérée comme une variable aléatoire gaussienne. Dans le modèle sans incertitude, l’algorithme EM est utilisé pour estimer les sources acoustiques. Dans le modèle incertain, les incertitudes sur les positions des microphones et le nombre d’onde sont transposées aux données comme dans la deuxième partie. Enfin, les positions et les variances des amplitudes aléatoires des sources acoustiques sont estimées en utilisant l’algorithme E2M. Les résultats montrent ici encore l’avantage d’utiliser un modèle statistique pour estimer les sources en présence, et l’intérêt de prendre en compte l’incertitude sur les paramètres du modèle
This work addresses the problem of multiple sound source localization for both deterministic and random signals measured by an array of microphones. The problem is solved in a statistical framework via maximum likelihood. The pressure measured by a microphone is interpreted as a mixture of latent signals emitted by the sources; then, both the sound source locations and strengths can be estimated using an expectation-maximization (EM) algorithm. In this thesis, two kinds of uncertainties are also considered: on the microphone locations and on the wave number. These uncertainties are transposed to the data in the belief functions framework. Then, the source locations and strengths can be estimated using a variant of the EM algorithm, known as Evidential EM (E2M) algorithm. The first part of this work begins with the deterministic signal model without consideration of uncertainty. The EM algorithm is then used to estimate the source locations and strengths : the update equations for the model parameters are provided. Furthermore, experimental results are presented and compared with the beamforming and the statistically optimized near-field holography (SONAH), which demonstrates the advantage of the EM algorithm. The second part raises the issue of model uncertainty and shows how the uncertainties on microphone locations and wave number can be taken into account at the data level. In this case, the notion of the likelihood is extended to the uncertain data. Then, the E2M algorithm is used to solve the sound source estimation problem. In both the simulation and real experiment, the E2M algorithm proves to be more robust in the presence of model and data uncertainty. The third part of this work considers the case of random signals, in which the amplitude is modeled by a Gaussian random variable. Both the certain and uncertain cases are investigated. In the former case, the EM algorithm is employed to estimate the sound sources. In the latter case, microphone location and wave number uncertainties are quantified similarly to the second part of the thesis. Finally, the source locations and the variance of the random amplitudes are estimated using the E2M algorithm

APA, Harvard, Vancouver, ISO, and other styles

17

Laubscher, Robert Alan. "An investigation into the use of IEEE 1394 for audio and control data distribution in music studio environments." Thesis, Rhodes University, 1999. http://hdl.handle.net/10962/d1006483.

Full text

Abstract:

This thesis investigates the feasibility of using a new digital interconnection technology, the IEEE-1394 High Performance Serial Bus, for audio and control data distribution in local and remote music recording studio environments. Current methods for connecting studio devices are described, and the need for a new digital interconnection technology explained. It is shown how this new interconnection technology and developing protocol standards make provision for multi-channel audio and control data distribution, routing, copyright protection, and device synchronisation. Feasibility is demonstrated by the implementation of a custom hardware and software solution. Remote music studio connectivity is considered, and the emerging standards and technologies for connecting future music studio utilising this new technology are discussed.
Microsoft Word
Adobe Acrobat 9.46 Paper Capture Plug-in

APA, Harvard, Vancouver, ISO, and other styles

18

Asplund, Ingeborg. "Songs of Transistor : A study of sound design in video games." Thesis, Södertörns högskola, Medieteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:sh:diva-32944.

Full text

Abstract:

While there is a lot of research about other aspects of game design, there are fairly few studies about music and sound in video games. Since music and sound are components of next to all games, it is interesting to investigate how this aspect affects the perceived immersion of gamers. The aim of this study is to investigate how sound and music affect player sense of presence in a video game, Transistor [19], which was chosen due to its distinct and strongly emotional music and sound. Five video prototypes were made using gameplay and sound from the game. The videos presented different variations of the soundscape. These were tested by a web survey with questions from the PENS questionnaire [15], providing the users a seven point Likert scale by which they could rate their experience. The answers were analyzed with a mixed model regression and compared with an estimated image of which degree of immersion would be experience for each of the videos. The result of the study showed that the complete soundscape was significantly more immersive than all the other soundscapes, while silence was significantly less immersive than the other soundscapes. The conclusions were the more complete the soundscape is, the more immersive it is, and that even a small part of the total soundscape is more immersive than complete silence.

APA, Harvard, Vancouver, ISO, and other styles

19

Ching, Kai-Sang. "Priority CSMA schemes for integrated voice and data transmission." Thesis, University of British Columbia, 1988. http://hdl.handle.net/2429/28372.

Full text

Abstract:

Priority schemes employing the inherent properties of carrier-sense multiple-access (CSMA) schemes are investigated and then applied to the integrated transmission of voice and data. A priority scheme composed of 1-persistent and non-persistent CSMA protocols is proposed. The throughput and delay characteristics of this protocol are evaluated by mathematical analysis and simulation, respectively. The approach of throughput analysis is further extended to another more general case, p-persistent CSMA with two persistency factors, the throughput performance of which had not been analyzed before. Simulations are carried out to study the delay characteristics of this protocol. After careful consideration of the features of the priority schemes studied, two protocols are proposed for integrated voice and data transmission. While their ultimate purpose is for integrated services, they have different application. One of them is applied to local area network; the other is suitable for packet radio network. The distinctive features of the former are simplicity and flexibility. The latter is different from other studies in that collision detection is not required, and that it has small mean and variance of voice packet delay. Performance characteristics of both of these protocols are examined by simulations under various system parameter values.
Applied Science, Faculty of
Electrical and Computer Engineering, Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

20

Gruhn, Michael [Verfasser], and Felix [Gutachter] Freiling. "Forensically Sound Data Acquisition in the Age of Anti-Forensic Innocence / Michael Gruhn ; Gutachter: Felix Freiling." Erlangen : Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 2016. http://d-nb.info/1122350279/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Wijnans, Hortense. "The body as a spatial sound generating instrument : defining the three dimensional data interpreting methodology (3DIM)." Thesis, Bath Spa University, 2010. http://researchspace.bathspa.ac.uk/1483/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Holmes, Jason. "Measuring the accuracy of four attributes of sound for conveying changes in a large data set." Thesis, University of North Texas, 2003. https://digital.library.unt.edu/ark:/67531/metadc4154/.

Full text

Abstract:

Human auditory perception is suited to receiving and interpreting information from the environment but this knowledge has not been used extensively in designing computer-based information exploration tools. It is not known which aspects of sound are useful for accurately conveying information in an auditory display. An auditory display was created using PD, a graphical programming language used primarily to manipulate digital sound. The interface for the auditory display was a blank window. When the cursor is moved around in this window, the sound generated would changed based on the underlying data value at any given point. An experiment was conducted to determine which attribute of sound most accurately represents data values in an auditory display. The four attributes of sound tested were frequency-sine waveform, frequency-sawtooth waveform, loudness and tempo. 24 subjects were given the task of finding the highest data point using sound alone using each of the four sound treatments. Three dependent variables were measured: distance accuracy, numeric accuracy, and time on task. Repeated measures ANOVA procedures conducted on these variables did not rise to the level of statistical significance (α=.05). None of the sound treatments was more accurate than the other as representing the underlying data values. 52% of the trials were accurate within 50 pixels of the highest data point (target). An interesting finding was the tendency for the frequency-sin waveform to be used in the least accurate trial attempts (38%). Loudness, on the other hand, accounted for very few (12.5%) of the least accurate trial attempts. In completing the experimental task, four different search techniques were employed by the subjects: perimeter, parallel sweep, sector, and quadrant. The perimeter technique was the most commonly used.

APA, Harvard, Vancouver, ISO, and other styles

23

Ballora, Mark. "Data analysis through auditory display : applications in heart rate variability." Thesis, McGill University, 2000. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=35463.

Full text

Abstract:

This thesis draws from music technology to create novel sonifications of heart rate information that may be of clinical utility to physicians. Current visually-based methods of analysis involve filtering the data, so that by definition some aspects are illuminated at the expense of others, which are decimated. However, earlier research has demonstrated the suitability of the auditory system for following multiple streams of information. With this in mind, sonification may offer a means to display a potentially unlimited number of signal processing operations simultaneously, allowing correlations among various analytical techniques to be observed. This study proposes a flexible listening environment in which a cardiologist or researcher may adjust the rate of playback and relative levels of several parallel sonifications that represent different processing operations. Each sonification "track" is meant to remain perceptually segregated so that the listener may create an optimal audio mix. A distinction is made between parameters that are suited for illustrating information and parameters that carry less perceptual weight, which are employed as stream separators. The proposed sonification model is assessed with a perception test in which participants are asked to identify four different cardiological conditions by auditory and visual displays. The results show a higher degree of accuracy in the identification of obstructive sleep apnea by the auditory displays than by visual displays. The sonification model is then fine-tuned to reflect unambiguously the oscillatory characteristics of sleep apnea that may not be evident from a visual representation. Since the identification of sleep apnea through the heart rate is a current priority in cardiology, it is thus feasible that sonification could become a valuable component in apnea diagnosis.

APA, Harvard, Vancouver, ISO, and other styles

24

Klinkradt, Bradley Hugh. "An investigation into the application of the IEEE 1394 high performance serial bus to sound installation contro." Thesis, Rhodes University, 2003. http://hdl.handle.net/10962/d1004899.

Full text

Abstract:

This thesis investigates the feasibility of using existing IP-based control and monitoring protocols within professional audio installations utilising IEEE 1394 technology. Current control and monitoring technologies are examined, and the characteristics common to all are extracted and compiled into an object model. This model forms the foundation for a set of evaluation criteria against which current and future control and monitoring protocols may be measured. Protocols considered include AV/C, MIDI, QSC-24, and those utilised within the UPnP architecture. As QSC-24 and the UPnP architecture are IP-based, the facilities required to transport IP datagrams over the IEEE 1394 bus are investigated and implemented. Example QSC-24 and UPnP architecture implementations are described, which permit the control and monitoring of audio devices over the IEEE 1394 network using these IP-based technologies. The way forward for the control and monitoring of professional audio devices within installations is considered, and recommendations are provided.
KMBT_363
Adobe Acrobat 9.54 Paper Capture Plug-in

APA, Harvard, Vancouver, ISO, and other styles

25

Andersson, López Lisa. "SENSITIV – Mapping Design of Movement Data to Sound Parameters when Creating a Sonic Interaction Design Tool for Interactive Dance." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280699.

Full text

Abstract:

Technology has during the last decades been adopted into the dance art form which has appeared as interactive dance. Many studies and performances have been conducted investigating this merging of dance and technology and the mapping of the motion data to other modalities. But none have previously explored how the introduction of technology affects the mutually interdependent relationship, the co-play, between a dancer and a live musician in a completely live setting. This thesis specifically explores this novel stting by investigating which sound parameters of a live drummer’s sound a dancer should be able to manipulate, through the usage of motion tracking sensors, to alter the dancer’s experience positively in comparison of without the usage of the tool. For this purpose, two studies have been conducted. First, a development study to create a prototype from the first person perspective of a professional dancer and choreographer. Second, an evaluative study was conducted to evaluate the applicability of the prototype and the experience of the manipulation of the sound parameters chosen, on a larger group of professional dancers. The studies showed that the sound parameters of delay and pitch altered a dance experience most positively. This thesis further shows that it is important for the user to get enough time to truly get to know the interactions allowed by the system, to actually be able to evaluate experience of the sound parameters.
Teknik har under de senaste decennierna anammats in i danskonsten vilket har framträtt som interaktiv dans. Många studier och föreställningar har genomförts för att undersöka denna sammanslagning av dans och teknik, och mappningen av rörelsedata till andra modaliteter. Men ingen har tidigare undersökt hur introduktionen av teknik påverkar samspelet, den ömsesidigt beroende relationen, mellan en dansare och en live-musiker i en fullständigt live inramning. Den här avhandlingen utforskar specifikt denna nya inramning genom att undersöka vilka ljudparametrar av en live-trummis ljud en dansare ska kunna manipulera genom användning av rörelsesensorer, för att förändra en dansarens upplevelse positivt i jämförelse med utan användning av verktyget. För detta ändamål har två studier genomförts. Först en utvecklingsstudie för att skapa en prototyp från ett förstapersonsperspektiv av en professionell dansare och koreograf. Sedan genomfördes en utvärderingsstudie för att utvärdera användbarheten av prototypen och upplevelsen av att manipulera de valda ljudparametrarna, på en större grupp professionella dansare. Studierna visade att ljudparametrarna av fördröjning och tonhöjd förändrade en dansupplevelse mest positivt. Denna avhandling visar vidare att det är viktig för användaren att få tillräckligt med tid för att verkligen lära känna de interaktioner som systemet tillåter, för att faktiskt kunna utvärdera upplevelsen av själva ljudparametrarna.

APA, Harvard, Vancouver, ISO, and other styles

26

Rutz, Hanns Holger. "Tracing the compositional process : sound art that rewrites its own past : formation, praxis and a computer framework." Thesis, University of Plymouth, 2014. http://hdl.handle.net/10026.1/3116.

Full text

Abstract:

The domain of this thesis is electroacoustic computer-based music and sound art. It investigates a facet of composition which is often neglected or ill-defined: the process of composing itself and its embedding in time. Previous research mostly focused on instrumental composition or, when electronic music was included, the computer was treated as a tool which would eventually be subtracted from the equation. The aim was either to explain a resultant piece of music by reconstructing the intention of the composer, or to explain human creativity by building a model of the mind. Our aim instead is to understand composition as an irreducible unfolding of material traces which takes place in its own temporality. This understanding is formalised as a software framework that traces creation time as a version graph of transactions. The instantiation and manipulation of any musical structure implemented within this framework is thereby automatically stored in a database. Not only can it be queried ex post by an external researcher—providing a new quality for the empirical analysis of the activity of composing—but it is an integral part of the composition environment. Therefore it can recursively become a source for the ongoing composition and introduce new ways of aesthetic expression. The framework aims to unify creation and performance time, fixed and generative composition, human and algorithmic “writing”, a writing that includes indeterminate elements which condense as concurrent vertices in the version graph. The second major contribution is a critical epistemological discourse on the question of ob- servability and the function of observation. Our goal is to explore a new direction of artistic research which is characterised by a mixed methodology of theoretical writing, technological development and artistic practice. The form of the thesis is an exercise in becoming process-like itself, wherein the epistemic thing is generated by translating the gaps between these three levels. This is my idea of the new aesthetics: That through the operation of a re-entry one may establish a sort of process “form”, yielding works which go beyond a categorical either “sound-in-itself” or “conceptualism”. Exemplary processes are revealed by deconstructing a series of existing pieces, as well as through the successful application of the new framework in the creation of new pieces.

APA, Harvard, Vancouver, ISO, and other styles

27

Björklund, Staffan. "Normativa data för samband mellan subglottalt tryck och ljudtrycksnivå." Thesis, Uppsala universitet, Logopedi, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-217727.

Full text

Abstract:

Syftet med föreliggande studie var att undersöka sambandet mellan subglottalt tryck och ljudtrycksnivå, och undersöka inverkan av kön och tonhöjd. Röststyrkan är starkt beroende av subglottalt tryck och sambandet har analyserats i ett flertal studier, som alla visar på ett linjärt förhållande mellan logaritmen av subglottala trycket och ljudtrycksnivån (SPL). Schutte (1980) analyserade sambandet för 21 kvinnor och 24 män som producerade ett stort antal mätdata med olika SPL och tonhöjd. Tryck mättes med ballong i esofagus. Tanaka & Gould (1982) analyserade 4 kvinnor och 6 män på tre ljudnivåer på bekväm tonhöjd; svag, normal och stark ljudstyrka. Subglottalt tryck uppmättes med pletysmograf, med försökspersonen sittandes i lufttät box. Pressad fonation karaktäriseras av ett högt subglottalt tryck och en förhållandevis låg SPL, så sambandet mellan tryck och SPL torde påverkas av glottal adduktion och troligen också av tonhöjd. Därför borde normativa data från röstfriska personer vara av intresse. I denna studie producerade 16 kvinnor och 15 män sekvenser av stavelsen [pæ] på fyra tonhöjder, jämnt fördelade över en oktav. Regressionsanalys användes för att approximera sambandet mellan SPL och logaritmen av subglottalt tryck samt för att beräkna genomsnittlig SPL-ökning för dubblat subglottalt tryck och beräknad SPL vid 10 cm H2O. Resultatet visar som väntat ett mycket starkt samband mellan subglottalt tryck och ljudtrycksnivån, med en korrelationskoefficient på 0.835 respektive 0.826 för kvinnor respektive män. En fördubbling av subglottalt tryck gav en genomsnittlig SPL-ökning av 11,5 dB (SD 3.8) för kvinnor och 10,0 dB (SD 2,7) för män. Skillnaden mellan kvinnor och män var här signifikant, vilket ger stöd för att använda separativa normativa värden för kvinnor och män. Genomsnittlig beräknad nivå vid 10 cm H2O var 83,6 dB (SD 3,9) för kvinnor och 82,2 dB (SD 4,6) för män beräknat till 15 cm mikrofonavstånd. Sambandet mellan subglottalt tryck och SPL skiljde sig en aning beroende på tonhöjd, men skillnaden var ej signifikant. Trots de relativt höga standardavvikelserna tyder resultatet på att det vore värt att studera i vad mån avvikelser från de potentiellt normativa värdena skulle kunna vara tecken på någon slags fonatorisk dysfunktion.
The purpose of this study was to examine the relationship between subglottal pressure and sound pressure level (SPL), and to study the importance of gender and fundamental frequency in this relationship. Vocal loudness is strongly dependent on subglottal pressure. The relation between them has been analyzed in several investigations, all showing a linear relationship between the SPL and the log of the pressure. For example, Schutte (1980) analyzed the relation in 21 female and 24 male subjects who produced a great number of samples at different degrees of vocal loudness and at the subjects’ preferred pitch. Pressure was measured by means of an esophageal balloon. Tanaka and Gould (1982) analyzed 10 subjects each producing vowels at three loudness levels at comfortable pitch. Pressure data were obtained from a plethysmograph, with the subject sitting in an airtight box. Pressed phonation is characterized by a high subglottal pressure producing a comparatively low SPL, so hence the pressure – SPL relationship would be affected by glottal adduction and possibly also by F0. Therefore normative data from healthy voices should be of interest. In the present study 16 female and 15 male normal voices were asked to produce diminuendo and crescendo sequences of the syllable [pæ] at four pitches, equidistantly spaced within an octave. Trendlines were used to approximate the relation between SPL and the log of subglottal pressure. The resulting regression equations were used to calculate the average SPL increase for doubling of pressure and the SPL produced by a pressure of 10 cm H2O. The results showed an average correlation coefficient of 0.835 and 0.826 for female and male subjects. A doubled pressure produced an SPL increase of 11.5 dB (SD 3.8) and 10.0 dB (SD 2.7) for the female and the male voices. The difference between female and male voices was significant, which supports use of separate normative values for female and male voices. On average, a subglottal pressure of 10 cm H2O produced an SPL @ 0.15 m of 83.6 dB (SD 3.9) and 82.2 dB (SD 4.6) for the female and the male voices. The relationship between subglottal pressure and SPL depended somewhat on fundamental frequency, but the difference was not significant. In spite of the relatively high standard errors the results indicate that it would be worth to study in what extent differences from the potentially normative values of this study may be a sign of some sort of phonatory dysfunction.

APA, Harvard, Vancouver, ISO, and other styles

28

Melih, Kathy, and n/a. "Audio Source Separation Using Perceptual Principles for Content-Based Coding and Information Management." Griffith University. School of Information Technology, 2004. http://www4.gu.edu.au:8080/adt-root/public/adt-QGU20050114.081327.

Full text

Abstract:

The information age has brought with it a dual problem. In the first place, the ready access to mechanisms to capture and store vast amounts of data in all forms (text, audio, image and video), has resulted in a continued demand for ever more efficient means to store and transmit this data. In the second, the rapidly increasing store demands effective means to structure and access the data in an efficient and meaningful manner. In terms of audio data, the first challenge has traditionally been the realm of audio compression research that has focused on statistical, unstructured audio representations that obfuscate the inherent structure and semantic content of the underlying data. This has only served to further complicate the resolution of the second challenge resulting in access mechanisms that are either impractical to implement, too inflexible for general application or too low level for the average user. Thus, an artificial dichotomy has been created from what is in essence a dual problem. The founding motivation of this thesis is that, although the hypermedia model has been identified as the ideal, cognitively justified method for organising data, existing audio data representations and coding models provide little, if any, support for, or resemblance to, this model. It is the contention of the author that any successful attempt to create hyperaudio must resolve this schism, addressing both storage and information management issues simultaneously. In order to achieve this aim, an audio representation must be designed that provides compact data storage while, at the same time, revealing the inherent structure of the underlying data. Thus it is the aim of this thesis to present a representation designed with these factors in mind. Perhaps the most difficult hurdle in the way of achieving the aims of content-based audio coding and information management is that of auditory source separation. The MPEG committee has noted this requirement during the development of its MPEG-7 standard, however, the mechanics of "how" to achieve auditory source separation were left as an open research question. This same committee proposed that MPEG-7 would "support descriptors that can act as handles referring directly to the data, to allow manipulation of the multimedia material." While meta-data tags are a part solution to this problem, these cannot allow manipulation of audio material down to the level of individual sources when several simultaneous sources exist in a recording. In order to achieve this aim, the data themselves must be encoded in such a manner that allows these descriptors to be formed. Thus, content-based coding is obviously required. In the case of audio, this is impossible to achieve without effecting auditory source separation. Auditory source separation is the concern of computational auditory scene analysis (CASA). However, the findings of CASA research have traditionally been restricted to a limited domain. To date, the only real application of CASA research to what could loosely be classified as information management has been in the area of signal enhancement for automatic speech recognition systems. In these systems, a CASA front end serves as a means of separating the target speech from the background "noise". As such, the design of a CASA-based approach, as presented in this thesis, to one of the most significant challenges facing audio information management research represents a significant contribution to the field of information management. Thus, this thesis unifies research from three distinct fields in an attempt to resolve some specific and general challenges faced by all three. It describes an audio representation that is based on a sinusoidal model from which low-level auditory primitive elements are extracted. The use of a sinusoidal representation is somewhat contentious with the modern trend in CASA research tending toward more complex approaches in order to resolve issues relating to co-incident partials. However, the choice of a sinusoidal representation has been validated by the demonstration of a method to resolve many of these issues. The majority of the thesis contributes several algorithms to organise the low-level primitives into low-level auditory objects that may form the basis of nodes or link anchor points in a hyperaudio structure. Finally, preliminary investigations in the representations suitability for coding and information management tasks are outlined as directions for future research.

APA, Harvard, Vancouver, ISO, and other styles

29

Melih, Kathy. "Audio Source Separation Using Perceptual Principles for Content-Based Coding and Information Management." Thesis, Griffith University, 2004. http://hdl.handle.net/10072/366279.

Full text

Abstract:

The information age has brought with it a dual problem. In the first place, the ready access to mechanisms to capture and store vast amounts of data in all forms (text, audio, image and video), has resulted in a continued demand for ever more efficient means to store and transmit this data. In the second, the rapidly increasing store demands effective means to structure and access the data in an efficient and meaningful manner. In terms of audio data, the first challenge has traditionally been the realm of audio compression research that has focused on statistical, unstructured audio representations that obfuscate the inherent structure and semantic content of the underlying data. This has only served to further complicate the resolution of the second challenge resulting in access mechanisms that are either impractical to implement, too inflexible for general application or too low level for the average user. Thus, an artificial dichotomy has been created from what is in essence a dual problem. The founding motivation of this thesis is that, although the hypermedia model has been identified as the ideal, cognitively justified method for organising data, existing audio data representations and coding models provide little, if any, support for, or resemblance to, this model. It is the contention of the author that any successful attempt to create hyperaudio must resolve this schism, addressing both storage and information management issues simultaneously. In order to achieve this aim, an audio representation must be designed that provides compact data storage while, at the same time, revealing the inherent structure of the underlying data. Thus it is the aim of this thesis to present a representation designed with these factors in mind. Perhaps the most difficult hurdle in the way of achieving the aims of content-based audio coding and information management is that of auditory source separation. The MPEG committee has noted this requirement during the development of its MPEG-7 standard, however, the mechanics of "how" to achieve auditory source separation were left as an open research question. This same committee proposed that MPEG-7 would "support descriptors that can act as handles referring directly to the data, to allow manipulation of the multimedia material." While meta-data tags are a part solution to this problem, these cannot allow manipulation of audio material down to the level of individual sources when several simultaneous sources exist in a recording. In order to achieve this aim, the data themselves must be encoded in such a manner that allows these descriptors to be formed. Thus, content-based coding is obviously required. In the case of audio, this is impossible to achieve without effecting auditory source separation. Auditory source separation is the concern of computational auditory scene analysis (CASA). However, the findings of CASA research have traditionally been restricted to a limited domain. To date, the only real application of CASA research to what could loosely be classified as information management has been in the area of signal enhancement for automatic speech recognition systems. In these systems, a CASA front end serves as a means of separating the target speech from the background "noise". As such, the design of a CASA-based approach, as presented in this thesis, to one of the most significant challenges facing audio information management research represents a significant contribution to the field of information management. Thus, this thesis unifies research from three distinct fields in an attempt to resolve some specific and general challenges faced by all three. It describes an audio representation that is based on a sinusoidal model from which low-level auditory primitive elements are extracted. The use of a sinusoidal representation is somewhat contentious with the modern trend in CASA research tending toward more complex approaches in order to resolve issues relating to co-incident partials. However, the choice of a sinusoidal representation has been validated by the demonstration of a method to resolve many of these issues. The majority of the thesis contributes several algorithms to organise the low-level primitives into low-level auditory objects that may form the basis of nodes or link anchor points in a hyperaudio structure. Finally, preliminary investigations in the representation’s suitability for coding and information management tasks are outlined as directions for future research.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Information Technology
Full Text

APA, Harvard, Vancouver, ISO, and other styles

30

Reis, Sofia Ester Pereira. "Expanding the magic circle in pervasive casual play." Doctoral thesis, Faculdade de Ciências e Tecnologia, 2013. http://hdl.handle.net/10362/11352.

Full text

Abstract:

Dissertação para obtenção do Grau de Doutor em Informática
In this document we present proposals for merging the fictional game world with the real world taking into account the profile of casual players. To merge games with reality we resorted to the creation of games that explore di-verse real world elements. We focused on sound, video, physiological data, ac-celerometer data, weather and location. We made the choice for these real world elements because data, about those elements, can be acquired making use of functionality already available, or foreseen in the near future, in devices like computers or mobile phones, thus fitting the profile of casual players who are usually not willing to invest in expensive or specialized hardware just for the sake of playing a game. By resorting to real world elements, the screen is no longer the only focus of the player’s attention because reality also influences the outcome of the game. Here, we describe how the insertion of real world elements affected the role of the screen as the primary focus of the player’s attention. Games happen inside a magic circle that spatially and temporally delimits the game from the ordinary world. J. Huizinga, the inventor of the magic circle concept, also leaves implicit a social demarcation, separating who is playing the game from who is not playing the game [1]. In this document, we show how the insertion of real world elements blurred the spatial, temporal and social limits, in our games. Through this fusion with the ordinary world, the fictional game world integrates with reality, instead of being isolated from it. We also present an analysis about integration with the real world and context data in casual en-tertainment.
Fundação para a Ciência e a Tecnologia - grant SFRH/BD/61085/2009

APA, Harvard, Vancouver, ISO, and other styles

31

Fonseca, Eduardo. "Training sound event classifiers using different types of supervision." Doctoral thesis, Universitat Pompeu Fabra, 2021. http://hdl.handle.net/10803/673067.

Full text

Abstract:

The automatic recognition of sound events has gained attention in the past few years, motivated by emerging applications in fields such as healthcare, smart homes, or urban planning. When the work for this thesis started, research on sound event classification was mainly focused on supervised learning using small datasets, often carefully annotated with vocabularies limited to specific domains (e.g., urban or domestic). However, such small datasets do not support training classifiers able to recognize hundreds of sound events occurring in our everyday environment, such as kettle whistles, bird tweets, cars passing by, or different types of alarms. At the same time, large amounts of environmental sound data are hosted in websites such as Freesound or YouTube, which can be convenient for training large-vocabulary classifiers, particularly using data-hungry deep learning approaches. To advance the state-of-the-art in sound event classification, this thesis investigates several strands of dataset creation as well as supervised and unsupervised learning to train large-vocabulary sound event classifiers, using different types of supervision in novel and alternative ways. Specifically, we focus on supervised learning using clean and noisy labels, as well as self-supervised representation learning from unlabeled data. The first part of this thesis focuses on the creation of FSD50K, a large-vocabulary dataset with over 100h of audio manually labeled using 200 classes of sound events. We provide a detailed description of the creation process and a comprehensive characterization of the dataset. In addition, we explore architectural modifications to increase shift invariance in CNNs, improving robustness to time/frequency shifts in input spectrograms. In the second part, we focus on training sound event classifiers using noisy labels. First, we propose a dataset that supports the investigation of real label noise. Then, we explore network-agnostic approaches to mitigate the effect of label noise during training, including regularization techniques, noise-robust loss functions, and strategies to reject noisy labeled examples. Further, we develop a teacher-student framework to address the problem of missing labels in sound event datasets. In the third part, we propose algorithms to learn audio representations from unlabeled data. In particular, we develop self-supervised contrastive learning frameworks, where representations are learned by comparing pairs of examples computed via data augmentation and automatic sound separation methods. Finally, we report on the organization of two DCASE Challenge Tasks on automatic audio tagging with noisy labels. By providing data resources as well as state-of-the-art approaches and audio representations, this thesis contributes to the advancement of open sound event research, and to the transition from traditional supervised learning using clean labels to other learning strategies less dependent on costly annotation efforts.
El interés en el reconocimiento automático de eventos sonoros se ha incrementado en los últimos años, motivado por nuevas aplicaciones en campos como la asistencia médica, smart homes, o urbanismo. Al comienzo de esta tesis, la investigación en clasificación de eventos sonoros se centraba principalmente en aprendizaje supervisado usando datasets pequeños, a menudo anotados cuidadosamente con vocabularios limitados a dominios específicos (como el urbano o el doméstico). Sin embargo, tales datasets no permiten entrenar clasificadores capaces de reconocer los cientos de eventos sonoros que ocurren en nuestro entorno, como silbidos de kettle, sonidos de pájaros, coches pasando, o diferentes alarmas. Al mismo tiempo, websites como Freesound o YouTube albergan grandes cantidades de datos de sonido ambiental, que pueden ser útiles para entrenar clasificadores con un vocabulario más extenso, particularmente utilizando métodos de deep learning que requieren gran cantidad de datos. Para avanzar el estado del arte en la clasificación de eventos sonoros, esta tesis investiga varios aspectos de la creación de datasets, así como de aprendizaje supervisado y no supervisado para entrenar clasificadores de eventos sonoros con un vocabulario extenso, utilizando diferentes tipos de supervisión de manera novedosa y alternativa. En concreto, nos centramos en aprendizaje supervisado usando etiquetas sin ruido y con ruido, así como en aprendizaje de representaciones auto-supervisado a partir de datos no etiquetados. La primera parte de esta tesis se centra en la creación de FSD50K, un dataset con más de 100h de audio etiquetado manualmente usando 200 clases de eventos sonoros. Presentamos una descripción detallada del proceso de creación y una caracterización exhaustiva del dataset. Además, exploramos modificaciones arquitectónicas para aumentar la invariancia frente a desplazamientos en CNNs, mejorando la robustez frente a desplazamientos de tiempo/frecuencia en los espectrogramas de entrada. En la segunda parte, nos centramos en entrenar clasificadores de eventos sonoros usando etiquetas con ruido. Primero, proponemos un dataset que permite la investigación del ruido de etiquetas real. Después, exploramos métodos agnósticos a la arquitectura de red para mitigar el efecto del ruido en las etiquetas durante el entrenamiento, incluyendo técnicas de regularización, funciones de coste robustas al ruido, y estrategias para rechazar ejemplos etiquetados con ruido. Además, desarrollamos un método teacher-student para abordar el problema de las etiquetas ausentes en datasets de eventos sonoros. En la tercera parte, proponemos algoritmos para aprender representaciones de audio a partir de datos sin etiquetar. En particular, desarrollamos métodos de aprendizaje contrastivos auto-supervisados, donde las representaciones se aprenden comparando pares de ejemplos calculados a través de métodos de aumento de datos y separación automática de sonido. Finalmente, reportamos sobre la organización de dos DCASE Challenge Tasks para el tageado automático de audio a partir de etiquetas ruidosas. Mediante la propuesta de datasets, así como de métodos de vanguardia y representaciones de audio, esta tesis contribuye al avance de la investigación abierta sobre eventos sonoros y a la transición del aprendizaje supervisado tradicional utilizando etiquetas sin ruido a otras estrategias de aprendizaje menos dependientes de costosos esfuerzos de anotación.

APA, Harvard, Vancouver, ISO, and other styles

32

Thibault, François. "High-level control of singing voice timbre transformations." Thesis, McGill University, 2004. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=81514.

Full text

Abstract:

The sustained increase in computing performance over the last decades has brought enough computing power to perform significant audio processing in affordable personal computers. Following this revolution, we have witnessed a series of improvements in sound transformation techniques and the introduction of numerous digital audio effects to modify effectively the time, pitch, and loudness dimensions of audio signals. Due to the complex and multi-dimensional nature of timbre however, it is significantly more difficult to achieve meaningful and convincing qualitative transformations. The tools currently available for timbre modifications (e.g. equalizers) do not operate along perceptually meaningful axes of singing voice timbre (e.g. breathiness, roughness, etc.) resulting in a transformation control problem. One of the goals of this work is to examine more intuitive procedures to achieve high-fidelity qualitative transformations explicitly controlling certain dimensions of singing voice timbre. Quantitative measurements (i.e. voice timbre descriptors) are introduced and used as high-level controls in an adaptive processing system dependent on the characteristics observed in the input signal.
The transformation methods use a harmonic plus noise representation from which voice timbre descriptors are derived. This higher-level representation, closer to our perception of voice timbre, offers more intuitive controls over timbre transformations. The topics of parametric voice modeling and timbre descriptor computation are first introduced, followed by a study of the acoustical impacts of voice breathiness variations. A timbre transformation system operating specifically on the singing voice quality is then introduced with accompanying software implementations, including an example digital audio effect for the control and modification of the breathiness quality on normal voices.

APA, Harvard, Vancouver, ISO, and other styles

33

Alharbi, Saad Talal. "Graphical and non-speech sound metaphors in email browsing : an empirical approach : a usability based study investigating the role of incorporating visual and non-speech sound metaphors to communicate email data and threads." Thesis, University of Bradford, 2009. http://hdl.handle.net/10454/4244.

Full text

Abstract:

This thesis investigates the effect of incorporating various information visualisation techniques and non-speech sounds (i.e. auditory icons and earcons) in email browsing. This empirical work consisted of three experimental phases. The first experimental phase aimed at finding out the most usable visualisation techniques for presenting email information. This experiment involved the development of two experimental email visualisation approaches which were called LinearVis and MatrixVis. These approaches visualised email messages based on a dateline together with various types of email information such as the time and the senders. The findings of this experiment were used as a basis for the development of a further email visualisation approach which was called LinearVis II. This novel approach presented email data based on multi-coordinated views. The usability of messages retrieval in this approach was investigated and compared to a typical email client in the second experimental phase. Users were required to retrieve email messages in the two experiments with the provided relevant information such as the subject, status and priority. The third experimental phase aimed at exploring the usability of retrieving email messages by using other type of email data, particularly email threads. This experiment investigated the synergic use of graphical representations with non-speech sounds (Multimodal Metaphors), graphical representations and textual display to present email threads and to communicate contextual information about email threads. The findings of this empirical study demonstrated that there is a high potential for using information visualisation techniques and non-speech sounds (i.e. auditory icons and earcons) to improve the usability of email message retrieval. Furthermore, the thesis concludes with a set of empirically derived guidelines for the use of information visualisation techniques and non-speech sound to improve email browsing.

APA, Harvard, Vancouver, ISO, and other styles

34

Alharbi, Saad T. "Graphical and Non-speech Sound Metaphors in Email Browsing: An Empirical Approach. A Usability Based Study Investigating the Role of Incorporating Visual and Non-Speech Sound Metaphors to Communicate Email Data and Threads." Thesis, University of Bradford, 2009. http://hdl.handle.net/10454/4244.

Full text

Abstract:

This thesis investigates the effect of incorporating various information visualisation techniques and non-speech sounds (i.e. auditory icons and earcons) in email browsing. This empirical work consisted of three experimental phases. The first experimental phase aimed at finding out the most usable visualisation techniques for presenting email information. This experiment involved the development of two experimental email visualisation approaches which were called LinearVis and MatrixVis. These approaches visualised email messages based on a dateline together with various types of email information such as the time and the senders. The findings of this experiment were used as a basis for the development of a further email visualisation approach which was called LinearVis II. This novel approach presented email data based on multi-coordinated views. The usability of messages retrieval in this approach was investigated and compared to a typical email client in the second experimental phase. Users were required to retrieve email messages in the two experiments with the provided relevant information such as the subject, status and priority. The third experimental phase aimed at exploring the usability of retrieving email messages by using other type of email data, particularly email threads. This experiment investigated the synergic use of graphical representations with non-speech sounds (Multimodal Metaphors), graphical representations and textual display to present email threads and to communicate contextual information about email threads. The findings of this empirical study demonstrated that there is a high potential for using information visualisation techniques and non-speech sounds (i.e. auditory icons and earcons) to improve the usability of email message retrieval. Furthermore, the thesis concludes with a set of empirically derived guidelines for the use of information visualisation techniques and non-speech sound to improve email browsing.
Taibah University in Medina and the Ministry of Higher Education in Saudi Arabia.

APA, Harvard, Vancouver, ISO, and other styles

35

Tsiros, Augoustinos. "A multidimensional sketching interface for visual interaction with corpus-based concatenative sound synthesis." Thesis, Edinburgh Napier University, 2016. http://researchrepository.napier.ac.uk/Output/463438.

Full text

Abstract:

The present research sought to investigate the correspondence between auditory and visual feature dimensions and to utilise this knowledge in order to inform the design of audio-visual mappings for visual control of sound synthesis. The first stage of the research involved the design and implementation of Morpheme, a novel interface for interaction with corpus-based concatenative synthesis. Morpheme uses sketching as a model for interaction between the user and the computer. The purpose of the system is to facilitate the expression of sound design ideas by describing the qualities of the sound to be synthesised in visual terms, using a set of perceptually meaningful audio-visual feature associations. The second stage of the research involved the preparation of two multidimensional mappings for the association between auditory and visual dimensions. The third stage of this research involved the evaluation of the Audio-Visual (A/V) mappings and of Morpheme's user interface. The evaluation comprised two controlled experiments, an online study and a user study. Our findings suggest that the strength of the perceived correspondence between the A/V associations prevails over the timbre characteristics of the sounds used to render the complementary polar features. Hence, the empirical evidence gathered by previous research is generalizable/ applicable to different contexts and the overall dimensionality of the sound used to render should not have a very significant effect on the comprehensibility and usability of an A/V mapping. However, the findings of the present research also show that there is a non-linear interaction between the harmonicity of the corpus and the perceived correspondence of the audio-visual associations. For example, strongly correlated cross-modal cues such as size-loudness or vertical position-pitch are affected less by the harmonicity of the audio corpus in comparison to weaker correlated dimensions (e.g. texture granularity-sound dissonance). No significant differences were revealed as a result of musical/audio training. The third study consisted of an evaluation of Morpheme's user interface were participants were asked to use the system to design a sound for a given video footage. The usability of the system was found to be satisfactory. An interface for drawing visual queries was developed for high level control of the retrieval and signal processing algorithms of concatenative sound synthesis. This thesis elaborates on previous research findings and proposes two methods for empirically driven validation of audio-visual mappings for sound synthesis. These methods could be applied to a wide range of contexts in order to inform the design of cognitively useful multi-modal interfaces and representation and rendering of multimodal data. Moreover this research contributes to the broader understanding of multimodal perception by gathering empirical evidence about the correspondence between auditory and visual feature dimensions and by investigating which factors affect the perceived congruency between aural and visual structures.

APA, Harvard, Vancouver, ISO, and other styles

36

Lui, Siu-Hang. "MIDI to SP-MIDI and I-melody transcoding using phrase stealing /." View abstract or full-text, 2005. http://library.ust.hk/cgi/db/thesis.pl?COMP%202005%20LUI.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Perry, Michael D. "Value aided satellite altimetry data for weapon presets." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2003. http://library.nps.navy.mil/uhtbin/hyperion-image/03Jun%5FPerry.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Watkins, Gregory Shroll. "A framework for interpreting noisy, two-dimensional images, based on a fuzzification of programmed, attributed graph grammars." Thesis, Rhodes University, 1998. http://hdl.handle.net/10962/d1004862.

Full text

Abstract:

This thesis investigates a fuzzy syntactic approach to the interpretation of noisy two-dimensional images. This approach is based on a modification of the attributed graph grammar formalism to utilise fuzzy membership functions in the applicability predicates. As far as we are aware, this represents the first such modification of graph grammars. Furthermore, we develop a method for programming the resultant fuzzy attributed graph grammars through the use of non-deterministic control diagrams. To do this, we modify the standard programming mechanism to allow it to cope with the fuzzy certainty values associated with productions in our grammar. Our objective was to develop a flexible framework which can be used for the recognition of a wide variety of image classes, and which is adept at dealing with noise in these images. Programmed graph grammars are specifically chosen for the ease with which they allow one to specify a new two-dimensional image class. We implement a prototype system for Optical Music Recognition using our framework. This system allows us to test the capabilities of the framework for coping with noise in the context of handwritten music score recognition. Preliminary results from the prototype system show that the framework copes well with noisy images.

APA, Harvard, Vancouver, ISO, and other styles

39

Eisenberg, Gunnar. "Identifikation und Klassifikation von Musikinstrumentenklängen in monophoner und polyphoner Musik." Göttingen Cuvillier, 2008. http://d-nb.info/992262933/04.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Mott, Ryan. "Music in motion : the synthesis of album design and motion graphics for downloadable music /." Online version of thesis, 2009. http://hdl.handle.net/1850/10942.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Kursu, Sami. "Adaptiv nivåreglering: Dynamisk expansion av ljudsignaler i en reell arbetsmiljö." Thesis, Interactive Institute Piteå, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:ri:diva-24267.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Luck, Rodney K. "On the use of two-dimensional orthogonal function expansions to model ocean bathymetric and sound-speed data in the recursive ray acoustics algorithm." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1995. http://handle.dtic.mil/100.2/ADA303056.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Крючко, Є. В., Юрій Олександрович Зубань, Юрий Александрович Зубань, and Yurii Oleksandrovych Zuban. "Аналіз методів стиску аудіо даних та засобів їх реалізації." Thesis, Видавництво СумДУ, 2010. http://essuir.sumdu.edu.ua/handle/123456789/4015.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Boyle, John K. "Performance Metrics for Depth-based Signal Separation Using Deep Vertical Line Arrays." PDXScholar, 2015. https://pdxscholar.library.pdx.edu/open_access_etds/2198.

Full text

Abstract:

Vertical line arrays (VLAs) deployed below the critical depth in the deep ocean can exploit reliable acoustic path (RAP) propagation, which provides low transmission loss (TL) for targets at moderate ranges, and increased TL for distant interferers. However, sound from nearby surface interferers also undergoes RAP propagation, and without horizontal aperture, a VLA cannot separate these interferers from submerged targets. A recent publication by McCargar and Zurk (2013) addressed this issue, presenting a transform-based method for passive, depth-based separation of signals received on deep VLAs based on the depth-dependent modulation caused by the interference between the direct and surface-reflected acoustic arrivals. This thesis expands on that work by quantifying the transform-based depth estimation method performance in terms of the resolution and ambiguity in the depth estimate. Then, the depth discrimination performance is quantified in terms of the number of VLA elements.

APA, Harvard, Vancouver, ISO, and other styles

45

Mattei, Pietro, and Stefan Stolica. "SoundCubes prototyp : Tillhandahålla ny stimulerande hörselträning för personer med hörselnedsättningar." Thesis, Södertörns högskola, Medieteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:sh:diva-32111.

Full text

Abstract:

Målet med projektet är att skapa en fungerande hörseltränings prototyp som kan vara användbar för en vidareutveckling av träningssystem för personer med nedsatt hörsel. Idén bygger på “The Music Puzzle” (Hansen et al. 2012). Prototypen benämndes SoundCubes och består av tre kuber med olika motiv (Fiducial markörer) på fyra av de sex sidorna. Ett ljudspår delades i tre delar och tilldelades till kuberna. Varje del fragmenterades i fyra nivåer där den första spelar upp ett instrument ända fram till att alla instrumenten hörs. Dessa faser tilldelades till Fiducial markörerna på kubens fyra sidor. Kuberna är slumpmässigt placerade horisontellt framför en kamera som är ansluten till en dator. Alla fyra sidor av kuberna som har motivet fastsatt kommer att spela upp reproduktionen av ett fragment från ljudspåret när de ställs inför kamerans synfält. Endast en specifik kombination av sidorna och horisontell ordning av kuberna kommer att leda till de fullständiga ljudspårskomponenterna och till ett applådljud som signalerar den framgångsrika reproduktionen av det ursprungliga ljudspåret. SoundCubes prototypen utvecklades funktionellt genom TUIO, ReacTIVision och Pure Data. För att kunna fastställa ifall prototypens interaktion fungerade så testades SoundCubes på 15 friska personer utan nedsatt hörsel. Testpersonerna introducerades individuellt till prototypen, och de fick sedan lyssna på det kompletta ljudspåret. Därefter fick de uppgiften att reproducera det kompletta ljudspåret genom att hitta kubernas rätta sidor och horisontella ordning. Av totalt 15 individer klarade 12 av testet inom tidsgränsen (tio minuter). Vi kunde därför komma fram till att prototypen och dess interaktion fungerar, vi kunde även observera att användningen av SoundCubes upplevdes som underhållande och engagerande. Detta första test kan därför lägga grund för vidareutveckling av SoundCubes konceptet för hörselträning på ett interaktivt och underhållande sätt.
The aim of the present project was to create a functional sound-training prototype which may be useful for further development of training systems for individuals with impaired hearing. The idea is based on the concept of “The Music Puzzle” (Hansen et al. 2012). The prototype proposed here was termed SoundCubes and it is composed of three cubes with different motives (Fiducial markers) on four of the six sides. A soundtrack was divided into three parts and was assigned to the cubes. Each part was fragmented into four parties where the first playing an instrument until all the instruments could be heard. These parts were assigned to Fiducial markers on the four sides of the cube. The cubes are randomly placed horizontally in front of a camera attached to a computer. All of the four sides with a motive of the cubes will induce the reproduction of fragments of a soundtrack when exposed to the field of the camera thanks to the specific motives attached on it. Only one specific combination of the sides and horizontal order of the cubes will lead to the complete soundtrack instrumental components and to a clapping sound which signals the successful reproduction of the2initial soundtrack. The SoundCubes prototype was developed functionally through Tuio, ReacTIVision and Pure Data. SoundCubes was tested on 15 healthy individuals without hearing impairments to determine if the prototype’s interaction works. The test individuals were individually introduced to SoundCubes, and they were allowed to hear the complete soundtrack. Thereafter, they were given the task of reproducing the complete track by finding the right side and horizontal order of the cubes. Out of a total of 15 individuals, 12 completed the test successfully within the maximum given time frame (ten minutes).We could therefore conclude that the system works properly, and we also observed that the test was experienced as very entertaining and engaging. this first test puts therefore the basis for further development of the SoundCubes concept to train people’s hearing in an interactive and entertaining way.Kjetil

APA, Harvard, Vancouver, ISO, and other styles

46

Chen, Howard. "AZIP, audio compression system: Research on audio compression, comparison of psychoacoustic principles and genetic algorithms." CSUSB ScholarWorks, 2005. https://scholarworks.lib.csusb.edu/etd-project/2617.

Full text

Abstract:

The purpose of this project is to investigate the differences between psychoacoustic principles and genetic algorithms (GA0). These will be discussed separately. The review will also compare the compression ratio and the quality of the decompressed files decoded by these two methods.

APA, Harvard, Vancouver, ISO, and other styles

47

Ferroudj, Meriem. "Detection of rain in acoustic recordings of the environment using machine learning techniques." Thesis, Queensland University of Technology, 2015. https://eprints.qut.edu.au/82848/1/Meriem_Ferroudj_Thesis.pdf.

Full text

Abstract:

This thesis is concerned with the detection and prediction of rain in environmental recordings using different machine learning algorithms. The results obtained in this research will help ecologists to efficiently analyse environmental data and monitor biodiversity.

APA, Harvard, Vancouver, ISO, and other styles

48

Saeed, Nausheen. "Automated Gravel Road Condition Assessment : A Case Study of Assessing Loose Gravel using Audio Data." Licentiate thesis, Högskolan Dalarna, Institutionen för information och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:du-36402.

Full text

Abstract:

Gravel roads connect sparse populations and provide highways for agriculture and the transport of forest goods. Gravel roads are an economical choice where traffic volume is low. In Sweden, 21% of all public roads are state-owned gravel roads, covering over 20,200 km. In addition, there are some 74,000 km of gravel roads and 210,000 km of forest roads that are owned by the private sector. The Swedish Transport Administration (Trafikverket) rates the condition of gravel roads according to the severity of irregularities (e.g. corrugations and potholes), dust, loose gravel, and gravel cross-sections. This assessment is carried out during the summertime when roads are free of snow. One of the essential parameters for gravel road assessment is loose gravel. Loose gravel can cause a tire to slip, leading to a loss of driver control. Assessment of gravel roads is carried out subjectively by taking images of road sections and adding some textual notes. A cost-effective, intelligent, and objective method for road assessment is lacking. Expensive methods, such as laser profiler trucks, are available and can offer road profiling with high accuracy. These methods are not applied to gravel roads, however, because of the need to maintain cost-efficiency. In this thesis, we explored the idea that, in addition to machine vision, we could also use machine hearing to classify the condition of gravel roads in relation to loose gravel. Several suitable classical supervised learning and convolutional neural networks (CNN) were tested. When people drive on gravel roads, they can make sense of the road condition by listening to the gravel hitting the bottom of the car. The more we hear gravel hitting the bottom of the car, the more we can sense that there is a lot of loose gravel and, therefore, the road might be in a bad condition. Based on this idea, we hypothesized that machines could also undertake such a classification when trained with labeled sound data. Machines can identify gravel and non-gravel sounds. In this thesis, we used traditional machine learning algorithms, such as support vector machines (SVM), decision trees, and ensemble classification methods. We also explored CNN for classifying spectrograms of audio sounds and images in gravel roads. Both supervised learning and CNN were used, and results were compared for this study. In classical algorithms, when compared with other classifiers, ensemble bagged tree (EBT)-based classifiers performed best for classifying gravel and non-gravel sounds. EBT performance is also useful in reducing the misclassification of non-gravel sounds. The use of CNN also showed a 97.91% accuracy rate. Using CNN makes the classification process more intuitive because the network architecture takes responsibility for selecting the relevant training features. Furthermore, the classification results can be visualized on road maps, which can help road monitoring agencies assess road conditions and schedule maintenance activities for a particular road.

Due to unforeseen circumstances the seminar was postponed from May 7 to 28, as duly stated in the new posting page.

APA, Harvard, Vancouver, ISO, and other styles

49

Jonsson, Mårten. "Digital tools for the blind : How to increase navigational capabilities for visually impaired persons." Thesis, Högskolan i Skövde, Institutionen för kommunikation och information, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-9735.

Full text

Abstract:

The development of human-computer interaction (HCI) systems, usable by people withvisual impairments is a progressing field of research. Similarly, the creation of audio-onlygames and digital tools has been investigated somewhat thoroughly, with many interestingresults. This thesis aims to combine the two fields in the creation of an audio-only digital tool,aimed at aiding visually impaired persons to navigate unknown areas. This is done by lookingat the field of HCI systems, and games for blind, and by looking at the concept of mentalmaps and spatial orientation within cognitive science. An application is created, evaluatedand tested based on a set number of criteria. An experiment is performed and the results areevaluated and compared to another digital tool in order to learn more about how to increasethe usability and functionality of digital tools for the visually impaired. The results give astrong indication towards how to best proceed with future research.

APA, Harvard, Vancouver, ISO, and other styles

50

Magnani, Alessandro. "Sonificazione: stato dell'arte e casi di studio." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/24697/.

Full text

Abstract:

Negli ultimi anni, i progressi delle tecnologie di elaborazione digitale dei segnali hanno favorito l’utilizzo del suono all’interno dei sistemi multimediali, non soltanto in virtù delle sue caratteristiche musicali, ma anche come mezzo per rappresentare informazioni e dati più o meno complessi. Sfruttare l’udito in combinazione con la vista viene tanto naturale nella quotidianità che quasi non ci si accorge di quanto questo sistema sia complesso. Oggigiorno poter usufruire di questo strumento anche quando si interagisce con la tecnologia è cruciale. In questi termini la Sonificazione ha reso possibile porre nuovi orizzonti in moltissimi campi scientifici, attraverso l’efficacia e l’efficienza dimostrata nei vari casi di studio. L’obiettivo di questa tesi è di trasmettere al lettore una conoscenza del concetto di Sonificazione, soffermandosi in particolare sugli aspetti tecnici e sui vantaggi e gli svantaggi che essa può offrire. Oltre alle nozioni già presenti nello stato dell’arte, si sono voluti presentare due casi di studio recenti, nel tentativo di illustrare come questa disciplina possa avere un impatto fondamentale nella vita di tutti i giorni.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Sound data'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles