Zaloguj się

Gotowe bibliografie tematyczne / Audio data / Artykuły w czasopismach

Artykuły w czasopismach na temat „Audio data”

Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Audio data.

Autor: Grafiati

Data publikacji: 4 czerwca 2021

Data aktualizacji: 4 marca 2023

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Audio data”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Matsunuma, Yasuhiro. "Audio data processing apparatus and audio data distributing apparatus". Journal of the Acoustical Society of America 124, nr 4 (2008): 1903. http://dx.doi.org/10.1121/1.3001094.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

2

Schuller, Gerald, Matthias Gruhne i Tobias Friedrich. "Fast Audio Feature Extraction From Compressed Audio Data". IEEE Journal of Selected Topics in Signal Processing 5, nr 6 (październik 2011): 1262–71. http://dx.doi.org/10.1109/jstsp.2011.2158802.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

3

Wylie, F. "Digital audio data compression". Electronics & Communication Engineering Journal 7, nr 1 (1.02.1995): 5–10. http://dx.doi.org/10.1049/ecej:19950103.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

4

Seok, Jong Won, i Jin Woo Hong. "Audio watermarking for copyright protection of digital audio data". Electronics Letters 37, nr 1 (2001): 60. http://dx.doi.org/10.1049/el:20010029.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

5

Patil, Adwait. "Covid Classification Using Audio Data". International Journal for Research in Applied Science and Engineering Technology 9, nr 10 (31.10.2021): 1633–37. http://dx.doi.org/10.22214/ijraset.2021.38675.

Pełny tekst źródła

Streszczenie:

Abstract: Coronavirus outbreak has affected the entire world adversely this project has been developed in order to help common masses diagnose their chances of been covid positive just by using coughing sound and basic patient data. Audio classification is one of the most interesting applications of deep learning. Similar to image data audio data is also stored in form of bits and to understand and analyze this audio data we have used Mel frequency cepstral coefficients (MFCCs) which makes it possible to feed the audio to our neural network. In this project we have used Coughvid a crowdsource dataset consisting of 27000 audio files and metadata of same amount of patients. In this project we have used a 1D Convolutional Neural Network (CNN) to process the audio and metadata. Future scope for this project will be a model that rates how likely it is that a person is infected instead of binary classification. Keywords: Audio classification, Mel frequency cepstral coefficients, Convolutional neural network, deep learning, Coughvid

Style APA, Harvard, Vancouver, ISO itp.

6

BASYSTIUK, Oleh, i Nataliia MELNYKOVA. "MULTIMODAL SPEECH RECOGNITION BASED ON AUDIO AND TEXT DATA". Herald of Khmelnytskyi National University. Technical sciences 313, nr 5 (27.10.2022): 22–25. http://dx.doi.org/10.31891/2307-5732-2022-313-5-22-25.

Pełny tekst źródła

Streszczenie:

Systems of machine translation of texts from one language to another simulate the work of a human translator. Their performance depends on the ability to understand the grammar rules of the language. In translation, the basic units are not individual words, but word combinations or phraseological units that express different concepts. Only by using them, more complex ideas can be expressed through the translated text. The main feature of machine translation is different length for input and output. The ability to work with different lengths of input and output provides us with the approach of recurrent neural networks. A recurrent neural network (RNN) is a class of artificial neural network that has connections between nodes. In this case, a connection refers to a connection from a more distant node to a less distant node. The presence of connections allows the RNN to remember and reproduce the entire sequence of reactions to one stimulus. From the point of view of programming, such networks are analogous to cyclic execution, and from the point of view of the system, such networks are equivalent to a state machine. RNNs are commonly used to process word sequences in natural language processing. Usually, a hidden Markov model (HMM) and an N-program language model are used to process a sequence of words. Deep learning has completely changed the approach to machine translation. Researchers in the deep learning field has created simple solutions based on machine learning that outperform the best expert systems. In this paper was reviewed the main features of machine translation based on recurrent neural networks. The advantages of systems based on RNN using the sequence-to-sequence model against statistical translation systems are also highlighted in the article. Two machine translation systems based on the sequence-to-sequence model were constructed using Keras and PyTorch machine learning libraries. Based on the obtained results, libraries analysis was done, and their performance comparison.

Style APA, Harvard, Vancouver, ISO itp.

7

Wu, S., J. Huang, D. Huang i Y. Q. Shi. "Efficiently Self-Synchronized Audio Watermarking for Assured Audio Data Transmission". IEEE Transactions on Broadcasting 51, nr 1 (marzec 2005): 69–76. http://dx.doi.org/10.1109/tbc.2004.838265.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

8

Struthers, Allan. "Radioactive Decay: Audio Data Collection". PRIMUS 19, nr 4 (12.06.2009): 388–95. http://dx.doi.org/10.1080/10511970802238829.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

9

LIN, RUEI-SHIANG, i LING-HWEI CHEN. "A NEW APPROACH FOR CLASSIFICATION OF GENERIC AUDIO DATA". International Journal of Pattern Recognition and Artificial Intelligence 19, nr 01 (luty 2005): 63–78. http://dx.doi.org/10.1142/s0218001405003958.

Pełny tekst źródła

Streszczenie:

The existing audio retrieval systems fall into one of two categories: single-domain systems that can accept data of only a single type (e.g. speech) or multiple-domain systems that offer content-based retrieval for multiple types of audio data. Since a single-domain system has limited applications, a multiple-domain system will be more useful. However, different types of audio data will have different properties, this will make a multiple-domain system harder to be developed. If we can classify audio information in advance, the above problems can be solved. In this paper, we will propose a real-time classification method to classify audio signals into several basic audio types such as pure speech, music, song, speech with music background, and speech with environmental noise background. In order to make the proposed method robust for a variety of audio sources, we use Bayesian decision function for multivariable Gaussian distribution instead of manually adjusting a threshold for each discriminator. The proposed approach can be applied to content-based audio/video retrieval. In the experiment, the efficiency and effectiveness of this method are shown by an accuracy rate of more than 96% for general audio data classification.

Style APA, Harvard, Vancouver, ISO itp.

10

Alderete, John, i Monica Davies. "Investigating Perceptual Biases, Data Reliability, and Data Discovery in a Methodology for Collecting Speech Errors From Audio Recordings". Language and Speech 62, nr 2 (6.04.2018): 281–317. http://dx.doi.org/10.1177/0023830918765012.

Pełny tekst źródła

Streszczenie:

This work describes a methodology of collecting speech errors from audio recordings and investigates how some of its assumptions affect data quality and composition. Speech errors of all types (sound, lexical, syntactic, etc.) were collected by eight data collectors from audio recordings of unscripted English speech. Analysis of these errors showed that: (i) different listeners find different errors in the same audio recordings, but (ii) the frequencies of error patterns are similar across listeners; (iii) errors collected “online” using on the spot observational techniques are more likely to be affected by perceptual biases than “offline” errors collected from audio recordings; and (iv) datasets built from audio recordings can be explored and extended in a number of ways that traditional corpus studies cannot be.

Style APA, Harvard, Vancouver, ISO itp.

11

Premjith B., Neethu Mohan, Prabaharan Poornachandran i Soman K.P. "Audio Data Authentication with PMU Data and EWT". Procedia Technology 21 (2015): 596–603. http://dx.doi.org/10.1016/j.protcy.2015.10.066.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

12

Geiger, Ralf. "Apparatus and method for coding a time-discrete audio signal to obtain coded audio data and for decoding coded audio data". Journal of the Acoustical Society of America 123, nr 3 (2008): 1233. http://dx.doi.org/10.1121/1.2901358.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

13

Manoharan, J. Samuel. "Audio Tagging Using CNN Based Audio Neural Networks for Massive Data Processing". December 2021 3, nr 4 (24.12.2021): 365–74. http://dx.doi.org/10.36548/jaicn.2021.4.008.

Pełny tekst źródła

Streszczenie:

Sound event detection, speech emotion classification, music classification, acoustic scene classification, audio tagging and several other audio pattern recognition applications are largely dependent on the growing machine learning technology. The audio pattern recognition issues are also addressed by neural networks in recent days. The existing systems operate within limited durations on specific datasets. Pretrained systems with large datasets in natural language processing and computer vision applications over the recent years perform well in several tasks. However, audio pattern recognition research with large-scale datasets is limited in the current scenario. In this paper, a large-scale audio dataset is used for training a pre-trained audio neural network. Several audio related tasks are performed by transferring this audio neural network. Several convolution neural networks are used for modeling the proposed audio neural network. The computational complexity and performance of this system are analyzed. The waveform and leg-mel spectrogram are used as input features in this architecture. During audio tagging, the proposed system outperforms the existing systems with a mean average of 0.45. The performance of the proposed model is demonstrated by applying the audio neural network to five specific audio pattern recognition tasks.

Style APA, Harvard, Vancouver, ISO itp.

14

Kadiri, Sudarsana Reddy, i Paavo Alku. "Subjective Evaluation of Basic Emotions from Audio–Visual Data". Sensors 22, nr 13 (29.06.2022): 4931. http://dx.doi.org/10.3390/s22134931.

Pełny tekst źródła

Streszczenie:

Understanding of the perception of emotions or affective states in humans is important to develop emotion-aware systems that work in realistic scenarios. In this paper, the perception of emotions in naturalistic human interaction (audio–visual data) is studied using perceptual evaluation. For this purpose, a naturalistic audio–visual emotion database collected from TV broadcasts such as soap-operas and movies, called the IIIT-H Audio–Visual Emotion (IIIT-H AVE) database, is used. The database consists of audio-alone, video-alone, and audio–visual data in English. Using data of all three modes, perceptual tests are conducted for four basic emotions (angry, happy, neutral, and sad) based on category labeling and for two dimensions, namely arousal (active or passive) and valence (positive or negative), based on dimensional labeling. The results indicated that the participants’ perception of emotions was remarkably different between the audio-alone, video-alone, and audio–video data. This finding emphasizes the importance of emotion-specific features compared to commonly used features in the development of emotion-aware systems.

Style APA, Harvard, Vancouver, ISO itp.

15

Wang, Peng, Xia Wang i Xia Liu. "Selection of Audio Learning Resources Based on Big Data". International Journal of Emerging Technologies in Learning (iJET) 17, nr 06 (29.03.2022): 23–38. http://dx.doi.org/10.3991/ijet.v17i06.30013.

Pełny tekst źródła

Streszczenie:

Currently, audio learning resources account for a large proportion of the total online learning resources. Designing and implementing a method for optimizing and selecting audio learning resources based on big data of education will be of great significance to the recommendation of learning resources. Therefore, this paper studies a method for selecting audio learning resources based on the big data of education, with music learning as an example. First, the audio signals were converted into mel spectrograms, and accordingly, the mel-frequency cepstral coefficient features of audio learning resources were obtained. Then, on the basis of the conventional content-based audio recommendation algorithm, the established interest degree vector of target students with respect to music learning was expanded, and a collaborative filtering hybrid algorithm for audio learning resources that incorporates the interest degrees of neighbouring students was proposed, which effectively improved the accuracy and stability in the prediction of students’ interest in music learning. Finally, the experimental results verified the feasibility and prediction accuracy of the proposed algorithm.

Style APA, Harvard, Vancouver, ISO itp.

16

Lee, Jae-Woo. "Design and Construction of Fiber Optical Link Application System for Multi-Video Audio Data Transmission". Journal of the Korea Academia-Industrial cooperation Society 10, nr 10 (31.10.2009): 2691–95. http://dx.doi.org/10.5762/kais.2009.10.10.2691.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

17

Samudra, Yoga. "Mirroring-Based Data Hiding in Audio". International Journal of Intelligent Engineering and Systems 14, nr 5 (31.10.2021): 550–58. http://dx.doi.org/10.22266/ijies2021.1031.48.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

18

Mande, Ameya Ajit. "EMOTION DETECTION USING AUDIO DATA SAMPLES". International Journal of Advanced Research in Computer Science 10, nr 6 (20.12.2019): 13–20. http://dx.doi.org/10.26483/ijarcs.v10i6.6489.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

19

Sakthisudhan, K., S. Gayathri Priya, P. Prabhu i P. Thangaraj. "Secure Data Transmission Using Audio Steganography". i-manager's Journal on Electronics Engineering 2, nr 3 (15.05.2012): 1–6. http://dx.doi.org/10.26634/jele.2.3.1763.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

20

Guarino, Joe, Wes Orme i Wayne Fischer. "Audio enhancement of biomechanical impact data." Journal of the Acoustical Society of America 125, nr 4 (kwiecień 2009): 2731. http://dx.doi.org/10.1121/1.4784508.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

21

Yamasaki, Yoshio, i Itaru Kaneko. "MPEG. 2-2. Audio Data Coding." Journal of the Institute of Television Engineers of Japan 49, nr 4 (1995): 422–30. http://dx.doi.org/10.3169/itej1978.49.422.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

22

Tan, E., i B. Vermuelen. "Digital audio tape for data storage". IEEE Spectrum 26, nr 10 (październik 1989): 34–38. http://dx.doi.org/10.1109/6.40682.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

23

Ikeda, Mikio, Ryouzoh Toyoshima, Kazuya Takeda i Fumitada Itakura. "Audio data hiding using band elimination". Electronics and Communications in Japan (Part II: Electronics) 86, nr 2 (15.01.2003): 57–67. http://dx.doi.org/10.1002/ecjb.10120.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

24

Chen, Lieu-Hen, Pin-Chieh Cheng, Hao-Ming Hung, Wei-Fen Hsieh i Yasufumi Takama. "An Audio-Visual Information Visualization System for Time-Varying Big Data". SIJ Transactions on Computer Science Engineering & its Applications (CSEA) 03, nr 05 (20.10.2015): 13–19. http://dx.doi.org/10.9756/sijcsea/v3i5/03080260402.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

25

Xu, Yanping, i Sen Xu. "A Clustering Analysis Method for Massive Music Data". Modern Electronic Technology 5, nr 1 (6.05.2021): 24. http://dx.doi.org/10.26549/met.v5i1.6763.

Pełny tekst źródła

Streszczenie:

Clustering analysis plays a very important role in the field of data mining, image segmentation and pattern recognition. The method of cluster analysis is introduced to analyze NetEYun music data. In addition, different types of music data are clustered to find the commonness among the same kind of music. A music data-oriented clustering analysis method is proposed: Firstly, the audio beat period is calculated by reading the audio file data, and the emotional features of the audio are extracted; Secondly, the audio beat period is calculated by Fourier transform. Finally, a clustering algorithm is designed to obtain the clustering results of music data.

Style APA, Harvard, Vancouver, ISO itp.

26

Teh, Do-Hui. "Apparatus and method for stereo audio encoding of digital audio signal data". Journal of the Acoustical Society of America 103, nr 1 (styczeń 1998): 21. http://dx.doi.org/10.1121/1.423157.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

27

Karamchandani, Sunil H., Krutarth J. Gandhi, Siddharth R. Gosalia, Vinod K. Madan, Shabbir N. Merchant i Uday B. Desai. "PCA Encrypted Short Acoustic Data Inculcated in Digital Color Images". International Journal of Computers Communications & Control 10, nr 5 (1.07.2015): 678. http://dx.doi.org/10.15837/ijccc.2015.5.2029.

Pełny tekst źródła

Streszczenie:

We propose develop a generalized algorithm for hiding audio signal using image steganography. The authors suggest transmitting short audio messages camouflaged in digital images using Principal Component Analysis (PCA) as an encryption technique. The quantum of principal components required to represent the audio signal by removing the redundancies is a measure of the magnitude of the Eigen values. The aforementioned technique follows a dual task of encryption and in turn also compresses the audio data, sufficient enough to be buried in the image. A 57Kb audio signal is decipher from the Stego image with a high PSNR of 47.49 and a correspondingly low mse of 3.3266 × 10􀀀^(-6) with an equalized high quality audio output. The consistent and comparable experimental results on application of the proposed method across a series of images demonstrate that PCA based encryption can be adapted as an universal rule for a specific payload and the desired compression ratio.

Style APA, Harvard, Vancouver, ISO itp.

28

Kang, Yu, Tianqiao Liu, Hang Li, Yang Hao i Wenbiao Ding. "Self-Supervised Audio-and-Text Pre-training with Extremely Low-Resource Parallel Data". Proceedings of the AAAI Conference on Artificial Intelligence 36, nr 10 (28.06.2022): 10875–83. http://dx.doi.org/10.1609/aaai.v36i10.21334.

Pełny tekst źródła

Streszczenie:

Multimodal pre-training for audio-and-text has recently been proved to be effective and has significantly improved the performance of many downstream speech understanding tasks. However, these state-of-the-art pre-training audio-text models work well only when provided with large amount of parallel audio-and-text data, which brings challenges on many languages that are rich in unimodal corpora but scarce of parallel cross-modal corpus. In this paper, we investigate whether it is possible to pre-train an audio-text multimodal model with extremely low-resource parallel data and extra non-parallel unimodal data. Our pre-training framework consists of the following components: (1) Intra-modal Denoising Auto-Encoding (IDAE), which is able to reconstruct input text (audio) representations from a noisy version of itself. (2) Cross-modal Denoising Auto-Encoding (CDAE), which is pre-trained to reconstruct the input text (audio), given both a noisy version of the input text (audio) and the corresponding translated noisy audio features (text embeddings). (3) Iterative Denoising Process (IDP), which iteratively translates raw audio (text) and the corresponding text embeddings (audio features) translated from previous iteration into the new less-noisy text embeddings (audio features). We adapt a dual cross-modal Transformer as our backbone model which consists of two unimodal encoders for IDAE and two cross-modal encoders for CDAE and IDP. Our method achieves comparable performance on multiple downstream speech understanding tasks compared with the model pre-trained on fully parallel data, demonstrating the great potential of the proposed method.

Style APA, Harvard, Vancouver, ISO itp.

29

Budiman, Gelar, Andriyan Bayu Suksmono i Donny Danudirdjo. "Compressive Sampling with Multiple Bit Spread Spectrum-Based Data Hiding". Applied Sciences 10, nr 12 (24.06.2020): 4338. http://dx.doi.org/10.3390/app10124338.

Pełny tekst źródła

Streszczenie:

We propose a novel data hiding method in an audio host with a compressive sampling technique. An over-complete dictionary represents a group of watermarks. Each row of the dictionary is a Hadamard sequence representing multiple bits of the watermark. Then, the singular values of the segment-based host audio in a diagonal matrix are multiplied by the over-complete dictionary, producing a lower size matrix. At the same time, we embed the watermark into the compressed audio. In the detector, we detect the watermark and reconstruct the audio. This proposed method offers not only hiding the information, but also compressing the audio host. The application of the proposed method is broadcast monitoring and biomedical signal recording. We can mark and secure the signal content by hiding the watermark inside the signal while we compress the signal for memory efficiency. We evaluate the performance in terms of payload, compression ratio, audio quality, and watermark quality. The proposed method can hide the data imperceptibly, in the range of 729–5292 bps, with a compression ratio 1.47–4.84, and a perfectly detected watermark.

Style APA, Harvard, Vancouver, ISO itp.

30

BAKIR, Çiğdem. "Compressing English Speech Data with Hybrid Methods without Data Loss". International Journal of Applied Mathematics Electronics and Computers 10, nr 3 (30.09.2022): 68–75. http://dx.doi.org/10.18100/ijamec.1166951.

Pełny tekst źródła

Streszczenie:

Understanding the mechanism of speech formation is of great importance in the successful coding of the speech signal. It is also used for various applications, from authenticating audio files to connecting speech recording to data acquisition device (e.g. microphone). Speech coding is of vital importance in the acquisition, analysis and evaluation of sound, and in the investigation of criminal events in forensics. For the collection, processing, analysis, extraction and evaluation of speech or sounds recorded as audio files, which play an important role in crime detection, it is necessary to compress the audio without data loss. Since there are many voice changing software available today, the number of recorded speech files and their correct interpretation play an important role in detecting originality. Using various techniques such as signal processing, noise extraction, filtering on an incomprehensible speech recording, improving the speech, making them comprehensible, determining whether there is any manipulation on the speech recording, understanding whether it is original, whether various methods of addition and subtraction are used, coding of sounds, the code must be decoded and the decoded sounds must be transcribed. In this study, first of all, what sound coding is, its purposes, areas of use, classification of sound coding according to some features and techniques are given. Moreover, in our study speech coding was done on the English audio data. This dataset is the real dataset and consists of approximately 100000 voice recordings. Speech coding was done using waveform, vocoders and hybrid methods and the success of all the methods used on the system we created was measured. Hybrid models gave more successful results than others. The results obtained will set an example for our future work.

Style APA, Harvard, Vancouver, ISO itp.

31

NOVAMIZANTI, LEDYA, GELAR BUDIMAN i BHISMA ADI WIBOWO. "Optimasi Sistem Penyembunyian Data pada Audio menggunakan Sub-band Stasioner dan Manipulasi Rata-rata Statistik". ELKOMIKA: Jurnal Teknik Energi Elektrik, Teknik Telekomunikasi, & Teknik Elektronika 6, nr 2 (9.07.2018): 165. http://dx.doi.org/10.26760/elkomika.v6i2.165.

Pełny tekst źródła

Streszczenie:

ABSTRAKKasus pelanggaran hak cipta musik atau lagu menjadi masalah dan mendapat perhatian serius oleh industri musik di Indonesia. Teknik audio watermarking merupakan salah satu solusi untuk melindungi hak cipta audio digital dari tindakan ilegal dengan cara menyembunyikan watermark berupa identitas pemilik ke dalam audio tersebut. Pada penelitian ini, audio host diubah menjadi matriks 1-dimensi untuk proses framing. Selanjutnya Stationary Wavelet Transform (SWT) digunakan untuk mendapatkan sub- band stasioner terpilih yang akan disisipkan watermark. Metode Statistical Mean Manipulation (SMM) akan menghitung rata-rata host audio dalam satu frame, dan dilakukan proses penyisipan bit. Optimasi dilakukan dengan melakukan evaluasi parameter yang menghasilkan BER paling tinggi setelah sistem diberikan serangan. Hasil dari optimasi diperoleh suatu sistem audio watermarking yang kuat dan tahan terhadap gangguan signal, dengan rata-rata BER 0.113, SNR 31 dB, ODG -0.6, dan MOS 4.6.Kata kunci: audio watermarking, SWT, SMM, optimasiABSTRACTThe case of copyright infringement of music or song becomes a serious problem in Indonesia. Audio watermarking technique is one solution to protect the music copyright of digital audio from illegal acts by hidding the watermark in the form owner's identity into the audio. The workings of audio watermarking is to embed the watermark in the form owner's identity into the audio. In this study, the audio host is converted into a 1-dimensional matrix for the framing process. Furthermore Stationary Wavelet Transform (SWT) used to obtain stationary sub-bands selected to be inserted watermark. Statistical methods Mean Manipulation (SMM) will calculate the average host audio in one frame, and do bits insertion process. Optimization is done by evaluating the parameters that produce the highest BER after the system is given an attack. The results of the optimization obtained an audio watermarking system that is robust and resistant to signal interference, with the average BER 0.113, SNR 31 dB, ODG -0.6, and MOS 4.6. Keywords: audio watermarking, SWT, SMM, optimization

Style APA, Harvard, Vancouver, ISO itp.

32

Huang, Xinchao, Zihan Liu, Wei Lu, Hongmei Liu i Shijun Xiang. "Fast and Effective Copy-Move Detection of Digital Audio Based on Auto Segment". International Journal of Digital Crime and Forensics 11, nr 2 (kwiecień 2019): 47–62. http://dx.doi.org/10.4018/ijdcf.2019040104.

Pełny tekst źródła

Streszczenie:

Detecting digital audio forgeries is a significant research focus in the field of audio forensics. In this article, the authors focus on a special form of digital audio forgery—copy-move—and propose a fast and effective method to detect doctored audios. First, the article segments the input audio data into syllables by voice activity detection and syllable detection. Second, the authors select the points in the frequency domain as feature by applying discrete Fourier transform (DFT) to each audio segment. Furthermore, this article sorts every segment according to the features and gets a sorted list of audio segments. In the end, the article merely compares one segment with some adjacent segments in the sorted list so that the time complexity is decreased. After comparisons with other state of the art methods, the results show that the proposed method can identify the authentication of the input audio and locate the forged position fast and effectively.

Style APA, Harvard, Vancouver, ISO itp.

33

Xu, Xin, i Su Mei Xi. "Cross-Media Retrieval Method Based on Space Mapping". Advanced Materials Research 756-759 (wrzesień 2013): 1898–902. http://dx.doi.org/10.4028/www.scientific.net/amr.756-759.1898.

Pełny tekst źródła

Streszczenie:

This paper puts forward a novel cross-media retrieval approach, which can process multimedia data of different modalities and measure cross-media similarity, such as image-audio similarity. Both image and audio data are selected for experiments and comparisons. Given the same visual and auditory features the new approach outperforms ICA, PCA and PLS methods both in precision and recall performance. Overall cross-media retrieval results between images and audios are very encouraging.

Style APA, Harvard, Vancouver, ISO itp.

34

Jang, Miso, i Dong-Chul Park. "Application of Classifier Integration Model with Confusion Table to Audio Data Classification". International Journal of Machine Learning and Computing 9, nr 3 (czerwiec 2019): 368–73. http://dx.doi.org/10.18178/ijmlc.2019.9.3.812.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

35

Leban, Roy. "System and method for communicating audio data signals via an audio communications medium". Journal of the Acoustical Society of America 119, nr 2 (2006): 694. http://dx.doi.org/10.1121/1.2174528.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

36

HARAHAP, HANNAN, GELAR BUDIMAN i LEDYA NOVAMIZANTI. "Implementasi Teknik Watermarking menggunakan FFT dan Spread Spectrum Watermark pada Data Audio Digital". ELKOMIKA: Jurnal Teknik Energi Elektrik, Teknik Telekomunikasi, & Teknik Elektronika 4, nr 1 (2.05.2018): 98. http://dx.doi.org/10.26760/elkomika.v4i1.98.

Pełny tekst źródła

Streszczenie:

ABSTRAKPenggunaan teknologi dan internet yang berkembang dengan pesat menyebabkan banyak pemalsuan dan penyebaran yang tidak sah terhadap data digital. Oleh karena itu, sangat diperlukan suatu teknologi yang dapat melindungi hak cipta data multimedia seperti audio. Teknik yang sering digunakan dalam perlindungan hak cipta adalah watermarking karena teknik ini memiliki tiga kriteria utama dalam keamanan data, yaitu robustness, imperceptibility, dan safety. Untuk itu, pada penelitian ini dirancang suatu skema yang dapat melindungi hak cipta data audio. Metode yang digunakan adalah Fast Fourier Transform, yang mengubah data audio asli ke dalam domain frekuensi sebelum dilakukan proses penyisipan watermark dan proses ekstraksi watermark. Watermark disebar pada komponen yang paling signifikan dari spektrum magnitude audio host. Teknik watermarking pada penelitian ini dapat menghasilkan Signal-to-Noise Ratio di atas 20 dB dan Bit Error Rate di bawah 5%.Kata kunci: Audio watermarking, Copyright Protection, Fast Fourier Transform, Spektrum magnitudeABSTRACTThe use of technology and internet has grown rapidly that causes a lot of forgery and illegal proliferation of digital data. It needs a technology that can protect the copyright of multimedia data such as audio. The most common technique in copyright protection is watermarking because it has three main criteria in data security: robustness, imperceptibility, and safety. This research created a scheme that can protect a copyright of audio data. The method that we used is Fast Fourier Transform. This method changes the original audio data into frequency domain before the embedding and extraction process. The watermark is spread into the most significant component of the magnitude spectrum of audio host. This technique obtains Signal-to-Noise Ratio above 20 dB and Bit Error Rate below 5%.Keywords: Audio watermarking, Copyright Protection, Fast Fourier Transform, Magnitude spectrum

Style APA, Harvard, Vancouver, ISO itp.

37

Fejfar, Jiří, Jiří Šťastný, Martin Pokorný, Jiří Balej i Petr Zach. "Analysis of sound data streamed over the network". Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis 61, nr 7 (2013): 2105–10. http://dx.doi.org/10.11118/actaun201361072105.

Pełny tekst źródła

Streszczenie:

In this paper we inspect a difference between original sound recording and signal captured after streaming this original recording over a network loaded with a heavy traffic. There are several kinds of failures occurring in the captured recording caused by network congestion. We try to find a method how to evaluate correctness of streamed audio. Usually there are metrics based on a human perception of a signal such as “signal is clear, without audible failures”, “signal is having some failures but it is understandable”, or “signal is inarticulate”. These approaches need to be statistically evaluated on a broad set of respondents, which is time and resource consuming. We try to propose some metrics based on signal properties allowing us to compare the original and captured recording. We use algorithm called Dynamic Time Warping (Müller, 2007) commonly used for time series comparison in this paper. Some other time series exploration approaches can be found in (Fejfar, 2011) and (Fejfar, 2012). The data was acquired in our network laboratory simulating network traffic by downloading files, streaming audio and video simultaneously. Our former experiment inspected Quality of Service (QoS) and its impact on failures of received audio data stream. This experiment is focused on the comparison of sound recordings rather than network mechanism.We focus, in this paper, on a real time audio stream such as a telephone call, where it is not possible to stream audio in advance to a “pool”. Instead it is necessary to achieve as small delay as possible (between speaker voice recording and listener voice replay). We are using RTP protocol for streaming audio.

Style APA, Harvard, Vancouver, ISO itp.

38

Shen, Jiaxing, Jiannong Cao, Oren Lederman, Shaojie Tang i Alex “Sandy” Pentland. "User Profiling Based on Nonlinguistic Audio Data". ACM Transactions on Information Systems 40, nr 1 (31.01.2022): 1–23. http://dx.doi.org/10.1145/3474826.

Pełny tekst źródła

Streszczenie:

User profiling refers to inferring people’s attributes of interest ( AoIs ) like gender and occupation, which enables various applications ranging from personalized services to collective analyses. Massive nonlinguistic audio data brings a novel opportunity for user profiling due to the prevalence of studying spontaneous face-to-face communication. Nonlinguistic audio is coarse-grained audio data without linguistic content. It is collected due to privacy concerns in private situations like doctor-patient dialogues. The opportunity facilitates optimized organizational management and personalized healthcare, especially for chronic diseases. In this article, we are the first to build a user profiling system to infer gender and personality based on nonlinguistic audio. Instead of linguistic or acoustic features that are unable to extract, we focus on conversational features that could reflect AoIs. We firstly develop an adaptive voice activity detection algorithm that could address individual differences in voice and false-positive voice activities caused by people nearby. Secondly, we propose a gender-assisted multi-task learning method to combat dynamics in human behavior by integrating gender differences and the correlation of personality traits. According to the experimental evaluation of 100 people in 273 meetings, we achieved 0.759 and 0.652 in F1-score for gender identification and personality recognition, respectively.

Style APA, Harvard, Vancouver, ISO itp.

39

Chen, Ke, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick i Shlomo Dubnov. "Zero-Shot Audio Source Separation through Query-Based Learning from Weakly-Labeled Data". Proceedings of the AAAI Conference on Artificial Intelligence 36, nr 4 (28.06.2022): 4441–49. http://dx.doi.org/10.1609/aaai.v36i4.20366.

Pełny tekst źródła

Streszczenie:

Deep learning techniques for separating audio into different sound sources face several challenges. Standard architectures require training separate models for different types of audio sources. Although some universal separators employ a single model to target multiple sources, they have difficulty generalizing to unseen sources. In this paper, we propose a three-component pipeline to train a universal audio source separator from a large, but weakly-labeled dataset: AudioSet. First, we propose a transformer-based sound event detection system for processing weakly-labeled training data. Second, we devise a query-based audio separation model that leverages this data for model training. Third, we design a latent embedding processor to encode queries that specify audio targets for separation, allowing for zero-shot generalization. Our approach uses a single model for source separation of multiple sound types, and relies solely on weakly-labeled data for training. In addition, the proposed audio separator can be used in a zero-shot setting, learning to separate types of audio sources that were never seen in training. To evaluate the separation performance, we test our model on MUSDB18, while training on the disjoint AudioSet. We further verify the zero-shot performance by conducting another experiment on audio source types that are held-out from training. The model achieves comparable Source-to-Distortion Ratio (SDR) performance to current supervised models in both cases.

Style APA, Harvard, Vancouver, ISO itp.

40

Widyastuti, Nadia. "PENERAPAN MEDIA AUDIO VISUAL DALAM PEMBELAJARAN BAHASA INGGRIS KELAS VII DI SMPN 1 SYAMTALIRA BAYU ACEH UTARA". Hudan Lin Naas: Jurnal Ilmu Sosial dan Humaniora 3, nr 2 (14.12.2022): 59. http://dx.doi.org/10.28944/hudanlinnaas.v3i2.690.

Pełny tekst źródła

Streszczenie:

Penelitian ini membahas tentang penerapan media audio visual dalam mata pelajaran Bahasa inggris, hal ini dilatar belakangi oleh dunia Pendidikan saat ini yang terus mengalami kemajuan dalam teknik penerapannya salah satunya menggunakan media audio visual sebagai media pembelajaran yaitu menggunakan video animasi Bahasa inggris ,film pendek serta musik dengan teks Bahasa Inggris. Penelitian ini bertujuan untuk mendeskripsikan penerapan media audio visual dalam pembelajaran Bahasa inggris di kelas VII dan memaparkan faktor pendukung dan penghambat dalam penerapAan media audio visual pada pembelajaran. Penelitian ini menggunakan penelitian kualitattif deskriptif. Subjek penelitian ini adalah guru kelas VII. Teknik pengumpulan data yang dilakukan adalah wawancara melalui daring atau via whatsapp. Hasil penelitian menunjukkkan bahwa pelaksaan pembelajaran Bahasa inggris dengan menggunakan audio visual berjalan sesuai dengan RPP yang sudah dibuat oleh guru, dan hasil yang didapatkan media audio visual berdampak pada hasil nilai belajar yang semakin membaik sebelum dan sesudah menggunakan audia visual.penggunaan media pembelajaran ini merupakan media yang tepat untuk digunakan.

Style APA, Harvard, Vancouver, ISO itp.

41

MATSUO, Yuichi, i Kazuyo SUEMATSU. "Audio-Visual Technique of Numerical Simulation Data". Journal of the Visualization Society of Japan 20, nr 78 (2000): 197–202. http://dx.doi.org/10.3154/jvs.20.197.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

42

AliSabir, Firas. "Hiding Encrypted Data in Audio Wave File". International Journal of Computer Applications 91, nr 4 (18.04.2014): 6–9. http://dx.doi.org/10.5120/15867-4809.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

43

Mudusu, Rambabu, A. Nagesh i M. Sadanandam. "Enhancing Data Security Using Audio-Video Steganography". International Journal of Engineering & Technology 7, nr 2.20 (18.04.2018): 276. http://dx.doi.org/10.14419/ijet.v7i2.20.14777.

Pełny tekst źródła

Streszczenie:

Steganography may be a strategy for concealing any mystery information like content, picture, sound behind distinctive cowl document .In this paper we have a tendency to planned the mix of image steganography associated sound steganography with confront acknowledgment innovation as an instrument for verification. The purpose is to shroud the mystery information behind sound and therefore the beneficiary's face image of video, because it may be a use of various still casings of images and sound. During this technique we've chosen any casing of video to shroud beneficiary's face image and sound to hide the mystery data. Affordable calculation, maybe, increased LSB and RSA rule is been utilised to shroud mystery content and movie. PCA rule is employed for confront acknowledgment. The parameter for security and verification ar gotten at collector and transmitter facet that ar exactly indistinguishable, consequently the knowledge security is dilated.

Style APA, Harvard, Vancouver, ISO itp.

44

H. Kridalaksana, Awang, Andi Yushika Rangan i Asfami Ansharie. "ENKRIPSI DATA AUDIO MENGGUNAKAN METODE KRIPTOGRAFI RSA". Sebatik 17, nr 1 (1.01.2017): 6–10. http://dx.doi.org/10.46984/sebatik.v17i1.79.

Pełny tekst źródła

Streszczenie:

Penerapan Metode RSA pada Enkripsi Data Audio, merupakan bentuk penelitian untuk membuktikan bahwa Metode Kriptografi dapat digunakan untuk pencarian solusi, khususnya pada permasalahan kerahasian data. Tujuan dari penelitian ini adalah merancang dan membangun sebuah aplikasi yang dapat menyelesaikan masalah enkripsi data untuk merahasiakan sebuah data dengan menggunakan dua kunci yaitu, proses enkripsi dengan menggunakan kunci public dan kunci Private digunakaan untuk melakukan proses dekripsinya, dengan menggunakan bahasa pemrograman Visual Basic .NET. Dalam penelitian ini, teknik pengumpulan data yang digunakan adalah studi pustaka. Metode pengujian yang digunakan adalah metode pengujian White-Box yang digunakan untuk menguji listing Coding Proses enkirpsi dan dekripsinya, Black Box digunakaan untuk menguji apakah aplikasi berjalan dengan algoritma kunci yang sesuai, menguji daya tahan hasil enkripsi data apakah bisa di enkripsi dengan metode kriptografi lainnya. Dengan menggunakan tahapan pengembangan Prototype yaitu Tahapan Perancangan Antarmuka, Implementasi, Pengujian Sistem, agar dalam membangun Aplkasi Enkripsi Data Audio menggunakan Kriptografi RSA dengan terstruktur. Aplikasi ini dapat menjadi salah satu media alternatif untuk Keamanan data.

Style APA, Harvard, Vancouver, ISO itp.

45

Toyama, Akira. "Apparatus for reproducing digital audio waveform data". Journal of the Acoustical Society of America 103, nr 1 (styczeń 1998): 17. http://dx.doi.org/10.1121/1.423136.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

46

Levine, Scott N. "A malleable audio representation for data compression". Journal of the Acoustical Society of America 107, nr 5 (maj 2000): 2875. http://dx.doi.org/10.1121/1.428679.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

47

Warner, Paul. "System for transmitting data simultaneously with audio". Journal of the Acoustical Society of America 81, nr 1 (styczeń 1987): 212. http://dx.doi.org/10.1121/1.394926.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

48

Colasito, Marco, Jeremy Straub i Pratap Kotala. "Correlated lip motion and voice audio data". Data in Brief 21 (grudzień 2018): 856–60. http://dx.doi.org/10.1016/j.dib.2018.10.043.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

49

Sophiya, E., i S. Jothilakshmi. "Large scale data based audio scene classification". International Journal of Speech Technology 21, nr 4 (4.09.2018): 825–36. http://dx.doi.org/10.1007/s10772-018-9552-3.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

50

Alhassan, Salamudeen, Mohammed Muniru Iddrisu i Mohammed Ibrahim Daabo. "Securing audio data using K-shuffle technique". Multimedia Tools and Applications 78, nr 23 (10.10.2019): 33985–97. http://dx.doi.org/10.1007/s11042-019-08151-6.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!