Log in

Relevant bibliographies by topics / Speaker embedding / Journal articles

To see the other types of publications on this topic, follow the link: Speaker embedding.

Journal articles on the topic 'Speaker embedding'

Author: Grafiati

Published: 28 June 2021

Last updated: 10 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Speaker embedding.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Kang, Woo Hyun, Sung Hwan Mun, Min Hyun Han, and Nam Soo Kim. "Disentangled Speaker and Nuisance Attribute Embedding for Robust Speaker Verification." IEEE Access 8 (2020): 141838–49. http://dx.doi.org/10.1109/access.2020.3012893.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Lee, Kong Aik, Qiongqiong Wang, and Takafumi Koshinaka. "Xi-Vector Embedding for Speaker Recognition." IEEE Signal Processing Letters 28 (2021): 1385–89. http://dx.doi.org/10.1109/lsp.2021.3091932.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Sečujski, Milan, Darko Pekar, Siniša Suzić, Anton Smirnov, and Tijana Nosek. "Speaker/Style-Dependent Neural Network Speech Synthesis Based on Speaker/Style Embedding." JUCS - Journal of Universal Computer Science 26, no. 4 (April 28, 2020): 434–53. http://dx.doi.org/10.3897/jucs.2020.023.

Full text

Abstract:

The paper presents a novel architecture and method for training neural networks to produce synthesized speech in a particular voice and speaking style, based on a small quantity of target speaker/style training data. The method is based on neural network embedding, i.e. mapping of discrete variables into continuous vectors in a low-dimensional space, which has been shown to be a very successful universal deep learning technique. In this particular case, different speaker/style combinations are mapped into different points in a low-dimensional space, which enables the network to capture the similarities and differences between speakers and speaking styles more efficiently. The initial model from which speaker/style adaptation was carried out was a multi-speaker/multi-style model based on 8.5 hours of American English speech data which corresponds to 16 different speaker/style combinations. The results of the experiments show that both versions of the obtained system, one using 10 minutes and the other as little as 30 seconds of target data, outperform the state of the art in parametric speaker/style-dependent speech synthesis. This opens a wide range of application of speaker/style dependent speech synthesis based on small quantities of training data, in domains ranging from customer interaction in call centers to robot-assisted medical therapy.

APA, Harvard, Vancouver, ISO, and other styles

4

Bae, Ara, and Wooil Kim. "Speaker Verification Employing Combinations of Self-Attention Mechanisms." Electronics 9, no. 12 (December 21, 2020): 2201. http://dx.doi.org/10.3390/electronics9122201.

Full text

Abstract:

One of the most recent speaker recognition methods that demonstrates outstanding performance in noisy environments involves extracting the speaker embedding using attention mechanism instead of average or statistics pooling. In the attention method, the speaker recognition performance is improved by employing multiple heads rather than a single head. In this paper, we propose advanced methods to extract a new embedding by compensating for the disadvantages of the single-head and multi-head attention methods. The combination method comprising single-head and split-based multi-head attentions shows a 5.39% Equal Error Rate (EER). When the single-head and projection-based multi-head attention methods are combined, the speaker recognition performance improves by 4.45%, which is the best performance in this work. Our experimental results demonstrate that the attention mechanism reflects the speaker’s properties more effectively than average or statistics pooling, and the speaker verification system could be further improved by employing combinations of different attention techniques.

APA, Harvard, Vancouver, ISO, and other styles

5

Bahmaninezhad, Fahimeh, Chunlei Zhang, and John H. L. Hansen. "An investigation of domain adaptation in speaker embedding space for speaker recognition." Speech Communication 129 (May 2021): 7–16. http://dx.doi.org/10.1016/j.specom.2021.01.001.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Li, Wenjie, Pengyuan Zhang, and Yonghong Yan. "TEnet: target speaker extraction network with accumulated speaker embedding for automatic speech recognition." Electronics Letters 55, no. 14 (July 2019): 816–19. http://dx.doi.org/10.1049/el.2019.1228.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Mingote, Victoria, Antonio Miguel, Alfonso Ortega, and Eduardo Lleida. "Supervector Extraction for Encoding Speaker and Phrase Information with Neural Networks for Text-Dependent Speaker Verification." Applied Sciences 9, no. 16 (August 11, 2019): 3295. http://dx.doi.org/10.3390/app9163295.

Full text

Abstract:

In this paper, we propose a new differentiable neural network with an alignment mechanism for text-dependent speaker verification. Unlike previous works, we do not extract the embedding of an utterance from the global average pooling of the temporal dimension. Our system replaces this reduction mechanism by a phonetic phrase alignment model to keep the temporal structure of each phrase since the phonetic information is relevant in the verification task. Moreover, we can apply a convolutional neural network as front-end, and, thanks to the alignment process being differentiable, we can train the network to produce a supervector for each utterance that will be discriminative to the speaker and the phrase simultaneously. This choice has the advantage that the supervector encodes the phrase and speaker information providing good performance in text-dependent speaker verification tasks. The verification process is performed using a basic similarity metric. The new model using alignment to produce supervectors was evaluated on the RSR2015-Part I database, providing competitive results compared to similar size networks that make use of the global average pooling to extract embeddings. Furthermore, we also evaluated this proposal on the RSR2015-Part II. To our knowledge, this system achieves the best published results obtained on this second part.

APA, Harvard, Vancouver, ISO, and other styles

8

LIANG, Chunyan, Lin YANG, Qingwei ZHAO, and Yonghong YAN. "Factor Analysis of Neighborhood-Preserving Embedding for Speaker Verification." IEICE Transactions on Information and Systems E95.D, no. 10 (2012): 2572–76. http://dx.doi.org/10.1587/transinf.e95.d.2572.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Lin, Weiwei, Man-Wai Mak, Na Li, Dan Su, and Dong Yu. "A Framework for Adapting DNN Speaker Embedding Across Languages." IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 (2020): 2810–22. http://dx.doi.org/10.1109/taslp.2020.3030499.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Byun, Jaeuk, and Jong Won Shin. "Monaural Speech Separation Using Speaker Embedding From Preliminary Separation." IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021): 2753–63. http://dx.doi.org/10.1109/taslp.2021.3101617.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Wang, Shuai, Yexin Yang, Zhanghao Wu, Yanmin Qian, and Kai Yu. "Data Augmentation Using Deep Generative Models for Embedding Based Speaker Recognition." IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 (2020): 2598–609. http://dx.doi.org/10.1109/taslp.2020.3016498.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Wang, Shuai, Zili Huang, Yanmin Qian, and Kai Yu. "Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification." IEEE/ACM Transactions on Audio, Speech, and Language Processing 27, no. 11 (November 2019): 1686–96. http://dx.doi.org/10.1109/taslp.2019.2928128.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Seo, Soonshin, and Ji-Hwan Kim. "Self-Attentive Multi-Layer Aggregation with Feature Recalibration and Deep Length Normalization for Text-Independent Speaker Verification System." Electronics 9, no. 10 (October 17, 2020): 1706. http://dx.doi.org/10.3390/electronics9101706.

Full text

Abstract:

One of the most important parts of a text-independent speaker verification system is speaker embedding generation. Previous studies demonstrated that shortcut connections-based multi-layer aggregation improves the representational power of a speaker embedding system. However, model parameters are relatively large in number, and unspecified variations increase in the multi-layer aggregation. Therefore, in this study, we propose a self-attentive multi-layer aggregation with feature recalibration and deep length normalization for a text-independent speaker verification system. To reduce the number of model parameters, we set the ResNet with the scaled channel width and layer depth as a baseline. To control the variability in the training, we apply a self-attention mechanism to perform multi-layer aggregation with dropout regularizations and batch normalizations. Subsequently, we apply a feature recalibration layer to the aggregated feature using fully-connected layers and nonlinear activation functions. Further, deep length normalization is used on a recalibrated feature in the training process. Experimental results using the VoxCeleb1 evaluation dataset showed that the performance of the proposed methods was comparable to that of state-of-the-art models (equal error rate of 4.95% and 2.86%, using the VoxCeleb1 and VoxCeleb2 training datasets, respectively).

APA, Harvard, Vancouver, ISO, and other styles

14

Byun, Sung-Woo, and Seok-Pil Lee. "Design of a Multi-Condition Emotional Speech Synthesizer." Applied Sciences 11, no. 3 (January 26, 2021): 1144. http://dx.doi.org/10.3390/app11031144.

Full text

Abstract:

Recently, researchers have developed text-to-speech models based on deep learning, which have produced results superior to those of previous approaches. However, because those systems only mimic the generic speaking style of reference audio, it is difficult to assign user-defined emotional types to synthesized speech. This paper proposes an emotional speech synthesizer constructed by embedding not only speaking styles but also emotional styles. We extend speaker embedding to multi-condition embedding by adding emotional embedding in Tacotron, so that the synthesizer can generate emotional speech. An evaluation of the results showed the superiority of the proposed model to a previous model, in terms of emotional expressiveness.

APA, Harvard, Vancouver, ISO, and other styles

15

YOU, MINGYU, GUO-ZHENG LI, JACK Y. YANG, and MARY QU YANG. "AN ENHANCED LIPSCHITZ EMBEDDING CLASSIFIER FOR MULTI-EMOTION SPEECH ANALYSIS." International Journal of Pattern Recognition and Artificial Intelligence 23, no. 08 (December 2009): 1685–700. http://dx.doi.org/10.1142/s0218001409007764.

Full text

Abstract:

This paper proposes an Enhanced Lipschitz Embedding based Classifier (ELEC) for the classification of multi-emotions from speech signals. ELEC adopts geodesic distance to preserve the intrinsic geometry at all scales of speech corpus, instead of Euclidean distance. Based on the minimal geodesic distance to vectors of different emotions, ELEC maps the high dimensional feature vectors into a lower space. Through analyzing the class labels of the neighbor training vectors in the compressed low space, ELEC classifies the test data into six archetypal emotional states, i.e. neutral, anger, fear, happiness, sadness and surprise. Experimental results on clear and noisy data set demonstrate that compared with the traditional methods of dimensionality reduction and classification, ELEC achieves 15% improvement on average for speaker-independent emotion recognition and 11% for speaker-dependent.

APA, Harvard, Vancouver, ISO, and other styles

16

Viñals, Ignacio, Alfonso Ortega, Antonio Miguel, and Eduardo Lleida. "An Analysis of the Short Utterance Problem for Speaker Characterization." Applied Sciences 9, no. 18 (September 5, 2019): 3697. http://dx.doi.org/10.3390/app9183697.

Full text

Abstract:

Speaker characterization has always been conditioned by the length of the evaluated utterances. Despite performing well with large amounts of audio, significant degradations in performance are obtained when short utterances are considered. In this work we present an analysis of the short utterance problem providing an alternative point of view. From our perspective the performance in the evaluation of short utterances is highly influenced by the phonetic similarity between enrollment and test utterances. Both enrollment and test should contain similar phonemes to properly discriminate, being degraded otherwise. In this study we also interpret short utterances as incomplete long utterances where some acoustic units are either unbalanced or just missing. These missing units are responsible for the speaker representations to be unreliable. These unreliable representations are biased with respect to the reference counterparts, obtained from long utterances. These undesired shifts increase the intra-speaker variability, causing a significant loss of performance. According to our experiments, short utterances (3–60 s) can perform as accurate as if long utterances were involved by just reassuring the phonetic distributions. This analysis is determined by the current embedding extraction approach, based on the accumulation of local short-time information. Thus it is applicable to most of the state-of-the-art embeddings, including traditional i-vectors and Deep Neural Network (DNN) xvectors.

APA, Harvard, Vancouver, ISO, and other styles

17

Kang, Woo Hyun, and Nam Soo Kim. "Unsupervised Learning of Total Variability Embedding for Speaker Verification with Random Digit Strings." Applied Sciences 9, no. 8 (April 17, 2019): 1597. http://dx.doi.org/10.3390/app9081597.

Full text

Abstract:

Recently, the increasing demand for voice-based authentication systems has encouraged researchers to investigate methods for verifying users with short randomized pass-phrases with constrained vocabulary. The conventional i-vector framework, which has been proven to be a state-of-the-art utterance-level feature extraction technique for speaker verification, is not considered to be an optimal method for this task since it is known to suffer from severe performance degradation when dealing with short-duration speech utterances. More recent approaches that implement deep-learning techniques for embedding the speaker variability in a non-linear fashion have shown impressive performance in various speaker verification tasks. However, since most of these techniques are trained in a supervised manner, which requires speaker labels for the training data, it is difficult to use them when a scarce amount of labeled data is available for training. In this paper, we propose a novel technique for extracting an i-vector-like feature based on the variational autoencoder (VAE), which is trained in an unsupervised manner to obtain a latent variable representing the variability within a Gaussian mixture model (GMM) distribution. The proposed framework is compared with the conventional i-vector method using the TIDIGITS dataset. Experimental results showed that the proposed method could cope with the performance deterioration caused by the short duration. Furthermore, the performance of the proposed approach improved significantly when applied in conjunction with the conventional i-vector framework.

APA, Harvard, Vancouver, ISO, and other styles

18

Kang, Woo Hyun, and Nam Soo Kim. "Adversarially Learned Total Variability Embedding for Speaker Recognition with Random Digit Strings." Sensors 19, no. 21 (October 30, 2019): 4709. http://dx.doi.org/10.3390/s19214709.

Full text

Abstract:

Over the recent years, various research has been conducted to investigate methods for verifying users with a short randomized pass-phrase due to the increasing demand for voice-based authentication systems. In this paper, we propose a novel technique for extracting an i-vector-like feature based on an adversarially learned inference (ALI) model which summarizes the variability within the Gaussian mixture model (GMM) distribution through a nonlinear process. Analogous to the previously proposed variational autoencoder (VAE)-based feature extractor, the proposed ALI-based model is trained to generate the GMM supervector according to the maximum likelihood criterion given the Baum–Welch statistics of the input utterance. However, to prevent the potential loss of information caused by the Kullback–Leibler divergence (KL divergence) regularization adopted in the VAE-based model training, the newly proposed ALI-based feature extractor exploits a joint discriminator to ensure that the generated latent variable and the GMM supervector are more realistic. The proposed framework is compared with the conventional i-vector and VAE-based methods using the TIDIGITS dataset. Experimental results show that the proposed method can represent the uncertainty caused by the short duration better than the VAE-based method. Furthermore, the proposed approach has shown great performance when applied in association with the standard i-vector framework.

APA, Harvard, Vancouver, ISO, and other styles

19

CLARIDGE, CLAUDIA, EWA JONSSON, and MERJA KYTÖ. "Entirely innocent: a historical sociopragmatic analysis of maximizers in the Old Bailey Corpus." English Language and Linguistics 24, no. 4 (December 23, 2019): 855–74. http://dx.doi.org/10.1017/s1360674319000388.

Full text

Abstract:

Based on an investigation of the Old Bailey Corpus, this article explores the development and usage patterns of maximizers in Late Modern English (LModE). The maximizers to be considered for inclusion in the study are based on the lists provided in Quirk et al. (1985) and Huddleston & Pullum (2002). The aims of the study were to (i) document the frequency development of maximizers, (ii) investigate the sociolinguistic embedding of maximizers usage (gender, class) and (iii) analyze the sociopragmatics of maximizers based on the speakers’ roles, such as judge or witness, in the courtroom.Of the eleven maximizer types focused on in the investigation, perfectly and entirely were found to dominate in frequency. The whole group was found to rise over the period 1720 to 1913. In terms of gender, social class and speaker roles, there was variation in the use of maximizers across the different speaker groups. Prominently, defendants, but also judges and lawyers, maximized more than witnesses and victims; further, male speakers and higher-ranking speakers used more maximizers. The results were interpreted taking into account the courtroom context and its dialogue dynamics.

APA, Harvard, Vancouver, ISO, and other styles

20

Alexandropoulou, Stavroula, Jakub Dotlačil, and Rick Nouwen. "At least ignorance inferences come at a processing cost: Support from eye movements." Semantics and Linguistic Theory 26 (October 25, 2016): 795. http://dx.doi.org/10.3765/salt.v26i0.3944.

Full text

Abstract:

We present results of an eye-tracking reading study that directly probes ignorance effects of the superlative numeral modifier at least in embedding and unembedding environments. We find that interpreting a numeral (phrase) modified by at least in a context with an ignorant speaker is costlier than in a context with a knowledgeable speaker, regardless of whether at least is in an embedding environment or not. In line with online studies testing scalar implicatures using a similar paradigm, this finding is taken to suggest that the observed processing cost is due to the derivation of ignorance interpretations via a pragmatic mechanism. Our results, given the paradigm we employ, further enable us to adjudicate not only between semantic and pragmatic accounts of ignorance, but also among various pragmatic proposals, favouring neo-Gricean accounts that derive ignorance as a quantity implicature (Büring 2008; Cummins & Katsos 2010; Schwarz 2013; Kennedy 2015). We find no evidence indicating that ignorance with at least in interaction with a universal modal involves an extra operation, like covert movement.

APA, Harvard, Vancouver, ISO, and other styles

21

Gomez-Alanis, Alejandro, Jose A. Gonzalez-Lopez, S. Pavankumar Dubagunta, Antonio M. Peinado, and Mathew Magimai.-Doss. "On Joint Optimization of Automatic Speaker Verification and Anti-Spoofing in the Embedding Space." IEEE Transactions on Information Forensics and Security 16 (2021): 1579–93. http://dx.doi.org/10.1109/tifs.2020.3039045.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Keydana, Götz. "‘Finite’ infinitives in Ancient Greek." Indo-European Linguistics 5, no. 1 (2017): 49–76. http://dx.doi.org/10.1163/22125892-00501003.

Full text

Abstract:

In this paper I argue that the unembedded Accusativus cum Infinitivo in Ancient Greek is a case of hearer-induced grammaticalization. AcI embedded under verba dicendi or δοκεῖν in deontic contexts are ambiguous: while the speaker intends the deontic reading as a side meaning of the embedding verb, it can also be attributed to the AcI. If the hearer opts for the latter analysis, a new function of the AcI emerges, which ultimately leads to its de-embedding. I show that this grammaticalization process is parallel to similar reanalyses in sound change. In a short outlook the analysis is extended to the emergence of absolute constructions.

APA, Harvard, Vancouver, ISO, and other styles

23

Vujović, Mia. "POREĐENJE SISTEMA ZA SINTEZU EKSPRESIVNOG GOVORA SA MOGUĆNOŠĆU KONTROLE JAČINE EMOCIJE." Zbornik radova Fakulteta tehničkih nauka u Novom Sadu 36, no. 01 (December 26, 2020): 103–6. http://dx.doi.org/10.24867/11be18vujovic.

Full text

Abstract:

U sintezi ekspresivnog govora važno je generisati emocionalno obojen govor koji odražava kompleksnost emocionalnih stanja. Brojni TTS sistemi emocije u sintetizovanom govoru modeluju u vidu diskretnih skupova, ali tek kada se uzmu u obzir i varijacije koje postoje unutar emotivnih stanja, generisani govor može biti nalik ljudskom. Ovaj rad obuhvata teorijsku analizu i poređenje dva inovativna sistema za sintezu ekspresivnog govora koji kompleksnost emocija modeluju u vidu kontinualnih vektora kojima je moguće manipulisati. Rezultati pokazuju da je pristup zasnovan na t-SNE embedding vektorima primjenljiv samo u slučaju specifičnih baza podataka, dok je drugi pristup, zasnovan na interpolaciji tačaka u embedding prostoru multi-speaker, multi-style modela, opštiji, ali zahtijeva dodatnu analizu.

APA, Harvard, Vancouver, ISO, and other styles

24

Vujović, Mia. "POREĐENJE SISTEMA ZA SINTEZU EKSPRESIVNOG GOVORA SA MOGUĆNOŠĆU KONTROLE JAČINE EMOCIJE." Zbornik radova Fakulteta tehničkih nauka u Novom Sadu 36, no. 01 (December 26, 2020): 103–6. http://dx.doi.org/10.24867/11be18vujovic.

Full text

Abstract:

U sintezi ekspresivnog govora važno je generisati emocionalno obojen govor koji odražava kompleksnost emocionalnih stanja. Brojni TTS sistemi emocije u sintetizovanom govoru modeluju u vidu diskretnih skupova, ali tek kada se uzmu u obzir i varijacije koje postoje unutar emotivnih stanja, generisani govor može biti nalik ljudskom. Ovaj rad obuhvata teorijsku analizu i poređenje dva inovativna sistema za sintezu ekspresivnog govora koji kompleksnost emocija modeluju u vidu kontinualnih vektora kojima je moguće manipulisati. Rezultati pokazuju da je pristup zasnovan na t-SNE embedding vektorima primjenljiv samo u slučaju specifičnih baza podataka, dok je drugi pristup, zasnovan na interpolaciji tačaka u embedding prostoru multi-speaker, multi-style modela, opštiji, ali zahtijeva dodatnu analizu.

APA, Harvard, Vancouver, ISO, and other styles

25

Fennell, Christopher, and Krista Byers-Heinlein. "You sound like Mommy." International Journal of Behavioral Development 38, no. 4 (June 4, 2014): 309–16. http://dx.doi.org/10.1177/0165025414530631.

Full text

Abstract:

Previous research indicates that monolingual infants have difficulty learning minimal pairs (i.e., words differing by one phoneme) produced by a speaker uncharacteristic of their language environment and that bilinguals might share this difficulty. To clearly reveal infants’ underlying phonological representations, we minimized task demands by embedding target words in naming phrases, using a fully crossed, between-subjects experimental design. We tested 17-month-old French-English bilinguals’ ( N = 30) and English monolinguals’ ( N = 31) learning of a minimal pair (/k∊m/ – /g∊m/) produced by an adult bilingual or monolingual. Infants learned the minimal pair only when the speaker matched their language environment. This vulnerability to subtle changes in word pronunciation reveals that neither monolingual nor bilingual 17-month-olds possess fully generalizable phonological representations.

APA, Harvard, Vancouver, ISO, and other styles

26

van Duijn, Max, and Arie Verhagen. "Recursive embedding of viewpoints, irregularity, and the role for a flexible framework." Pragmatics. Quarterly Publication of the International Pragmatics Association (IPrA) 29, no. 2 (March 26, 2019): 198–225. http://dx.doi.org/10.1075/prag.18049.van.

Full text

Abstract:

Abstract This paper discusses several conventional perspective operators at the lexical, grammatical, and narrative levels. When combined with each other and with particular contexts, these operators can amount to unexpected viewpoints arrangements. Traditional conceptualisations in terms of viewpoint embedding and the regular shifting from one viewpoint to the other are argued to be insufficient for describing these arrangements in all their nuances and details. We present an analysis of three cases in which viewpoints of speaker, addressee, and third parties are mutually coordinated: (i) global and local perspective structure in Nabokov’s novel Lolita, (ii) postposed reporting constructions in Dutch, and (iii) the Russian apprehensive construction, which has a seemingly redundant negation marker in the subordinate clause. For each of these three cases, we discuss how traditional conceptualisations fall short. We discuss an alternative model of viewpoint construction which allows for the conceptual juxtaposition and mixing of different and simultaneously activated viewpoints.

APA, Harvard, Vancouver, ISO, and other styles

27

Gan, Zibang, Biqing Zeng, Lianglun Cheng, Shuai Liu, Heng Yang, Mayi Xu, and Meirong Ding. "RoRePo: Detecting the role information and relative position information for contexts in multi-turn dialogue generation." Journal of Intelligent & Fuzzy Systems 40, no. 5 (April 22, 2021): 10003–15. http://dx.doi.org/10.3233/jifs-202641.

Full text

Abstract:

In multi-turn dialogue generation, dialogue contexts have been shown to have an important influence on the reasoning of the next round of dialogue. A multi-turn dialogue between two people should be able to give a reasonable response according to the relevant context. However, the widely used hierarchical recurrent encoder-decoder model and the latest model that detecting the relevant contexts with self-attention are facing the same problem. Their given response doesn’t match the identity of the current speaker, which we call it role ambiguity. In this paper, we propose a new model, named RoRePo, to tackle this problem by detecting the role information and relative position information. Firstly, as a part of the decoder input, we add a role embedding to identity different speakers. Secondly, we incorporate self-attention mechanism with relative position representation to dialogue context understanding. Besides, the design of our model architecture considers the influence of latent variables in generating more diverse responses. Experimental results of our evaluations on the DailyDialog and DSTC7_AVSD datasets show that our proposed model advances in multi-turn dialogue generation.

APA, Harvard, Vancouver, ISO, and other styles

28

Dishar, Inst Iqbal Sahib. "Assimilation in Selected Texts of Holy Quran: A Phonological Study." ALUSTATH JOURNAL FOR HUMAN AND SOCIAL SCIENCES 223, no. 1 (December 1, 2017): 103–20. http://dx.doi.org/10.36473/ujhss.v223i1.315.

Full text

Abstract:

Assimilation is a phonological, linguistic phenomenon. Assimilation is a change may occur between words or within a word when one speech sound comes to resemble or become identical with a neighboring sound. By heraldically swift speech, adjoining consonant sounds often influence one another to produce changes embedding modification in voicing, place of articulation, or in both voicing and place (Majeed& Mahmad, 1997 : 124). The main reason behind this change is the tendency of the speaker towards ease of articulation and/ or economy of effort. The study aims to contrast assimilation in Classical Arabic and Standard English in order to identify the different and similar patterns in the two languages.

APA, Harvard, Vancouver, ISO, and other styles

29

Rühlemann,, Christoph, and Matthew Brook O'Donnell,. "Introducing a corpus of conversational stories. Construction and annotation of the Narrative Corpus." Corpus Linguistics and Linguistic Theory 8, no. 2 (October 26, 2012): 313–50. http://dx.doi.org/10.1515/cllt-2012-0015.

Full text

Abstract:

AbstractAlthough widely seen as critical both in terms of its frequency and its social significance as a prime means of encoding and perpetuating moral stance and configuring self and identity, conversational narrative has received little attention in corpus linguistics. In this paper we describe the construction and annotation of a corpus that is intended to advance the linguistic theory of this fundamental mode of everyday social interaction: the Narrative Corpus (NC). The NC contains narratives extracted from the demographically-sampled subcorpus of the British National Corpus (BNC) (XML version). It includes more than 500 narratives, socially balanced in terms of participant sex, age, and social class.We describe the extraction techniques, selection criteria, and sampling methods used in constructing the NC. Further, we describe four levels of annotation implemented in the corpus: speaker (social information on speakers), text (text Ids, title, type of story, type of embedding etc.), textual components (pre-/post-narrative talk, narrative, and narrative-initial/final utterances), and utterance (participation roles, quotatives and reporting modes). A brief rationale is given for each level of annotation, and possible avenues of research facilitated by the annotation are sketched out.

APA, Harvard, Vancouver, ISO, and other styles

30

Geric, Michelle. "READING MAUD'S REMAINS: TENNYSON, GEOLOGICAL PROCESSES, AND PALAEONTOLOGICAL RECONSTRUCTIONS." Victorian Literature and Culture 42, no. 1 (February 19, 2014): 59–79. http://dx.doi.org/10.1017/s1060150313000260.

Full text

Abstract:

As Tennyson's “little Hamlet,”Maud (1855) posits a speaker who, like Hamlet, confronts the ignominious fate of dead remains. Maud's speaker contemplates such remains as bone, hair, shell, and he experiences his world as one composed of hard inorganic matter, such things as rocks, gems, flint, stone, coal, and gold. While Maud's imagery of “stones, and hard substances” has been read as signifying the speaker's desire “unnaturally to harden himself into insensibility” (Killham 231, 235), I argue that these substances benefit from being read in the context of Tennyson's wider understanding of geological processes. Along with highlighting these materials, the text's imagery focuses on processes of fossilisation, while Maud's characters appear to be in the grip of an insidious petrification. Despite the preoccupation with geological materials and processes, the poem has received little critical attention in these terms. Dennis R. Dean, for example, whose Tennyson and Geology (1985) is still the most rigorous study of the sources of Tennyson's knowledge of geology, does not detect a geological register in the poem, arguing that by the time Tennyson began to write Maud, he was “relatively at ease with the geological world” (Dean 21). I argue, however, that Maud reveals that Tennyson was anything but “at ease” with geology. While In Memoriam (1851) wrestles with religious doubt that is both initiated, and, to some extent, alleviated by geological theories, it finally affirms the transcendence of spirit over matter. Maud, conversely, gravitates towards the ground, concerning itself with the corporal remains of life and with the agents of change that operate on all matter. Influenced by his reading of geology, and particularly Charles Lyell's provocative writings on the embedding and fossilisation of organic material in strata in his Principles of Geology (1830–33) volume 2, Tennyson's poem probes the taphonomic processes that result in the incorporation of dead remains and even living flesh into the geological system.

APA, Harvard, Vancouver, ISO, and other styles

31

Häusler, Christian Olaf, and Michael Hanke. "A studyforrest extension, an annotation of spoken language in the German dubbed movie “Forrest Gump” and its audio-description." F1000Research 10 (January 28, 2021): 54. http://dx.doi.org/10.12688/f1000research.27621.1.

Full text

Abstract:

Here we present an annotation of speech in the audio-visual movie “Forrest Gump” and its audio-description for a visually impaired audience, as an addition to a large public functional brain imaging dataset (studyforrest.org). The annotation provides information about the exact timing of each of the more than 2500 spoken sentences, 16,000 words (including 202 non-speech vocalizations), 66,000 phonemes, and their corresponding speaker. Additionally, for every word, we provide lemmatization, a simple part-of-speech-tagging (15 grammatical categories), a detailed part-of-speech tagging (43 grammatical categories), syntactic dependencies, and a semantic analysis based on word embedding which represents each word in a 300-dimensional semantic space. To validate the dataset’s quality, we build a model of hemodynamic brain activity based on information drawn from the annotation. Results suggest that the annotation’s content and quality enable independent researchers to create models of brain activity correlating with a variety of linguistic aspects under conditions of near-real-life complexity.

APA, Harvard, Vancouver, ISO, and other styles

32

Wang, Yansen, Ying Shen, Zhun Liu, Paul Pu Liang, Amir Zadeh, and Louis-Philippe Morency. "Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 7216–23. http://dx.doi.org/10.1609/aaai.v33i01.33017216.

Full text

Abstract:

Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication. Speaker intentions often vary dynamically depending on different nonverbal contexts, such as vocal patterns and facial expressions. As a result, when modeling human language, it is essential to not only consider the literal meaning of the words but also the nonverbal contexts in which these words appear. To better model human language, we first model expressive nonverbal representations by analyzing the fine-grained visual and acoustic patterns that occur during word segments. In addition, we seek to capture the dynamic nature of nonverbal intents by shifting word representations based on the accompanying nonverbal behaviors. To this end, we propose the Recurrent Attended Variation Embedding Network (RAVEN) that models the fine-grained structure of nonverbal subword sequences and dynamically shifts word representations based on nonverbal cues. Our proposed model achieves competitive performance on two publicly available datasets for multimodal sentiment analysis and emotion recognition. We also visualize the shifted word representations in different nonverbal contexts and summarize common patterns regarding multimodal variations of word representations.

APA, Harvard, Vancouver, ISO, and other styles

33

Boursinos, Dimitrios, and Xenofon Koutsoukos. "Assurance monitoring of learning-enabled cyber-physical systems using inductive conformal prediction based on distance learning." Artificial Intelligence for Engineering Design, Analysis and Manufacturing 35, no. 2 (May 2021): 251–64. http://dx.doi.org/10.1017/s089006042100010x.

Full text

Abstract:

AbstractMachine learning components such as deep neural networks are used extensively in cyber-physical systems (CPS). However, such components may introduce new types of hazards that can have disastrous consequences and need to be addressed for engineering trustworthy systems. Although deep neural networks offer advanced capabilities, they must be complemented by engineering methods and practices that allow effective integration in CPS. In this paper, we proposed an approach for assurance monitoring of learning-enabled CPS based on the conformal prediction framework. In order to allow real-time assurance monitoring, the approach employs distance learning to transform high-dimensional inputs into lower size embedding representations. By leveraging conformal prediction, the approach provides well-calibrated confidence and ensures a bounded small error rate while limiting the number of inputs for which an accurate prediction cannot be made. We demonstrate the approach using three datasets of mobile robot following a wall, speaker recognition, and traffic sign recognition. The experimental results demonstrate that the error rates are well-calibrated while the number of alarms is very small. Furthermore, the method is computationally efficient and allows real-time assurance monitoring of CPS.

APA, Harvard, Vancouver, ISO, and other styles

34

Anthonissen, Lynn, Astrid De Wit, and Tanja Mortelmans. "(Inter)subjective uses of the Dutch progressive constructions." Linguistics 57, no. 5 (September 25, 2019): 1111–59. http://dx.doi.org/10.1515/ling-2019-0019.

Full text

Abstract:

AbstractThis paper addresses the (inter)subjective functions of progressive aspect in Dutch. While the aspectual profile of the various Dutch progressive constructions has received considerable attention in the last few years, much less attention has been paid to their non-aspectual uses. As we will demonstrate in this paper on the basis of a corpus study of spoken Dutch, complemented with native-speaker elicitations, the Dutch progressive constructions can be specifically recruited to express (inter)subjective meanings such as surprise, irritation and intensity, and they differ in this respect from their simplex counterparts. Our analysis of progressive aspect in terms of backgrounded boundaries provides an explanation for (i) the general association of progressive aspect with (inter)subjectivity and (ii) our observation that some Dutch progressive constructions are more prone to such (inter)subjective exploitation than others. This semantic account also underlies the last part of this contribution, in which we discuss cases of what we call “(inter)subjective reinforcement” in complex progressive constructions, that is, the embedding of progressive constructions in other constructions that are also semantically affiliated to (inter)subjectivity (e.g. the perfect,gaan/komen‘go/come’, modals and the bare infinitive construction), which has been largely neglected in the literature.

APA, Harvard, Vancouver, ISO, and other styles

35

Kaler, Jasmeet. "7 ASAS-EAAP Exchange Speaker Talk: Increasing animal welfare and efficiency on-farm with the use of precision technology tools: technology development, application, and use." Journal of Animal Science 97, Supplement_3 (December 2019): 12–13. http://dx.doi.org/10.1093/jas/skz258.023.

Full text

Abstract:

Abstract Recent advances in bio-telemetry technology have made it possible to generate lot of data through sensors, which could be used to monitor welfare and classify behavioural activities in many different farm animals. However, little has been done with regards to evaluating predictive ability and comparing various machine learning approaches for ‘big data’ and also evaluating how this changes depending on sampling frequencies and position of sensors. In this talk, I will discuss technological development covering range of sensor technologies utilising state-of-the-art computation and transmission protocols we have co- developed as part of our research and on how we used these technologies to build machine learning algorithms for lameness in, and drinking behaviour in cows, with an ultimate aim to improve animal welfare. Algorithms could classify behaviours with overall accuracy above 95%; however, the accuracy varied by number of features used, choice of algorithm and window size used for feature generation. The talk will focus on challenges and approaches to build smart systems that are not only technologically advanced, have good accuracy, algorithms that continue to learn and versatile but also energy efficient and practical. While precision livestock farming has been a growing area for the past decade and has huge potential to improve livestock health and welfare, technology adoption has not occurred at the same pace. We need to understand farmers’ perceptions and understanding around technology, its use on farms and in farming. Results from our research with farmers suggest few key areas are important for embedding and adoption of technology on farms: first, utility of the technology, lack of validation and its ability to fit with existing structures and practices and the beliefs held by farmers that the use of the device may result in a loss of skill in future—that of the farmer knowing his animals.

APA, Harvard, Vancouver, ISO, and other styles

36

AHMAD, Rehan, and Syed ZUBAIR. "Unsupervised deep feature embeddings for speaker diarization." TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES 27, no. 4 (July 26, 2019): 3138–49. http://dx.doi.org/10.3906/elk-1901-125.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Sun, Guangzhi, Chao Zhang, and Philip C. Woodland. "Combination of deep speaker embeddings for diarisation." Neural Networks 141 (September 2021): 372–84. http://dx.doi.org/10.1016/j.neunet.2021.04.020.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Keriven, Nicolas, Anthony Bourrier, Rémi Gribonval, and Patrick Pérez. "Sketching for large-scale learning of mixture models." Information and Inference: A Journal of the IMA 7, no. 3 (December 22, 2017): 447–508. http://dx.doi.org/10.1093/imaiai/iax015.

Full text

Abstract:

Abstract Learning parameters from voluminous data can be prohibitive in terms of memory and computational requirements. We propose a ‘compressive learning’ framework, where we estimate model parameters from a sketch of the training data. This sketch is a collection of generalized moments of the underlying probability distribution of the data. It can be computed in a single pass on the training set and is easily computable on streams or distributed datasets. The proposed framework shares similarities with compressive sensing, which aims at drastically reducing the dimension of high-dimensional signals while preserving the ability to reconstruct them. To perform the estimation task, we derive an iterative algorithm analogous to sparse reconstruction algorithms in the context of linear inverse problems. We exemplify our framework with the compressive estimation of a Gaussian mixture model (GMM), providing heuristics on the choice of the sketching procedure and theoretical guarantees of reconstruction. We experimentally show on synthetic data that the proposed algorithm yields results comparable to the classical expectation-maximization technique while requiring significantly less memory and fewer computations when the number of database elements is large. We further demonstrate the potential of the approach on real large-scale data (over $10^{8}$ training samples) for the task of model-based speaker verification. Finally, we draw some connections between the proposed framework and approximate Hilbert space embedding of probability distributions using random features. We show that the proposed sketching operator can be seen as an innovative method to design translation-invariant kernels adapted to the analysis of GMMs. We also use this theoretical framework to derive preliminary information preservation guarantees, in the spirit of infinite-dimensional compressive sensing.

APA, Harvard, Vancouver, ISO, and other styles

39

Villalba, Jesús, Nanxin Chen, David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Jonas Borgstrom, et al. "State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and Speakers in the Wild evaluations." Computer Speech & Language 60 (March 2020): 101026. http://dx.doi.org/10.1016/j.csl.2019.101026.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Pinheiro, Hector N. B., Tsang Ing Ren, André G. Adami, and George D. C. Cavalcanti. "Variational DNN embeddings for text-independent speaker verification." Pattern Recognition Letters 148 (August 2021): 100–106. http://dx.doi.org/10.1016/j.patrec.2021.05.003.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Li, Qingbiao, Chunhua Wu, Zhe Wang, and Kangfeng Zheng. "Hierarchical Transformer Network for Utterance-Level Emotion Recognition." Applied Sciences 10, no. 13 (June 28, 2020): 4447. http://dx.doi.org/10.3390/app10134447.

Full text

Abstract:

While there have been significant advances in detecting emotions in text, in the field of utterance-level emotion recognition (ULER), there are still many problems to be solved. In this paper, we address some challenges in ULER in dialog systems. (1) The same utterance can deliver different emotions when it is in different contexts. (2) Long-range contextual information is hard to effectively capture. (3) Unlike the traditional text classification problem, for most datasets of this task, they contain inadequate conversations or speech. (4) To better model the emotional interaction between speakers, speaker information is necessary. To address the problems of (1) and (2), we propose a hierarchical transformer framework (apart from the description of other studies, the “transformer” in this paper usually refers to the encoder part of the transformer) with a lower-level transformer to model the word-level input and an upper-level transformer to capture the context of utterance-level embeddings. For problem (3), we use bidirectional encoder representations from transformers (BERT), a pretrained language model, as the lower-level transformer, which is equivalent to introducing external data into the model and solves the problem of data shortage to some extent. For problem (4), we add speaker embeddings to the model for the first time, which enables our model to capture the interaction between speakers. Experiments on three dialog emotion datasets, Friends, EmotionPush, and EmoryNLP, demonstrate that our proposed hierarchical transformer network models obtain competitive results compared with the state-of-the-art methods in terms of the macro-averaged F1-score (macro-F1).

APA, Harvard, Vancouver, ISO, and other styles

42

Shim, Hye-jin, Jee-weon Jung, Ju-ho Kim, and Ha-jin Yu. "Integrated Replay Spoofing-Aware Text-Independent Speaker Verification." Applied Sciences 10, no. 18 (September 10, 2020): 6292. http://dx.doi.org/10.3390/app10186292.

Full text

Abstract:

A number of studies have successfully developed speaker verification or presentation attack detection systems. However, studies integrating the two tasks remain in the preliminary stages. In this paper, we propose two approaches for building an integrated system of speaker verification and presentation attack detection: an end-to-end monolithic approach and a back-end modular approach. The first approach simultaneously trains speaker identification, presentation attack detection, and the integrated system using multi-task learning using a common feature. However, through experiments, we hypothesize that the information required for performing speaker verification and presentation attack detection might differ because speaker verification systems try to remove device-specific information from speaker embeddings, while presentation attack detection systems exploit such information. Therefore, we propose a back-end modular approach using a separate deep neural network (DNN) for speaker verification and presentation attack detection. This approach has thee input components: two speaker embeddings (for enrollment and test each) and prediction of presentation attacks. Experiments are conducted using the ASVspoof 2017-v2 dataset, which includes official trials on the integration of speaker verification and presentation attack detection. The proposed back-end approach demonstrates a relative improvement of 21.77% in terms of the equal error rate for integrated trials compared to a conventional speaker verification system.

APA, Harvard, Vancouver, ISO, and other styles

43

Zhang, Chunlei, Kazuhito Koishida, and John H. L. Hansen. "Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings." IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, no. 9 (September 2018): 1633–44. http://dx.doi.org/10.1109/taslp.2018.2831456.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Travadi, Ruchir, and Shrikanth Narayanan. "Total Variability Layer in Deep Neural Network Embeddings for Speaker Verification." IEEE Signal Processing Letters 26, no. 6 (June 2019): 893–97. http://dx.doi.org/10.1109/lsp.2019.2910400.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Watts, Carys, Katie Wray, Ciara Kennedy, Paul Freeman, and Gareth Trainer. "Embedding Enterprise in Biosciences." Industry and Higher Education 24, no. 6 (December 2010): 487–94. http://dx.doi.org/10.5367/ihe.2010.0009.

Full text

Abstract:

Enterprise education at Newcastle University, UK, is embedded in the fabric of the curriculum via the Newcastle University Graduate Skills Framework. An example of this is the ‘Business for the Bioscientist’ module. The authors discuss this module with regard to good practice, enterprise development and the wider arena of graduate careers and employer expectations. The paper illustrates how a combination of academics, curriculum developers, enterprise educators and guest speakers can result in an innovative and interactive enterprise module. Feedback from employers has reinforced the importance of embedding enterprise skills in the curriculum: the authors examine the methodology used at Newcastle to achieve this, the approach adopted and responses from learners. They assess how such an initiative can establish enterprise as a norm in the skills sets of graduates. The paper proposes and highlights various factors that universities need to address if they are to realize fully the concept of entrepreneurial learning.

APA, Harvard, Vancouver, ISO, and other styles

46

Tsipas, Nikolaos, Lazaros Vrysis, Konstantinos Konstantoudakis, and Charalampos Dimoulas. "Semi-supervised audio-driven TV-news speaker diarization using deep neural embeddings." Journal of the Acoustical Society of America 148, no. 6 (December 2020): 3751–61. http://dx.doi.org/10.1121/10.0002924.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Assunção, Gustavo, Nuno Gonçalves, and Paulo Menezes. "Bio-Inspired Modality Fusion for Active Speaker Detection." Applied Sciences 11, no. 8 (April 10, 2021): 3397. http://dx.doi.org/10.3390/app11083397.

Full text

Abstract:

Human beings have developed fantastic abilities to integrate information from various sensory sources exploring their inherent complementarity. Perceptual capabilities are therefore heightened, enabling, for instance, the well-known "cocktail party" and McGurk effects, i.e., speech disambiguation from a panoply of sound signals. This fusion ability is also key in refining the perception of sound source location, as in distinguishing whose voice is being heard in a group conversation. Furthermore, neuroscience has successfully identified the superior colliculus region in the brain as the one responsible for this modality fusion, with a handful of biological models having been proposed to approach its underlying neurophysiological process. Deriving inspiration from one of these models, this paper presents a methodology for effectively fusing correlated auditory and visual information for active speaker detection. Such an ability can have a wide range of applications, from teleconferencing systems to social robotics. The detection approach initially routes auditory and visual information through two specialized neural network structures. The resulting embeddings are fused via a novel layer based on the superior colliculus, whose topological structure emulates spatial neuron cross-mapping of unimodal perceptual fields. The validation process employed two publicly available datasets, with achieved results confirming and greatly surpassing initial expectations.

APA, Harvard, Vancouver, ISO, and other styles

48

Jacobs, Bart. "Embedding Papiamentu in the mixed language debate." Journal of Historical Linguistics 2, no. 1 (July 25, 2012): 52–82. http://dx.doi.org/10.1075/jhl.2.1.05jac.

Full text

Abstract:

This paper takes as a point of departure the hypothesis that Papiamentu descends from Upper Guinea Portuguese Creole (a term covering the sister varieties of the Cape Verde Islands and Guinea-Bissau and Casamance), speakers of which arrived on Curaçao in the second half of the 17th century, subsequently shifted their basic content vocabulary towards Spanish, but maintained the original morphosyntax. This scenario raises the question of whether, in addition to being a creole, Papiamentu can be analyzed as a so-called mixed (or intertwined) language. The present paper positively answers this question by drawing parallels between (the emergence of) Papiamentu and recognized mixed languages.

APA, Harvard, Vancouver, ISO, and other styles

49

Rahimi, Zahra, and Diane Litman. "Entrainment2Vec: Embedding Entrainment for Multi-Party Dialogues." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 8681–88. http://dx.doi.org/10.1609/aaai.v34i05.6393.

Full text

Abstract:

Entrainment is the propensity of speakers to begin behaving like one another in conversation. While most entrainment studies have focused on dyadic interactions, researchers have also started to investigate multi-party conversations. In these studies, multi-party entrainment has typically been estimated by averaging the pairs' entrainment values or by averaging individuals' entrainment to the group. While such multi-party measures utilize the strength of dyadic entrainment, they have not yet exploited different aspects of the dynamics of entrainment relations in multi-party groups. In this paper, utilizing an existing pairwise asymmetric entrainment measure, we propose a novel graph-based vector representation of multi-party entrainment that incorporates both strength and dynamics of pairwise entrainment relations. The proposed kernel approach and weakly-supervised representation learning method show promising results at the downstream task of predicting team outcomes. Also, examining the embedding, we found interesting information about the dynamics of the entrainment relations. For example, teams with more influential members have more process conflict.

APA, Harvard, Vancouver, ISO, and other styles

50

Malcolm, Ian G. "Embedding cultural conceptualization within an adopted language." Cultural Linguistic Contributions to World Englishes 4, no. 2 (December 14, 2017): 149–69. http://dx.doi.org/10.1075/ijolc.4.2.02mal.

Full text

Abstract:

Abstract Although a minority of Indigenous Australians still use their heritage languages, English has been largely adopted by Aboriginal and Torres Strait Islander people as their medium of communication both within and beyond their communities. In the period since English first reached Australia in 1788, a dialect has emerged, drawing on English, contact language, and Indigenous language sources, to enable Aboriginal and Torres Strait Islander speakers to maintain cultural conceptual continuity while communicating in a dramatically changed environment. In the perspective of Cultural Linguistics it can be shown that many of the modifications in the lexicon, grammar, phonology, and discourse of English as used by Indigenous Australians can be related to cultural/conceptual principles, of which five are illustrated here: interconnectedness, embodiment, group reference, orientation to motion, and orientation to observation. This is demonstrated here with data from varieties of Aboriginal English spoken in diverse Australian locations.1 The understanding of Aboriginal English this gives has implications for cross-cultural communication and for education.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!