Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Automatic Text Recognition (ATR).

Artykuły w czasopismach na temat „Automatic Text Recognition (ATR)”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Automatic Text Recognition (ATR)”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Prebor, Gila. "From Digitization and Images to Text and Content: Transkribus as a Case Study". Proceedings of the Association for Information Science and Technology 60, nr 1 (październik 2023): 1102–3. http://dx.doi.org/10.1002/pra2.958.

Pełny tekst źródła
Streszczenie:
ABSTRACTThis poster explores the potential of using technological tools, specifically the Transkribus platform, for the transcription of Hebrew manuscripts. The digitization of historical resources has made them accessible, but the textual content of the scanned images remains inaccessible. Transkribus, an AI‐powered platform, offers tools for text recognition, transcription, and search of historical documents. The poster discusses the process of automatic text recognition (ATR) and the challenges it faces, particularly in handling handwritten texts and Hebrew letters. It provides an overview of the Transkribus platform, its functionalities, and the training process for creating transcription models. The author presents a case study of transcribing a 15th‐century Sephardic semi‐cursive Hebrew manuscript using the Transkribus platform and evaluates the performance of different models. The poster concludes by discussing the implications and possibilities of using Transkribus for automatic transcription of historical Hebrew manuscripts. While the results show promising improvements in accuracy, further challenges and solutions are also discussed. Overall, Transkribus offers significant potential for the study and transcription of Hebrew manuscripts, revolutionizing the field of Jewish studies and historical research.
Style APA, Harvard, Vancouver, ISO itp.
2

R.V, Shalini, Sangamithra G, Shamna A.S, Priyadharshini B i Raguram M. "Digital Prescription for Hospital Database Management using ASR". International Journal of Computer Communication and Informatics 6, nr 1 (25.05.2024): 58–69. http://dx.doi.org/10.34256/ijcci2414.

Pełny tekst źródła
Streszczenie:
According to American Medical Association (AMA), handwritten prescriptions are associated with larger risk of pharmaceutical errors when compared to electronic prescriptions. The solution to this problem is to create a digital prescription. This application leverages the usage of automated speech recognition (ASR) technology with digital prescription to make flawless and legible prescriptions. Automatic speech recognition reduces transcribing errors and speeds up prescription processing as well as ensures smooth interface with hospital database management by translating spoken instructions into text in real-time. This innovation not only simplifies clinical workflows but also improves patient safety and database management by providing a reliable and automated method for prescription documentation. This paper presents a digital prescription system for hospital database management using automatic speech recognition (ASR) technology, integrated with MySQL for database management and Java Script for application development. This approach aims to streamline the prescription process, minimize pharmaceutical errors and improve the overall patient care.
Style APA, Harvard, Vancouver, ISO itp.
3

Kit, Chunyu, i Xiaoyue Liu. "Measuring mono-word termhood by rank difference via corpus comparison". Terminology 14, nr 2 (12.12.2008): 204–29. http://dx.doi.org/10.1075/term.14.2.05kit.

Pełny tekst źródła
Streszczenie:
Terminology as a set of concept carriers crystallizes our special knowledge about a subject. Automatic term recognition (ATR) plays a critical role in the processing and management of various kinds of information, knowledge and documents, e.g., knowledge acquisition via text mining. Measuring termhood properly is one of the core issues involved in ATR. This article presents a novel approach to termhood measurement for mono-word terms via corpus comparison, which quantifies the termhood of a term candidate as its rank difference in a domain and a background corpus. Our ATR experiments to identify legal terms in Hong Kong (HK) legal texts with the British National Corpus (BNC) as background corpus provide evidence to confirm the validity and effectiveness of this approach. Without any prior knowledge and ad hoc heuristics, it achieves a precision of 97.0% on the top 1000 candidates and a precision of 96.1% on the top 10% candidates that are most highly ranked by the termhood measure, illustrating a state-of-the-art performance on mono-word ATR in the field.
Style APA, Harvard, Vancouver, ISO itp.
4

Hu, Jinge. "Automatic Target Recognition of SAR Images Using Collaborative Representation". Computational Intelligence and Neuroscience 2022 (24.05.2022): 1–7. http://dx.doi.org/10.1155/2022/3100028.

Pełny tekst źródła
Streszczenie:
Synthetic aperture radar (SAR) automatic target recognition (ATR) is one of the key technologies for SAR image interpretation. This paper proposes a SAR target recognition method based on collaborative representation-based classification (CRC). The collaborative coding adopts the global dictionary constructed by training samples of all categories to optimally reconstruct the test samples and determines the target category according to the reconstruction error of each category. Compared with the sparse representation methods, the collaborative representation strategy can improve the representation ability of a small number of training samples for test samples. For SAR target recognition, the resources of training samples are very limited. Therefore, the collaborative representation is more suitable. Based on the MSTAR dataset, the experiments are carried out under a variety of conditions and the proposed method is compared with other classifiers. Experimental results show that the proposed method can achieve superior recognition performance under the standard operating condition (SOC), configuration variances, depression angle variances, and a small number of training samples, which proves its effectiveness.
Style APA, Harvard, Vancouver, ISO itp.
5

Weng, Wenbo. "Multitask Sparse Representation of Two-Dimensional Variational Mode Decomposition Components for SAR Target Recognition". Scientific Programming 2023 (25.04.2023): 1–12. http://dx.doi.org/10.1155/2023/8846287.

Pełny tekst źródła
Streszczenie:
A synthetic aperture radar (SAR) automatic target recognition (ATR) method is developed based on the two-dimensional variational mode decomposition (2D-VMD). 2D-VMD decomposes original SAR images into multiscale components, which depict the time-frequency properties of the targets. The original image and its 2D-VMD components are highly correlated, so the multitask sparse representation is chosen to jointly represent them. According to the resulted reconstruction errors of different classes, the target label of test sample can be classified. The moving and stationary target acquisition and recognition (MSTAR) dataset is used to set up the standard operating condition (SOC) and several extended operating conditions (EOCs) including configuration variants, depression angle variances, noise corruption, and partial occlusion to test and validate the proposed method. The results confirm the effectiveness and robustness of the proposed method compared with several state-of-the-art SAR ATR references.
Style APA, Harvard, Vancouver, ISO itp.
6

Araujo, Gustavo F., Renato Machado i Mats I. Pettersson. "Non-Cooperative SAR Automatic Target Recognition Based on Scattering Centers Models". Sensors 22, nr 3 (8.02.2022): 1293. http://dx.doi.org/10.3390/s22031293.

Pełny tekst źródła
Streszczenie:
This article proposes an Automatic Target Recognition (ATR) algorithm to classify non-cooperative targets in Synthetic Aperture Radar (SAR) images. The scarcity or nonexistence of measured SAR data demands that classification algorithms rely only on synthetic data for training purposes. Based on a model represented by the set of scattering centers extracted from purely synthetic data, the proposed algorithm generates hypotheses for the set of scattering centers extracted from the target under test belonging to each class. A Goodness of Fit test is considered to verify each hypothesis, where the Likelihood Ratio Test is modified by a scattering center-weighting function common to both the model and target. Some algorithm variations are assessed for scattering center extraction and hypothesis generation and verification. The proposed solution is the first model-based classification algorithm to address the recently released Synthetic and Measured Paired Labeled Experiment (SAMPLE) dataset on a 100% synthetic training data basis. As a result, an accuracy of 91.30% in a 10-target test within a class experiment under Standard Operating Conditions (SOCs) was obtained. The algorithm was also pioneered in testing the SAMPLE dataset in Extend Operating Conditions (EOCs), assuming noise contamination and different target configurations. The proposed algorithm was shown to be robust for SNRs greater than −5 dB.
Style APA, Harvard, Vancouver, ISO itp.
7

Zhang, Likun, Xiaoyan Li, Yi Tang, Fangbin Song, Tian Xia i Wei Wang. "Contemporary Advertising Text Art Design and Effect Evaluation by IoT Deep Learning under the Smart City". Security and Communication Networks 2022 (22.07.2022): 1–14. http://dx.doi.org/10.1155/2022/5161398.

Pełny tekst źródła
Streszczenie:
This work intends to solve the problem that the current artistic typeface generation methods rely too much on manual intervention, lack novelty, and the single font local feature and the global feature extraction method cannot fully describe the font features. Firstly, it proposes a handwritten word recognition model based on generalized search trees (GIST) and the pyramid histogram of oriented gradient (PHOG). The local features and global features of the font are fused. Secondly, a model of automatic artistic typeface generation based on generative adversarial networks (GAN) is constructed, which can use hand-drawn fonts to automatically generate artistic typefaces in the desired style through training as needed. Finally, the generation of the huaniao typeface is used as an example. By constructing the dataset, the effectiveness of the two models is verified. The experimental results show the following: (1) The proposed handwritten character recognition model based on GIST and PHOG has a higher recognition rate of different fonts than the single GIST and PHOG features by more than 5.8%. The total recognition time is reduced by more than 49.4%, and the performance is improved significantly. (2) Compared with other popular algorithms, the constructed GAN-based automatic artistic typeface generation model has the best quality of the generation of huaniao on both the pencil sketch and the calligraphy character image dataset. Models have broad application prospects in contemporary advertising text art design. This study aims to provide important technical support for the automation of contemporary advertising text art design and the improvement of overall efficiency.
Style APA, Harvard, Vancouver, ISO itp.
8

Salamun, Sukri, Khairul Amin, Luluk Elvitaria i Liza Trisnawati. "Artificial Intelligence Automatic Speech Recognition (ASR) untuk pencarian potongan ayat Al-Qu’ran". Jurnal Komputer Terapan, Vol. 8 No. 1 (2022) (31.05.2022): 36–45. http://dx.doi.org/10.35143/jkt.v8i1.5299.

Pełny tekst źródła
Streszczenie:
Indonesia merupakan negara dengan jumlah umat muslim terbesar di dunia, yang menjadikan pembacaan ayat-ayat Al-Qur’an sering terdengar di berbagai tempat-tempat umum seperti Mesjid, Mushollah, dan di berbagai kegiatan. Pemanfaatan Automatic Speech Recognition (ASR) sebagai pengenalan kata yang bertujuan untuk mengetahui ayat-ayat Al-Qur’an yang di bacakan untuk menambah pengetahuan mengenai ayat-ayat serta informasi pendukung lainnya sebagai salah satu sarana berdakwah dalam menyampaikan pengetahuan mengenai ayat-ayat Al-Qur’an. Automatic Speech Recognitions (ASR) ini dirancang menggunakan bahasa pemograman Python dan menggunakan framework Django untuk menampilkan informasi mengenai ayat-ayat yang dibacakan dalam bentuk tampilan berbasis web. Penelitian ini bertujuan untuk menciptakan sebuah teknik dan sistem untuk memasukkan perintah suara ke dalam mesin, agar mesin dapat mengerti apa yang manusia ucapkan dan mematuhi apa yang diperintahkannya. Aplikasi ini mengubah data suara menjadi data text menggunakan sistem pengenalan suara yang bekerja secara otomatis dengan pencocokan pola didigitalkan audio kata yang diucapkan terhadap model komputer dari pola bicara untuk menghasilkan keluaran akhir berupa teks yang di simpan didalam database.
Style APA, Harvard, Vancouver, ISO itp.
9

Yu, Xuelian, Hailong Yu, Yi Liu i Haohao Ren. "Enhanced Prototypical Network with Customized Region-Aware Convolution for Few-Shot SAR ATR". Remote Sensing 16, nr 19 (25.09.2024): 3563. http://dx.doi.org/10.3390/rs16193563.

Pełny tekst źródła
Streszczenie:
With the prosperous development and successful application of deep learning technologies in the field of remote sensing, numerous deep-learning-based methods have emerged for synthetic aperture radar (SAR) automatic target recognition (ATR) tasks over the past few years. Generally, most deep-learning-based methods can achieve outstanding recognition performance on the condition that an abundance of labeled samples are available to train the model. However, in real application scenarios, it is difficult and costly to acquire and to annotate abundant SAR images due to the imaging mechanism of SAR, which poses a big challenge to existing SAR ATR methods. Therefore, SAR target recognition in the situation of few-shot, where only a scarce few labeled samples are available, is a fundamental problem that needs to be solved. In this paper, a new method named enhanced prototypical network with customized region-aware convolution (CRCEPN) is put forward to specially tackle the few-shot SAR ATR tasks. To be specific, a feature-extraction network based on a customized and region-aware convolution is first developed. This network can adaptively adjust convolutional kernels and their receptive fields according to each SAR image’s own characteristics as well as the semantical similarity among spatial regions, thus augmenting its capability to extract more informative and discriminative features. To achieve accurate and robust target identity prediction under the few-shot condition, an enhanced prototypical network is proposed. This network can improve the representation ability of the class prototype by properly making use of training and test samples together, thus effectively raising the classification accuracy. Meanwhile, a new hybrid loss is designed to learn a feature space with both inter-class separability and intra-class tightness as much as possible, which can further upgrade the recognition performance of the proposed method. Experiments performed on the moving and stationary target acquisition and recognition (MSTAR) dataset, the OpenSARShip dataset, and the SAMPLE+ dataset demonstrate that the proposed method is competitive with some state-of-the-art methods for few-shot SAR ATR tasks.
Style APA, Harvard, Vancouver, ISO itp.
10

Eck, J. Thomas, i Frank Y. Shih. "An automatic text-free speaker recognition system based on an enhanced Art 2 neural architecture". Information Sciences 76, nr 3-4 (styczeń 1994): 233–53. http://dx.doi.org/10.1016/0020-0255(94)90011-6.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
11

Zhao, Xin, Xiaoling Lv, Jinlei Cai, Jiayi Guo, Yueting Zhang, Xiaolan Qiu i Yirong Wu. "Few-Shot SAR-ATR Based on Instance-Aware Transformer". Remote Sensing 14, nr 8 (14.04.2022): 1884. http://dx.doi.org/10.3390/rs14081884.

Pełny tekst źródła
Streszczenie:
Few-shot synthetic aperture radar automatic target recognition (SAR-ATR) aims to recognize the targets of the images (query images) based on a few annotated images (support images). Such a task requires modeling the relationship between the query and support images. In this paper, we propose the instance-aware transformer (IAT) model. The IAT exploits the power of all instances by constructing the attention map based on the similarities between the query feature and all support features. The query feature aggregates the support features based on the attention values. To align the features of the query and support images in IAT, the shared cross-transformer keep all the projections in the module shared across all features. Instance cosine distance is used in training to minimize the distance between the query feature and the support features. In testing, to fuse the support features of the same class into the class representation, Euclidean (Cosine) Loss is used to calculate the query-class distances. Experiments on the two proposed few-shot SAR-ATR test sets based on MSTAR demonstrate the superiority of the proposed method.
Style APA, Harvard, Vancouver, ISO itp.
12

Nga, Yan Zun, Zuhayr Rymansaib, Alfie Anthony Treloar i Alan Hunter. "Automated Recognition of Submerged Body-like Objects in Sonar Images Using Convolutional Neural Networks". Remote Sensing 16, nr 21 (30.10.2024): 4036. http://dx.doi.org/10.3390/rs16214036.

Pełny tekst źródła
Streszczenie:
The Police Robot for Inspection and Mapping of Underwater Evidence (PRIME) is an uncrewed surface vehicle (USV) currently being developed for underwater search and recovery teams to assist in crime scene investigation. The USV maps underwater scenes using sidescan sonar (SSS). Test exercises use a clothed mannequin lying on the seafloor as a target object to evaluate system performance. A robust, automated method for detecting human body-shaped objects is required to maximise operational functionality. The use of a convolutional neural network (CNN) for automatic target recognition (ATR) is proposed. SSS image data acquired from four different locations during previous missions were used to build a dataset consisting of two classes, i.e., a binary classification problem. The target object class consisted of 166 196 × 196 pixel image snippets of the underwater mannequin, whereas the non-target class consisted of 13,054 examples. Due to the large class imbalance in the dataset, CNN models were trained with six different imbalance ratios. Two different pre-trained models (ResNet-50 and Xception) were compared, and trained via transfer learning. This paper presents results from the CNNs and details the training methods used. Larger datasets are shown to improve CNN performance despite class imbalance, achieving average F1 scores of 97% in image classification. Average F1 scores for target vs background classification with unseen data are only 47% but the end result is enhanced by combining multiple weak classification results in an ensemble average. The combined output, represented as a georeferenced heatmap, accurately indicates the target object location with a high detection confidence and one false positive of low confidence. The CNN approach shows improved object detection performance when compared to the currently used ATR method.
Style APA, Harvard, Vancouver, ISO itp.
13

Jiang, Chunyun, Huiqiang Zhang, Ronghui Zhan, Wenyu Shu i Jun Zhang. "Open-Set Recognition Model for SAR Target Based on Capsule Network with the KLD". Remote Sensing 16, nr 17 (26.08.2024): 3141. http://dx.doi.org/10.3390/rs16173141.

Pełny tekst źródła
Streszczenie:
Synthetic aperture radar (SAR) automatic target recognition (ATR) technology has seen significant advancements. Despite these advancements, the majority of research still operates under the closed-set assumption, wherein all test samples belong to classes seen during the training phase. In real-world applications, however, it is common to encounter targets not previously seen during training, posing a significant challenge to the existing methods. Ideally, an ATR system should not only accurately identify known target classes but also effectively reject those belonging to unknown classes, giving rise to the concept of open set recognition (OSR). To address this challenge, we propose a novel approach that leverages the unique capabilities of the Capsule Network and the Kullback-Leibler divergence (KLD) to distinguish unknown classes. This method begins by deeply mining the features of SAR targets using the Capsule Network and enhancing the separability between different features through a specially designed loss function. Subsequently, the KLD of features between a testing sample and the center of each known class is calculated. If the testing sample exhibits a significantly larger KLD compared to all known classes, it is classified as an unknown target. The experimental results of the SAR-ACD dataset demonstrate that our method can maintain a correct identification rate of over 95% for known classes while effectively recognizing unknown classes. Compared to existing techniques, our method exhibits significant improvements.
Style APA, Harvard, Vancouver, ISO itp.
14

Xing, Xiangwei, Kefeng Ji, Huanxin Zou i Jixiang Sun. "Sparse Representation Based SAR Vehicle Recognition along with Aspect Angle". Scientific World Journal 2014 (2014): 1–10. http://dx.doi.org/10.1155/2014/834140.

Pełny tekst źródła
Streszczenie:
As a method of representing the test sample with few training samples from an overcomplete dictionary, sparse representation classification (SRC) has attracted much attention in synthetic aperture radar (SAR) automatic target recognition (ATR) recently. In this paper, we develop a novel SAR vehicle recognition method based on sparse representation classification along with aspect information (SRCA), in which the correlation between the vehicle’s aspect angle and the sparse representation vector is exploited. The detailed procedure presented in this paper can be summarized as follows. Initially, the sparse representation vector of a test sample is solved by sparse representation algorithm with a principle component analysis (PCA) feature-based dictionary. Then, the coefficient vector is projected onto a sparser one within a certain range of the vehicle’s aspect angle. Finally, the vehicle is classified into a certain category that minimizes the reconstruction error with the novel sparse representation vector. Extensive experiments are conducted on the moving and stationary target acquisition and recognition (MSTAR) dataset and the results demonstrate that the proposed method performs robustly under the variations of depression angle and target configurations, as well as incomplete observation.
Style APA, Harvard, Vancouver, ISO itp.
15

Lee, Sumi, i Sang-Wan Kim. "Recognition of Targets in SAR Images Based on a WVV Feature Using a Subset of Scattering Centers". Sensors 22, nr 21 (5.11.2022): 8528. http://dx.doi.org/10.3390/s22218528.

Pełny tekst źródła
Streszczenie:
This paper proposes a robust method for feature-based matching with potential for application to synthetic aperture radar (SAR) automatic target recognition (ATR). The scarcity of measured SAR data available for training classification algorithms leads to the replacement of such data with synthetic data. As attributed scattering centers (ASCs) extracted from the SAR image reflect the electromagnetic phenomenon of the SAR target, this is effective for classifying targets when purely synthetic SAR images are used as the template. In the classification stage, following preparation of the extracted template ASC dataset, some of the template ASCs were subsampled by the amplitude and the neighbor matching algorithm to focus on the related points of the test ASCs. Then, the subset of ASCs were reconstructed to the world view vector feature set, considering the point similarity and structure similarity simultaneously. Finally, the matching scores between the two sets were calculated using weighted bipartite graph matching and then combined with several weights for overall similarity. Experiments on synthetic and measured paired labeled experiment datasets, which are publicly available, were conducted to verify the effectiveness and robustness of the proposed method. The proposed method can be used in practical SAR ATR systems trained using simulated images.
Style APA, Harvard, Vancouver, ISO itp.
16

Geng, Zhe, Ying Xu, Bei-Ning Wang, Xiang Yu, Dai-Yin Zhu i Gong Zhang. "Target Recognition in SAR Images by Deep Learning with Training Data Augmentation". Sensors 23, nr 2 (13.01.2023): 941. http://dx.doi.org/10.3390/s23020941.

Pełny tekst źródła
Streszczenie:
Mass production of high-quality synthetic SAR training imagery is essential for boosting the performance of deep-learning (DL)-based SAR automatic target recognition (ATR) algorithms in an open-world environment. To address this problem, we exploit both the widely used Moving and Stationary Target Acquisition and Recognition (MSTAR) SAR dataset and the Synthetic and Measured Paired Labeled Experiment (SAMPLE) dataset, which consists of selected samples from the MSTAR dataset and their computer-generated synthetic counterparts. A series of data augmentation experiments are carried out. First, the sparsity of the scattering centers of the targets is exploited for new target pose synthesis. Additionally, training data with various clutter backgrounds are synthesized via clutter transfer, so that the neural networks are better prepared to cope with background changes in the test samples. To effectively augment the synthetic SAR imagery in the SAMPLE dataset, a novel contrast-based data augmentation technique is proposed. To improve the robustness of neural networks against out-of-distribution (OOD) samples, the SAR images of ground military vehicles collected by the self-developed MiniSAR system are used as the training data for the adversarial outlier exposure procedure. Simulation results show that the proposed data augmentation methods are effective in improving both the target classification accuracy and the OOD detection performance. The purpose of this work is to establish the foundation for large-scale, open-field implementation of DL-based SAR-ATR systems, which is not only of great value in the sense of theoretical research, but is also potentially meaningful in the aspect of military application.
Style APA, Harvard, Vancouver, ISO itp.
17

Baquero-Arnal, Pau, Javier Jorge, Adrià Giménez, Javier Iranzo-Sánchez, Alejandro Pérez, Gonçal Vicent Garcés Díaz-Munío, Joan Albert Silvestre-Cerdà, Jorge Civera, Albert Sanchis i Alfons Juan. "MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension". Applied Sciences 12, nr 2 (13.01.2022): 804. http://dx.doi.org/10.3390/app12020804.

Pełny tekst źródła
Streszczenie:
This paper describes the automatic speech recognition (ASR) systems built by the MLLP-VRAIN research group of Universitat Politècnica de València for the Albayzín-RTVE 2020 Speech-to-Text Challenge, and includes an extension of the work consisting of building and evaluating equivalent systems under the closed data conditions from the 2018 challenge. The primary system (p-streaming_1500ms_nlt) was a hybrid ASR system using streaming one-pass decoding with a context window of 1.5 seconds. This system achieved 16.0% WER on the test-2020 set. We also submitted three contrastive systems. From these, we highlight the system c2-streaming_600ms_t which, following a similar configuration as the primary system with a smaller context window of 0.6 s, scored 16.9% WER points on the same test set, with a measured empirical latency of 0.81 ± 0.09 s (mean ± stdev). That is, we obtained state-of-the-art latencies for high-quality automatic live captioning with a small WER degradation of 6% relative. As an extension, the equivalent closed-condition systems obtained 23.3% WER and 23.5% WER, respectively. When evaluated with an unconstrained language model, we obtained 19.9% WER and 20.4% WER; i.e., not far behind the top-performing systems with only 5% of the full acoustic data and with the extra ability of being streaming-capable. Indeed, all of these streaming systems could be put into production environments for automatic captioning of live media streams.
Style APA, Harvard, Vancouver, ISO itp.
18

Khalil, Driss, Amrutha Prasad, Petr Motlicek, Juan Zuluaga-Gomez, Iuliia Nigmatulina, Srikanth Madikeri i Christof Schuepbach. "An Automatic Speaker Clustering Pipeline for the Air Traffic Communication Domain". Aerospace 10, nr 10 (10.10.2023): 876. http://dx.doi.org/10.3390/aerospace10100876.

Pełny tekst źródła
Streszczenie:
In air traffic management (ATM), voice communications are critical for ensuring the safe and efficient operation of aircraft. The pertinent voice communications—air traffic controller (ATCo) and pilot—are usually transmitted in a single channel, which poses a challenge when developing automatic systems for air traffic management. Speaker clustering is one of the challenges when applying speech processing algorithms to identify and group the same speaker among different speakers. We propose a pipeline that deploys (i) speech activity detection (SAD) to identify speech segments, (ii) an automatic speech recognition system to generate the text for audio segments, (iii) text-based speaker role classification to detect the role of the speaker—ATCo or pilot in our case—and (iv) unsupervised speaker clustering to create a cluster of each individual pilot speaker from the obtained speech utterances. The speech segments obtained by SAD are input into an automatic speech recognition (ASR) engine to generate the automatic English transcripts. The speaker role classification system takes the transcript as input and uses it to determine whether the speech was from the ATCo or the pilot. As the main goal of this project is to group the speakers in pilot communication, only pilot data acquired from the classification system is employed. We present a method for separating the speech parts of pilots into different clusters based on the speaker’s voice using agglomerative hierarchical clustering (AHC). The performance of the speaker role classification and speaker clustering is evaluated on two publicly available datasets: the ATCO2 corpus and the Linguistic Data Consortium Air Traffic Control Corpus (LDC-ATCC). Since the pilots’ real identities are unknown, the ground truth is generated based on logical hypotheses regarding the creation of each dataset, timing information, and the information extracted from associated callsigns. In the case of speaker clustering, the proposed algorithm achieves an accuracy of 70% on the LDC-ATCC dataset and 50% on the more noisy ATCO2 dataset.
Style APA, Harvard, Vancouver, ISO itp.
19

Altheneyan, Alaa, i Mohamed El Bachir Menai. "Evaluation of State-of-the-Art Paraphrase Identification and Its Application to Automatic Plagiarism Detection". International Journal of Pattern Recognition and Artificial Intelligence 34, nr 04 (22.08.2019): 2053004. http://dx.doi.org/10.1142/s0218001420530043.

Pełny tekst źródła
Streszczenie:
Paraphrase identification is a natural language processing (NLP) problem that involves the determination of whether two text segments have the same meaning. Various NLP applications rely on a solution to this problem, including automatic plagiarism detection, text summarization, machine translation (MT), and question answering. The methods for identifying paraphrases found in the literature fall into two main classes: similarity-based methods and classification methods. This paper presents a critical study and an evaluation of existing methods for paraphrase identification and its application to automatic plagiarism detection. It presents the classes of paraphrase phenomena, the main methods, and the sets of features used by each particular method. All the methods and features used are discussed and enumerated in a table for easy comparison. Their performances on benchmark corpora are also discussed and compared via tables. Automatic plagiarism detection is presented as an application of paraphrase identification. The performances on benchmark corpora of existing plagiarism detection systems able to detect paraphrases are compared and discussed. The main outcome of this study is the identification of word overlap, structural representations, and MT measures as feature subsets that lead to the best performance results for support vector machines in both paraphrase identification and plagiarism detection on corpora. The performance results achieved by deep learning techniques highlight that these techniques are the most promising research direction in this field.
Style APA, Harvard, Vancouver, ISO itp.
20

Tatman, Rachael. "Why ASR + NLP isn't enough for commercial language technology". Journal of the Acoustical Society of America 150, nr 4 (październik 2021): A347. http://dx.doi.org/10.1121/10.0008537.

Pełny tekst źródła
Streszczenie:
With an increasing commercial demand for speech interfaces to be integrated into language technology, many technologists have made an unfortunate discovery: combining existing automatic speech recognition (ASR) and natural language processing (NLP) systems often leads to disappointing results. This talk will discuss two factors that contribute to this disparity and make some general suggestions for language technologists and researchers looking to work with tem. The first is the greater degree of variation in speech than text (at least in languages like English) which can lead to higher error rates overall. The second is a mismatch in domain. Modern machine learning approaches to language technology are very sensitive to differences between datasets and (due in part to the disciplinary division between researchers working on language technology for speech and text) most NLP applications have not been trained on speech data.
Style APA, Harvard, Vancouver, ISO itp.
21

Chraibi, Khaoula, Ilham Chaker i Azeddine Zahi. "Predicting personality traits from Arabic text: an investigation of textual and demographic features with feature selection analysis". International Journal of Electrical and Computer Engineering (IJECE) 15, nr 1 (1.02.2025): 970. http://dx.doi.org/10.11591/ijece.v15i1.pp970-979.

Pełny tekst źródła
Streszczenie:
Automatic personality recognition (APR) utilizes machine learning to predict personality traits from various data sources. This study aims to predict the big five personality traits from modern standard Arabic (MSA) texts, using both textual and demographic features. The “MSAPersonality” dataset is employed to conduct a comprehensive analysis of features and feature selection methods to evaluate their impact on APR model performance. We compared feature selection algorithms from the filter, wrapper, and embedded-based categories through a systematic experimental design that consisted of feature engineering, feature selection, and regression. This study showed that each trait was more accurately predicted using a distinct set of features. However, age and study level were the most common features among the five traits. Moreover, although there were no statistically significant differences in performance between the feature selection techniques, embedded-based methods offered the best compromise between performance, time, and interpretability. These findings contribute to the understanding of APR in general and among Arabic speakers.
Style APA, Harvard, Vancouver, ISO itp.
22

Stefaniak, Paweł, Maria Stachowiak, Wioletta Koperska, Artur Skoczylas i Paweł Śliwiński. "Application of Wearable Computer and ASR Technology in an Underground Mine to Support Mine Supervision of the Heavy Machinery Chamber". Sensors 22, nr 19 (8.10.2022): 7628. http://dx.doi.org/10.3390/s22197628.

Pełny tekst źródła
Streszczenie:
Systems that use automatic speech recognition in industry are becoming more and more popular. They bring benefits especially in cases when the user’s hands are often busy or the environment does not allow the use of a keyboard. However, the accuracy of algorithms is still a big challenge. The article describes the attempt to use ASR in the underground mining industry as an improvement in the records of work in the heavy machinery chamber by a foreman. Particular attention was paid to the factors that in this case will have a negative impact on speech recognition: the influence of the environment, specialized mining vocabulary, and the learning curve. First, the foreman’s workflow and documentation were recognized. This allowed for the selection of functionalities that should be included in the application. A dictionary of specialized mining vocabulary and a source database were developed which, in combination with the string matching algorithms, aim to improve correct speech recognition. Text mining analysis, machine learning methods were used to create functionalities that provide assistance in registering information. Finally, the prototype of the application was tested in the mining environment and the accuracy of the results were presented.
Style APA, Harvard, Vancouver, ISO itp.
23

Gondi, Santosh, i Vineel Pratap. "Performance and Efficiency Evaluation of ASR Inference on the Edge". Sustainability 13, nr 22 (10.11.2021): 12392. http://dx.doi.org/10.3390/su132212392.

Pełny tekst źródła
Streszczenie:
Automatic speech recognition, a process of converting speech signals to text, has improved a great deal in the past decade thanks to the deep learning based systems. With the latest transformer based models, the recognition accuracy measured as word-error-rate (WER), is even below the human annotator error (4%). However, most of these advanced models run on big servers with large amounts of memory, CPU/GPU resources and have huge carbon footprint. This server based architecture of ASR is not viable in the long run given the inherent lack of privacy for user data, reliability and latency issues of the network connection. On the other hand, on-device ASR (meaning, speech to text conversion on the edge device itself) solutions will fix deep-rooted privacy issues while at same time being more reliable and performant by avoiding network connectivity to the back-end server. On-device ASR can also lead to a more sustainable solution by considering the energy vs. accuracy trade-off and choosing right model for specific use cases/applications of the product. Hence, in this paper we evaluate energy-accuracy trade-off of ASR with a typical transformer based speech recognition model on an edge device. We have run evaluations on Raspberry Pi with an off-the-shelf USB meter for measuring energy consumption. We conclude that, in the case of CPU based ASR inference, the energy consumption grows exponentially as the word error rate improves linearly. Additionally, based on our experiment we deduce that, with PyTorch mobile optimization and quantization, the typical transformer based ASR on edge performs reasonably well in terms of accuracy and latency and comes close to the accuracy of server based inference.
Style APA, Harvard, Vancouver, ISO itp.
24

Guo, Dongyue, Zichen Zhang, Peng Fan, Jianwei Zhang i Bo Yang. "A Context-Aware Language Model to Improve the Speech Recognition in Air Traffic Control". Aerospace 8, nr 11 (16.11.2021): 348. http://dx.doi.org/10.3390/aerospace8110348.

Pełny tekst źródła
Streszczenie:
Recognizing isolated digits of the flight callsign is an important and challenging task for automatic speech recognition (ASR) in air traffic control (ATC). Fortunately, the flight callsign is a kind of prior ATC knowledge and is available from dynamic contextual information. In this work, we attempt to utilize this prior knowledge to improve the performance of the callsign identification by integrating it into the language model (LM). The proposed approach is named context-aware language model (CALM), which can be applied for both the ASR decoding and rescoring phase. The proposed model is implemented with an encoder–decoder architecture, in which an extra context encoder is proposed to consider the contextual information. A shared embedding layer is designed to capture the correlations between the ASR text and contextual information. The context attention is introduced to learn discriminative representations to support the decoder module. Finally, the proposed approach is validated with an end-to-end ASR model on a multilingual real-world corpus (ATCSpeech). Experimental results demonstrate that the proposed CALM outperforms other baselines for both the ASR and callsign identification task, and can be practically migrated to a real-time environment.
Style APA, Harvard, Vancouver, ISO itp.
25

Olatunji, Tobi, Tejumade Afonja, Aditya Yadavalli, Chris Chinenye Emezue, Sahib Singh, Bonaventure F. P. Dossou, Joanne Osuchukwu i in. "AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR". Transactions of the Association for Computational Linguistics 11 (2023): 1669–85. http://dx.doi.org/10.1162/tacl_a_00627.

Pełny tekst źródła
Streszczenie:
Abstract Africa has a very poor doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day—a heavy patient burden compared with developed countries—but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However, clinical ASR is mature, even ubiquitous, in developed nations, and clinician-reported performance of commercial clinical ASR systems is generally satisfactory. Furthermore, the recent performance of general domain ASR is approaching human accuracy. However, several gaps exist. Several publications have highlighted racial bias with speech-to-text algorithms and performance on minority accents lags significantly. To our knowledge, there is no publicly available research or benchmark on accented African clinical ASR, and speech data is non-existent for the majority of African accents. We release AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463 unique speakers across 120 indigenous accents from 13 countries for clinical and general domain ASR, a benchmark test set, with publicly available pre-trained models with SOTA performance on the AfriSpeech benchmark.
Style APA, Harvard, Vancouver, ISO itp.
26

Zuluaga-Gomez, Juan, Amrutha Prasad, Iuliia Nigmatulina, Petr Motlicek i Matthias Kleinert. "A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers". Aerospace 10, nr 5 (22.05.2023): 490. http://dx.doi.org/10.3390/aerospace10050490.

Pełny tekst źródła
Streszczenie:
In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI)-based tools. The virtual simulation-pilot engine receives spoken communications from ATCo trainees, and it performs automatic speech recognition and understanding. Thus, it goes beyond only transcribing the communication and can also understand its meaning. The output is subsequently sent to a response generator system, which resembles the spoken read-back that pilots give to the ATCo trainees. The overall pipeline is composed of the following submodules: (i) an automatic speech recognition (ASR) system that transforms audio into a sequence of words; (ii) a high-level air traffic control (ATC)-related entity parser that understands the transcribed voice communication; and (iii) a text-to-speech submodule that generates a spoken utterance that resembles a pilot based on the situation of the dialogue. Our system employs state-of-the-art AI-based tools such as Wav2Vec 2.0, Conformer, BERT and Tacotron models. To the best of our knowledge, this is the first work fully based on open-source ATC resources and AI tools. In addition, we develop a robust and modular system with optional submodules that can enhance the system’s performance by incorporating real-time surveillance data, metadata related to exercises (such as sectors or runways), or even a deliberate read-back error to train ATCo trainees to identify them. Our ASR system can reach as low as 5.5% and 15.9% absolute word error rates (WER) on high- and low-quality ATC audio. We also demonstrate that adding surveillance data into the ASR can yield a callsign detection accuracy of more than 96%.
Style APA, Harvard, Vancouver, ISO itp.
27

Song, Peng, Han Zhang, Cheng Wang, Bin Luo i Jun Xiong Zhang. "Design and Experiment of a Sorting System for Haploid Maize Kernel". International Journal of Pattern Recognition and Artificial Intelligence 32, nr 03 (22.11.2017): 1855002. http://dx.doi.org/10.1142/s0218001418550029.

Pełny tekst źródła
Streszczenie:
This study designed an automatic sorting system that can separate haploid maize kernels from cross-breeding kernels that are marked with the Navajo label. This system comprises the seed feeding, image acquisition, sorting and system control units. The seed feeding unit distributes the maize kernels over the synchronous belt. The image acquisition unit acquires the images of maize kernels, as well as distinguishes the heterozygous from haploid kernels based on the color feature of the endosperm. The sorting unit contains mechanical arms and solenoid valves that can select the heterozygote kernels using air suction. Lastly, the system control unit is compatible with other units. Four maize varieties, namely, 958-CAU, 1050-37, LC0990-LN75 and LC0995-CAU, were provided by China Agricultural University, National Maize Improvement Center. These varieties were selected for the experiments conducted. The successful unloading rates of the system are 85.56%, 91.39%, 87.22% and 86.39%, respectively. (R[Formula: see text]G)[Formula: see text]G–B[Formula: see text]/B was employed to distinguish the pigmented endosperm area. The accuracy rates of identification for the heterozygous kernels are 92.91%, 95.04%, 88.82% and 92.65% for 958-CAU, 1050-37, LC0990-LN75 and LC0995-CAU, respectively.
Style APA, Harvard, Vancouver, ISO itp.
28

Xiao, Jun, Jianping Xian, Song Li i Shuai Zou. "Research on Elevation Survey Method of Sea-Crossing Bridge under Adverse Conditions". Sustainability 14, nr 18 (16.09.2022): 11641. http://dx.doi.org/10.3390/su141811641.

Pełny tekst źródła
Streszczenie:
Aiming to survey scenarios of offshore projects with difficult horizontal elevation transmission and long-distance, all-weather elevation monitoring operations, a long-distance, total station, trigonometric leveling based on dynamic compensation is proposed. The feasibility of this method was verified by an outdoor survey experiment, and the range of transverse coverage and accuracy reached by this method was quantitatively analyzed. The results indicate that this method shows a good correction effect on the survey results of test points under different environmental conditions, which proves that this method is feasible. The correction effect of this method is affected by the distance between the test point and the datum point; within the range of 60 m horizontally from the datum point, an assurance rate of about 90% can be achieved for the error range of 20 mm. Combining with the built-in ATR (Automatic Target Recognition) technology of the total station, this method can make the elevation survey result reach the accuracy of millimeter level under the range of about 1000 m, by obtaining multiple groups of data and then calculating the mean value. This paper provides a new method for the elevation transfer of sea-crossing bridges under long-distance conditions and harsh environmental conditions.
Style APA, Harvard, Vancouver, ISO itp.
29

Perero-Codosero, Juan M., Fernando M. Espinoza-Cuadros i Luis A. Hernández-Gómez. "A Comparison of Hybrid and End-to-End ASR Systems for the IberSpeech-RTVE 2020 Speech-to-Text Transcription Challenge". Applied Sciences 12, nr 2 (17.01.2022): 903. http://dx.doi.org/10.3390/app12020903.

Pełny tekst źródła
Streszczenie:
This paper describes a comparison between hybrid and end-to-end Automatic Speech Recognition (ASR) systems, which were evaluated on the IberSpeech-RTVE 2020 Speech-to-Text Transcription Challenge. Deep Neural Networks (DNNs) are becoming the most promising technology for ASR at present. In the last few years, traditional hybrid models have been evaluated and compared to other end-to-end ASR systems in terms of accuracy and efficiency. We contribute two different approaches: a hybrid ASR system based on a DNN-HMM and two state-of-the-art end-to-end ASR systems, based on Lattice-Free Maximum Mutual Information (LF-MMI). To address the high difficulty in the speech-to-text transcription of recordings with different speaking styles and acoustic conditions from TV studios to live recordings, data augmentation and Domain Adversarial Training (DAT) techniques were studied. Multi-condition data augmentation applied to our hybrid DNN-HMM demonstrated WER improvements in noisy scenarios (about 10% relatively). In contrast, the results obtained using an end-to-end PyChain-based ASR system were far from our expectations. Nevertheless, we found that when including DAT techniques, a relative WER improvement of 2.87% was obtained as compared to the PyChain-based system.
Style APA, Harvard, Vancouver, ISO itp.
30

Patil, Prof Shital. "Air Handwriting using AI and ML". INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, nr 05 (14.05.2024): 1–5. http://dx.doi.org/10.55041/ijsrem33918.

Pełny tekst źródła
Streszczenie:
Air-writing refers to virtually writing linguistic characters through hand gestures in three dimensional space with six degrees of freedom. In this paper a generic video camera dependent convolutional neural network (CNN) based air-writing framework has been proposed. Gestures are performed using a marker of fixed color in front of a generic video camera followed by color based segmentation to identify the marker and track the trajectory of marker tip. A pre-trained CNN is then used to classify the gesture. The recognition accuracy is further improved using transfer learning with the newly acquired data. The performance of the system varies greatly on the illumi nation condition due to color based segmentation. In a less fluctuating illumination condition the system is able to recognize isolated unistroke numerals of multiple languages. The proposed framework achieved 97.7recognition rate in person inde pendent evaluation over English, Bengali and Devanagari numerals, respectively. Object tracking is considered as an important task within the field of Computer Vision. The invention of faster computers, availability of inexpensive and good quality video cameras and demands of automated video analysis has given popularity to object tracking techniques. Generally, video analysis procedure has three major steps: firstly, detecting of the object, secondly tracking its movement from frame to frame and lastly analysing the behaviour of that object. For object tracking, four different issues are taken into account; selection of suitable object representation, feature selection for tracking, object detection and object tracking. In real world, Object tracking algorithms are the primarily part of different applications such as: automatic surveillance, video indexing and vehicle navigation etc. The generated text can also be used for various purposes, such as sending messages, emails, etc. It will be a powerful means of communication for the deaf. It is an effective communication method that reduces mobile and laptop usage by eliminating the need to write. Key Words: Air Writing, Character Recognition, Object Detection, Real-Time Gesture Control System, Computer Vision , Hand tracking.
Style APA, Harvard, Vancouver, ISO itp.
31

Liu, Xinwei. "Research on the Impact of ASR Technology on Intelligent Interconnection based on Embedded Experiments". Applied and Computational Engineering 121, nr 1 (10.01.2025): 209–14. https://doi.org/10.54254/2755-2721/2025.20233.

Pełny tekst źródła
Streszczenie:
With the rapid development of artificial intelligence technology, automatic speech recognition (ASR) technology has become an important breakthrough in the field of human-computer interaction. ASR technology can convert human speech into text, thereby revolutionizing the way human-computer interaction occurs and significantly improving the efficiency of information processing. Therefore, it is of great research significance to observe the impact of ASR technology on social interaction and public discourse space from the perspective of embedded experiments. This article comprehensively ASR technology, covering its working principle, development history, technical methods and application scope. The basic principles, application scope, and related technologies of embedded experiments are introduced, and the direction of future development is prospected. In addition, this article explores the specific application of embedded experiments in ASR technology, analyzes in detail how ASR technology is combined with embedded experiments through steps such as preprocessing, feature extraction, and identification, and gives three specific cases to illustrate its use. Practical application effect. The research significance of this article is to reveal the important role of ASR technology in various fields, and how embedded experiments can further promote the optimization and progress of ASR technology.
Style APA, Harvard, Vancouver, ISO itp.
32

Pei, Jifang, Weibo Huo, Chenwei Wang, Yulin Huang, Yin Zhang, Junjie Wu i Jianyu Yang. "Multiview Deep Feature Learning Network for SAR Automatic Target Recognition". Remote Sensing 13, nr 8 (9.04.2021): 1455. http://dx.doi.org/10.3390/rs13081455.

Pełny tekst źródła
Streszczenie:
Multiview synthetic aperture radar (SAR) images contain much richer information for automatic target recognition (ATR) than a single-view one. It is desirable to establish a reasonable multiview ATR scheme and design effective ATR algorithm to thoroughly learn and extract that classification information, so that superior SAR ATR performance can be achieved. Hence, a general processing framework applicable for a multiview SAR ATR pattern is first given in this paper, which can provide an effective approach to ATR system design. Then, a new ATR method using a multiview deep feature learning network is designed based on the proposed multiview ATR framework. The proposed neural network is with a multiple input parallel topology and some distinct deep feature learning modules, with which significant classification features, the intra-view and inter-view features existing in the input multiview SAR images, will be learned simultaneously and thoroughly. Therefore, the proposed multiview deep feature learning network can achieve an excellent SAR ATR performance. Experimental results have shown the superiorities of the proposed multiview SAR ATR method under various operating conditions.
Style APA, Harvard, Vancouver, ISO itp.
33

Kageura, Kyo, Masaharu Yoshioka, Koichi Takeuchi, Teruo Koyama, Keita Tsuji i Fuyuki Yoshikane. "Recent advances in automatic term recognition". Terminology 6, nr 2 (31.12.2000): 151–73. http://dx.doi.org/10.1075/term.6.2.03kag.

Pełny tekst źródła
Streszczenie:
This article provides basic background information on the articles included in this special issue on Japanese term extraction, by (i) clarifying the basic background of research into automatic term recognition, (ii) explaining briefly the ‘contest’-style workshop we organised in 1999, and (iii) briefly summarising the ATR methodologies proposed in the articles, and positioning their ideas, philosophies and methodologies within ATR from a unified perspective. Through this information, we intend to consolidate the contributions of the NTCIR TMREC workshop, and, hopefully, clarify a basic framework for discussion which different researchers can use to constructively communicate with each other about automatic term extraction and beyond.
Style APA, Harvard, Vancouver, ISO itp.
34

Kim, Seoyoung, Yeon Su Park, Dakyeom Ahn, Jin Myung Kwak i Juho Kim. "Is the Same Performance Really the Same?: Understanding How Listeners Perceive ASR Results Differently According to the Speaker's Accent". Proceedings of the ACM on Human-Computer Interaction 8, CSCW1 (17.04.2024): 1–22. http://dx.doi.org/10.1145/3641008.

Pełny tekst źródła
Streszczenie:
Research suggests that automatic speech recognition (ASR) systems, which automatically convert speech to text, show different performances according to various input classes (e.g., accent, age), requiring attention to building fairer AI systems that would perform similarly across various input classes. However, would an AI system with the same performance regardless of input classes really be perceived as fair enough? To this end, we investigate how listeners perceive the ASR system of the same result differently according to whether the speaker is a native speaker (NS) or a non-native speaker (NNS), which may lead to unfair situations. We conducted a study (n = 420), where participants were given one of the ten speech recordings with various accents of the same script along with the same captions. We found that even with the same ASR output, listeners perceive the ASR results differently. They found captions to be more useful for NNS's speech and blamed NNS more for the errors than NS. Based on the findings, we present design implications suggesting that we should take a step further than just achieving the same performance across various input classes to build a fair ASR system.
Style APA, Harvard, Vancouver, ISO itp.
35

Zubair, Abdul Rasak, Emmanuel Sinmiloluwa Olu-Flourish i Martins Obinna Nnaukwu. "Smart Home, Support At Old Age And Support For Persons With Disabilities: Speech Processing For Control Of Energy". International Journal of Advanced Networking and Applications 13, nr 05 (2022): 5134–42. http://dx.doi.org/10.35444/ijana.2022.13507.

Pełny tekst źródła
Streszczenie:
Generally, conventional home wiring system use simple latching switch that is being connected to the power supply for controlling electrical appliances such as fan, light, washing machine, air conditioner and television. The switch is usually located at the wall near the electrical appliance. This requires the user to move to the location of the switches to control the appliances. There is rapid increase in the number of people with special needs like the elderly and the disabled. Smart houses are considered a good alternative for the independent life of older persons and persons with disabilities. A smart home is a home that provides its residents the comfort, the convenience and the ease of operation of devices at all times, irrespective of where the resident actually is within the house. Smart Homes include devices that have automatic functions and systems that can be remotely controlled by the user. The primary objective of a smart house is to enhance comfort, energy saving, security for the residents and independent living of people at old age and people with disabilities. A low-cost prototype of a voice controlled smart home system controlling four devices by an Arduino microcontroller via a four-channel relay is presented. Voice control is one of the easiest methods to give input commands and is a more personalized form of control, since it can be adapted and customized to a particular speaker’s voice. Voice recognition is a computer software program embedded in a hardware device with the ability to decode the human voice. Most voice recognition systems require “training” (also called “enrolment”) where an individual speaker reads text or isolated vocabulary into the system. The system analyses the person’s specific voices and uses it to fine-tune the recognition of that person’s command. Upon successful recognition of the voice command, the microcontroller drives the corresponding load with the help of the relay circuit. Voice or Speech Processing has been applied successfully for the control of the supply of energy to home appliances.
Style APA, Harvard, Vancouver, ISO itp.
36

Ram Kumar, R. P., A. Chandra Prasad, K. Vishnuvardhan, K. Bhuvanesh i Sanjeev Dhama. "Automated Handwritten Text Recognition". E3S Web of Conferences 430 (2023): 01022. http://dx.doi.org/10.1051/e3sconf/202343001022.

Pełny tekst źródła
Streszczenie:
A computer’s capacity to recognize and convert handwritten inputs from sources like photographs and paper documents into digital format is known as Automated Handwritten Text Recognition (AHTR). Systems for reading handwriting are frequently employed in a variety of fields, including banking, finance, and the healthcare industry. In this paper, we took on the problem of categorizing any handwritten artwork, whether it be in block lettering or cursive. There are many different types of handwritten characters, including digits, symbols, and scripts in both English and other languages. This makes the evolution of handwriting more complex. It is difficult to train an Optical Character Recognition (OCR) system using these requirements. In order to convert handwritten material into digital form, this work aims to categorize each unique handwritten word. Because Convolutional Neural Networks (CNNs) are so good at this task, they are the best method for handwriting recognition system. The method will be used to identify writings in various formats.
Style APA, Harvard, Vancouver, ISO itp.
37

Cuellar, Adam, Daniel Brignac, Abhijit Mahalanobis i Wasfy Mikhael. "Simultaneous Classification of Objects with Unknown Rejection (SCOUR) Using Infra-Red Sensor Imagery". Sensors 25, nr 2 (16.01.2025): 492. https://doi.org/10.3390/s25020492.

Pełny tekst źródła
Streszczenie:
Recognizing targets in infra-red images is an important problem for defense and security applications. A deployed network must not only recognize the known classes, but it must also reject any new or unknown objects without confusing them to be one of the known classes. Our goal is to enhance the ability of existing (or pretrained) classifiers to detect and reject unknown classes. Specifically, we do not alter the training strategy of the main classifier so that its performance on known classes remains unchanged. Instead, we introduce a second network (trained using regression) that uses the decision of the primary classifier to produce a class conditional score that indicates whether an input object is indeed a known object. This is performed in a Bayesian framework where the classification confidence of the primary network is combined with the class-conditional score of the secondary network to accurately separate the unknown objects from the known target classes. Most importantly, our method does not require any examples of OOD imagery to be used for training the second network. For illustrative purposes, we demonstrate the effectiveness of the proposed method using the CIFAR-10 dataset. Ultimately, our goal is to classify known targets in infra-red images while improving the ability to reject unknown classes. Towards this end, we train and test our method on a public domain medium-wave infra-red (MWIR) dataset provided by the US Army for the development of automatic target recognition (ATR) algorithms. The results of this experiment show that the proposed method outperforms other state-of-the-art methods in rejecting the unknown target types while accurately classifying the known ones.
Style APA, Harvard, Vancouver, ISO itp.
38

Zhao, Pengfei, Kai Liu, Hao Zou i Xiantong Zhen. "Multi-Stream Convolutional Neural Network for SAR Automatic Target Recognition". Remote Sensing 10, nr 9 (14.09.2018): 1473. http://dx.doi.org/10.3390/rs10091473.

Pełny tekst źródła
Streszczenie:
Despite the fact that automatic target recognition (ATR) in Synthetic aperture radar (SAR) images has been extensively researched due to its practical use in both military and civil applications, it remains an unsolved problem. The major challenges of ATR in SAR stem from severe data scarcity and great variation of SAR images. Recent work started to adopt convolutional neural networks (CNNs), which, however, remain unable to handle the aforementioned challenges due to their high dependency on large quantities of data. In this paper, we propose a novel deep convolutional learning architecture, called Multi-Stream CNN (MS-CNN), for ATR in SAR by leveraging SAR images from multiple views. Specifically, we deploy a multi-input architecture that fuses information from multiple views of the same target in different aspects; therefore, the elaborated multi-view design of MS-CNN enables it to make full use of limited SAR image data to improve recognition performance. We design a Fourier feature fusion framework derived from kernel approximation based on random Fourier features which allows us to unravel the highly nonlinear relationship between images and classes. More importantly, MS-CNN is qualified with the desired characteristic of easy and quick manoeuvrability in real SAR ATR scenarios, because it only needs to acquire real-time GPS information from airborne SAR to calculate aspect differences used for constructing testing samples. The effectiveness and generalization ability of MS-CNN have been demonstrated by extensive experiments under both the Standard Operating Condition (SOC) and Extended Operating Condition (EOC) on the MSTAR dataset. Experimental results have shown that our proposed MS-CNN can achieve high recognition rates and outperform other state-of-the-art ATR methods.
Style APA, Harvard, Vancouver, ISO itp.
39

Cui, Zongyong, Zongjie Cao, Jianyu Yang i Hongliang Ren. "Hierarchical Recognition System for Target Recognition from Sparse Representations". Mathematical Problems in Engineering 2015 (2015): 1–6. http://dx.doi.org/10.1155/2015/527095.

Pełny tekst źródła
Streszczenie:
A hierarchical recognition system (HRS) based on constrained Deep Belief Network (DBN) is proposed for SAR Automatic Target Recognition (SAR ATR). As a classical Deep Learning method, DBN has shown great performance on data reconstruction, big data mining, and classification. However, few works have been carried out to solve small data problems (like SAR ATR) by Deep Learning method. In HRS, the deep structure and pattern classifier are combined to solve small data classification problems. After building the DBN with multiple Restricted Boltzmann Machines (RBMs), hierarchical features can be obtained, and then they are fed to classifier directly. To obtain more natural sparse feature representation, the Constrained RBM (CRBM) is proposed with solving a generalized optimization problem. Three RBM variants,L1-RNM,L2-RBM, andL1/2-RBM, are presented and introduced to HRS in this paper. The experiments on MSTAR public dataset show that the performance of the proposed HRS with CRBM outperforms current pattern recognition methods in SAR ATR, like PCA + SVM, LDA + SVM, and NMF + SVM.
Style APA, Harvard, Vancouver, ISO itp.
40

Wang, Li, Xueru Bai i Feng Zhou. "SAR ATR of Ground Vehicles Based on ESENet". Remote Sensing 11, nr 11 (1.06.2019): 1316. http://dx.doi.org/10.3390/rs11111316.

Pełny tekst źródła
Streszczenie:
In recent studies, synthetic aperture radar (SAR) automatic target recognition (ATR) algorithms that are based on the convolutional neural network (CNN) have achieved high recognition rates in the moving and stationary target acquisition and recognition (MSTAR) dataset. However, in a SAR ATR task, the feature maps with little information automatically learned by CNN will disturb the classifier. We design a new enhanced squeeze and excitation (enhanced-SE) module to solve this problem, and then propose a new SAR ATR network, i.e., the enhanced squeeze and excitation network (ESENet). When compared to the available CNN structures that are designed for SAR ATR, the ESENet can extract more effective features from SAR images and obtain better generalization performance. In the MSTAR dataset containing pure targets, the proposed method achieves a recognition rate of 97.32% and it exceeds the available CNN-based SAR ATR algorithms. Additionally, it has shown robustness to large depression angle variation, configuration variants, and version variants.
Style APA, Harvard, Vancouver, ISO itp.
41

Li, Yan Peng, Yu Liang Qin i Hong Qiang Wang. "Performance Evaluation for an Automatic Target Recognition System with the Application of Sugeno Fuzzy Integration". Applied Mechanics and Materials 738-739 (marzec 2015): 311–15. http://dx.doi.org/10.4028/www.scientific.net/amm.738-739.311.

Pełny tekst źródła
Streszczenie:
Automatic Target Recognition (ATR) system is widely utilized in engineering. However, the performance evaluation method for an ATR system is limited. This paper resolves the problem based on the theory of Sugeno fuzzy integration. The performance metrics are firstly measured. Then, a performance evaluation model is developed. Simulation result shows that, compared to the existing technologies, the novel method can offer more objective performance conclusions for an ATR system.
Style APA, Harvard, Vancouver, ISO itp.
42

Pei, Jifang, Zhiyong Wang, Xueping Sun, Weibo Huo, Yin Zhang, Yulin Huang, Junjie Wu i Jianyu Yang. "FEF-Net: A Deep Learning Approach to Multiview SAR Image Target Recognition". Remote Sensing 13, nr 17 (2.09.2021): 3493. http://dx.doi.org/10.3390/rs13173493.

Pełny tekst źródła
Streszczenie:
Synthetic aperture radar (SAR) is an advanced microwave imaging system of great importance. The recognition of real-world targets from SAR images, i.e., automatic target recognition (ATR), is an attractive but challenging issue. The majority of existing SAR ATR methods are designed for single-view SAR images. However, multiview SAR images contain more abundant classification information than single-view SAR images, which benefits automatic target classification and recognition. This paper proposes an end-to-end deep feature extraction and fusion network (FEF-Net) that can effectively exploit recognition information from multiview SAR images and can boost the target recognition performance. The proposed FEF-Net is based on a multiple-input network structure with some distinct and useful learning modules, such as deformable convolution and squeeze-and-excitation (SE). Multiview recognition information can be effectively extracted and fused with these modules. Therefore, excellent multiview SAR target recognition performance can be achieved by the proposed FEF-Net. The superiority of the proposed FEF-Net was validated based on experiments with the moving and stationary target acquisition and recognition (MSTAR) dataset.
Style APA, Harvard, Vancouver, ISO itp.
43

Cao, Changjie, Zongyong Cui, Zongjie Cao, Liying Wang i Jianyu Yang. "An Integrated Counterfactual Sample Generation and Filtering Approach for SAR Automatic Target Recognition with a Small Sample Set". Remote Sensing 13, nr 19 (27.09.2021): 3864. http://dx.doi.org/10.3390/rs13193864.

Pełny tekst źródła
Streszczenie:
Although automatic target recognition (ATR) models based on data-driven algorithms have achieved excellent performance in recent years, the synthetic aperture radar (SAR) ATR model often suffered from performance degradation when it encountered a small sample set. In this paper, an integrated counterfactual sample generation and filtering approach is proposed to alleviate the negative influence of a small sample set. The proposed method consists of a generation component and a filtering component. First, the proposed generation component utilizes the overfitting characteristics of generative adversarial networks (GANs), which ensures the generation of counterfactual target samples. Second, the proposed filtering component is built by learning different recognition functions. In the proposed filtering component, multiple SVMs trained by different SAR target sample sets provide pseudo-labels to the other SVMs to improve the recognition rate. Then, the proposed approach improves the performance of the recognition model dynamically while it continuously generates counterfactual target samples. At the same time, counterfactual target samples that are beneficial to the ATR model are also filtered. Moreover, ablation experiments demonstrate the effectiveness of the various components of the proposed method. Experimental results based on the Moving and Stationary Target Acquisition and Recognition (MSTAR) and OpenSARship dataset also show the advantages of the proposed approach. Even though the size of the constructed training set was 14.5% of the original training set, the recognition performance of the ATR model reached 91.27% with the proposed approach.
Style APA, Harvard, Vancouver, ISO itp.
44

Yamany, Sameh M., Aly A. Farag i Shin-Yi Hsu. "A fuzzy hyperspectral classifier for automatic target recognition (ATR) systems". Pattern Recognition Letters 20, nr 11-13 (listopad 1999): 1431–38. http://dx.doi.org/10.1016/s0167-8655(99)00116-6.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
45

Gunay, Banihan. "A methodology on the automatic recognition of poor lane keeping". Journal of Advanced Transportation 42, nr 2 (kwiecień 2008): 129–49. http://dx.doi.org/10.1002/atr.5670420203.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
46

Ospangaziyeva, N., А. Zhalalova i V. Askaroğlu. "PHONETIC WAYS OF AUTOMATIC TEXT RECOGNITION". Tiltanym 96, nr 4 (9.01.2025): 155–63. https://doi.org/10.55491/2411-6076-2024-4-155-163.

Pełny tekst źródła
Streszczenie:
Phonetic-phonological methods and models of automatic text recognition are considered in the article, the phonemic principle of the Kazakh language is shown. An overview of the works of scientists who first discussed the adaptation of Kazakh texts to the computer, substantiated the theory, implemented the practice and demonstrated it in practice was also made. Today, computer and information technologies, artificial intelligence are constantly updating the process of their development in accordance with the new trends of time. Such information technologies include computers, smartphones, gadgets, etc. and these devices are actively used by consumers in everyday life. That's why automatic text recognition needs to be improved and updated day by day. Today, sound automation opens a new facet of Kazakh science. In the written language, the phonemic principle is of special importance. According to this principle, only one grapheme should always be used for each phoneme in the writing system. Such a position will undoubtedly make the letter more coherent and easy to read. Thanks to this, new possibilities of the Kazakh language will be considered. The main results of automatic text recognition can already be seen in the development of the National Corpus of the Kazakh language, which is ready to satisfy the various requests of any reader.
Style APA, Harvard, Vancouver, ISO itp.
47

Yu, Cuilin, Yikui Zhai, Haifeng Huang, Qingsong Wang i Wenlve Zhou. "Capsule Broad Learning System Network for Robust Synthetic Aperture Radar Automatic Target Recognition with Small Samples". Remote Sensing 16, nr 9 (26.04.2024): 1526. http://dx.doi.org/10.3390/rs16091526.

Pełny tekst źródła
Streszczenie:
The utilization of deep learning in Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) has witnessed a recent surge owing to its remarkable feature extraction capabilities. Nonetheless, deep learning methodologies are often encumbered by inadequacies in labeled data and the protracted nature of training processes. To address these challenges and offer an alternative avenue for accurately extracting image features, this paper puts forth a novel and distinctive network dubbed the Capsule Broad Learning System Network for robust SAR ATR (CBLS-SARNET). This novel strategy is specifically tailored to cater to small-sample SAR ATR scenarios. On the one hand, we introduce a United Division Co-training (UDC) Framework as a feature filter, adeptly amalgamating CapsNet and the Broad Learning System (BLS) to enhance network efficiency and efficacy. On the other hand, we devise a Parameters Sharing (PS) network to facilitate secondary learning by sharing the weight and bias of BLS node layers, thereby augmenting the recognition capability of CBLS-SARNET. Experimental results unequivocally demonstrate that our proposed CBLS-SARNET outperforms other deep learning methods in terms of recognition accuracy and training time. Furthermore, experiments validate the generalization and robustness of our novel method under various conditions, including the addition of blur, Gaussian noise, noisy labels, and different depression angles. These findings underscore the superior generalization capabilities of CBLS-SARNET across diverse SAR ATR scenarios.
Style APA, Harvard, Vancouver, ISO itp.
48

Cui, Zongyong, Cui Tang, Zongjie Cao i Nengyuan Liu. "D-ATR for SAR Images Based on Deep Neural Networks". Remote Sensing 11, nr 8 (13.04.2019): 906. http://dx.doi.org/10.3390/rs11080906.

Pełny tekst źródła
Streszczenie:
Automatic target recognition (ATR) can obtain important information for target surveillance from Synthetic Aperture Radar (SAR) images. Thus, a direct automatic target recognition (D-ATR) method, based on a deep neural network (DNN), is proposed in this paper. To recognize targets in large-scene SAR images, the traditional methods of SAR ATR are comprised of four major steps: detection, discrimination, feature extraction, and classification. However, the recognition performance is sensitive to each step, as the processing result from each step will affect the following step. Meanwhile, these processes are independent, which means that there is still room for processing speed improvement. The proposed D-ATR method can integrate these steps as a whole system and directly recognize targets in large-scene SAR images, by encapsulating all of the computation in a single deep convolutional neural network (DCNN). Before the DCNN, a fast sliding method is proposed to partition the large image into sub-images, to avoid information loss when resizing the input images, and to avoid the target being divided into several parts. After the DCNN, non-maximum suppression between sub-images (NMSS) is performed on the results of the sub-images, to obtain an accurate result of the large-scene SAR image. Experiments on the MSTAR dataset and large-scene SAR images (with resolution 1478 × 1784) show that the proposed method can obtain a high accuracy and fast processing speed, and out-performs other methods, such as CFAR+SVM, Region-based CNN, and YOLOv2.
Style APA, Harvard, Vancouver, ISO itp.
49

Wang, Chenwei, Jifang Pei, Zhiyong Wang, Yulin Huang, Junjie Wu, Haiguang Yang i Jianyu Yang. "When Deep Learning Meets Multi-Task Learning in SAR ATR: Simultaneous Target Recognition and Segmentation". Remote Sensing 12, nr 23 (25.11.2020): 3863. http://dx.doi.org/10.3390/rs12233863.

Pełny tekst źródła
Streszczenie:
With the recent advances of deep learning, automatic target recognition (ATR) of synthetic aperture radar (SAR) has achieved superior performance. By not being limited to the target category, the SAR ATR system could benefit from the simultaneous extraction of multifarious target attributes. In this paper, we propose a new multi-task learning approach for SAR ATR, which could obtain the accurate category and precise shape of the targets simultaneously. By introducing deep learning theory into multi-task learning, we first propose a novel multi-task deep learning framework with two main structures: encoder and decoder. The encoder is constructed to extract sufficient image features in different scales for the decoder, while the decoder is a tasks-specific structure which employs these extracted features adaptively and optimally to meet the different feature demands of the recognition and segmentation. Therefore, the proposed framework has the ability to achieve superior recognition and segmentation performance. Based on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset, experimental results show the superiority of the proposed framework in terms of recognition and segmentation.
Style APA, Harvard, Vancouver, ISO itp.
50

Gunay, Banihan. "Using automatic number plate recognition technology to observe drivers' headway preferences". Journal of Advanced Transportation 46, nr 4 (18.06.2012): 305–17. http://dx.doi.org/10.1002/atr.1197.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii