Journal articles on the topic 'Deep semantic parsing'

To see the other types of publications on this topic, follow the link: Deep semantic parsing.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Deep semantic parsing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Laukaitis, Algirdas, Egidijus Ostašius, and Darius Plikynas. "Deep Semantic Parsing with Upper Ontologies." Applied Sciences 11, no. 20 (October 11, 2021): 9423. http://dx.doi.org/10.3390/app11209423.

Full text
Abstract:
This paper presents a new method for semantic parsing with upper ontologies using FrameNet annotations and BERT-based sentence context distributed representations. The proposed method leverages WordNet upper ontology mapping and PropBank-style semantic role labeling and it is designed for long text parsing. Given a PropBank, FrameNet and WordNet-labeled corpus, a model is proposed that annotates the set of semantic roles with upper ontology concept names. These annotations are used for the identification of predicates and arguments that are relevant for virtual reality simulators in a 3D world with a built-in physics engine. It is shown that state-of-the-art results can be achieved in relation to semantic role labeling with upper ontology concepts. Additionally, a manually annotated corpus was created using this new method and is presented in this study. It is suggested as a benchmark for future studies relevant to semantic parsing.
APA, Harvard, Vancouver, ISO, and other styles
2

BALLESTEROS, MIGUEL, BERND BOHNET, SIMON MILLE, and LEO WANNER. "Data-driven deep-syntactic dependency parsing." Natural Language Engineering 22, no. 6 (August 18, 2015): 939–74. http://dx.doi.org/10.1017/s1351324915000285.

Full text
Abstract:
Abstract‘Deep-syntactic’ dependency structures that capture the argumentative, attributive and coordinative relations between full words of a sentence have a great potential for a number of NLP-applications. The abstraction degree of these structures is in between the output of a syntactic dependency parser (connected trees defined over all words of a sentence and language-specific grammatical functions) and the output of a semantic parser (forests of trees defined over individual lexemes or phrasal chunks and abstract semantic role labels which capture the frame structures of predicative elements and drop all attributive and coordinative dependencies). We propose a parser that provides deep-syntactic structures. The parser has been tested on Spanish, English and Chinese.
APA, Harvard, Vancouver, ISO, and other styles
3

Luo, Ling, Dingyu Xue, and Xinglong Feng. "EHANet: An Effective Hierarchical Aggregation Network for Face Parsing." Applied Sciences 10, no. 9 (April 30, 2020): 3135. http://dx.doi.org/10.3390/app10093135.

Full text
Abstract:
In recent years, benefiting from deep convolutional neural networks (DCNNs), face parsing has developed rapidly. However, it still has the following problems: (1) Existing state-of-the-art frameworks usually do not satisfy real-time while pursuing performance; (2) similar appearances cause incorrect pixel label assignments, especially in the boundary; (3) to promote multi-scale prediction, deep features and shallow features are used for fusion without considering the semantic gap between them. To overcome these drawbacks, we propose an effective and efficient hierarchical aggregation network called EHANet for fast and accurate face parsing. More specifically, we first propose a stage contextual attention mechanism (SCAM), which uses higher-level contextual information to re-encode the channel according to its importance. Secondly, a semantic gap compensation block (SGCB) is presented to ensure the effective aggregation of hierarchical information. Thirdly, the advantages of weighted boundary-aware loss effectively make up for the ambiguity of boundary semantics. Without any bells and whistles, combined with a lightweight backbone, we achieve outstanding results on both CelebAMask-HQ (78.19% mIoU) and Helen datasets (90.7% F1-score). Furthermore, our model can achieve 55 FPS on a single GTX 1080Ti card with 640 × 640 input and further reach over 300 FPS with a resolution of 256 × 256, which is suitable for real-world applications.
APA, Harvard, Vancouver, ISO, and other styles
4

Abdelaziz, Ibrahim, Srinivas Ravishankar, Pavan Kapanipathi, Salim Roukos, and Alexander Gray. "A Semantic Parsing and Reasoning-Based Approach to Knowledge Base Question Answering." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 18 (May 18, 2021): 15985–87. http://dx.doi.org/10.1609/aaai.v35i18.17988.

Full text
Abstract:
Knowledge Base Question Answering (KBQA) is a task where existing techniques have faced significant challenges, such as the need for complex question understanding, reasoning, and large training datasets. In this work, we demonstrate Deep Thinking Question Answering (DTQA), a semantic parsing and reasoning-based KBQA system. DTQA (1) integrates multiple, reusable modules that are trained specifically for their individual tasks (e.g. semantic parsing, entity linking, and relationship linking), eliminating the need for end-to-end KBQA training data; (2) leverages semantic parsing and a reasoner for improved question understanding. DTQA is a system of systems that achieves state-of-the-art performance on two popular KBQA datasets.
APA, Harvard, Vancouver, ISO, and other styles
5

Huang, Lili, Jiefeng Peng, Ruimao Zhang, Guanbin Li, and Liang Lin. "Learning deep representations for semantic image parsing: a comprehensive overview." Frontiers of Computer Science 12, no. 5 (August 30, 2018): 840–57. http://dx.doi.org/10.1007/s11704-018-7195-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Fernández-Martínez, Nicolás José, and Pamela Faber. "Who stole what from whom?" Languages in Contrast 20, no. 1 (June 5, 2019): 107–40. http://dx.doi.org/10.1075/lic.19002.fer.

Full text
Abstract:
Abstract Drawing on the Lexical Grammar Model, Frame Semantics and Corpus Pattern Analysis, we analyze and contrast verbs of stealing in English and Spanish from a lexico-semantic perspective. This involves looking at the lexical collocates and their corresponding semantic categories that fill the argument slots of verbs of stealing. Our corpus search is performed with the Word Sketch tool on Sketch Engine. To the best of our knowledge, no study has yet taken advantage of the Word Sketch tool in the study of the selection preferences of verbs of stealing, let alone a semantic, cross-linguistic study of those verbs. Our findings reveal that English and Spanish verbs of stealing map out the same underlying semantic space. This shared conceptual layer can thus be incorporated into an ontology based on deep semantics, which could in turn enhance NLP tasks such as word sense disambiguation, machine translation, semantic tagging, and semantic parsing.
APA, Harvard, Vancouver, ISO, and other styles
7

Zhao, H., X. Zhang, and C. Kit. "Integrative Semantic Dependency Parsing via Efficient Large-scale Feature Selection." Journal of Artificial Intelligence Research 46 (February 20, 2013): 203–33. http://dx.doi.org/10.1613/jair.3717.

Full text
Abstract:
Semantic parsing, i.e., the automatic derivation of meaning representation such as an instantiated predicate-argument structure for a sentence, plays a critical role in deep processing of natural language. Unlike all other top systems of semantic dependency parsing that have to rely on a pipeline framework to chain up a series of submodels each specialized for a specific subtask, the one presented in this article integrates everything into one model, in hopes of achieving desirable integrity and practicality for real applications while maintaining a competitive performance. This integrative approach tackles semantic parsing as a word pair classification problem using a maximum entropy classifier. We leverage adaptive pruning of argument candidates and large-scale feature selection engineering to allow the largest feature space ever in use so far in this field, it achieves a state-of-the-art performance on the evaluation data set for CoNLL-2008 shared task, on top of all but one top pipeline system, confirming its feasibility and effectiveness.
APA, Harvard, Vancouver, ISO, and other styles
8

Zhang, Xun, Yantao Du, Weiwei Sun, and Xiaojun Wan. "Transition-Based Parsing for Deep Dependency Structures." Computational Linguistics 42, no. 3 (September 2016): 353–89. http://dx.doi.org/10.1162/coli_a_00252.

Full text
Abstract:
Derivations under different grammar formalisms allow extraction of various dependency structures. Particularly, bilexical deep dependency structures beyond surface tree representation can be derived from linguistic analysis grounded by CCG, LFG, and HPSG. Traditionally, these dependency structures are obtained as a by-product of grammar-guided parsers. In this article, we study the alternative data-driven, transition-based approach, which has achieved great success for tree parsing, to build general dependency graphs. We integrate existing tree parsing techniques and present two new transition systems that can generate arbitrary directed graphs in an incremental manner. Statistical parsers that are competitive in both accuracy and efficiency can be built upon these transition systems. Furthermore, the heterogeneous design of transition systems yields diversity of the corresponding parsing models and thus greatly benefits parser ensemble. Concerning the disambiguation problem, we introduce two new techniques, namely, transition combination and tree approximation, to improve parsing quality. Transition combination makes every action performed by a parser significantly change configurations. Therefore, more distinct features can be extracted for statistical disambiguation. With the same goal of extracting informative features, tree approximation induces tree backbones from dependency graphs and re-uses tree parsing techniques to produce tree-related features. We conduct experiments on CCG-grounded functor–argument analysis, LFG-grounded grammatical relation analysis, and HPSG-grounded semantic dependency analysis for English and Chinese. Experiments demonstrate that data-driven models with appropriate transition systems can produce high-quality deep dependency analysis, comparable to more complex grammar-driven models. Experiments also indicate the effectiveness of the heterogeneous design of transition systems for parser ensemble, transition combination, as well as tree approximation for statistical disambiguation.
APA, Harvard, Vancouver, ISO, and other styles
9

Zhou, Fan, Enbo Huang, Zhuo Su, and Ruomei Wang. "Multiscale Meets Spatial Awareness: An Efficient Attention Guidance Network for Human Parsing." Mathematical Problems in Engineering 2020 (October 16, 2020): 1–12. http://dx.doi.org/10.1155/2020/5794283.

Full text
Abstract:
Human parsing, which aims at resolving human body and clothes into semantic part regions from an human image, is a fundamental task in human-centric analysis. Recently, the approaches for human parsing based on deep convolutional neural networks (DCNNs) have made significant progress. However, hierarchically exploiting multiscale and spatial contexts as convolutional features is still a hurdle to overcome. In order to boost the scale and spatial awareness of a DCNN, we propose two effective structures, named “Attention SPP and Attention RefineNet,” to form a Mutual Attention operation, to exploit multiscale and spatial semantics different from the existing approaches. Moreover, we propose a novel Attention Guidance Network (AG-Net), a simple yet effective architecture without using bells and whistles (such as human pose and edge information), to address human parsing tasks. Comprehensive evaluations on two public datasets well demonstrate that the AG-Net outperforms the state-of-the-art networks.
APA, Harvard, Vancouver, ISO, and other styles
10

Yang, Haitong, Tao Zhuang, and Chengqing Zong. "Domain Adaptation for Syntactic and Semantic Dependency Parsing Using Deep Belief Networks." Transactions of the Association for Computational Linguistics 3 (December 2015): 271–82. http://dx.doi.org/10.1162/tacl_a_00138.

Full text
Abstract:
In current systems for syntactic and semantic dependency parsing, people usually define a very high-dimensional feature space to achieve good performance. But these systems often suffer severe performance drops on out-of-domain test data due to the diversity of features of different domains. This paper focuses on how to relieve this domain adaptation problem with the help of unlabeled target domain data. We propose a deep learning method to adapt both syntactic and semantic parsers. With additional unlabeled target domain data, our method can learn a latent feature representation (LFR) that is beneficial to both domains. Experiments on English data in the CoNLL 2009 shared task show that our method largely reduced the performance drop on out-of-domain test data. Moreover, we get a Macro F1 score that is 2.32 points higher than the best system in the CoNLL 2009 shared task in out-of-domain tests.
APA, Harvard, Vancouver, ISO, and other styles
11

Xin, Peng, and Li Qiujun. "Semantic Dependency Graph Parsing of Financial Domain Questions Based on Deep Learning." Journal of Physics: Conference Series 1453 (January 2020): 012058. http://dx.doi.org/10.1088/1742-6596/1453/1/012058.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Yao, Xuchen, Gosse Bouma, and Yi Zhang. "Semantics-based Question Generation and Implementation." Dialogue & Discourse 3, no. 2 (March 16, 2012): 11–42. http://dx.doi.org/10.5087/dad.2012.202.

Full text
Abstract:
This paper presents a question generation system based on the approach of semantic rewriting. The state-of-the-art deep linguistic parsing and generation tools are employed to convert (back and forth) between the natural language sentences and their meaning representations in the form of Minimal Recursion Semantics (MRS). By carefully operating on the semantic structures, we show a principled way of generating questions without ad-hoc manipulation of the syntactic structures. Based on the (partial) understanding of the sentence meaning, the system generates questions which are semantically grounded and purposeful. And with the support of deep linguistic grammars, the grammaticality of the generation results is warranted. Further, with a specialized ranking model, the linguistic realizations from the general purpose generation model are further refined for our the question generation task. The evaluation results from QGSTEC2010 show promising prospects of the proposed approach.
APA, Harvard, Vancouver, ISO, and other styles
13

Zhou, Wujie, Shaohua Dong, Caie Xu, and Yaguan Qian. "Edge-Aware Guidance Fusion Network for RGB–Thermal Scene Parsing." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 3571–79. http://dx.doi.org/10.1609/aaai.v36i3.20269.

Full text
Abstract:
RGB–thermal scene parsing has recently attracted increasing research interest in the field of computer vision. However, most existing methods fail to perform good boundary extraction for prediction maps and cannot fully use high-level features. In addition, these methods simply fuse the features from RGB and thermal modalities but are unable to obtain comprehensive fused features. To address these problems, we propose an edge-aware guidance fusion network (EGFNet) for RGB–thermal scene parsing. First, we introduce a prior edge map generated using the RGB and thermal images to capture detailed information in the prediction map and then embed the prior edge information in the feature maps. To effectively fuse the RGB and thermal information, we propose a multimodal fusion module that guarantees adequate cross-modal fusion. Considering the importance of high-level semantic information, we propose a global information module and a semantic information module to extract rich semantic information from the high-level features. For decoding, we use simple elementwise addition for cascaded feature fusion. Finally, to improve the parsing accuracy, we apply multitask deep supervision to the semantic and boundary maps. Extensive experiments were performed on benchmark datasets to demonstrate the effectiveness of the proposed EGFNet and its superior performance compared with state-of-the-art methods. The code and results can be found at https://github.com/ShaohuaDong2021/EGFNet.
APA, Harvard, Vancouver, ISO, and other styles
14

Yin, Zeyu, Jinsong Shao, Muhammad Jawad Hussain, Yajie Hao, Yu Chen, Xuefeng Zhang, and Li Wang. "DPG-LSTM: An Enhanced LSTM Framework for Sentiment Analysis in Social Media Text Based on Dependency Parsing and GCN." Applied Sciences 13, no. 1 (December 27, 2022): 354. http://dx.doi.org/10.3390/app13010354.

Full text
Abstract:
Sentiment analysis based on social media text is found to be essential for multiple applications such as project design, measuring customer satisfaction, and monitoring brand reputation. Deep learning models that automatically learn semantic and syntactic information have recently proved effective in sentiment analysis. Despite earlier studies’ good performance, these methods lack syntactic information to guide feature development for contextual semantic linkages in social media text. In this paper, we introduce an enhanced LSTM-based on dependency parsing and a graph convolutional network (DPG-LSTM) for sentiment analysis. Our research aims to investigate the importance of syntactic information in the task of social media emotional processing. To fully utilize the semantic information of social media, we adopt a hybrid attention mechanism that combines dependency parsing to capture semantic contextual information. The hybrid attention mechanism redistributes higher attention scores to words with higher dependencies generated by dependency parsing. To validate the performance of the DPG-LSTM from different perspectives, experiments have been conducted on three tweet sentiment classification datasets, sentiment140, airline reviews, and self-driving car reviews with 1,604,510 tweets. The experimental results show that the proposed DPG-LSTM model outperforms the state-of-the-art model by 2.1% recall scores, 1.4% precision scores, and 1.8% F1 scores on sentiment140.
APA, Harvard, Vancouver, ISO, and other styles
15

Zhao, Ruilin, Yanbing Xue, Jing Cai, and Zan Gao. "Parsing human image by fusing semantic and spatial features: A deep learning approach." Information Processing & Management 57, no. 6 (November 2020): 102306. http://dx.doi.org/10.1016/j.ipm.2020.102306.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Bauer, Daniel. "Understanding Descriptions of Visual Scenes Using Graph Grammars." Proceedings of the AAAI Conference on Artificial Intelligence 27, no. 1 (June 29, 2013): 1656–57. http://dx.doi.org/10.1609/aaai.v27i1.8498.

Full text
Abstract:
Automatic generation of 3D scenes from descriptions has applications in communication, education, and entertainment, but requires deep understanding of the input text. I propose thesis work on language understanding using graph-based meaning representations that can be decomposed into primitive spatial relations. The techniques used for analyzing text and transforming it into a scene representation are based on context-free graph grammars. The thesis develops methods for semantic parsing with graphs, acquisition of graph grammars, and satisfaction of spatial and world-knowledge constraints during parsing.
APA, Harvard, Vancouver, ISO, and other styles
17

Ndongala, Nathan Manzambi. "Light RAT-SQL: A RAT-SQL with More Abstraction and Less Embedding of Pre-existing Relations." TEXILA INTERNATIONAL JOURNAL OF ACADEMIC RESEARCH 10, no. 2 (April 28, 2023): 1–11. http://dx.doi.org/10.21522/tijar.2014.10.02.art001.

Full text
Abstract:
RAT-SQL is among the popular framework used in the Text-To-SQL challenges for jointly encoding the database relations and questions in a way to improve the semantic parser. In this work, we propose a light version of the RAT-SQL where we dramatically reduced the number of the preexisting relations from 55 to 7 (Light RAT-SQL-7) while preserving the same parsing accuracy. To ensure the effectiveness of our approach, we trained a Light RAT-SQL-2, (with 2 embeddings) to show that there is a statistically significant difference between RAT-SQL and Light RAT-SQL-2 while Light RAT-SQL-7 can compete with RAT-SQL. Keywords: Deep learning, Natural Language Processing, Neural Semantic Parsing, Relation Aware Transformer, RAT-SQL, Text-To-SQL, Transformer.
APA, Harvard, Vancouver, ISO, and other styles
18

Li, Xiangtai, Houlong Zhao, Lei Han, Yunhai Tong, Shaohua Tan, and Kuiyuan Yang. "Gated Fully Fusion for Semantic Segmentation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 11418–25. http://dx.doi.org/10.1609/aaai.v34i07.6805.

Full text
Abstract:
Semantic segmentation generates comprehensive understanding of scenes through densely predicting the category for each pixel. High-level features from Deep Convolutional Neural Networks already demonstrate their effectiveness in semantic segmentation tasks, however the coarse resolution of high-level features often leads to inferior results for small/thin objects where detailed information is important. It is natural to consider importing low level features to compensate for the lost detailed information in high-level features. Unfortunately, simply combining multi-level features suffers from the semantic gap among them. In this paper, we propose a new architecture, named Gated Fully Fusion(GFF), to selectively fuse features from multiple levels using gates in a fully connected way. Specifically, features at each level are enhanced by higher-level features with stronger semantics and lower-level features with more details, and gates are used to control the propagation of useful information which significantly reduces the noises during fusion. We achieve the state of the art results on four challenging scene parsing datasets including Cityscapes, Pascal Context, COCO-stuff and ADE20K.
APA, Harvard, Vancouver, ISO, and other styles
19

Qian, Rui, Yunchao Wei, Honghui Shi, Jiachen Li, Jiaying Liu, and Thomas Huang. "Weakly Supervised Scene Parsing with Point-Based Distance Metric Learning." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 8843–50. http://dx.doi.org/10.1609/aaai.v33i01.33018843.

Full text
Abstract:
Semantic scene parsing is suffering from the fact that pixellevel annotations are hard to be collected. To tackle this issue, we propose a Point-based Distance Metric Learning (PDML) in this paper. PDML does not require dense annotated masks and only leverages several labeled points that are much easier to obtain to guide the training process. Concretely, we leverage semantic relationship among the annotated points by encouraging the feature representations of the intra- and intercategory points to keep consistent, i.e. points within the same category should have more similar feature representations compared to those from different categories. We formulate such a characteristic into a simple distance metric loss, which collaborates with the point-wise cross-entropy loss to optimize the deep neural networks. Furthermore, to fully exploit the limited annotations, distance metric learning is conducted across different training images instead of simply adopting an image-dependent manner. We conduct extensive experiments on two challenging scene parsing benchmarks of PASCALContext and ADE 20K to validate the effectiveness of our PDML, and competitive mIoU scores are achieved.
APA, Harvard, Vancouver, ISO, and other styles
20

OHTA, Tomoko. "Semantic retrieval for the accurate identification of relational concepts based on deep syntactic parsing." Journal of Information Processing and Management 49, no. 10 (2007): 555–63. http://dx.doi.org/10.1241/johokanri.49.555.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Dai, Hongming. "LinGAN: an Advanced Model for Code Generating based on Linformer." Journal of Physics: Conference Series 2082, no. 1 (November 1, 2021): 012019. http://dx.doi.org/10.1088/1742-6596/2082/1/012019.

Full text
Abstract:
Abstract Parsing natural language to corresponding programming language attracts much attention in recent years. Natural Language to SQL(NL2SQL) widely appears in numerous practical Internet applications. Previous solution was to convert the input as a heterogeneous graph which failed to learn good word representation in question utterance. In this paper, we propose a Relation-Aware framework named LinGAN, which has powerful semantic parsing abilities and can jointly encode the question utterance and syntax information of the object language. We also propose the pre-norm residual shrinkage unit to solve the problem of deep degradation of Linformer. Experiments show that LinGAN achieves excellent performance on multiple code generation tasks.
APA, Harvard, Vancouver, ISO, and other styles
22

Frisoni, Giacomo, Paolo Italiani, Stefano Salvatori, and Gianluca Moro. "Cogito Ergo Summ: Abstractive Summarization of Biomedical Papers via Semantic Parsing Graphs and Consistency Rewards." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (June 26, 2023): 12781–89. http://dx.doi.org/10.1609/aaai.v37i11.26503.

Full text
Abstract:
The automatic synthesis of biomedical publications catalyzes a profound research interest elicited by literature congestion. Current sequence-to-sequence models mainly rely on the lexical surface and seldom consider the deep semantic interconnections between the entities mentioned in the source document. Such superficiality translates into fabricated, poorly informative, redundant, and near-extractive summaries that severely restrict their real-world application in biomedicine, where the specialized jargon and the convoluted facts further emphasize task complexity. To fill this gap, we argue that the summarizer should acquire semantic interpretation over input, exploiting structured and unambiguous representations to capture and conserve the most relevant parts of the text content. This paper presents CogitoErgoSumm, the first framework for biomedical abstractive summarization equipping large pre-trained language models with rich semantic graphs. Precisely, we infuse graphs from two complementary semantic parsing techniques with different goals and granularities—Event Extraction and Abstract Meaning Representation, also designing a reward signal to maximize information content preservation through reinforcement learning. Extensive quantitative and qualitative evaluations on the CDSR dataset show that our solution achieves competitive performance according to multiple metrics, despite using 2.5x fewer parameters. Results and ablation studies indicate that our joint text-graph model generates more enlightening, readable, and consistent summaries. Code available at: https://github.com/disi-unibo-nlp/cogito-ergo-summ.
APA, Harvard, Vancouver, ISO, and other styles
23

Quispe, Rodolfo, and Helio Pedrini. "Improved person re-identification based on saliency and semantic parsing with deep neural network models." Image and Vision Computing 92 (December 2019): 103809. http://dx.doi.org/10.1016/j.imavis.2019.07.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Xia, Qingrong, Zhenghua Li, Min Zhang, Meishan Zhang, Guohong Fu, Rui Wang, and Luo Si. "Syntax-Aware Neural Semantic Role Labeling." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 7305–13. http://dx.doi.org/10.1609/aaai.v33i01.33017305.

Full text
Abstract:
Semantic role labeling (SRL), also known as shallow semantic parsing, is an important yet challenging task in NLP. Motivated by the close correlation between syntactic and semantic structures, traditional discrete-feature-based SRL approaches make heavy use of syntactic features. In contrast, deep-neural-network-based approaches usually encode the input sentence as a word sequence without considering the syntactic structures. In this work, we investigate several previous approaches for encoding syntactic trees, and make a thorough study on whether extra syntax-aware representations are beneficial for neural SRL models. Experiments on the benchmark CoNLL-2005 dataset show that syntax-aware SRL approaches can effectively improve performance over a strong baseline with external word representations from ELMo. With the extra syntax-aware representations, our approaches achieve new state-of-the-art 85.6 F1 (single model) and 86.6 F1 (ensemble) on the test data, outperforming the corresponding strong baselines with ELMo by 0.8 and 1.0, respectively. Detailed error analysis are conducted to gain more insights on the investigated approaches.
APA, Harvard, Vancouver, ISO, and other styles
25

Zou, Nan, Zhiyu Xiang, Yiman Chen, Shuya Chen, and Chengyu Qiao. "Simultaneous Semantic Segmentation and Depth Completion with Constraint of Boundary." Sensors 20, no. 3 (January 23, 2020): 635. http://dx.doi.org/10.3390/s20030635.

Full text
Abstract:
As the core task of scene understanding, semantic segmentation and depth completion play a vital role in lots of applications such as robot navigation, AR/VR and autonomous driving. They are responsible for parsing scenes from the angle of semantics and geometry, respectively. While great progress has been made in both tasks through deep learning technologies, few works have been done on building a joint model by deeply exploring the inner relationship of the above tasks. In this paper, semantic segmentation and depth completion are jointly considered under a multi-task learning framework. By sharing a common encoder part and introducing boundary features as inner constraints in the decoder part, the two tasks can properly share the required information from each other. An extra boundary detection sub-task is responsible for providing the boundary features and constructing cross-task joint loss functions for network training. The entire network is implemented end-to-end and evaluated with both RGB and sparse depth input. Experiments conducted on synthesized and real scene datasets show that our proposed multi-task CNN model can effectively improve the performance of every single task.
APA, Harvard, Vancouver, ISO, and other styles
26

Staykova, Kamenka, Petya Osenova, and Kiril Simov. "New Applications of “Ontology-to-Text Relation” Strategy for Bulgarian Language." Cybernetics and Information Technologies 12, no. 4 (December 1, 2012): 43–51. http://dx.doi.org/10.2478/cait-2012-0029.

Full text
Abstract:
Abstract The paper presents new applications of the Ontology-to-Text Relation Strategy to Bulgarian Iconographic Domain. First the strategy itself is discussed within the triple ontology-terminological lexicon-annotation grammars, then - the related works. Also, the specifics of the semantic annotation and evaluation over iconographic data are presented. A family of domain ontologies over the iconographic domain are created and used. The evaluation against a gold standard shows that this strategy is good enough for more precise, but shallow results, and can be supported further by deep parsing techniques.
APA, Harvard, Vancouver, ISO, and other styles
27

Ma, Sai, Weibing Wan, Zedong Yu, and Yuming Zhao. "EDET: Entity Descriptor Encoder of Transformer for Multi-Modal Knowledge Graph in Scene Parsing." Applied Sciences 13, no. 12 (June 14, 2023): 7115. http://dx.doi.org/10.3390/app13127115.

Full text
Abstract:
In scene parsing, the model is required to be able to process complex multi-modal data such as images and contexts in real scenes, and discover their implicit connections from objects existing in the scene. As a storage method that contains entity information and the relationship between entities, a knowledge graph can well express objects and the semantic relationship between objects in the scene. In this paper, a new multi-phase process was proposed to solve scene parsing tasks; first, a knowledge graph was used to align the multi-modal information and then the graph-based model generates results. We also designed an experiment of feature engineering’s validation for a deep-learning model to preliminarily verify the effectiveness of this method. Hence, we proposed a knowledge representation method named Entity Descriptor Encoder of Transformer (EDET), which uses both the entity itself and its internal attributes for knowledge representation. This method can be embedded into the transformer structure to solve multi-modal scene parsing tasks. EDET can aggregate the multi-modal attributes of entities, and the results in the scene graph generation and image captioning tasks prove that EDET has excellent performance in multi-modal fields. Finally, the proposed method was applied to the industrial scene, which confirmed the viability of our method.
APA, Harvard, Vancouver, ISO, and other styles
28

Manzambi Ndongala, Nathan. "Topological Relation Aware Transformer." Texila International Journal of Academic Research 11, no. 1 (January 31, 2024): 160–74. http://dx.doi.org/10.21522/tijar.2014.11.01.art015.

Full text
Abstract:
We present a Topological Relation Aware Transformer (T-RAT), a specialized head transformer to open sets, an element of the topology τ generated by the set S, the set of all pre-existing relations between input tokens of the model. From this topological space (S, τ), we present the way to spread each open set to one head of our Transformer. T-RAT improves exact match accuracy in Text-To-SQL challenge (62.09%) without any enhancement of large language models compared to the baseline models RAT-SQL (57.2%) and Light RAT-SQL (60.25%). Keywords: Deep learning, Natural Language Processing, Neural Semantic Parsing, Relation Aware Transformer, RAT-SQL, Text-To-SQL Transformer.
APA, Harvard, Vancouver, ISO, and other styles
29

Costa, Marcus Vinícius Coelho Vieira da, Osmar Luiz Ferreira de Carvalho, Alex Gois Orlandi, Issao Hirata, Anesmar Olino de Albuquerque, Felipe Vilarinho e. Silva, Renato Fontes Guimarães, Roberto Arnaldo Trancoso Gomes, and Osmar Abílio de Carvalho Júnior. "Remote Sensing for Monitoring Photovoltaic Solar Plants in Brazil Using Deep Semantic Segmentation." Energies 14, no. 10 (May 20, 2021): 2960. http://dx.doi.org/10.3390/en14102960.

Full text
Abstract:
Brazil is a tropical country with continental dimensions and abundant solar resources that are still underutilized. However, solar energy is one of the most promising renewable sources in the country. The proper inspection of Photovoltaic (PV) solar plants is an issue of great interest for the Brazilian territory’s energy management agency, and advances in computer vision and deep learning allow automatic, periodic, and low-cost monitoring. The present research aims to identify PV solar plants in Brazil using semantic segmentation and a mosaicking approach for large image classification. We compared four architectures (U-net, DeepLabv3+, Pyramid Scene Parsing Network, and Feature Pyramid Network) with four backbones (Efficient-net-b0, Efficient-net-b7, ResNet-50, and ResNet-101). For mosaicking, we evaluated a sliding window with overlapping pixels using different stride values (8, 16, 32, 64, 128, and 256). We found that: (1) the models presented similar results, showing that the most relevant approach is to acquire high-quality labels rather than models in many scenarios; (2) U-net presented slightly better metrics, and the best configuration was U-net with the Efficient-net-b7 encoder (98% overall accuracy, 91% IoU, and 95% F-score); (3) mosaicking progressively increases results (precision-recall and receiver operating characteristic area under the curve) when decreasing the stride value, at the cost of a higher computational cost. The high trends of solar energy growth in Brazil require rapid mapping, and the proposed study provides a promising approach.
APA, Harvard, Vancouver, ISO, and other styles
30

Şahin, Gözde Gül. "To Augment or Not to Augment? A Comparative Study on Text Augmentation Techniques for Low-Resource NLP." Computational Linguistics 48, no. 1 (2022): 5–42. http://dx.doi.org/10.1162/coli_a_00425.

Full text
Abstract:
Abstract Data-hungry deep neural networks have established themselves as the de facto standard for many NLP tasks, including the traditional sequence tagging ones. Despite their state-of-the-art performance on high-resource languages, they still fall behind their statistical counterparts in low-resource scenarios. One methodology to counterattack this problem is text augmentation, that is, generating new synthetic training data points from existing data. Although NLP has recently witnessed several new textual augmentation techniques, the field still lacks a systematic performance analysis on a diverse set of languages and sequence tagging tasks. To fill this gap, we investigate three categories of text augmentation methodologies that perform changes on the syntax (e.g., cropping sub-sentences), token (e.g., random word insertion), and character (e.g., character swapping) levels. We systematically compare the methods on part-of-speech tagging, dependency parsing, and semantic role labeling for a diverse set of language families using various models, including the architectures that rely on pretrained multilingual contextualized language models such as mBERT. Augmentation most significantly improves dependency parsing, followed by part-of-speech tagging and semantic role labeling. We find the experimented techniques to be effective on morphologically rich languages in general rather than analytic languages such as Vietnamese. Our results suggest that the augmentation techniques can further improve over strong baselines based on mBERT, especially for dependency parsing. We identify the character-level methods as the most consistent performers, while synonym replacement and syntactic augmenters provide inconsistent improvements. Finally, we discuss that the results most heavily depend on the task, language pair (e.g., syntactic-level techniques mostly benefit higher-level tasks and morphologically richer languages), and model type (e.g., token-level augmentation provides significant improvements for BPE, while character-level ones give generally higher scores for char and mBERT based models).
APA, Harvard, Vancouver, ISO, and other styles
31

Tosic, D., S. Tuttas, L. Hoegner, and U. Stilla. "FUSION OF FEATURE BASED AND DEEP LEARNING METHODS FOR CLASSIFICATION OF MMS POINT CLOUDS." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W16 (September 17, 2019): 235–42. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w16-235-2019.

Full text
Abstract:
<p><strong>Abstract.</strong> This work proposes an approach for semantic classification of an outdoor-scene point cloud acquired with a high precision Mobile Mapping System (MMS), with major goal to contribute to the automatic creation of High Definition (HD) Maps. The automatic point labeling is achieved by utilizing the combination of a feature-based approach for semantic classification of point clouds and a deep learning approach for semantic segmentation of images. Both, point cloud data, as well as the data from a multi-camera system are used for gaining spatial information in an urban scene. Two types of classification applied for this task are: 1) Feature-based approach, in which the point cloud is organized into a supervoxel structure for capturing geometric characteristics of points. Several geometric features are then extracted for appropriate representation of the local geometry, followed by removing the effect of local tendency for each supervoxel to enhance the distinction between similar structures. And lastly, the Random Forests (RF) algorithm is applied in the classification phase, for assigning labels to supervoxels and therefore to points within them. 2) The deep learning approach is employed for semantic segmentation of MMS images of the same scene. To achieve this, an implementation of Pyramid Scene Parsing Network is used. Resulting segmented images with each pixel containing a class label are then projected onto the point cloud, enabling label assignment for each point. At the end, experiment results are presented from a complex urban scene and the performance of this method is evaluated on a manually labeled dataset, for the deep learning and feature-based classification individually, as well as for the result of the labels fusion. The achieved overall accuracy with fusioned output is 0.87 on the final test set, which significantly outperforms the results of individual methods on the same point cloud. The labeled data is published on the TUM-PF Semantic-Labeling-Benchmark.</p>
APA, Harvard, Vancouver, ISO, and other styles
32

UZZAMAN, NAUSHAD, and JAMES F. ALLEN. "EVENT AND TEMPORAL EXPRESSION EXTRACTION FROM RAW TEXT: FIRST STEP TOWARDS A TEMPORALLY AWARE SYSTEM." International Journal of Semantic Computing 04, no. 04 (December 2010): 487–508. http://dx.doi.org/10.1142/s1793351x10001097.

Full text
Abstract:
Extracting temporal information from raw text is fundamental for deep language understanding, and key to many applications like question answering, information extraction, and document summarization. Our long-term goal is to build complete temporal structure of documents and use the temporal structure in other applications like textual entailment, question answering, visualization, or others. In this paper, we present a first step, a system for extracting events, event features, main events, temporal expressions and their normalized values from raw text. Our system is a combination of deep semantic parsing with extraction rules, Markov Logic Network classifiers and Conditional Random Field classifiers. To compare with existing systems, we evaluated our system on the TempEval-1 and TempEval-2 corpus. Our system outperforms or performs competitively with existing systems that evaluate on the TimeBank, TempEval-1 and TempEval-2 corpus and our performance is very close to inter-annotator agreement of the TimeBank annotators.
APA, Harvard, Vancouver, ISO, and other styles
33

Pan, Qian, Maofang Gao, Pingbo Wu, Jingwen Yan, and Shilei Li. "A Deep-Learning-Based Approach for Wheat Yellow Rust Disease Recognition from Unmanned Aerial Vehicle Images." Sensors 21, no. 19 (September 30, 2021): 6540. http://dx.doi.org/10.3390/s21196540.

Full text
Abstract:
Yellow rust is a disease with a wide range that causes great damage to wheat. The traditional method of manually identifying wheat yellow rust is very inefficient. To improve this situation, this study proposed a deep-learning-based method for identifying wheat yellow rust from unmanned aerial vehicle (UAV) images. The method was based on the pyramid scene parsing network (PSPNet) semantic segmentation model to classify healthy wheat, yellow rust wheat, and bare soil in small-scale UAV images, and to investigate the spatial generalization of the model. In addition, it was proposed to use the high-accuracy classification results of traditional algorithms as weak samples for wheat yellow rust identification. The recognition accuracy of the PSPNet model in this study reached 98%. On this basis, this study used the trained semantic segmentation model to recognize another wheat field. The results showed that the method had certain generalization ability, and its accuracy reached 98%. In addition, the high-accuracy classification result of a support vector machine was used as a weak label by weak supervision, which better solved the labeling problem of large-size images, and the final recognition accuracy reached 94%. Therefore, the present study method facilitated timely control measures to reduce economic losses.
APA, Harvard, Vancouver, ISO, and other styles
34

Panboonyuen, Teerapong, Kulsawasd Jitkajornwanich, Siam Lawawirojwong, Panu Srestasathiern, and Peerapon Vateekul. "Transformer-Based Decoder Designs for Semantic Segmentation on Remotely Sensed Images." Remote Sensing 13, no. 24 (December 15, 2021): 5100. http://dx.doi.org/10.3390/rs13245100.

Full text
Abstract:
Transformers have demonstrated remarkable accomplishments in several natural language processing (NLP) tasks as well as image processing tasks. Herein, we present a deep-learning (DL) model that is capable of improving the semantic segmentation network in two ways. First, utilizing the pre-training Swin Transformer (SwinTF) under Vision Transformer (ViT) as a backbone, the model weights downstream tasks by joining task layers upon the pretrained encoder. Secondly, decoder designs are applied to our DL network with three decoder designs, U-Net, pyramid scene parsing (PSP) network, and feature pyramid network (FPN), to perform pixel-level segmentation. The results are compared with other image labeling state of the art (SOTA) methods, such as global convolutional network (GCN) and ViT. Extensive experiments show that our Swin Transformer (SwinTF) with decoder designs reached a new state of the art on the Thailand Isan Landsat-8 corpus (89.8% F1 score), Thailand North Landsat-8 corpus (63.12% F1 score), and competitive results on ISPRS Vaihingen. Moreover, both our best-proposed methods (SwinTF-PSP and SwinTF-FPN) even outperformed SwinTF with supervised pre-training ViT on the ImageNet-1K in the Thailand, Landsat-8, and ISPRS Vaihingen corpora.
APA, Harvard, Vancouver, ISO, and other styles
35

Iacob, Radu Cristian Alexandru, Vlad Cristian Monea, Dan Rădulescu, Andrei-Florin Ceapă, Traian Rebedea, and Ștefan Trăușan-Matu. "AlgoLabel: A Large Dataset for Multi-Label Classification of Algorithmic Challenges." Mathematics 8, no. 11 (November 9, 2020): 1995. http://dx.doi.org/10.3390/math8111995.

Full text
Abstract:
While semantic parsing has been an important problem in natural language processing for decades, recent years have seen a wide interest in automatic generation of code from text. We propose an alternative problem to code generation: labelling the algorithmic solution for programming challenges. While this may seem an easier task, we highlight that current deep learning techniques are still far from offering a reliable solution. The contributions of the paper are twofold. First, we propose a large multi-modal dataset of text and code pairs consisting of algorithmic challenges and their solutions, called AlgoLabel. Second, we show that vanilla deep learning solutions need to be greatly improved to solve this task and we propose a dual text-code neural model for detecting the algorithmic solution type for a programming challenge. While the proposed text-code model increases the performance of using the text or code alone, the improvement is rather small highlighting that we require better methods to combine text and code features.
APA, Harvard, Vancouver, ISO, and other styles
36

Zhang, Zeyu, Honggui Deng, Yang Liu, Qiguo Xu, and Gang Liu. "A Semi-Supervised Semantic Segmentation Method for Blast-Hole Detection." Symmetry 14, no. 4 (March 23, 2022): 653. http://dx.doi.org/10.3390/sym14040653.

Full text
Abstract:
The goal of blast-hole detection is to help place charge explosives into blast-holes. This process is full of challenges, because it requires the ability to extract sample features in complex environments, and to detect a wide variety of blast-holes. Detection techniques based on deep learning with RGB-D semantic segmentation have emerged in recent years of research and achieved good results. However, implementing semantic segmentation based on deep learning usually requires a large amount of labeled data, which creates a large burden on the production of the dataset. To address the dilemma that there is very little training data available for explosive charging equipment to detect blast-holes, this paper extends the core idea of semi-supervised learning to RGB-D semantic segmentation, and devises an ERF-AC-PSPNet model based on a symmetric encoder–decoder structure. The model adds a residual connection layer and a dilated convolution layer for down-sampling, followed by an attention complementary module to acquire the feature maps, and uses a pyramid scene parsing network to achieve hole segmentation during decoding. A new semi-supervised learning method, based on pseudo-labeling and self-training, is proposed, to train the model for intelligent detection of blast-holes. The designed pseudo-labeling is based on the HOG algorithm and depth data, and proved to have good results in experiments. To verify the validity of the method, we carried out experiments on the images of blast-holes collected at a mine site. Compared to the previous segmentation methods, our method is less dependent on the labeled data and achieved IoU of 0.810, 0.867, 0.923, and 0.945, at labeling ratios of 1/8, 1/4, 1/2, and 1.
APA, Harvard, Vancouver, ISO, and other styles
37

Li, Rui, Shili Shu, Shunli Wang, Yang Liu, Yanhao Li, and Mingjun Peng. "DAT-MT Accelerated Graph Fusion Dependency Parsing Model for Small Samples in Professional Fields." Entropy 25, no. 10 (October 12, 2023): 1444. http://dx.doi.org/10.3390/e25101444.

Full text
Abstract:
The rapid development of information technology has made the amount of information in massive texts far exceed human intuitive cognition, and dependency parsing can effectively deal with information overload. In the background of domain specialization, the migration and application of syntactic treebanks and the speed improvement in syntactic analysis models become the key to the efficiency of syntactic analysis. To realize domain migration of syntactic tree library and improve the speed of text parsing, this paper proposes a novel approach—the Double-Array Trie and Multi-threading (DAT-MT) accelerated graph fusion dependency parsing model. It effectively combines the specialized syntactic features from small-scale professional field corpus with the generalized syntactic features from large-scale news corpus, which improves the accuracy of syntactic relation recognition. Aiming at the problem of high space and time complexity brought by the graph fusion model, the DAT-MT method is proposed. It realizes the rapid mapping of massive Chinese character features to the model’s prior parameters and the parallel processing of calculation, thereby improving the parsing speed. The experimental results show that the unlabeled attachment score (UAS) and the labeled attachment score (LAS) of the model are improved by 13.34% and 14.82% compared with the model with only the professional field corpus and improved by 3.14% and 3.40% compared with the model only with news corpus; both indicators are better than DDParser and LTP 4 methods based on deep learning. Additionally, the method in this paper achieves a speedup of about 3.7 times compared to the method with a red-black tree index and a single thread. Efficient and accurate syntactic analysis methods will benefit the real-time processing of massive texts in professional fields, such as multi-dimensional semantic correlation, professional feature extraction, and domain knowledge graph construction.
APA, Harvard, Vancouver, ISO, and other styles
38

Li, Wei, Junhua Gu, Benwen Chen, and Jungong Han. "Incremental Instance-Oriented 3D Semantic Mapping via RGB-D Cameras for Unknown Indoor Scene." Discrete Dynamics in Nature and Society 2020 (April 23, 2020): 1–10. http://dx.doi.org/10.1155/2020/2528954.

Full text
Abstract:
Scene parsing plays a crucial role when accomplishing human-robot interaction tasks. As the “eye” of the robot, RGB-D camera is one of the most important components for collecting multiview images to construct instance-oriented 3D environment semantic maps, especially in unknown indoor scenes. Although there are plenty of studies developing accurate object-level mapping systems with different types of cameras, these methods either process the instance segmentation problem in completed mapping or suffer from a critical real-time issue due to heavy computation processing required. In this paper, we propose a novel method to incrementally build instance-oriented 3D semantic maps directly from images acquired by the RGB-D camera. To ensure an efficient reconstruction of 3D objects with semantic and instance IDs, the input RGB images are operated by a real-time deep-learned object detector. To obtain accurate point cloud cluster, we adopt the Gaussian mixture model as an optimizer after processing 2D to 3D projection. Next, we present a data association strategy to update class probabilities across the frames. Finally, a map integration strategy fuses information about their 3D shapes, locations, and instance IDs in a faster way. We evaluate our system on different indoor scenes including offices, bedrooms, and living rooms from the SceneNN dataset, and the results show that our method not only builds the instance-oriented semantic map efficiently but also enhances the accuracy of the individual instance in the scene.
APA, Harvard, Vancouver, ISO, and other styles
39

Sermet, Yusuf, and Ibrahim Demir. "A Semantic Web Framework for Automated Smart Assistants: A Case Study for Public Health." Big Data and Cognitive Computing 5, no. 4 (October 18, 2021): 57. http://dx.doi.org/10.3390/bdcc5040057.

Full text
Abstract:
The COVID-19 pandemic elucidated that knowledge systems will be instrumental in cases where accurate information needs to be communicated to a substantial group of people with different backgrounds and technological resources. However, several challenges and obstacles hold back the wide adoption of virtual assistants by public health departments and organizations. This paper presents the Instant Expert, an open-source semantic web framework to build and integrate voice-enabled smart assistants (i.e., chatbots) for any web platform regardless of the underlying domain and technology. The component allows non-technical domain experts to effortlessly incorporate an operational assistant with voice recognition capability into their websites. Instant Expert is capable of automatically parsing, processing, and modeling Frequently Asked Questions pages as an information resource as well as communicating with an external knowledge engine for ontology-powered inference and dynamic data use. The presented framework uses advanced web technologies to ensure reusability and reliability, and an inference engine for natural-language understanding powered by deep learning and heuristic algorithms. A use case for creating an informatory assistant for COVID-19 based on the Centers for Disease Control and Prevention (CDC) data is presented to demonstrate the framework’s usage and benefits.
APA, Harvard, Vancouver, ISO, and other styles
40

Ma, Kaifeng, Mengshu Hao, Wenlong Shang, Jinping Liu, Junzhen Meng, Qingfeng Hu, Peipei He, and Shiming Li. "Study on the Influence of Label Image Accuracy on the Performance of Concrete Crack Segmentation Network Models." Sensors 24, no. 4 (February 6, 2024): 1068. http://dx.doi.org/10.3390/s24041068.

Full text
Abstract:
A high-quality dataset is a basic requirement to ensure the training quality and prediction accuracy of a deep learning network model (DLNM). To explore the influence of label image accuracy on the performance of a concrete crack segmentation network model in a semantic segmentation dataset, this study uses three labelling strategies, namely pixel-level fine labelling, outer contour widening labelling and topological structure widening labelling, respectively, to generate crack label images and construct three sets of crack semantic segmentation datasets with different accuracy. Four semantic segmentation network models (SSNMs), U-Net, High-Resolution Net (HRNet)V2, Pyramid Scene Parsing Network (PSPNet) and DeepLabV3+, were used for learning and training. The results show that the datasets constructed from the crack label images with pix-el-level fine labelling are more conducive to improving the accuracy of the network model for crack image segmentation. The U-Net had the best performance among the four SSNMs. The Mean Intersection over Union (MIoU), Mean Pixel Accuracy (MPA) and Accuracy reached 85.47%, 90.86% and 98.66%, respectively. The average difference between the quantized width of the crack image segmentation obtained by U-Net and the real crack width was 0.734 pixels, the maximum difference was 1.997 pixels, and the minimum difference was 0.141 pixels. Therefore, to improve the segmentation accuracy of crack images, the pixel-level fine labelling strategy and U-Net are the best choices.
APA, Harvard, Vancouver, ISO, and other styles
41

Song, Young Chol, and Henry Kautz. "A Testbed for Learning by Demonstration from Natural Language and RGB-Depth Video." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (September 20, 2021): 2457–58. http://dx.doi.org/10.1609/aaai.v26i1.8430.

Full text
Abstract:
We are developing a testbed for learning by demonstration combining spoken language and sensor data in a natural real-world environment. Microsoft Kinect RGB-Depth cameras allow us to infer high-level visual features, such as the relative position of objects in space, with greater precision and less training than required by traditional systems. Speech is recognized and parsed using a “deep” parsing system, so that language features are available at the word, syntactic, and semantic levels. We collected an initial data set of 10 episodes of 7 individuals demonstrating how to “make tea”, and created a “gold standard” hand annotation of the actions performed in each. Finally, we are constructing “baseline” HMM-based activity recognition models using the visual and language features, in order to be ready to evaluate the performance of our future work on deeper and more structured models.
APA, Harvard, Vancouver, ISO, and other styles
42

Nugraha, Deny Wiria, Amil Ahmad Ilham, Andani Achmad, and Ardiaty Arief. "Performance Improvement of Deep Convolutional Networks for Aerial Imagery Segmentation of Natural Disaster-Affected Areas." JOIV : International Journal on Informatics Visualization 7, no. 4 (December 31, 2023): 2321. http://dx.doi.org/10.30630/joiv.7.4.01383.

Full text
Abstract:
This study proposes a framework for improving performance and exploring the application of Deep Convolutional Networks (DCN) using the best parameters and criteria to accurately produce aerial imagery semantic segmentation of natural disaster-affected areas. This study utilizes two models: U-Net and Pyramid Scene Parsing Network (PSPNet). Extensive study results show that the Grid Search algorithm can improve the performance of the two models used, whereas previous research has not used the Grid Search algorithm to improve performance in aerial imagery segmentation of natural disaster-affected areas. The Grid Search algorithm performs parameter tuning on DCN, data augmentation criteria tuning, and dataset criteria tuning for pre-training. The most optimal DCN model is shown by PSPNet (152) (bpc), using the best parameters and criteria, with a mean Intersection over Union (mIoU) of 83.34%, a significant mIoU increase of 43.09% compared to using only the default parameters and criteria (baselines). The validation results using the k-fold cross-validation method on the most optimal DCN model produced an average accuracy of 99.04%. PSPNet(152) (bpc) can detect and identify various objects with irregular shapes and sizes, can detect and identify various important objects affected by natural disasters such as flooded buildings and roads, and can detect and identify objects with small shapes such as vehicles and pools, which are the most challenging task for semantic segmentation network models. This study also shows that increasing the network layers in the PSPNet-(18, 34, 50, 101, 152) model, which uses the best parameters and criteria, improves the model's performance. The results of this study indicate the need to utilize a special dataset from aerial imagery originating from the Unmanned Aerial Vehicle (UAV) during the pre-training stage for transfer learning to improve DCN performance for further research.
APA, Harvard, Vancouver, ISO, and other styles
43

Mekhalfi, Mohamed Lamine, Mesay Belete Bejiga, Davide Soresina, Farid Melgani, and Begüm Demir. "Capsule Networks for Object Detection in UAV Imagery." Remote Sensing 11, no. 14 (July 17, 2019): 1694. http://dx.doi.org/10.3390/rs11141694.

Full text
Abstract:
Recent advances in Convolutional Neural Networks (CNNs) have attracted great attention in remote sensing due to their high capability to model high-level semantic content of Remote Sensing (RS) images. However, CNNs do not explicitly retain the relative position of objects in an image and, thus, the effectiveness of the obtained features is limited in the framework of the complex object detection problems. To address this problem, in this paper we introduce Capsule Networks (CapsNets) for object detection in Unmanned Aerial Vehicle-acquired images. Unlike CNNs, CapsNets extract and exploit the information content about objects’ relative position across several layers, which enables parsing crowded scenes with overlapping objects. Experimental results obtained on two datasets for car and solar panel detection problems show that CapsNets provide similar object detection accuracies when compared to state-of-the-art deep models with significantly reduced computational time. This is due to the fact that CapsNets emphasize dynamic routine instead of the depth.
APA, Harvard, Vancouver, ISO, and other styles
44

KRIEGER, HANS-ULRICH. "From UBGs to CFGs A practical corpus-driven approach." Natural Language Engineering 13, no. 4 (December 2007): 317–51. http://dx.doi.org/10.1017/s1351324906004128.

Full text
Abstract:
AbstractWe present a simple and intuitive unsound corpus-driven approximation method for turning unification-based grammars, such as HPSG, CLE, or PATR-II into context-free grammars (CFGs). Our research is motivated by the idea that we can exploit (large-scale), hand-written unification grammars not only for the purpose of describing natural language and obtaining a syntactic structure (and perhaps a semantic form), but also to address several other very practical topics. Firstly, to speed up deep parsing by having a cheap recognition pre-flter (the approximated CFG). Secondly, to obtain an indirect stochastic parsing model for the unification grammar through a trained PCFG, obtained from the approximated CFG. This gives us an efficient disambiguation model for the unification-based grammar. Thirdly, to generate domain-specific subgrammars for application areas such as information extraction or question answering. And finally, to compile context-free language models which assist the acoustic model of a speech recognizer. The approximation method is unsound in that it does not generate a CFG whose language is a true superset of the language accepted by the original unification-based grammar. It is a corpus-driven method in that it relies on a corpus of parsed sentences and generates broader CFGs when given more input samples. Our open approach can be fine-tuned in different directions, allowing us to monotonically come close to the original parse trees by shifting more information into the context-free symbols. The approach has been fully implemented in JAVA.
APA, Harvard, Vancouver, ISO, and other styles
45

Qiao, Dalei, Guangzhong Liu, Taizhi Lv, Wei Li, and Juan Zhang. "Marine Vision-Based Situational Awareness Using Discriminative Deep Learning: A Survey." Journal of Marine Science and Engineering 9, no. 4 (April 8, 2021): 397. http://dx.doi.org/10.3390/jmse9040397.

Full text
Abstract:
The primary task of marine surveillance is to construct a perfect marine situational awareness (MSA) system that serves to safeguard national maritime rights and interests and to maintain blue homeland security. Progress in maritime wireless communication, developments in artificial intelligence, and automation of marine turbines together imply that intelligent shipping is inevitable in future global shipping. Computer vision-based situational awareness provides visual semantic information to human beings that approximates eyesight, which makes it likely to be widely used in the field of intelligent marine transportation. We describe how we combined the visual perception tasks required for marine surveillance with those required for intelligent ship navigation to form a marine computer vision-based situational awareness complex and investigated the key technologies they have in common. Deep learning was a prerequisite activity. We summarize the progress made in four aspects of current research: full scene parsing of an image, target vessel re-identification, target vessel tracking, and multimodal data fusion with data from visual sensors. The paper gives a summary of research to date to provide background for this work and presents brief analyses of existing problems, outlines some state-of-the-art approaches, reviews available mainstream datasets, and indicates the likely direction of future research and development. As far as we know, this paper is the first review of research into the use of deep learning in situational awareness of the ocean surface. It provides a firm foundation for further investigation by researchers in related fields.
APA, Harvard, Vancouver, ISO, and other styles
46

Ji, Tianbo, Chenyang Lyu, Zhichao Cao, and Peng Cheng. "Multi-Hop Question Generation Using Hierarchical Encoding-Decoding and Context Switch Mechanism." Entropy 23, no. 11 (October 31, 2021): 1449. http://dx.doi.org/10.3390/e23111449.

Full text
Abstract:
Neural auto-regressive sequence-to-sequence models have been dominant in text generation tasks, especially the question generation task. However, neural generation models suffer from the global and local semantic semantic drift problems. Hence, we propose the hierarchical encoding–decoding mechanism that aims at encoding rich structure information of the input passages and reducing the variance in the decoding phase. In the encoder, we hierarchically encode the input passages according to its structure at four granularity-levels: [word, chunk, sentence, document]-level. Second, we progressively select the context vector from the document-level representations to the word-level representations at each decoding time step. At each time-step in the decoding phase, we progressively select the context vector from the document-level representations to word-level. We also propose the context switch mechanism that enables the decoder to use the context vector from the last step when generating the current word at each time-step.It provides a means of improving the stability of the text generation process during the decoding phase when generating a set of consecutive words. Additionally, we inject syntactic parsing knowledge to enrich the word representations. Experimental results show that our proposed model substantially improves the performance and outperforms previous baselines according to both automatic and human evaluation. Besides, we implement a deep and comprehensive analysis of generated questions based on their types.
APA, Harvard, Vancouver, ISO, and other styles
47

Mou, L., Y. Hua, P. Jin, and X. X. Zhu. "GLOBAL MESSAGE PASSING IN NETWORKS VIA TASK-DRIVEN RANDOM WALKS FOR SEMANTIC SEGMENTATION OF REMOTE SENSING IMAGES." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences V-2-2020 (August 3, 2020): 533–40. http://dx.doi.org/10.5194/isprs-annals-v-2-2020-533-2020.

Full text
Abstract:
Abstract. The capability of globally modeling and reasoning about relations between image regions is crucial for complex scene understanding tasks such as semantic segmentation. Most current semantic segmentation methods fall back on deep convolutional neural networks (CNNs), while their use of convolutions with local receptive fields is typically inefficient at capturing long-range dependencies. Recent works on self-attention mechanisms and relational reasoning networks seek to address this issue by learning pairwise relations between each two entities and have showcased promising results. But such approaches have heavy computational and memory overheads, which is computationally infeasible for dense prediction tasks, particularly on large size images, i.e., aerial imagery. In this work, we propose an efficient method for global context modeling in which at each position, a sparse set of features, instead of all features, over the spatial domain are adaptively sampled and aggregated. We further devise a highly efficient instantiation of the proposed method, namely learning RANdom walK samplIng aNd feature aGgregation (RANKING). The proposed module is lightweight and general, which can be used in a plug-and-play fashion with the existing fully convolutional neural network (FCN) framework. To evaluate RANKING-equipped networks, we conduct experiments on two aerial scene parsing datasets, and the networks can achieve competitive results at significant low costs in terms of the computational and memory.
APA, Harvard, Vancouver, ISO, and other styles
48

Zhang, Ruidong, and Xinguang Zhang. "Geometric Constraint-Based and Improved YOLOv5 Semantic SLAM for Dynamic Scenes." ISPRS International Journal of Geo-Information 12, no. 6 (May 23, 2023): 211. http://dx.doi.org/10.3390/ijgi12060211.

Full text
Abstract:
When using deep learning networks for dynamic feature rejection in SLAM systems, problems such as a priori static object motion leading to disturbed build quality and accuracy and slow system runtime are prone to occur. In this paper, based on the ORB-SLAM2 system, we propose a method based on improved YOLOv5 networks combined with geometric constraint methods for SLAM map building in dynamic environments. First, this paper uses ShuffleNetV2 to lighten the YOLOv5 network, which increases the improved network’s operation speed without reducing the accuracy. At the same time, a pyramidal scene parsing network segmentation head is added to the head part of the YOLOv5 network to achieve semantic extraction in the environment, so that the improved YOLOv5 network has both target detection and semantic segmentation functions. In order to eliminate the objects with low dynamic features in the environment, this paper adopts the method of geometric constraints to extract and eliminate the dynamic features of the low dynamic objects. By combining the improved YOLOv5 network with the geometric constraint method, the robustness of the system is improved and the interference of dynamic targets in the construction of the SLAM system map is eliminated. The test results on the TUM dataset show that, when constructing a map in a dynamic environment, compared with the traditional ORB-SLAM2 algorithm, the accuracy of map construction in a dynamic environment is significantly improved. The absolute trajectory error is reduced by 97.7% compared with ORB-SLAM2, and the relative position error is reduced by 59.7% compared with ORB-SLAM2. Compared with DynaSLAM for dynamic scenes of the same type, the accuracy of map construction is slightly improved, but the maximum increase in keyframe processing time is 94.7%.
APA, Harvard, Vancouver, ISO, and other styles
49

Gibril, Mohamed Barakat A., Helmi Zulhaidi Mohd Shafri, Abdallah Shanableh, Rami Al-Ruzouq, Aimrun Wayayok, and Shaiful Jahari Hashim. "Deep Convolutional Neural Network for Large-Scale Date Palm Tree Mapping from UAV-Based Images." Remote Sensing 13, no. 14 (July 15, 2021): 2787. http://dx.doi.org/10.3390/rs13142787.

Full text
Abstract:
Large-scale mapping of date palm trees is vital for their consistent monitoring and sustainable management, considering their substantial commercial, environmental, and cultural value. This study presents an automatic approach for the large-scale mapping of date palm trees from very-high-spatial-resolution (VHSR) unmanned aerial vehicle (UAV) datasets, based on a deep learning approach. A U-Shape convolutional neural network (U-Net), based on a deep residual learning framework, was developed for the semantic segmentation of date palm trees. A comprehensive set of labeled data was established to enable the training and evaluation of the proposed segmentation model and increase its generalization capability. The performance of the proposed approach was compared with those of various state-of-the-art fully convolutional networks (FCNs) with different encoder architectures, including U-Net (based on VGG-16 backbone), pyramid scene parsing network, and two variants of DeepLab V3+. Experimental results showed that the proposed model outperformed other FCNs in the validation and testing datasets. The generalizability evaluation of the proposed approach on a comprehensive and complex testing dataset exhibited higher classification accuracy and showed that date palm trees could be automatically mapped from VHSR UAV images with an F-score, mean intersection over union, precision, and recall of 91%, 85%, 0.91, and 0.92, respectively. The proposed approach provides an efficient deep learning architecture for the automatic mapping of date palm trees from VHSR UAV-based images.
APA, Harvard, Vancouver, ISO, and other styles
50

Ma, Kaifeng, Xiang Meng, Mengshu Hao, Guiping Huang, Qingfeng Hu, and Peipei He. "Research on the Efficiency of Bridge Crack Detection by Coupling Deep Learning Frameworks with Convolutional Neural Networks." Sensors 23, no. 16 (August 19, 2023): 7272. http://dx.doi.org/10.3390/s23167272.

Full text
Abstract:
Bridge crack detection based on deep learning is a research area of great interest and difficulty in the field of bridge health detection. This study aimed to investigate the effectiveness of coupling a deep learning framework (DLF) with a convolutional neural network (CNN) for bridge crack detection. A dataset consisting of 2068 bridge crack images was randomly split into training, verification, and testing sets with a ratio of 8:1:1, respectively. Several CNN models, including Faster R-CNN, Single Shot MultiBox Detector (SSD), You Only Look Once (YOLO)-v5(x), U-Net, and Pyramid Scene Parsing Network (PSPNet), were used to conduct experiments using the PyTorch, TensorFlow2, and Keras frameworks. The experimental results show that the Harmonic Mean (F1) values of the detection results of the Faster R-CNN and SSD models under the Keras framework are relatively large (0.76 and 0.67, respectively, in the object detection model). The YOLO-v5(x) model of the TensorFlow2 framework achieved the highest F1 value of 0.67. In semantic segmentation models, the U-Net model achieved the highest detection result accuracy (AC) value of 98.37% under the PyTorch framework. The PSPNet model achieved the highest AC value of 97.86% under the TensorFlow2 framework. These experimental results provide optimal coupling efficiency parameters of a DLF and CNN for bridge crack detection. A more accurate and efficient DLF and CNN model for bridge crack detection has been obtained, which has significant practical application value.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography