To see the other types of publications on this topic, follow the link: Semantics - Data processing.

Journal articles on the topic 'Semantics - Data processing'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Semantics - Data processing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Thirunarayan, Krishnaprasad, and Amit Sheth. "Semantics-Empowered Big Data Processing with Applications." AI Magazine 36, no. 1 (March 25, 2015): 39–54. http://dx.doi.org/10.1609/aimag.v36i1.2566.

Full text
Abstract:
We discuss the nature of big data and address the role of semantics in analyzing and processing big data that arises in the context of physical-cyber-social systems. To handle volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle variety, we resort to semantic models and annotations of data so that intelligent processing can be done independent of heterogeneity of data formats and media. To handle velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize relevant new concepts, entities and facts. To handle veracity, we explore trust models and approaches to glean trustworthiness. These four v's of big data are harnessed by the semantics-empowered analytics to derive value to support applications transcending physical-cyber-social continuum.
APA, Harvard, Vancouver, ISO, and other styles
2

Jin, Min. "Semantics in XML Data Processing." Journal of the Korea Academia-Industrial cooperation Society 12, no. 3 (March 31, 2011): 1327–35. http://dx.doi.org/10.5762/kais.2011.12.3.1327.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Liu, Jun Qiang, and Xiao Ling Guan. "Composite Event Processing for Data Streams and Domain Knowledge." Advanced Materials Research 219-220 (March 2011): 927–31. http://dx.doi.org/10.4028/www.scientific.net/amr.219-220.927.

Full text
Abstract:
In recent years the processing of composite event queries over data streams has attracted a lot of research attention. Traditional database techniques were not designed for stream processing system. Furthermore, example continuous queries are often formulated in declarative query language without specifying the semantics. To overcome these deficiencies, this article presents the design, implementation, and evaluation of a system that executes data streams with semantic information. Then, a set of optimization techniques are proposed for handling query. So, our approach not only makes it possible to express queries with a sound semantics, but also provides a solid foundation for query optimization. Experiment results show that our approach is effective and efficient for data streams and domain knowledge.
APA, Harvard, Vancouver, ISO, and other styles
4

Xhafa, Fatos, and Leonard Barolli. "Semantics, intelligent processing and services for big data." Future Generation Computer Systems 37 (July 2014): 201–2. http://dx.doi.org/10.1016/j.future.2014.02.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Ramprasad, R., and C. Jayakumari. "A Novel Approach for Mining Big Data Using Multi-Model Fusion Mechanism (MMFM)." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 5s (June 9, 2023): 484–93. http://dx.doi.org/10.17762/ijritcc.v11i5s.7110.

Full text
Abstract:
Big data processing and analytics require sophisticated systems and cutting-edge methodologies to extract useful data from the available data. Extracted data visualization is challenging because of the processing models' dependence on semantics and classification. To categorize and improve information-based semantics that have accumulated over time, this paper introduces the Multi-model fusion mechanism for data mining (MMFM) approach. Information dependencies are organized based on the links between the data model based on attribute values. This method divides the attributes under consideration based on processing time to handle complicated data in controlled amount of time. The proposed MMFM’s performance is assessed with real-time weather prediction dataset where the data is acquired from sensor (observed) and image data. MMFM is used to conduct semantic analytics and similarity-based classification on this collection. The processing time based on records and samples are investigated for the various data sizes, instances, and entries. It is found that the proposed MMFM gets 70 seconds of processing time for 2GB data and 0.99 seconds while handling 5000 records for various classification instances.
APA, Harvard, Vancouver, ISO, and other styles
6

NISHIMURA, SUSUMU, and ATSUSHI OHORI. "Parallel functional programming on recursively defined data via data-parallel recursion." Journal of Functional Programming 9, no. 4 (July 1999): 427–62. http://dx.doi.org/10.1017/s0956796899003457.

Full text
Abstract:
This article proposes a new language mechanism for data-parallel processing of dynamically allocated recursively defined data. Different from the conventional array-based data- parallelism, it allows parallel processing of general recursively defined data such as lists or trees in a functional way. This is achieved by representing a recursively defined datum as a system of equations, and defining new language constructs for parallel transformation of a system of equations. By integrating them with a higher-order functional language, we obtain a functional programming language suitable for describing data-parallel algorithms on recursively defined data in a declarative way. The language has an ML style polymorphic type system and a type sound operational semantics that uniformly integrates the parallel evaluation mechanism with the semantics of a typed functional language. We also show the intended parallel execution model behind the formal semantics, assuming an idealized distributed memory multicomputer.
APA, Harvard, Vancouver, ISO, and other styles
7

Boyle, Mary. "Semantic Treatments for Word and Sentence Production Deficits in Aphasia." Seminars in Speech and Language 38, no. 01 (February 2017): 052–61. http://dx.doi.org/10.1055/s-0036-1597256.

Full text
Abstract:
The cognitive domains of language and memory are intrinsically connected and work together during language processing. This relationship is especially apparent in the area of semantics. Several disciplines have contributed to a rich store of data about semantic organization and processing, and several semantic treatments for aphasic word and sentence production impairments have been based on these data. This article reviews the relationships between semantics and memory as they relate to word and sentence production, describes the aphasic language impairments that result from deficits in these areas, and summarizes treatment approaches that capitalize on what we have learned about these domains and how they work together.
APA, Harvard, Vancouver, ISO, and other styles
8

Sejdiu, Besmir, Florije Ismaili, and Lule Ahmedi. "Integration of Semantics Into Sensor Data for the IoT." International Journal on Semantic Web and Information Systems 16, no. 4 (October 2020): 1–25. http://dx.doi.org/10.4018/ijswis.2020100101.

Full text
Abstract:
The internet of things (IoT) as an evolving technology represents an active scientific research field in recognizing research challenges associated with its application in various domains, ranging from consumer convenience, smart energy, and resource saving to IoT enterprises. Sensors are crucial components of IoT that relay the collected data in the form of the data stream for further processing. Interoperability of various connected digital resources is a key challenge in IoT environments. The enrichment of raw sensor data with semantic annotations using concept definitions from ontologies enables more expressive data representation that supports knowledge discovery. In this paper, a systematic review of integration of semantics into sensor data for the IoT is provided. The conducted review is focused on analyzing the main solutions of adding semantic annotations to the sensor data, standards that enable all types of sensor data via the web, existing models of stream data annotation, and the IoT trend domains that use semantics.
APA, Harvard, Vancouver, ISO, and other styles
9

Nguyen Mau Quoc, Hoan, Martin Serrano, Han Mau Nguyen, John G. Breslin, and Danh Le-Phuoc. "EAGLE—A Scalable Query Processing Engine for Linked Sensor Data." Sensors 19, no. 20 (October 9, 2019): 4362. http://dx.doi.org/10.3390/s19204362.

Full text
Abstract:
Recently, many approaches have been proposed to manage sensor data using semantic web technologies for effective heterogeneous data integration. However, our empirical observations revealed that these solutions primarily focused on semantic relationships and unfortunately paid less attention to spatio–temporal correlations. Most semantic approaches do not have spatio–temporal support. Some of them have attempted to provide full spatio–temporal support, but have poor performance for complex spatio–temporal aggregate queries. In addition, while the volume of sensor data is rapidly growing, the challenge of querying and managing the massive volumes of data generated by sensing devices still remains unsolved. In this article, we introduce EAGLE, a spatio–temporal query engine for querying sensor data based on the linked data model. The ultimate goal of EAGLE is to provide an elastic and scalable system which allows fast searching and analysis with respect to the relationships of space, time and semantics in sensor data. We also extend SPARQL with a set of new query operators in order to support spatio–temporal computing in the linked sensor data context.
APA, Harvard, Vancouver, ISO, and other styles
10

Kim, Hyeon Gyu. "Exploiting Window Query Semantics in Scalable Data Stream Processing." International Journal of Control and Automation 8, no. 11 (November 30, 2015): 13–20. http://dx.doi.org/10.14257/ijca.2015.8.11.02.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Hamaz, Kamal, and Fouzia Benchikha. "A novel method for providing relational databases with rich semantics and natural language processing." Journal of Enterprise Information Management 30, no. 3 (April 10, 2017): 503–25. http://dx.doi.org/10.1108/jeim-01-2015-0005.

Full text
Abstract:
Purpose With the development of systems and applications, the number of users interacting with databases has increased considerably. The relational database model is still considered as the most used model for data storage and manipulation. However, it does not offer any semantic support for the stored data which can facilitate data access for the users. Indeed, a large number of users are intimidated when retrieving data because they are non-technical or have little technical knowledge. To overcome this problem, researchers are continuously developing new techniques for Natural Language Interfaces to Databases (NLIDB). Nowadays, the usage of existing NLIDBs is not widespread due to their deficiencies in understanding natural language (NL) queries. In this sense, the purpose of this paper is to propose a novel method for an intelligent understanding of NL queries using semantically enriched database sources. Design/methodology/approach First a reverse engineering process is applied to extract relational database hidden semantics. In the second step, the extracted semantics are enriched further using a domain ontology. After this, all semantics are stored in the same relational database. The phase of processing NL queries uses the stored semantics to generate a semantic tree. Findings The evaluation part of the work shows the advantages of using a semantically enriched database source to understand NL queries. Additionally, enriching a relational database has given more flexibility to understand contextual and synonymous words that may be used in a NL query. Originality/value Existing NLIDBs are not yet a standard option for interfacing a relational database due to their lack for understanding NL queries. Indeed, the techniques used in the literature have their limits. This paper handles those limits by identifying the NL elements by their semantic nature in order to generate a semantic tree. This last is a key solution towards an intelligent understanding of NL queries to relational databases.
APA, Harvard, Vancouver, ISO, and other styles
12

Shastri, Shankarayya, Veeragangadhara Swamy Teligi Math, and Patil Nagaraja Siddalingappa. "Sensing complicated meanings from unstructured data: a novel hybrid approach." International Journal of Electrical and Computer Engineering (IJECE) 14, no. 1 (February 1, 2024): 711. http://dx.doi.org/10.11591/ijece.v14i1.pp711-720.

Full text
Abstract:
The majority of data on computers nowadays is in the form of unstructured data and unstructured text. The inherent ambiguity of natural language makes it incredibly difficult but also highly profitable to find hidden information or comprehend complex semantics in unstructured text. In this paper, we present the combination of natural language processing (NLP) and convolution neural network (CNN) hybrid architecture called automated analysis of unstructured text using machine learning (AAUT-ML) for the detection of complex semantics from unstructured data that enables different users to make understand formal semantic knowledge to be extracted from an unstructured text corpus. The AAUT-ML has been evaluated using three datasets data mining (DM), operating system (OS), and data base (DB), and compared with the existing models, i.e., YAKE, term frequency-inverse document frequency (TF-IDF) and text-R. The results show better outcomes in terms of precision, recall, and macro-averaged F1-score. This work presents a novel method for identifying complex semantics using unstructured data.
APA, Harvard, Vancouver, ISO, and other styles
13

Mukanova, Assel, Marek Milosz, Assem Dauletkaliyeva, Aizhan Nazyrova, Gaziza Yelibayeva, Dmitrii Kuzin, and Lazzat Kussepova. "LLM-Powered Natural Language Text Processing for Ontology Enrichment." Applied Sciences 14, no. 13 (July 4, 2024): 5860. http://dx.doi.org/10.3390/app14135860.

Full text
Abstract:
This paper describes a method and technology for processing natural language texts and extracting data from the text that correspond to the semantics of an ontological model. The proposed method is distinguished by the use of a Large Language Model algorithm for text analysis. The extracted data are stored in an intermediate format, after which individuals and properties that reflect the specified semantics are programmatically created in the ontology. The proposed technology is implemented using the example of an ontological model that describes the geographical configuration and administrative–territorial division of Kazakhstan. The proposed method and technology can be applied in any subject areas for which ontological models have been developed. The results of the study can significantly improve the efficiency of using knowledge bases based on semantic networks by converting texts in natural languages into semantically linked data.
APA, Harvard, Vancouver, ISO, and other styles
14

Novikov, Boris, Alice Pigul, and Anna Yarygina. "A Performance Analysis of Semantic Caching for XML Query Processing." International Journal of Knowledge-Based Organizations 3, no. 4 (October 2013): 40–60. http://dx.doi.org/10.4018/ijkbo.2013100103.

Full text
Abstract:
Caching is important for any system attempting to achieve high performance. The semantic caching is an approach trying to benefit from the certain knowledge of data semantics. The authors expect that this information might enable reuse of semantically close data rather than exactly equal to cached data in the traditional system. However, the major obstacle for extensive application of semantic caching for any data model or query language is the computational complexity of the query containment problem, which is, in general, undecidable. In this article the authors introduce and compare three approximate conservative query matching algorithms for semantic caching of semi-structured queries. The authors then analyze their applicability for distributed query processing. Based on this analysis, the authors outline few scenarios where semantic caching can be beneficial for query processing in a distributed system of heterogeneous semi-structured information resources.
APA, Harvard, Vancouver, ISO, and other styles
15

Dell'Aglio, Daniele, Emanuele Della Valle, Jean-Paul Calbimonte, and Oscar Corcho. "RSP-QL Semantics." International Journal on Semantic Web and Information Systems 10, no. 4 (October 2014): 17–44. http://dx.doi.org/10.4018/ijswis.2014100102.

Full text
Abstract:
RDF and SPARQL are established standards for data interchange and querying on the Web. While they have been shown to be useful and applicable in many scenarios, they are not sufficiently adequate for dealing with streams of data and their intrinsic continuous nature. In the last years data and query languages have been proposed to extend both RDF and SPARQL for streams and continuous processing, under the name of RDF Stream Processing – RSP. These efforts resulted in several models and implementations that, at a first look, appear to propose alternative syntaxes but equivalent semantics. However, when asked to continuously answer the same queries on the same data streams, they provide different answers at disparate moments due to the heterogeneity of their operational semantics. These discrepancies render the process of understanding and comparing continuous query results complex and misleading. In this work, the authors propose RSP-QL, a comprehensive model that formally defines the semantics of an RSP system. RSP-QL makes explicit the hidden assumptions of currently available RSP systems, allows defining a formal notion of correctness for RSP query results and, thus, explains why available implementations provide different answers at disparate moments.
APA, Harvard, Vancouver, ISO, and other styles
16

Azabou, Maha, Ameen Banjar, and Jamel Omar Feki. "Enhancing the Diamond Document Warehouse Model." International Journal of Data Warehousing and Mining 16, no. 4 (October 2020): 1–25. http://dx.doi.org/10.4018/ijdwm.2020100101.

Full text
Abstract:
The data warehouse community has paid particular attention to the document warehouse (DocW) paradigm during the last two decades. However, some important issues related to the semantics are still pending and therefore need a deep research investigation. Indeed, the semantic exploitation of the DocW is not yet mature despite it representing a main concern for decision-makers. This paper aims to enhancing the multidimensional model called Diamond Document Warehouse Model with semantics aspects; in particular, it suggests semantic OLAP (on-line analytical processing) operators for querying the DocW.
APA, Harvard, Vancouver, ISO, and other styles
17

Teuscher, Balthasar, and Martin Werner. "Random Data Distribution for Efficient Parallel Point Cloud Processing." AGILE: GIScience Series 5 (May 30, 2024): 1–10. http://dx.doi.org/10.5194/agile-giss-5-15-2024.

Full text
Abstract:
Abstract. Current point cloud data management systems and formats are heavily specialized and targeted solely towards visualization purposes and fail to address the diverse needs of progressive point cloud workflows like for example semantic segmentation using machine learning. We therefore propose a distributed data infrastructure for dynamic point cloud data management that can support interactive real-time visualization at scale while simultaneously serving as a platform for analytical tasks. By introducing random data distribution, we show that simple query fragmentation and efficient and effective parallelism at scale are possible. At the same time, arbitrary queries in space and time can be efficiently run over the infrastructure including query semantics which returns only a random sample of the query results or preferred points based on an importance dimension calculated, for example, from a local point density information as commonly done in point cloud visualization. To cope with the unknown amount of user-specific attributes and to support even multiple ways of deciding the importance of a given point (ground point removal, coverage of space, random subset) the system is designed to support all of them transparently as multidimensional range queries backed by spatial indices.
APA, Harvard, Vancouver, ISO, and other styles
18

Zhang, Jing, and Xiaoyan Liang. "ADLBiLSTM: A Semantic Generation Algorithm for Multi-Grammar Network Access Control Policies." Applied Sciences 14, no. 11 (May 25, 2024): 4555. http://dx.doi.org/10.3390/app14114555.

Full text
Abstract:
Abstract: Semantic generation of network access control policies can help network administrators accurately implement policies to achieve desired security objectives. Current semantic generation research mainly focuses on semantic generation of single grammar and lacks work on automatically generating semantics for different grammatical strategies. Generating semantics for different grammars is a tedious, inefficient, and non-scalable task. Inspired by sequence labeling in the field of natural language processing, this article models automatic semantic generation as a sequence labeling task. We propose a semantic generation algorithm named ADLBiLSTM. The algorithm uses a self-attention mechanism and double-layer BiLSTM to extract the features of security policies from different aspects, so that the algorithm can flexibly adapt to policies of different complexity without frequent modification. Experimental results showed that the algorithm has good performance and can achieve high accuracy in semantic generation of access control list (ACL) and firewall data and can accurately understand and generate the semantics of network access control policies.
APA, Harvard, Vancouver, ISO, and other styles
19

Cimminella, Francesco, Sergio Della Sala, and Moreno I. Coco. "Extra-foveal Processing of Object Semantics Guides Early Overt Attention During Visual Search." Attention, Perception, & Psychophysics 82, no. 2 (December 2, 2019): 655–70. http://dx.doi.org/10.3758/s13414-019-01906-1.

Full text
Abstract:
AbstractEye-tracking studies using arrays of objects have demonstrated that some high-level processing of object semantics can occur in extra-foveal vision, but its role on the allocation of early overt attention is still unclear. This eye-tracking visual search study contributes novel findings by examining the role of object-to-object semantic relatedness and visual saliency on search responses and eye-movement behaviour across arrays of increasing size (3, 5, 7). Our data show that a critical object was looked at earlier and for longer when it was semantically unrelated than related to the other objects in the display, both when it was the search target (target-present trials) and when it was a target’s semantically related competitor (target-absent trials). Semantic relatedness effects manifested already during the very first fixation after array onset, were consistently found for increasing set sizes, and were independent of low-level visual saliency, which did not play any role. We conclude that object semantics can be extracted early in extra-foveal vision and capture overt attention from the very first fixation. These findings pose a challenge to models of visual attention which assume that overt attention is guided by the visual appearance of stimuli, rather than by their semantics.
APA, Harvard, Vancouver, ISO, and other styles
20

Krachunov, Milko, Ognyan Kulev, Valeriya Simeonova, Maria Nisheva, and Dimitar Vassilev. "Manageable Workflows for Processing Parallel Sequencing Data." Serdica Journal of Computing 8, no. 1 (February 2, 2015): 1–14. http://dx.doi.org/10.55630/sjc.2014.8.1-14.

Full text
Abstract:
Data analysis after parallel sequencing is a process that uses combinations of software tools that is often subject to experimentation and on-the-fly substitution, with the necessary file conversion. This article presents a developing system for creating and managing workflows aiding the tasks one encounters after parallel sequences, particularly in the area of metagenomics. The semantics, description language and software implementation aim to allow the creation of flexible, configurable workflows that are suitable for sharing and are easy to manipulate through software or by hand. The execution system design provides user-defined operations and interchangeability between an operation and a workflow. This allows significant extensibility, which can be further complemented with distributed computing and remote management interfaces.
APA, Harvard, Vancouver, ISO, and other styles
21

Hausser, Roland. "Database Semantics." Cadernos de Linguística 2, no. 1 (December 14, 2021): e382. http://dx.doi.org/10.25189/2675-4916.2021.v2.n1.id382.

Full text
Abstract:
For long-term upscaling, the computational reconstruction of a complex natural mechanism must be input-output equivalent with the prototype, i.e. the reconstruction must take the same input and produce the same output in the same processing order as the original. Accordingly, the modeling of natural language communication in Database Semantics (DBS) uses a time-linear derivation order for the speaker’s output and the hearer’s input. The language-dependent surfaces serving as the vehicle of content transfer from speaker to hearer are raw data without meaning or any grammatical properties whatsoever, but measurable by natural science.
APA, Harvard, Vancouver, ISO, and other styles
22

Fisher, Ingrid E., and Robert A. Nehmer. "Using Language Processing to Evaluate the Equivalency of the FASB and IASB Standards." Journal of Emerging Technologies in Accounting 13, no. 2 (September 1, 2016): 129–44. http://dx.doi.org/10.2308/jeta-51621.

Full text
Abstract:
ABSTRACT The passage of the Data Transparency and Accountability Act in the United States Congress will necessitate that government agencies provide more data in transparent formats. The issue of how to interpret such data remains an open question. The accounting profession has continued to struggle with common formats since the inception of balance sheets and income statements. The original FASB Conceptual Framework was developed to help construct consistent GAAP standards. XBRL was developed to provide a consistent representation of the data contained in financial statements and other financial documents. This research explores the use of two codifications (U.S. GAAP and IFRS) of GAAP standards in both their syntactic representation through XBRL taxonomies and their semantics through their authoritative references back to their own standards and codification. The research uses language theory to model the codifications in terms of the strings used to represent lexical content in the financial statements and to provide a systematic mapping to the semantics of the related XBRL specifications. The immediate objectives of this research are to provide a means to compare the semantic richness of U.S. GAAP and IFRS and to determine the consistency of either standardization with respect to the emerging shared Conceptual Framework. Ultimately, to the extent that the system is able to model both the syntax and the semantics of the financial statements, it could provide a baseline on which to consider assurance over parts of the financial statements, rather than over the financial statements taken as a whole.
APA, Harvard, Vancouver, ISO, and other styles
23

Tutzauer, P., and N. Haala. "PROCESSING OF CRAWLED URBAN IMAGERY FOR BUILDING USE CLASSIFICATION." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-1/W1 (May 31, 2017): 143–49. http://dx.doi.org/10.5194/isprs-archives-xlii-1-w1-143-2017.

Full text
Abstract:
Recent years have shown a shift from pure geometric 3D city models to data with semantics. This is induced by new applications (e.g. Virtual/Augmented Reality) and also a requirement for concepts like Smart Cities. However, essential urban semantic data like building use categories is often not available. We present a first step in bridging this gap by proposing a pipeline to use crawled urban imagery and link it with ground truth cadastral data as an input for automatic building use classification. We aim to extract this city-relevant semantic information automatically from Street View (SV) imagery. Convolutional Neural Networks (CNNs) proved to be extremely successful for image interpretation, however, require a huge amount of training data. Main contribution of the paper is the automatic provision of such training datasets by linking semantic information as already available from databases provided from national mapping agencies or city administrations to the corresponding façade images extracted from SV. Finally, we present first investigations with a CNN and an alternative classifier as a proof of concept.
APA, Harvard, Vancouver, ISO, and other styles
24

Villa, Ferdinando, Stefano Balbi, Ioannis N. Athanasiadis, and Caterina Caracciolo. "Semantics for interoperability of distributed data and models: Foundations for better-connected information." F1000Research 6 (May 17, 2017): 686. http://dx.doi.org/10.12688/f1000research.11638.1.

Full text
Abstract:
Correct and reliable linkage of independently produced information is a requirement to enable sophisticated applications and processing workflows. These can ultimately help address the challenges posed by complex systems (such as socio-ecological systems), whose many components can only be described through independently developed data and model products. We discuss the first outcomes of an investigation in the conceptual and methodological aspects of semantic annotation of data and models, aimed to enable a high standard of interoperability of information. The results, operationalized in the context of a long-term, active, large-scale project on ecosystem services assessment, include: A definition of interoperability based on semantics and scale;A conceptual foundation for the phenomenology underlying scientific observations, aimed to guide the practice of semantic annotation in domain communities;A dedicated language and software infrastructure that operationalizes the findings and allows practitioners to reap the benefits of data and model interoperability. The work presented is the first detailed description of almost a decade of work with communities active in socio-ecological system modeling. After defining the boundaries of possible interoperability based on the understanding of scale, we discuss examples of the practical use of the findings to obtain consistent, interoperable and machine-ready semantic specifications that can integrate semantics across diverse domains and disciplines.
APA, Harvard, Vancouver, ISO, and other styles
25

Beck, Edgar, Carsten Bockelmann, and Armin Dekorsy. "Semantic Information Recovery in Wireless Networks." Sensors 23, no. 14 (July 12, 2023): 6347. http://dx.doi.org/10.3390/s23146347.

Full text
Abstract:
Motivated by the recent success of Machine Learning (ML) tools in wireless communications, the idea of semantic communication by Weaver from 1949 has gained attention. It breaks with Shannon’s classic design paradigm by aiming to transmit the meaning of a message, i.e., semantics, rather than its exact version and, thus, enables savings in information rate. In this work, we extend the fundamental approach from Basu et al. for modeling semantics to the complete communications Markov chain. Thus, we model semantics by means of hidden random variables and define the semantic communication task as the data-reduced and reliable transmission of messages over a communication channel such that semantics is best preserved. We consider this task as an end-to-end Information Bottleneck problem, enabling compression while preserving relevant information. As a solution approach, we propose the ML-based semantic communication system SINFONY and use it for a distributed multipoint scenario; SINFONY communicates the meaning behind multiple messages that are observed at different senders to a single receiver for semantic recovery. We analyze SINFONY by processing images as message examples. Numerical results reveal a tremendous rate-normalized SNR shift up to 20 dB compared to classically designed communication systems.
APA, Harvard, Vancouver, ISO, and other styles
26

Fan, Yunqian, Xiuying Wei, Ruihao Gong, Yuqing Ma, Xiangguo Zhang, Qi Zhang, and Xianglong Liu. "Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 11 (March 24, 2024): 11936–43. http://dx.doi.org/10.1609/aaai.v38i11.29080.

Full text
Abstract:
Lane detection (LD) plays a crucial role in enhancing the L2+ capabilities of autonomous driving, capturing widespread attention. The Post-Processing Quantization (PTQ) could facilitate the practical application of LD models, enabling fast speeds and limited memories without labeled data. However, prior PTQ methods do not consider the complex LD outputs that contain physical semantics, such as offsets, locations, etc., and thus cannot be directly applied to LD models. In this paper, we pioneeringly investigate semantic sensitivity to post-processing for lane detection with a novel Lane Distortion Score. Moreover, we identify two main factors impacting the LD performance after quantization, namely intra-head sensitivity and inter-head sensitivity, where a small quantization error in specific semantics can cause significant lane distortion. Thus, we propose a Selective Focus framework deployed with Semantic Guided Focus and Sensitivity Aware Selection modules, to incorporate post-processing information into PTQ reconstruction. Based on the observed intra-head sensitivity, Semantic Guided Focus is introduced to prioritize foreground-related semantics using a practical proxy. For inter-head sensitivity, we present Sensitivity Aware Selection, efficiently recognizing influential prediction heads and refining the optimization objectives at runtime. Extensive experiments have been done on a wide variety of models including keypoint-, anchor-, curve-, and segmentation-based ones. Our method produces quantized models in minutes on a single GPU and can achieve 6.4\% F1 Score improvement on the CULane dataset. Code and supplementary statement can be found at https://github.com/PannenetsF/SelectiveFocus.
APA, Harvard, Vancouver, ISO, and other styles
27

Terziyan, Vagan, and Anton Nikulin. "Semantics of Voids within Data: Ignorance-Aware Machine Learning." ISPRS International Journal of Geo-Information 10, no. 4 (April 8, 2021): 246. http://dx.doi.org/10.3390/ijgi10040246.

Full text
Abstract:
Operating with ignorance is an important concern of geographical information science when the objective is to discover knowledge from the imperfect spatial data. Data mining (driven by knowledge discovery tools) is about processing available (observed, known, and understood) samples of data aiming to build a model (e.g., a classifier) to handle data samples that are not yet observed, known, or understood. These tools traditionally take semantically labeled samples of the available data (known facts) as an input for learning. We want to challenge the indispensability of this approach, and we suggest considering the things the other way around. What if the task would be as follows: how to build a model based on the semantics of our ignorance, i.e., by processing the shape of “voids” within the available data space? Can we improve traditional classification by also modeling the ignorance? In this paper, we provide some algorithms for the discovery and visualization of the ignorance zones in two-dimensional data spaces and design two ignorance-aware smart prototype selection techniques (incremental and adversarial) to improve the performance of the nearest neighbor classifiers. We present experiments with artificial and real datasets to test the concept of the usefulness of ignorance semantics discovery.
APA, Harvard, Vancouver, ISO, and other styles
28

Poux, F., R. Neuville, P. Hallot, and R. Billen. "MODEL FOR SEMANTICALLY RICH POINT CLOUD DATA." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-4/W5 (October 23, 2017): 107–15. http://dx.doi.org/10.5194/isprs-annals-iv-4-w5-107-2017.

Full text
Abstract:
This paper proposes an interoperable model for managing high dimensional point clouds while integrating semantics. Point clouds from sensors are a direct source of information physically describing a 3D state of the recorded environment. As such, they are an exhaustive representation of the real world at every scale: 3D reality-based spatial data. Their generation is increasingly fast but processing routines and data models lack of knowledge to reason from information extraction rather than interpretation. The enhanced smart point cloud developed model allows to bring intelligence to point clouds via 3 connected meta-models while linking available knowledge and classification procedures that permits semantic injection. Interoperability drives the model adaptation to potentially many applications through specialized domain ontologies. A first prototype is implemented in Python and PostgreSQL database and allows to combine semantic and spatial concepts for basic hybrid queries on different point clouds.
APA, Harvard, Vancouver, ISO, and other styles
29

Chernova, Daria A. "Phonological and graphic representations of words in mental lexicon: Homophone processing while reading." Vestnik of Saint Petersburg University. Language and Literature 19, no. 1 (2022): 181–94. http://dx.doi.org/10.21638/spbu09.2022.110.

Full text
Abstract:
This article presents an experimental study of the role of phonological representation of the word in lexical access during silent reading in Russian. The the phonological component in reading (i. e. whether semantics can be accessed via phonological decoding or directly from the orthographic image of the word) is actively discussed in modern psycholinguistics. Homophones can serve as a testing ground for these hypotheses: if graphemes are decoded into phonemes in silent reading in order to access semantics, then homophones will be processed like homonyms, but if semantics are accessed directly from the visual representation of the word, then homophones can be treated as all other orthographic neighbors. We address Russian homophones in order to investigate this question. In a self-paced reading experiment, we show that if a target word is substituted either by a homophone or a spelling control (an orthographic neighbor), semantic incongruence slows down the processing of the post-target region. We show that both homophones and spelling controls cause this processing load, and homophony does not facilitate the processing of semantically incongruent word. Our data give evidence for direct visual access to entries in mental lexicon as dual-route model predicts for experienced readers.
APA, Harvard, Vancouver, ISO, and other styles
30

Mari, Daniele, Elena Camuffo, and Simone Milani. "CACTUS: Content-Aware Compression and Transmission Using Semantics for Automotive LiDAR Data." Sensors 23, no. 12 (June 15, 2023): 5611. http://dx.doi.org/10.3390/s23125611.

Full text
Abstract:
Many recent cloud or edge computing strategies for automotive applications require transmitting huge amounts of Light Detection and Ranging (LiDAR) data from terminals to centralized processing units. As a matter of fact, the development of effective Point Cloud (PC) compression strategies that preserve semantic information, which is critical for scene understanding, proves to be crucial. Segmentation and compression have always been treated as two independent tasks; however, since not all the semantic classes are equally important for the end task, this information can be used to guide data transmission. In this paper, we propose Content-Aware Compression and Transmission Using Semantics (CACTUS), which is a coding framework that exploits semantic information to optimize the data transmission, partitioning the original point set into separate data streams. Experimental results show that differently from traditional strategies, the independent coding of semantically consistent point sets preserves class information. Additionally, whenever semantic information needs to be transmitted to the receiver, using the CACTUS strategy leads to gains in terms of compression efficiency, and more in general, it improves the speed and flexibility of the baseline codec used to compress the data.
APA, Harvard, Vancouver, ISO, and other styles
31

Simov, Kiril, and Petya Osenova. "Special Thematic Section on Semantic Models for Natural Language Processing (Preface)." Cybernetics and Information Technologies 18, no. 1 (March 1, 2018): 93–94. http://dx.doi.org/10.2478/cait-2018-0008.

Full text
Abstract:
Abstract With the availability of large language data online, cross-linked lexical resources (such as BabelNet, Predicate Matrix and UBY) and semantically annotated corpora (SemCor, OntoNotes, etc.), more and more applications in Natural Language Processing (NLP) have started to exploit various semantic models. The semantic models have been created on the base of LSA, clustering, word embeddings, deep learning, neural networks, etc., and abstract logical forms, such as Minimal Recursion Semantics (MRS) or Abstract Meaning Representation (AMR), etc. Additionally, the Linguistic Linked Open Data Cloud has been initiated (LLOD Cloud) which interlinks linguistic data for improving the tasks of NLP. This cloud has been expanding enormously for the last four-five years. It includes corpora, lexicons, thesauri, knowledge bases of various kinds, organized around appropriate ontologies, such as LEMON. The semantic models behind the data organization as well as the representation of the semantic resources themselves are a challenge to the NLP community. The NLP applications that extensively rely on the above discussed models include Machine Translation, Information Extraction, Question Answering, Text Simplification, etc.
APA, Harvard, Vancouver, ISO, and other styles
32

Amenta, Simona, Davide Crepaldi, and Marco Marelli. "Consistency measures individuate dissociating semantic modulations in priming paradigms: A new look on semantics in the processing of (complex) words." Quarterly Journal of Experimental Psychology 73, no. 10 (June 15, 2020): 1546–63. http://dx.doi.org/10.1177/1747021820927663.

Full text
Abstract:
In human language the mapping between form and meaning is arbitrary, as there is no direct connection between words and the objects that they represent. However, within a given language, it is possible to recognise systematic associations that support productivity and comprehension. In this work, we focus on the consistency between orthographic forms and meaning, and we investigate how the cognitive system may exploit it to process words. We take morphology as our case study, since it arguably represents one of the most notable examples of systematicity in form–meaning mapping. In a series of three experiments, we investigate the impact of form–meaning mapping in word processing by testing new consistency metrics as predictors of priming magnitude in primed lexical decision. In Experiment 1, we re-analyse data from five masked morphological priming studies and show that orthography–semantics–consistency explains independent variance in priming magnitude, suggesting that word semantics is accessed already at early stages of word processing and that crucially semantic access is constrained by word orthography. In Experiments 2 and 3, we investigate whether this pattern is replicated when looking at semantic priming. In Experiment 2, we show that orthography–semantics–consistency is not a viable predictor of priming magnitude with longer stimulus onset asynchrony (SOA). However, in Experiment 3, we develop a new semantic consistency measure based on the semantic density of target neighbourhoods. This measure is shown to significantly predict independent variance in semantic priming effect. Overall, our results indicate that consistency measures provide crucial information for the understanding of word processing. Specifically, the dissociation between measures and priming paradigms shows that different priming conditions are associated with the activation of different semantic cohorts.
APA, Harvard, Vancouver, ISO, and other styles
33

Sampaio, Thiago Oliveira da Motta, and Aniela Improta França. "Event-duration semantics in online sentence processing." Letras de Hoje 53, no. 1 (June 5, 2018): 59. http://dx.doi.org/10.15448/1984-7726.2018.1.28695.

Full text
Abstract:
Several experiments in Psycholinguistics found evidences of Iterative Coercion, an effect related to the reanalysis of punctual events used in durative contexts triggering an iterative meaning. We argue that this effect is not related to aspectual features, and that event-duration semantics is accessed during Sentence Processing. We ran a self-paced reading experiment in Brazilian Portuguese whose sentences contain events with an average duration of a few minutes. These sentences were inserted in durative contexts that became the experiment’s conditions following a Latin Square design: control condition (minutes), subtractive (seconds), iterative (hours) and habitual (days). Higher RTs were measured at the critical segments of all experimental conditions, except for the habitual context. The results corroborated our hypothesis while defying the psychological reality of habitual coercion. To better observe the habitual coercion condition, we now present a reanalysis of Sampaio et al. (2014) data. The present analysis confirms the results of our tests.***Semântica da duração de eventos no processamento online de sentenças***Diversos experimentos psicolinguísticos encontraram evidências da CoerçãoIterativa, um efeito relacionado à reanálise de eventos pontuais usados emcontextos durativos, que resultam numa leitura iterativa. Nesse trabalho argumentamos que esse efeito não é relacionado a propriedades aspectuais e que a semântica da duração dos eventos é acessada online. Aplicamos um experimento de leitura auto-monitorada em Português Brasileiro, cujas sentenças contém eventos com duração média de alguns minutos. Estas sentenças foram inseridas em quatro condições durativas, seguindo uma distribuição fatorial (quadrado latino):condição controle (minutos), subtrativa (segundos), iterativa (horas) e habitual (dias). Nossos resultados indicam maiores RTs nos segmentos críticos das condições experimentais, exceto em relação a condição habitual. Esses resultados confirmam nossa hipótese, porém, põem em cheque a realidade psicológica da coerção habitual. Assim, para melhor observar os efeitos dessa condição, procedemos uma reanálise dos dados de Sampaio et al. (2014), que finalmente veio confirmar o resultado de nosso teste.
APA, Harvard, Vancouver, ISO, and other styles
34

Nikitina, V. V., and A. M. Ivanova. "USING BIG DATA IN SEMANTIC RESEARCH: PERSPECTIVES AND APPROACHES." ВЕСТНИК ВОРОНЕЖСКОГО ГОСУДАРСТВЕННОГО ТЕХНИЧЕСКОГО УНИВЕРСИТЕТА, no. 3(42) (December 24, 2023): 25–34. http://dx.doi.org/10.36622/mlmdr.2023.95.56.003.

Full text
Abstract:
Statement of the problem. Digitization and technology are rapidly taking over various fields of humanitarian sciences, including cognitive linguistics that is developing new algorithms and methods of language research, with linguists being encouraged to acquire new skills in data collection and processing as well as to expand their sources of language materials. In cognitive semantics and discourse studies, the biggest game-changer has been ever increasing (to the points of apparent limitlessness) volumes of various language data from digitalized texts available on the Internet that effectively accumulate in and function as Big Data – data sets that are too multiple, large and complex to be processed by traditional, ‘by-hand’ methods. Extensive language data requires specific mining and analysis tools and techniques and imposes certain restrictions on its processing and interpretation. In this paper, the authors discuss approaches to processing (‘intellectualization’) of empirical linguistic data coming from various online sources via search engines and their overall reliability for cognitive semantics research. Results. The main argument is that to ensure the validity of the digital language data under study and credible study results, researchers should employ several data collection methods and tools, i.e., opt for methodological triangulation. The authors describe their methodology on a practical example and conclude that big data systems such as indexed digital texts can effectively replace native speakers as a source of credible information on a natural language semantics as well as improve the overall quality and verifiability of semantics studies. Conclusion. Considering the national segment of the Internet in a certain language (English, Russian, etc.) as a natural text corpus of high representativeness, the researcher can use it as a tool for testing various linguistic hypotheses. At the same time, it is necessary to reflect on the intrinsic qualities of big data, such as volume, velocity (swiftness of change), variability, lack of structure, valuableness, and the presence of “noise”, as well as carefully consider, design and test procedures for using search engines to collect big data in order to ensure the quality and verifiability of the results obtained.
APA, Harvard, Vancouver, ISO, and other styles
35

Lin, Yi, Hongwei Ding, and Yang Zhang. "Prosody Dominates Over Semantics in Emotion Word Processing: Evidence From Cross-Channel and Cross-Modal Stroop Effects." Journal of Speech, Language, and Hearing Research 63, no. 3 (March 23, 2020): 896–912. http://dx.doi.org/10.1044/2020_jslhr-19-00258.

Full text
Abstract:
Purpose Emotional speech communication involves multisensory integration of linguistic (e.g., semantic content) and paralinguistic (e.g., prosody and facial expressions) messages. Previous studies on linguistic versus paralinguistic salience effects in emotional speech processing have produced inconsistent findings. In this study, we investigated the relative perceptual saliency of emotion cues in cross-channel auditory alone task (i.e., semantics–prosody Stroop task) and cross-modal audiovisual task (i.e., semantics–prosody–face Stroop task). Method Thirty normal Chinese adults participated in two Stroop experiments with spoken emotion adjectives in Mandarin Chinese. Experiment 1 manipulated auditory pairing of emotional prosody (happy or sad) and lexical semantic content in congruent and incongruent conditions. Experiment 2 extended the protocol to cross-modal integration by introducing visual facial expression during auditory stimulus presentation. Participants were asked to judge emotional information for each test trial according to the instruction of selective attention. Results Accuracy and reaction time data indicated that, despite an increase in cognitive demand and task complexity in Experiment 2, prosody was consistently more salient than semantic content for emotion word processing and did not take precedence over facial expression. While congruent stimuli enhanced performance in both experiments, the facilitatory effect was smaller in Experiment 2. Conclusion Together, the results demonstrate the salient role of paralinguistic prosodic cues in emotion word processing and congruence facilitation effect in multisensory integration. Our study contributes tonal language data on how linguistic and paralinguistic messages converge in multisensory speech processing and lays a foundation for further exploring the brain mechanisms of cross-channel/modal emotion integration with potential clinical applications.
APA, Harvard, Vancouver, ISO, and other styles
36

El Guemhioui, Karim, and Steven A. Demurjian. "Semantic Reconciliation of Electronic Health Records Using Semantic Web Technologies." International Journal of Information Technology and Web Engineering 12, no. 2 (April 2017): 26–48. http://dx.doi.org/10.4018/ijitwe.2017040102.

Full text
Abstract:
In this paper, the authors present an approach to reconcile the semantics of distinct medical terms found in personal health records (PHRs - that store data controlled by patients) and electronic medical records (EMRs - that store data controlled by providers) that are utilized to describe the same concept in different systems. The authors present a solution for semantic reconciliation based on RDF and related semantic web technologies. As part of the solution, the authors utilize a centralized repository of ontologies to: uniformly interrogate the medical coding systems in which those terms are defined, extract all of their published synonyms, and save the results as RDF triples. The final step in the process is to employ a reasoner to infer non-explicit synonymy among those terms, hence evidencing the underlying semantics to the PHR and EMR systems for possible further processing.
APA, Harvard, Vancouver, ISO, and other styles
37

Mehta, Ansh. "Emotion Detection using Social Media Data." International Journal for Research in Applied Science and Engineering Technology 9, no. 11 (November 30, 2021): 1456–59. http://dx.doi.org/10.22214/ijraset.2021.39027.

Full text
Abstract:
Abstract: Previous research on emotion recognition of Twitter users centered on the use of lexicons and basic classifiers on pack of words models, despite the recent accomplishments of deep learning in many disciplines of natural language processing. The study's main question is if deep learning can help them improve their performance. Because of the scant contextual information that most posts offer, emotion analysis is still difficult. The suggested method can capture more emotion sematic than existing models by projecting emoticons and words into emoticon space, which improves the performance of emotion analysis. In a microblog setting, this aids in the detection of subjectivity, polarity, and emotion. It accomplishes this by utilizing hash tags to create three large emotion-labeled data sets that can be compared to various emotional orders. Then compare the results of a few words and character-based repetitive and convolutional neural networks to the results of a pack of words and latent semantic indexing models. Furthermore, the specifics examine the transferability of the most recent hidden state representations across distinct emotional classes and whether it is possible to construct a unified model for predicting each of them using a common representation. It's been shown that repetitive neural systems, especially character-based ones, outperform pack-of-words and latent semantic indexing models. The semantics of the token must be considered while classifying the tweet emotion. The semantics of the tokens recorded in the hash map may be simply searched. Despite these models' low exchange capacities, the recently presented training heuristic produces a unity model with execution comparable to the three solo models. Keywords: Hashtags, Sentiment Analysis, Facial Recognition, Emotions.
APA, Harvard, Vancouver, ISO, and other styles
38

Никитина, В. В., and А. М. Иванова. "USING BIG DATA IN SEMANTIC RESEARCH: PERSPERCIVES AND APPROACHES." НАУЧНЫЙ ЖУРНАЛ СОВРЕМЕННЫЕ ЛИНГВИСТИЧЕСКИЕ И МЕТОДИКО-ДИДАКТИЧЕСКИЕ ИССЛЕДОВАНИЯ, no. 3(59) (October 2, 2023): 38–49. http://dx.doi.org/10.36622/vstu.2023.32.12.003.

Full text
Abstract:
Постановка задачи. Эра цифровизации влияет на алгоритмы лингвистических методов и методологий исследования, требуя от исследователей новых компетенций в области сбора и обработки языкового материала, а также расширения эмпирической базы исследования за пределы конкретных дискурсивных жанров. Предположительно, легко доступный и неограниченный объем больших данных (Big Data) лингвистического характера, размещенных на различных ресурсах сети Интернет, предоставляет исследователям как обширную эмпирическую базу, так и налагает ограничения, связанные с возможностями обработки и интерпретации получаемых результатов. В настоящей статье рассматривается проблема контроля качества и «интеллектуализации» эмпирических данных, которые предоставляют поисковые системы применительно к лингвокогнитивному семантическому исследованию. Результаты. Авторами доказывается точка зрения, что для обеспечения достоверности изучаемых цифровых языковых данных и получаемых результатов исследования необходимо сочетание нескольких методов и инструментов сбора данных, т. е. методологическая триангуляция. Результаты исследования позволяют предположить, что системы больших данных, такие как индексированные цифровые тексты в сети Интернет, могут эффективно заменить носителей языка в качестве источника достоверной информации о семантике знаков естественных языков, а их триангуляционная верификация позволяет улучшить общее качество и проверяемость и воспроизводимость когнитивных исследований в области языка. Выводы. Рассматривая национальный сегмент сети Интернет на некотором языке (английский, русский и т.д.) в качестве естественного текстового корпуса высокой репрезентативности, исследователь может использовать его в качестве инструмента проверки различных гипотез. При этом необходимо учитывать имманентные качества больших данных, такие как динамичность, разнородность, неструктурированность и наличие «шума», а также тщательно продумывать, выстраивать и тестировать процедуру использования поисковой системы для сбора больших данных, чтобы обеспечить качество и верифицируемость получаемых результатов. Statement of the problem. Digitization and technology are rapidly taking over various fields of humanitarian sciences, including cognitive linguistics that is developing new algorithms and methods of language research, with linguists being encouraged to acquire new skills in data collection and processing as well as to expand their sources of language materials. In cognitive semantics and discourse studies, the biggest game-changer has been ever increasing (to the points of apparent limitlessness) volumes of various language data from digitalized texts available on the Internet that effectively accumulate in and function as Big Data – data sets that are too multiple, large and complex to be processed by traditional, ‘by-hand’ methods. Extensive language data requires specific mining and analysis tools and techniques and imposes certain restrictions on its processing and interpretation. In this paper, the authors discuss approaches to processing (‘intellectualization’) of empirical linguistic data coming from various online sources via search engines and their overall reliability for cognitive semantics research. Results. The main argument is that to ensure the validity of the digital language data under study and credible study results, researchers should employ several data collection methods and tools, i.e., opt for methodological triangulation. The authors describe their methodology on a practical example and conclude that big data systems such as indexed digital texts can effectively replace native speakers as a source of credible information on a natural language semantics as well as improve the overall quality and verifiability of semantics studies. Conclusion. Considering the national segment of the Internet in a certain language (English, Russian, etc.) as a natural text corpus of high representativeness, the researcher can use it as a tool for testing various linguistic hypotheses. At the same time, it is necessary to reflect on the intrinsic qualities of big data, such as volume, velocity (swiftness of change), variability, lack of structure, valuableness, and the presence of “noise”, as well as carefully consider, design and test procedures for using search engines to collect big data in order to ensure the quality and verifiability of the results obtained.
APA, Harvard, Vancouver, ISO, and other styles
39

Chersoni, E., E. Santus, L. Pannitto, A. Lenci, P. Blache, and C. R. Huang. "A structured distributional model of sentence meaning and processing." Natural Language Engineering 25, no. 4 (July 2019): 483–502. http://dx.doi.org/10.1017/s1351324919000214.

Full text
Abstract:
AbstractMost compositional distributional semantic models represent sentence meaning with a single vector. In this paper, we propose a structured distributional model (SDM) that combines word embeddings with formal semantics and is based on the assumption that sentences represent events and situations. The semantic representation of a sentence is a formal structure derived from discourse representation theory and containing distributional vectors. This structure is dynamically and incrementally built by integrating knowledge about events and their typical participants, as they are activated by lexical items. Event knowledge is modelled as a graph extracted from parsed corpora and encoding roles and relationships between participants that are represented as distributional vectors. SDM is grounded on extensive psycholinguistic research showing that generalized knowledge about events stored in semantic memory plays a key role in sentence comprehension.We evaluate SDMon two recently introduced compositionality data sets, and our results show that combining a simple compositionalmodel with event knowledge constantly improves performances, even with dif ferent types of word embeddings.
APA, Harvard, Vancouver, ISO, and other styles
40

Yang, Fan. "A Computational Linguistic Approach to English Lexicography." Transactions on Computer Science and Intelligent Systems Research 2 (December 21, 2023): 39–44. http://dx.doi.org/10.62051/wepk6t89.

Full text
Abstract:
Focusing on computational linguistic approaches to English linguistics, this research explores how computational methods can be applied to dissect, understand and utilise the English language. We first looked at text analysis and processing, delving into natural language processing techniques such as text categorisation, sentiment analysis and machine translation, and their application to social media and automated text processing. In the area of lexicography and semantics, we explored how techniques such as distributed word vectors, semantic role labelling and sentiment analysis can deepen our understanding of vocabulary and semantics. We highlight the importance of these techniques in natural language processing tasks such as sentiment analysis and information retrieval. In addition, we focus on cross-language comparative and multilingual research, emphasising how big data and cross-language comparative research can reveal similarities and differences between languages and their implications for global linguistics. Finally, we explore corpus linguistics and big data analytics, highlighting the richness of linguistic data and tools they provide for linguistic research. Overall, this study highlights the importance of computational linguistic approaches to English linguistics and how they have transformed the way linguistics is studied and language technology has evolved. Future research trends will continue to drive the further development of computational linguistics methods, leading to a closer integration of linguistics with big data analytics and computational methods, creating more opportunities for the future of the field of linguistics.
APA, Harvard, Vancouver, ISO, and other styles
41

Nicklas, Daniela, Thomas Schwarz, and Bernhard Mitschang. "A Schema-Based Approach to Enable Data Integration on the Fly." International Journal of Cooperative Information Systems 26, no. 01 (March 2017): 1650010. http://dx.doi.org/10.1142/s0218843016500106.

Full text
Abstract:
On-the-fly data integration, i.e. at query time, happens mostly in tightly coupled, homogeneous environments where the partitioning of the data can be controlled or is known in advance. During the process of data fusion, the information is homogenized and data inconsistencies are hidden from the application. Beyond this, we propose in this paper the Nexus metadata model and a processing approach that support on-the-fly data integration in a loosely coupled federation of autonomous data providers, thereby advancing the status quo in terms of flexibility and expressive power. It is able to represent data and schema inconsistencies like multi-valued attributes and multi-typed objects. In an open environment, this best suites the application needs where the data processing infrastructure is not able to decide which attribute value is correct. The Nexus metadata model provides the foundation for integration schemata that are specific to a given application domain. The corresponding processing model provides four complementary query semantics in order to account for the subtleties of multi-valued and missing attributes. In this paper we show that this query semantics is sound, easy to implement, and it builds upon existing query processing techniques. Thus the Nexus metadata model provides a unique level of flexibility for on-the-fly data integration.
APA, Harvard, Vancouver, ISO, and other styles
42

Hogrefe, Katharina, Georg Goldenberg, Ralf Glindemann, Madleen Klonowski, and Wolfram Ziegler. "Nonverbal Semantics Test (NVST)—A Novel Diagnostic Tool to Assess Semantic Processing Deficits: Application to Persons with Aphasia after Cerebrovascular Accident." Brain Sciences 11, no. 3 (March 11, 2021): 359. http://dx.doi.org/10.3390/brainsci11030359.

Full text
Abstract:
Assessment of semantic processing capacities often relies on verbal tasks which are, however, sensitive to impairments at several language processing levels. Especially for persons with aphasia there is a strong need for a tool that measures semantic processing skills independent of verbal abilities. Furthermore, in order to assess a patient’s potential for using alternative means of communication in cases of severe aphasia, semantic processing should be assessed in different nonverbal conditions. The Nonverbal Semantics Test (NVST) is a tool that captures semantic processing capacities through three tasks—Semantic Sorting, Drawing, and Pantomime. The main aim of the current study was to investigate the relationship between the NVST and measures of standard neurolinguistic assessment. Fifty-one persons with aphasia caused by left hemisphere brain damage were administered the NVST as well as the Aachen Aphasia Test (AAT). A principal component analysis (PCA) was conducted across all AAT and NVST subtests. The analysis resulted in a two-factor model that captured 69% of the variance of the original data, with all linguistic tasks loading high on one factor and the NVST subtests loading high on the other. These findings suggest that nonverbal tasks assessing semantic processing capacities should be administered alongside standard neurolinguistic aphasia tests.
APA, Harvard, Vancouver, ISO, and other styles
43

Haris, Erum, and Keng Hoon Gan. "Extraction and Visualization of Tourist Attraction Semantics from Travel Blogs." ISPRS International Journal of Geo-Information 10, no. 10 (October 18, 2021): 710. http://dx.doi.org/10.3390/ijgi10100710.

Full text
Abstract:
Travel blogs are a significant source for modeling human travelling behavior and characterizing tourist destinations owing to the presence of rich geospatial and thematic content. However, the bulk of unstructured text requires extensive processing for an efficient transformation of data to knowledge. Existing works have studied tourist places, but results lack a coherent outline and visualization of the semantic knowledge associated with tourist attractions. Hence, this work proposes place semantics extraction based on a fusion of content analysis and natural language processing (NLP) techniques. A weighted-sum equation model is then employed to construct a points of interest graph (POI graph) that integrates extracted semantics with conventional frequency-based weighting of tourist spots and routes. The framework offers determination and visualization of massive blog text in a comprehensible manner to facilitate individuals in travel decision-making as well as tourism managers to devise effective destination planning and management strategies.
APA, Harvard, Vancouver, ISO, and other styles
44

Huff, S. M., P. J. Haug, D. A. Evans, B. E. Bray, and R. A. Rocha. "Evaluation of a Semantic Data Model for Chest Radiology: Application of aNew Methodology." Methods of Information in Medicine 37, no. 04/05 (October 1998): 477–90. http://dx.doi.org/10.1055/s-0038-1634549.

Full text
Abstract:
AbstractAn essential step toward the effective processing of the medical language is the development of representational models that formalize the language semantics. These models, also known as semantic data models, help to unlock the meaning of descriptive expressions, making them accessible to computer systems. The present study tries to determine the quality of a semantic data model created to encode chest radiology findings . The evaluation methodology relied on the ability of physicians to extract information from textual and encoded representations of chest X-ray reports, whilst answering questions associated with each report. The evaluation demonstrated that the encoded reports seemed to have the same information content of the original textual reports. The methodology generated useful data regarding the quality of the data model, demonstrating that certain segments were creating ambiguous representations and that some details were not being represented.
APA, Harvard, Vancouver, ISO, and other styles
45

Jajaga, Edmond, and Lule Ahmedi. "C-SWRL: A Unique Semantic Web Framework for Reasoning Over Stream Data." International Journal of Semantic Computing 11, no. 03 (September 2017): 391–409. http://dx.doi.org/10.1142/s1793351x17400165.

Full text
Abstract:
The synergy of Data Stream Management Systems and Semantic Web applications has steered towards a new paradigm known as Stream Reasoning. The Semantic Web standards for knowledge base modeling and querying, namely RDF, OWL and SPARQL, has extensively been used by the Stream Reasoning community. However, the Semantic Web rule languages, such as SWRL and RIF, have never been used in stream data applications. Instead, different non-Semantic Web rule systems have been approached. Since RIF is primarily intended for exchanging rules among systems, we focused on SWRL applications with stream data. This proves difficult following the SWRL’s open world semantics. To overcome SWRL’s expressivity issues we propose an infrastructure extension, which will enable SWRL reasoning with stream data. Namely, a query processing system, such as C-SPARQL, was layered under SWRL to support closed-world and time-aware reasoning. Moreover, OWLAPI constructs were utilized to enable non-monotonicity, while SPARQL constructs were used to enable negation as failure. Water quality monitoring was used as a validation domain of the proposed system.
APA, Harvard, Vancouver, ISO, and other styles
46

Zhang, Kaile, and Ichiro Koshijima. "Trend analysis of online travel review text mining over time." Journal of Modelling in Management 15, no. 2 (November 18, 2019): 491–508. http://dx.doi.org/10.1108/jm2-10-2018-0178.

Full text
Abstract:
Purpose The reviews of online tourism have not been taken advantage of effectively because the text data of such reviews is enormous and its current, in-depth research is still in infancy. Therefore, it is expected that the text data could be processed by the method of text mining to better understand the implicit information. The purpose of this paper is to contribute to tourism practitioners and tourists to conveniently use the texts through appropriate visualization processing techniques. In particular, time-changing reviews can be used to reflect the changes in tourists’ feedback and concerns. Design/methodology/approach Latent semantic analysis is a new branch of semantics. Every term in the document can be regarded as a single point in multi-dimensional space. When a document with semantics comes into such space, the distribution of the document is not random, but will obey some type of semantic structure. Findings First, overall grasping for the big data is applicable. Second, propose a direct method is proposed that allows more non-language processing researchers or proprietors to use the data. Lastly, the results of changes in different spans of times are investigated. Originality/value This paper proposes an approach to disclose a significant number of travel comments from different years that may generate new ideas for tourism. The authors put forward a processing approach to deal with large amounts of texts of comments. Using the case study of Mt. Lushan, the various changes of travel reviews over the years are successfully visualized and displayed.
APA, Harvard, Vancouver, ISO, and other styles
47

Аkanova, Akerke, Nazira Ospanova, Saltanat Sharipova, Gulalem Мauina, and Zhanat Abdugulova. "Development of a thematic and neural network model for data learning." Eastern-European Journal of Enterprise Technologies 4, no. 2(118) (August 31, 2022): 40–50. http://dx.doi.org/10.15587/1729-4061.2022.263421.

Full text
Abstract:
Research in the field of semantic text analysis begins with the study of the structure of natural language. The Kazakh language is unique in that it belongs to agglutinative languages and requires careful study. The object of this study is the text in the Kazakh language. Existing approaches to the study of the semantic analysis of text in the Kazakh language do not consider text analysis using the methods of thematic modeling and learning of neural networks. The purpose of this study is to determine the quality of a topic model based on the LDA (Latent Dirichlet Allocation) method with Gibbs sampling, through neural network learning. The LDA model can determine the semantic probability of the keywords of a single document and give them a rating score. To build a neural network, one of the widely used LSTM architectures was used, which has proven itself well in working with NLP (Natural Language Processing). As a result of learning, it is possible to see to what extent the text was trained and how the semantic analysis of the text in the Kazakh language went. The system, developed on the basis of the LDA model and neural network learning, combines the detected keywords into separate topics. In general, the experimental results showed that the use of deep neural networks gives the expected results of the quality of the LDA model in the processing of the Kazakh language. The developed model of the neural network contributes to the assessment of the accuracy of the semantics of the used text in the Kazakh language. The results obtained can be applied in systems for processing text data, for example, when checking the compliance of the topic and content of the proposed texts (abstracts, term papers, theses, and other works).
APA, Harvard, Vancouver, ISO, and other styles
48

Rao, Yunbo, Menghan Zhang, Zhanglin Cheng, Junmin Xue, Jiansu Pu, and Zairong Wang. "Semantic Point Cloud Segmentation Using Fast Deep Neural Network and DCRF." Sensors 21, no. 8 (April 13, 2021): 2731. http://dx.doi.org/10.3390/s21082731.

Full text
Abstract:
Accurate segmentation of entity categories is the critical step for 3D scene understanding. This paper presents a fast deep neural network model with Dense Conditional Random Field (DCRF) as a post-processing method, which can perform accurate semantic segmentation for 3D point cloud scene. On this basis, a compact but flexible framework is introduced for performing segmentation to the semantics of point clouds concurrently, contribute to more precise segmentation. Moreover, based on semantics labels, a novel DCRF model is elaborated to refine the result of segmentation. Besides, without any sacrifice to accuracy, we apply optimization to the original data of the point cloud, allowing the network to handle fewer data. In the experiment, our proposed method is conducted comprehensively through four evaluation indicators, proving the superiority of our method.
APA, Harvard, Vancouver, ISO, and other styles
49

PAUL, MANOJ, and S. K. GHOSH. "A SERVICE-ORIENTED APPROACH FOR INTEGRATING HETEROGENEOUS SPATIAL DATA SOURCES REALIZATION OF A VIRTUAL GEO-DATA REPOSITORY." International Journal of Cooperative Information Systems 17, no. 01 (March 2008): 111–53. http://dx.doi.org/10.1142/s0218843008001774.

Full text
Abstract:
Searching and accessing geospatial information in the open and distributed environments of geospatial information systems poses several challenges due to the heterogeneity in geospatial data. Geospatial data is highly heterogeneous — both at the syntactic and semantic level. The requirement for an integration architecture for seamless access of geospatial data has been raised over the past decades. The paper proposes a service-based model for geospatial integration where each geospatial data provider is interfaced on the web as services. The interface for these services has been described with Open Geospatial Consortium (OGC) specified service standards. Catalog service provides service descriptions for the services to be discovered. The semantic of each service description is captured in the form of ontology. The similarity assessment method of request service with candidate services proposed in this paper is aimed at resolving the heterogeneity in semantics of locational terms of service descriptions. In a way, we have proposed an architecture for enterprise geographic information system (E-GIS), which is an organization-wide approach to GIS integration, operation, and management. A query processing mechanism for accessing geospatial information in the service-based distributed environment has also been discussed with the help of a case study.
APA, Harvard, Vancouver, ISO, and other styles
50

González, Víctor, Laura Martín, Juan Ramón Santana, Pablo Sotres, Jorge Lanza, and Luis Sánchez. "Reshaping Smart Cities through NGSI-LD Enrichment." Sensors 24, no. 6 (March 14, 2024): 1858. http://dx.doi.org/10.3390/s24061858.

Full text
Abstract:
The vast amount of information stemming from the deployment of the Internet of Things and open data portals is poised to provide significant benefits for both the private and public sectors, such as the development of value-added services or an increase in the efficiency of public services. This is further enhanced due to the potential of semantic information models such as NGSI-LD, which enable the enrichment and linkage of semantic data, strengthened by the contextual information present by definition. In this scenario, advanced data processing techniques need to be defined and developed for the processing of harmonised datasets and data streams. Our work is based on a structured approach that leverages the principles of linked-data modelling and semantics, as well as a data enrichment toolchain framework developed around NGSI-LD. Within this framework, we reveal the potential for enrichment and linkage techniques to reshape how data are exploited in smart cities, with a particular focus on citizen-centred initiatives. Moreover, we showcase the effectiveness of these data processing techniques through specific examples of entity transformations. The findings, which focus on improving data comprehension and bolstering smart city advancements, set the stage for the future exploration and refinement of the symbiosis between semantic data and smart city ecosystems.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography