Academic literature on the topic 'RDF dataset characterization and classification'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'RDF dataset characterization and classification.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "RDF dataset characterization and classification"

1

Gupta, Rupal, and Sanjay Kumar Malik. "A classification using RDFLIB and SPARQL on RDF dataset." Journal of Information and Optimization Sciences 43, no. 1 (January 2, 2022): 143–54. http://dx.doi.org/10.1080/02522667.2022.2039461.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Xiao, Changhu, Yuan Guo, Kaixuan Zhao, Sha Liu, Nongyue He, Yi He, Shuhong Guo, and Zhu Chen. "Prognostic Value of Machine Learning in Patients with Acute Myocardial Infarction." Journal of Cardiovascular Development and Disease 9, no. 2 (February 11, 2022): 56. http://dx.doi.org/10.3390/jcdd9020056.

Full text
Abstract:
(1) Background: Patients with acute myocardial infarction (AMI) still experience many major adverse cardiovascular events (MACEs), including myocardial infarction, heart failure, kidney failure, coronary events, cerebrovascular events, and death. This retrospective study aims to assess the prognostic value of machine learning (ML) for the prediction of MACEs. (2) Methods: Five-hundred patients diagnosed with AMI and who had undergone successful percutaneous coronary intervention were included in the study. Logistic regression (LR) analysis was used to assess the relevance of MACEs and 24 selected clinical variables. Six ML models were developed with five-fold cross-validation in the training dataset and their ability to predict MACEs was compared to LR with the testing dataset. (3) Results: The MACE rate was calculated as 30.6% after a mean follow-up of 1.42 years. Killip classification (Killip IV vs. I class, odds ratio 4.386, 95% confidence interval 1.943–9.904), drug compliance (irregular vs. regular compliance, 3.06, 1.721–5.438), age (per year, 1.025, 1.006–1.044), and creatinine (1 µmol/L, 1.007, 1.002–1.012) and cholesterol levels (1 mmol/L, 0.708, 0.556–0.903) were independent predictors of MACEs. In the training dataset, the best performing model was the random forest (RDF) model with an area under the curve of (0.749, 0.644–0.853) and accuracy of (0.734, 0.647–0.820). In the testing dataset, the RDF showed the most significant survival difference (log-rank p = 0.017) in distinguishing patients with and without MACEs. (4) Conclusions: The RDF model has been identified as superior to other models for MACE prediction in this study. ML methods can be promising for improving optimal predictor selection and clinical outcomes in patients with AMI.
APA, Harvard, Vancouver, ISO, and other styles
3

Sarquah, Khadija, Satyanarayana Narra, Gesa Beck, Uduak Bassey, Edward Antwi, Michael Hartmann, Nana Sarfo Agyemang Derkyi, Edward A. Awafo, and Michael Nelles. "Characterization of Municipal Solid Waste and Assessment of Its Potential for Refuse-Derived Fuel (RDF) Valorization." Energies 16, no. 1 (December 24, 2022): 200. http://dx.doi.org/10.3390/en16010200.

Full text
Abstract:
Reuse and recycling are preferred strategies in waste management to ensure the high position of waste resources in the waste management hierarchy. However, challenges are still pronounced in many developing countries, where disposal as a final solution is prevalent, particularly for municipal solid waste. On the other hand, refuse-derived fuel as a means of energy recovery provides a sustainable option for managing mixed, contaminated and residual municipal solid waste (MSW). This study provides one of the earliest assessments of refuse-derived fuel (RDF) from MSW in Ghana through a case study in the cities of Accra and Kumasi. The residual/reject fractions (RFs) of MSW material recovery were characterized for thermochemical energy purposes. The studied materials had the potential to be used as RDF. The combustible portions from the residual fractions formed good alternative fuel, RDF, under the class I, II-III classification of the EN 15359:2011 standards. The RDF from only combustible mixed materials such as plastics, paper and wood recorded a significant increase in the lower heating value (28.66–30.24 MJ/kg) to the mass RF, with the presence of organics (19.73 to 23.75 MJ/kg). The chlorine and heavy metal content met the limits set by various standards. An annual RDF production of 12 to 57 kilotons is possible from the two cities. This can offset 10–30% of the present industrial coal consumption, to about 180 kiloton/yr CO2 eq emissions and a net cost saving of USD 8.7 million per year. The market for RDF as an industrial alternative fuel is developing in Ghana and similar jurisdictions in this context. Therefore, this study provides insights into the potential for RDF in integrated waste management system implementation for socioeconomic and environmental benefits. This supports efforts towards achieving the Sustainable Development Goals (SDGs) and a circular economy.
APA, Harvard, Vancouver, ISO, and other styles
4

Seydou, Sangare, Konan Marcellin Brou, Kouame Appoh, and Kouadio Prosper Kimou. "HYBRID MODEL FOR THE CLASSIFICATION OF QUESTIONS EXPRESSED IN NATURAL LANGUAGE." International Journal of Advanced Research 10, no. 09 (September 30, 2022): 202–12. http://dx.doi.org/10.21474/ijar01/15343.

Full text
Abstract:
Question-answering systems rely on an unstructured text corpora or a knowledge base to answer user questions. Most of these systems store knowledge in multiple repositories including RDF. To access this type of repository, SPARQL is the most convenient formal language. It is a complex language, it is therefore necessary to transform the questions expressed in natural language by users into a SPARQL query. As this language is complex, several approaches have been proposed to transform the questions expressed in natural language by users into a SPARQL query.However, the identification of the question type is a serious problem. Questions classification plays a potential role at this level. Machine learning algorithms including neural networks are used for this classification. With the increase in the volume of data, neural networks better perform than those obtained by machine learning algorithms, in general. That is, neural networks, machine learning algorithms also remain good classifiers. For more efficiency, a combination of convolutional neural network with these algorithms has been suggested in this paper. The BICNN-SVM combination has obtained good score not only with small dataset with a precision of 96.60% but also with a large dataset with 94.05%.
APA, Harvard, Vancouver, ISO, and other styles
5

Liang, Haobang, Jiao Li, Hejun Wu, Li Li, Xinrui Zhou, and Xinhua Jiang. "Mammographic Classification of Breast Cancer Microcalcifications through Extreme Gradient Boosting." Electronics 11, no. 15 (August 4, 2022): 2435. http://dx.doi.org/10.3390/electronics11152435.

Full text
Abstract:
In this paper, we proposed an effective and efficient approach to the classification of breast cancer microcalcifications and evaluated the mathematical model for calcification on mammography with a large medical dataset. We employed several semi-automatic segmentation algorithms to extract 51 calcification features from mammograms, including morphologic and textural features. We adopted extreme gradient boosting (XGBoost) to classify microcalcifications. Then, we compared other machine learning techniques, including k-nearest neighbor (kNN), adaboostM1, decision tree, random decision forest (RDF), and gradient boosting decision tree (GBDT), with XGBoost. XGBoost showed the highest accuracy (90.24%) for classifying microcalcifications, and kNN demonstrated the lowest accuracy. This result demonstrates that it is essential for the classification of microcalcification to use the feature engineering method for the selection of the best composition of features. One of the contributions of this study is to present the best composition of features for efficient classification of breast cancers. This paper finds a way to select the best discriminative features as a collection to improve the accuracy. This study showed the highest accuracy (90.24%) for classifying microcalcifications with AUC = 0.89. Moreover, we highlighted the performance of various features from the dataset and found ideal parameters for classifying microcalcifications. Furthermore, we found that the XGBoost model is suitable both in theory and practice for the classification of calcifications on mammography.
APA, Harvard, Vancouver, ISO, and other styles
6

Sliwinski, Jakub, Martin Strobel, and Yair Zick. "Axiomatic Characterization of Data-Driven Influence Measures for Classification." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 718–25. http://dx.doi.org/10.1609/aaai.v33i01.3301718.

Full text
Abstract:
We study the following problem: given a labeled dataset and a specific datapoint ∼x, how did the i-th feature influence the classification for ∼x? We identify a family of numerical influence measures — functions that, given a datapoint ∼x, assign a numeric value φi(∼x) to every feature i, corresponding to how altering i’s value would influence the outcome for ∼x. This family, which we term monotone influence measures (MIM), is uniquely derived from a set of desirable properties, or axioms. The MIM family constitutes a provably sound methodology for measuring feature influence in classification domains; the values generated by MIM are based on the dataset alone, and do not make any queries to the classifier. While this requirement naturally limits the scope of our framework, we demonstrate its effectiveness on data.
APA, Harvard, Vancouver, ISO, and other styles
7

Devdatt Kawathekar, Ishan, and Anu Shaju Areeckal. "Performance analysis of texture characterization techniques for lung nodule classification." Journal of Physics: Conference Series 2161, no. 1 (January 1, 2022): 012045. http://dx.doi.org/10.1088/1742-6596/2161/1/012045.

Full text
Abstract:
Abstract Lung cancer ranks very high on a global index for cancer-related casualties. With early detection of lung cancer, the rate of survival increases to 80-90%. The standard method for diagnosing lung cancer from Computed Tomography (CT) scans is by manual annotation and detection of the cancerous regions, which is a tedious task for radiologists. This paper proposes a machine learning approach for multi-class classification of the lung nodules into solid, semi-solid, and Ground Glass Object texture classes. We employ feature extraction techniques, such as gray-level co-occurrence matrix, Gabor filters, and local binary pattern, and validate the performance on the LNDb dataset. The best performing classifier displays an accuracy of 94% and an F1-score of 0.92. The proposed approach was compared with related work using the same dataset. The results are promising, and the proposed method can be used to diagnose lung cancer accurately.
APA, Harvard, Vancouver, ISO, and other styles
8

Scime, Anthony, Nilay Saiya, Gregg R. Murray, and Steven J. Jurek. "Classification Trees as Proxies." International Journal of Business Analytics 2, no. 2 (April 2015): 31–44. http://dx.doi.org/10.4018/ijban.2015040103.

Full text
Abstract:
In data analysis, when data are unattainable, it is common to select a closely related attribute as a proxy. But sometimes substitution of one attribute for another is not sufficient to satisfy the needs of the analysis. In these cases, a classification model based on one dataset can be investigated as a possible proxy for another closely related domain's dataset. If the model's structure is sufficient to classify data from the related domain, the model can be used as a proxy tree. Such a proxy tree also provides an alternative characterization of the related domain. Just as important, if the original model does not successfully classify the related domain data the domains are not as closely related as believed. This paper presents a methodology for evaluating datasets as proxies along with three cases that demonstrate the methodology and the three types of results.
APA, Harvard, Vancouver, ISO, and other styles
9

Stork, Christopher L., and Michael R. Keenan. "Advantages of Clustering in the Phase Classification of Hyperspectral Materials Images." Microscopy and Microanalysis 16, no. 6 (October 22, 2010): 810–20. http://dx.doi.org/10.1017/s143192761009402x.

Full text
Abstract:
AbstractDespite the many demonstrated applications of factor analysis (FA) in analyzing hyperspectral materials images, FA does have inherent mathematical limitations, preventing it from solving certain materials characterization problems. A notable limitation of FA is its parsimony restriction, referring to the fact that in FA the number of components cannot exceed the chemical rank of a dataset. Clustering is a promising alternative to FA for the phase classification of hyperspectral materials images. In contrast with FA, the phases extracted by clustering do not have to be parsimonious. Clustering has an added advantage in its insensitivity to spectral collinearity that can result in phase mixing using FA. For representative energy dispersive X-ray spectroscopy materials images, namely a solder bump dataset and a braze interface dataset, clustering generates phase classification results that are superior to those obtained using representative FA-based methods. For the solder bump dataset, clustering identifies a Cu-Sn intermetallic phase that cannot be isolated using FA alone due to the parsimony restriction. For the braze interface sample that has collinearity among the phase spectra, the clustering results do not exhibit the physically unrealistic phase mixing obtained by multivariate curve resolution, a commonly utilized FA algorithm.
APA, Harvard, Vancouver, ISO, and other styles
10

Bougacha, Aymen, Ines Njeh, Jihene Boughariou, Omar Kammoun, Kheireddine Ben Mahfoudh, Mariem Dammak, Chokri Mhiri, and Ahmed Ben Hamida. "Rank-Two NMF Clustering for Glioblastoma Characterization." Journal of Healthcare Engineering 2018 (October 23, 2018): 1–7. http://dx.doi.org/10.1155/2018/1048164.

Full text
Abstract:
This study investigates a novel classification method for 3D multimodal MRI glioblastomas tumor characterization. We formulate our segmentation problem as a linear mixture model (LMM). Thus, we provide a nonnegative matrix M from every MRI slice in every segmentation process’ step. This matrix will be used as an input for the first segmentation process to extract the edema region from T2 and FLAIR modalities. After that, in the rest of segmentation processes, we extract the edema region from T1c modality, generate the matrix M, and segment the necrosis, the enhanced tumor, and the nonenhanced tumor regions. In the segmentation process, we apply a rank-two NMF clustering. We have executed our tumor characterization method on BraTS 2015 challenge dataset. Quantitative and qualitative evaluations over the publicly training and testing dataset from the MICCAI 2015 multimodal brain segmentation challenge (BraTS 2015) attested that the proposed algorithm could yield a competitive performance for brain glioblastomas characterization (necrosis, tumor core, and edema) among several competing methods.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "RDF dataset characterization and classification"

1

Sherif, Mohamed Ahmed Mohamed. "Automating Geospatial RDF Dataset Integration and Enrichment." Doctoral thesis, Universitätsbibliothek Leipzig, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-215708.

Full text
Abstract:
Over the last years, the Linked Open Data (LOD) has evolved from a mere 12 to more than 10,000 knowledge bases. These knowledge bases come from diverse domains including (but not limited to) publications, life sciences, social networking, government, media, linguistics. Moreover, the LOD cloud also contains a large number of crossdomain knowledge bases such as DBpedia and Yago2. These knowledge bases are commonly managed in a decentralized fashion and contain partly verlapping information. This architectural choice has led to knowledge pertaining to the same domain being published by independent entities in the LOD cloud. For example, information on drugs can be found in Diseasome as well as DBpedia and Drugbank. Furthermore, certain knowledge bases such as DBLP have been published by several bodies, which in turn has lead to duplicated content in the LOD . In addition, large amounts of geo-spatial information have been made available with the growth of heterogeneous Web of Data. The concurrent publication of knowledge bases containing related information promises to become a phenomenon of increasing importance with the growth of the number of independent data providers. Enabling the joint use of the knowledge bases published by these providers for tasks such as federated queries, cross-ontology question answering and data integration is most commonly tackled by creating links between the resources described within these knowledge bases. Within this thesis, we spur the transition from isolated knowledge bases to enriched Linked Data sets where information can be easily integrated and processed. To achieve this goal, we provide concepts, approaches and use cases that facilitate the integration and enrichment of information with other data types that are already present on the Linked Data Web with a focus on geo-spatial data. The first challenge that motivates our work is the lack of measures that use the geographic data for linking geo-spatial knowledge bases. This is partly due to the geo-spatial resources being described by the means of vector geometry. In particular, discrepancies in granularity and error measurements across knowledge bases render the selection of appropriate distance measures for geo-spatial resources difficult. We address this challenge by evaluating existing literature for point set measures that can be used to measure the similarity of vector geometries. Then, we present and evaluate the ten measures that we derived from the literature on samples of three real knowledge bases. The second challenge we address in this thesis is the lack of automatic Link Discovery (LD) approaches capable of dealing with geospatial knowledge bases with missing and erroneous data. To this end, we present Colibri, an unsupervised approach that allows discovering links between knowledge bases while improving the quality of the instance data in these knowledge bases. A Colibri iteration begins by generating links between knowledge bases. Then, the approach makes use of these links to detect resources with probably erroneous or missing information. This erroneous or missing information detected by the approach is finally corrected or added. The third challenge we address is the lack of scalable LD approaches for tackling big geo-spatial knowledge bases. Thus, we present Deterministic Particle-Swarm Optimization (DPSO), a novel load balancing technique for LD on parallel hardware based on particle-swarm optimization. We combine this approach with the Orchid algorithm for geo-spatial linking and evaluate it on real and artificial data sets. The lack of approaches for automatic updating of links of an evolving knowledge base is our fourth challenge. This challenge is addressed in this thesis by the Wombat algorithm. Wombat is a novel approach for the discovery of links between knowledge bases that relies exclusively on positive examples. Wombat is based on generalisation via an upward refinement operator to traverse the space of Link Specifications (LS). We study the theoretical characteristics of Wombat and evaluate it on different benchmark data sets. The last challenge addressed herein is the lack of automatic approaches for geo-spatial knowledge base enrichment. Thus, we propose Deer, a supervised learning approach based on a refinement operator for enriching Resource Description Framework (RDF) data sets. We show how we can use exemplary descriptions of enriched resources to generate accurate enrichment pipelines. We evaluate our approach against manually defined enrichment pipelines and show that our approach can learn accurate pipelines even when provided with a small number of training examples. Each of the proposed approaches is implemented and evaluated against state-of-the-art approaches on real and/or artificial data sets. Moreover, all approaches are peer-reviewed and published in a conference or a journal paper. Throughout this thesis, we detail the ideas, implementation and the evaluation of each of the approaches. Moreover, we discuss each approach and present lessons learned. Finally, we conclude this thesis by presenting a set of possible future extensions and use cases for each of the proposed approaches.
APA, Harvard, Vancouver, ISO, and other styles
2

Arndt, Natanael, and Norman Radtke. "Quit diff: calculating the delta between RDF datasets under version control." Universität Leipzig, 2016. https://ul.qucosa.de/id/qucosa%3A15780.

Full text
Abstract:
Distributed actors working on a common RDF dataset regularly encounter the issue to compare the status of one graph with another or generally to synchronize copies of a dataset. A versioning system helps to synchronize the copies of a dataset, combined with a difference calculation system it is also possible to compare versions in a log and to determine, in which version a certain statement was introduced or removed. In this demo we present Quit Diff 1, a tool to compare versions of a Git versioned quad store, while it is also applicable to simple unversioned RDF datasets. We are following an approach to abstract from differences on a syntactical level to differences on the level of the RDF data model, while we leave further semantic interpretation on the schema and instance level to specialized applications. Quit Diff can generate patches in various output formats and can be directly integrated in the distributed version control system Git which provides a foundation for a comprehensive co-evolution work flow on RDF datasets.
APA, Harvard, Vancouver, ISO, and other styles
3

Arndt, Natanael, Norman Radtke, and Michael Martin. "Distributed collaboration on RDF datasets using Git." Universität Leipzig, 2016. https://ul.qucosa.de/id/qucosa%3A15781.

Full text
Abstract:
Collaboration is one of the most important topics regarding the evolution of the World Wide Web and thus also for the Web of Data. In scenarios of distributed collaboration on datasets it is necessary to provide support for multiple different versions of datasets to exist simultaneously, while also providing support for merging diverged datasets. In this paper we present an approach that uses SPARQL 1.1 in combination with the version control system Git, that creates commits for all changes applied to an RDF dataset containing multiple named graphs. Further the operations provided by Git are used to distribute the commits among collaborators and merge diverged versions of the dataset. We show the advantages of (public) Git repositories for RDF datasets and how this represents a way to collaborate on RDF data and consume it. With SPARQL 1.1 and Git in combination, users are given several opportunities to participate in the evolution of RDF data.
APA, Harvard, Vancouver, ISO, and other styles
4

Soderi, Mirco. "Semantic models for the modeling and management of big data in a smart city environment." Doctoral thesis, 2021. http://hdl.handle.net/2158/1232245.

Full text
Abstract:
The overall purpose of this research has been the building or the improve- ment of semantic models for the representation of data related to smart cities and smart industries, in such a way that it could also be possible to build context-rich, user-oriented, ecient and eective applications based on such data. In some more detail, one of the key purposes has been the modelling of structural and the functioning aspects of the urban mobility and the produc- tion of instances exploiting the Open Street Map, that once integrated with trac sensors data, it has lead to the building and displaying of real-time trac reconstructions at a city level. One second key purpose has been the modelling of the Internet of Things, that allows today to seamlessy and e- ciently identify sensing devices that are deployed in a given area or along a given path and that are of a given type, and also inspect real-time data that they produce, through a user-oriented Web application, namely the Service Map. A pragmatic approach to the modelling has been followed, always tak- ing into consideration the best practices of semantic modelling on one side for that a clean, comprehensive and understandable model could result, and the reality of the data at our hands and of the applicative requirements on the other side. As said, the identication of architectures and methods that could grant eciency and scalability in data access has also been a primary purpose of this research that has led to the denition and implementation of a federation of Service Maps, namely the Super Service Map. The archi- tecture is fully distributed: each Super Service Map has a local list of the actual Service Maps with relevant metadata, it exposes the same interface as actual Service Maps, it forwards requests and builds merged responses, also implementing security and caching mechanisms. As said, the identica- tion of technologies, tools, methods, for presenting the data in a user-friendly manner is also has been a relevant part of this research, and it has led among the other to the denition and implementation of a client-server architecture and a Web interface in the Snap4City platform for the building, manage- ment, and displaying of synoptic templates and instances thanks to which users can securely display and iteract with dierent types of data. In end, some eort has been made for the automatic classication of RDF datasets as for their structures and purposes, based on the computation of metrics through SPARQL queries and on the application of dimensionality reduc- tion and clustering techniques. A Web portal is available where directories, datasets, metrics, and computations can be inspected even at real-time.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "RDF dataset characterization and classification"

1

Murtovi, Alnis, Alexander Bainczyk, and Bernhard Steffen. "Forest GUMP: A Tool for Explanation." In Tools and Algorithms for the Construction and Analysis of Systems, 314–31. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-99527-0_17.

Full text
Abstract:
AbstractIn this paper, we present Forest GUMP (for Generalized, Unifying Merge Process) a tool for providing tangible experience with three concepts of explanation. Besides the well-known model explanation and outcome explanation, Forest GUMP also supports class characterization, i.e., the precise characterization of all samples with the same classification. Key technology to achieve these results is algebraic aggregation, i.e., the transformation of a Random Forest into a semantically equivalent, concise white-box representation in terms of Algebraic Decision Diagrams (ADDs). The paper sketches the method and illustrates the use of Forest GUMP along an illustrative example taken from the literature. This way readers should acquire an intuition about the tool, and the way how it should be used to increase the understanding not only of the considered dataset, but also of the character of Random Forests and the ADD technology, here enriched to comprise infeasible path elimination.
APA, Harvard, Vancouver, ISO, and other styles
2

García-Álvarez, David, Javier Lara Hinojosa, and Jaime Quintero Villaraso. "Global General Land Use Cover Datasets with a Single Date." In Land Use Cover Datasets and Validation Tools, 269–86. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-90998-7_14.

Full text
Abstract:
AbstractGlobal general Land Use and Land Cover (LUC) datasets map all land uses and covers across the globe, without focusing on any specific use or cover. This chapter only reviews those datasets available for one single date, which have not been updated over time. Seven different datasets are described in detail. Two other ones were identified, but are not included in this review, because of its coarsens, which limits their utility: Mathews Global Vegetation/Land Use and GMRCA LULC. The first experiences in global LUC mapping date back to the 1990s, when leading research groups in the field produced the first global LUC maps at fine scales of 1 km spatial resolution: the UMD LC Classification and the Global Land Cover Characterization. Not long afterwards, in an attempt to build on these experiences and take them a stage further, an international partnership produced GLC2000 for the reference year 2000. These initial LUC mapping projects produced maps for just one reference year and were not continued or updated over time. Subsequent projects have mostly focused on the production of timeseries of global LUC maps, which allow us to study LUC change over time (see Chapter “Global General Land Use Cover Datasets with a Time Series of Maps”). As a result, there are relatively few single-date global LUC maps for recent years of reference. The latest projects and initiatives producing global LUC maps for single dates have focused on improving the accuracy of global LUC mapping and the use of crowdsourcing production strategies. The Geo-Wiki Hybrid and GLC-SHARE datasets built on the previous research in a bid to obtain more accurate global LUC maps by merging the data from existing datasets. OSM LULC is an ongoing test project that is trying to produce a global LUC map cheaply, using crowdsourced information provided by the Open Street Maps community. The other dataset reviewed here is the LADA LUC Map, which was developed for a specific thematic project (Land Degradation Assessment in Dryland). This dataset is not comparable to the others reviewed in this chapter in terms of its purpose and nature, as is clear from its coarse spatial resolution (5 arc minutes). We therefore believe that this dataset should not be considered part of initiatives to produce more accurate, more detailed land use maps at a global level.
APA, Harvard, Vancouver, ISO, and other styles
3

Van Miert, Sabine, Jan Creylman, and Geert R. Verheyen. "Mining a Nanoparticle Dataset, Compiled Within the MODENA-COST Action." In Research Anthology on Synthesis, Characterization, and Applications of Nanomaterials, 1706–24. IGI Global, 2021. http://dx.doi.org/10.4018/978-1-7998-8591-7.ch071.

Full text
Abstract:
Engineered nanomaterials (ENM) have new or enhanced physico-chemical properties compared to their micron-sized counterparts, but may also have an increased toxic potential. Animal and in vitro testing are typically employed to investigate the toxic effects of (nano)materials. The sheer number of ENMs and their physico-chemical parameters make it impossible to only use in vivo and in vitro testing, and modelling technologies are also deployed to find relationships between ENM parameters and toxicity. A heterogenous dataset containing information on 192 nanoparticle endpoints was compiled within the MODENA COST-Action consortium. Here, the available data was mined to identify relationships between nanoparticle properties and cell-death as measured with four cytotoxicity assays. ANOVA, collinearity analyses and classification and regression trees gave indications on potential relations between the NP-properties and toxicity, but could not deliver a robust model. More information and datapoints are necessary to build well-validated models.
APA, Harvard, Vancouver, ISO, and other styles
4

Fatima, Kiran, and Hammad Majeed. "Texture-Based Evolutionary Method for Cancer Classification in Histopathology." In Medical Imaging, 558–72. IGI Global, 2017. http://dx.doi.org/10.4018/978-1-5225-0571-6.ch021.

Full text
Abstract:
Real-world histology tissue textures owing to non-homogeneous nature and unorganized spatial intensity variations are complex to analyze and classify. The major challenge in solving pathological problems is inherent complexity due to high intra-class variability and low inter-class variation in texture of histology samples. The development of computational methods to assists pathologists in characterization of these tissue samples would have great diagnostic and prognostic value. In this chapter, an optimized texture-based evolutionary framework is proposed to provide assistance to pathologists for classification of benign and pre-malignant tumors. The proposed framework investigates the imperative role of RGB color channels for discrimination of cancer grades or subtypes, explores higher-order statistical features at image-level, and implements an evolution-based optimization scheme for feature selection and classification. The highest classification accuracy of 99.06% is achieved on meningioma dataset and 90% on breast cancer dataset through Quadratic SVM classifier.
APA, Harvard, Vancouver, ISO, and other styles
5

Fatima, Kiran, and Hammad Majeed. "Texture-Based Evolutionary Method for Cancer Classification in Histopathology." In Advances in Data Mining and Database Management, 55–69. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-4666-9767-6.ch004.

Full text
Abstract:
Real-world histology tissue textures owing to non-homogeneous nature and unorganized spatial intensity variations are complex to analyze and classify. The major challenge in solving pathological problems is inherent complexity due to high intra-class variability and low inter-class variation in texture of histology samples. The development of computational methods to assists pathologists in characterization of these tissue samples would have great diagnostic and prognostic value. In this chapter, an optimized texture-based evolutionary framework is proposed to provide assistance to pathologists for classification of benign and pre-malignant tumors. The proposed framework investigates the imperative role of RGB color channels for discrimination of cancer grades or subtypes, explores higher-order statistical features at image-level, and implements an evolution-based optimization scheme for feature selection and classification. The highest classification accuracy of 99.06% is achieved on meningioma dataset and 90% on breast cancer dataset through Quadratic SVM classifier.
APA, Harvard, Vancouver, ISO, and other styles
6

Zhirnov, Ivan, and Dean Kouprianoff. "Acoustic Diagnostic of Laser Powder Bed Fusion Processes." In Advances in Transdisciplinary Engineering. IOS Press, 2022. http://dx.doi.org/10.3233/atde220173.

Full text
Abstract:
Online monitoring of Laser Powder Bed Fusion is critical to advance the technology and its applications. Many studies have shown that the acoustic signal from the laser powder bed fusion process contains a large amount of information about the process condition. In this research, we used an acoustic system for the in-situ characterization of a wide variety of different single-track geometries. The internal acoustic system includes a microphone and accelerometer. The melting mode, cross-sectional shape and dimensions of Ti6Al4V single tracks at different process parameters are presented. We have established a correlation between track geometry, internal defects and acoustic signals. The parameters are varied and tested against the acoustic frequency measurements to determine the sensitivity. We determined the patterns of signal behaviour in the event of anomalies (spatter, balling, pores, undercut). The characteristic features of the process are traced to a commercial machine. Well described dataset with correlated monitoring data and signal tracks properties obtained and can be used for building classification model and quality prediction. All this is aimed at creating a database of experimental data that will be a key for LPBF digitalization and control, allowing real-time control of the process to optimize part quality and, more importantly, help with decision-making algorithms.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "RDF dataset characterization and classification"

1

Katterbauer, Klemens, Alberto Marsala, Yanhui Zhang, and Ibrahim Hoteit. "Artificial Intelligence Aided Geologic Facies Classification in Complex Carbonate Reservoirs." In SPE Middle East Oil & Gas Show and Conference. SPE, 2021. http://dx.doi.org/10.2118/204705-ms.

Full text
Abstract:
Abstract Facies classification for complex reservoirs is an important step in characterizing reservoir heterogeneity and determining reservoir properties and fluid flow patterns. Predicting rock facies automatically and reliably from well log and associated reservoir measurements is therefore essential to obtain accurate reservoir characterization for field development in a timely manner. In this study, we present an artificial intelligence (AI) aided rock facies classification framework for complex reservoirs based on well log measurements. We generalize the AI-aided classification workflow into five major steps including data collection, preprocessing, feature engineering, model learning cycle, and model prediction. In particular, we automate the process of facies classification focusing on the use of a deep learning technique, convolutional neural network, which has shown outstanding performance in many scientific applications involving pattern recognition and classification. For performance analysis, we also compare the developed model with a support vector machine approach. We examine the AI-aided workflow on a large open dataset acquired from a real complex reservoir in Alberta. The dataset contains a collection of well-log measurements over a couple of thousands of wells. The experimental results demonstrate the high efficiency and scalability of the developed framework for automatic facies classification with reasonable accuracy. This is particularly useful when quick facies prediction is necessary to support real-time decision making. The AI-aided framework is easily implementable and expandable to other reservoir applications.
APA, Harvard, Vancouver, ISO, and other styles
2

Lefranc, Marie, Zikri Bayraktar, Morten Kristensen, Hedi Driss, Isabell Le Nir, Philippe Marza, and Josselin Kherroubi. "DEEP-LEARNING-BASED AUTOMATED SEDIMENTARY GEOMETRY CHARACTERIZATION FROM BOREHOLE IMAGES." In 2021 SPWLA 62nd Annual Logging Symposium Online. Society of Petrophysicists and Well Log Analysts, 2021. http://dx.doi.org/10.30632/spwla-2021-0082.

Full text
Abstract:
Sedimentary geometry on borehole images usually summarizes the arrangement of bed boundaries, erosive surfaces, cross bedding, sedimentary dip, and/or deformed beds. The interpretation, very often manual, requires a good level of expertise, is time consuming, can suffer from user bias, and become very challenging when dealing with highly deviated wells. Bedform geometry interpretation from crossbed data is rarely completed from a borehole image. The purpose of this study is to develop an automated method to interpret sedimentary structures, including the bedform geometry, from borehole images. Automation is achieved in this unique interpretation methodology using deep learning. The first task comprised the creation of a training dataset of 2D borehole images. This library of images was then used to train machine learning (ML) models. Testing different architectures of convolutional neural networks (CNN) showed the ResNet architecture to give the best performance for the classification of the different sedimentary structures. The validation accuracy was very high, in the range of 93–96%. To test the developed method, additional logs of synthetic data were created as sequences of different sedimentary structures (i.e., classes) associated with different well deviations, with addition of gaps. The model was able to predict the proper class and highlight the transitions accurately.
APA, Harvard, Vancouver, ISO, and other styles
3

Masoud, Mohamed, W. Scott Meddaugh, Masoud Eljaroshi, and Khaled Elghanduri. "Enhanced and Rock Typing-Based Reservoir Characterization of the Palaeocene Harash Carbonate Reservoir-Zelten Field-Sirte Basin-Libya." In SPE Annual Technical Conference and Exhibition. SPE, 2021. http://dx.doi.org/10.2118/205971-ms.

Full text
Abstract:
Abstract The Harash Formation was previously known as the Ruaga A and is considered to be one of the most productive reservoirs in the Zelten field in terms of reservoir quality, areal extent, and hydrocarbon quantity. To date, nearly 70 wells were drilled targeting the Harash reservoir. A few wells initially naturally produced but most had to be stimulated which reflected the field drilling and development plan. The Harash reservoir rock typing identification was essential in understanding the reservoir geology implementation of reservoir development drilling program, the construction of representative reservoir models, hydrocarbons volumetric calculations, and historical pressure-production matching in the flow modelling processes. The objectives of this study are to predict the permeability at un-cored wells and unsampled locations, to classify the reservoir rocks into main rock typing, and to build robust reservoir properties models in which static petrophysical properties and fluid properties are assigned for identified rock type and assessed the existed vertical and lateral heterogeneity within the Palaeocene Harash carbonate reservoir. Initially, an objective-based workflow was developed by generating a training dataset from open hole logs and core samples which were conventionally and specially analyzed of six wells. The developed dataset was used to predict permeability at cored wells through a K-mod model that applies Neural Network Analysis (NNA) and Declustring (DC) algorithms to generate representative permeability and electro-facies. Equal statistical weights were given to log responses without analytical supervision taking into account the significant log response variations. The core data was grouped on petrophysical basis to compute pore throat size aiming at deriving and enlarging the interpretation process from the core to log domain using Indexation and Probabilities of Self-Organized Maps (IPSOM) classification model to develop a reliable representation of rock type classification at the well scale. Permeability and rock typing derived from the open-hole logs and core samples analysis are the main K-mod and IPSOM classification model outputs. The results were propagated to more than 70 un-cored wells. Rock typing techniques were also conducted to classify the Harash reservoir rocks in a consistent manner. Depositional rock typing using a stratigraphic modified Lorenz plot and electro-facies suggest three different rock types that are probably linked to three flow zones. The defined rock types are dominated by specifc reservoir parameters. Electro-facies enables subdivision of the formation into petrophysical groups in which properties were assigned to and were characterized by dynamic behavior and the rock-fluid interaction. Capillary pressure and relative permeability data proved the complexity in rock capillarity. Subsequently, Swc is really rock typing dependent. The use of a consistent representative petrophysical rock type classification led to a significant improvement of geological and flow models.
APA, Harvard, Vancouver, ISO, and other styles
4

Abbas, Mohammed A., and Watheq J. Al-Mudhafar. "Lithofacies Classification of Carbonate Reservoirs Using Advanced Machine Learning: A Case Study from a Southern Iraqi Oil Field." In Offshore Technology Conference. OTC, 2021. http://dx.doi.org/10.4043/31114-ms.

Full text
Abstract:
Abstract Estimating rock facies from petrophysical logs in non-cored wells in complex carbonates represents a crucial task for improving reservoir characterization and field development. Thus, it most essential to identify the lithofacies that discriminate the reservoir intervals based on their flow and storage capacity. In this paper, an innovative procedure is adopted for lithofacies classification using data-driven machine learning in a well from the Mishrif carbonate reservoir in the giant Majnoon oil field, Southern Iraq. The Random Forest method was adopted for lithofacies classification using well logging data in a cored well to predict their distribution in other non-cored wells. Furthermore, three advanced statistical algorithms: Logistic Boosting Regression, Bagging Multivariate Adaptive Regression Spline, and Generalized Boosting Modeling were implemented and compared to the Random Forest approach to attain the most realistic lithofacies prediction. The dataset includes the measured discrete lithofacies distribution and the original log curves of caliper, gamma ray, neutron porosity, bulk density, sonic, deep and shallow resistivity, all available over the entire reservoir interval. Prior to applying the four classification algorithms, a random subsampling cross-validation was conducted on the dataset to produce training and testing subsets for modeling and prediction, respectively. After predicting the discrete lithofacies distribution, the Confusion Table and the Correct Classification Rate Index (CCI) were employed as further criteria to analyze and compare the effectiveness of the four classification algorithms. The results of this study revealed that Random Forest was more accurate in lithofacies classification than other techniques. It led to excellent matching between the observed and predicted discrete lithofacies through attaining 100% of CCI based on the training subset and 96.67 % of the CCI for the validating subset. Further validation of the resulting facies model was conducted by comparing each of the predicted discrete lithofacies with the available ranges of porosity and permeability obtained from the NMR log. We observed that rudist-dominated lithofacies correlates to rock with higher porosity and permeability. In contrast, the argillaceous lithofacies correlates to rocks with lower porosity and permeability. Additionally, these high-and low-ranges of permeability were later compared with the oil rate obtained from the PLT log data. It was identified that the high-and low-ranges of permeability correlate well to the high- and low-oil rate logs, respectively. In conclusion, the high quality estimation of lithofacies in non-cored intervals and wells is a crucial reservoir characterization task in order to obtain meaningful permeability-porosity relationships and capture realistic reservoir heterogeneity. The application of machine learning techniques drives down costs, provides for time-savings, and allows for uncertainty mitigation in lithofacies classification and prediction. The entire workflow was done through R, an open-source statistical computing language. It can easily be applied to other reservoirs to attain for them a similar improved overall reservoir characterization.
APA, Harvard, Vancouver, ISO, and other styles
5

Luna, Santiago, and Christian Salamea. "Noise pattern definition methodology for noise cancellation in coughs signals using an adaptive filter." In Intelligent Human Systems Integration (IHSI 2022) Integrating People and Intelligent Systems. AHFE International, 2022. http://dx.doi.org/10.54941/ahfe100983.

Full text
Abstract:
This work proposes a methodology to create a reference signal (noise pattern) that can be used in adaptive filtering to minimize the noise produced in a cough record. This noise pattern is able to incorporate information of all types of noises that contaminate a record cough signal. This reference signal has been created using a dataset of cough audio signals. The signal-to-noise ratio (SNR) has been used as the evaluation metric of the filtering quality. A system able to minimize the noise across all the record cough files using this methodology with an adaptive filtering technique has been created obtaining results closely to 0db, demonstrating the efficiency and generalization of the proposed technique that is part of the preprocessing phase in a system of characterization and classification of cough records.
APA, Harvard, Vancouver, ISO, and other styles
6

Katterbauer, Klemens, Waleed Dokhon, Fahmi Aulia, and Mohanad Fahmi. "A Novel Corrosion Monitoring and Prediction System Utilizing Advanced Artificial Intelligence." In SPE Middle East Oil & Gas Show and Conference. SPE, 2021. http://dx.doi.org/10.2118/204580-ms.

Full text
Abstract:
Abstract Corrosion in pipes is a major challenge for the oil and gas industry as the metal loss of the pipe, as well as solid buildup in the pipe, may lead to an impediment of flow assurance or may lead to hindering well performance. Therefore, managing well integrity by stringent monitoring and predicting corrosion of the well is quintessential for maximizing the productive life of the wells and minimizing the risk of well control issues, which subsequently minimizing cost related to corrosion log allocation and workovers. We present a novel supervised learning method for a corrosion monitoring and prediction system in real time. The system analyzes in real time various parameters of major causes of corrosion such as salt water, hydrogen sulfide, CO2, well age, fluid rate, metal losses, and other parameters. The data are preprocessed with a filter to remove outliers and inconsistencies in the data. The filter cross-correlates the various parameters to determine the input weights for the deep learning classification techniques. The wells are classified in terms of their need for a workover, then by the framework based on the data, utilizing a two-dimensional segmentation approach for the severity as well as risk for each well. The framework was trialed on a probabilistically determined large dataset of a group of wells with an assumed metal loss. The framework was first trained on the training dataset, and then subsequently evaluated on a different test well set. The training results were robust with a strong ability to estimate metal losses and corrosion classification. Segmentation on the test wells outlined strong segmentation capabilities, while facing challenges in the segmentation when the quantified risk for a well is medium. The novel framework presents a data-driven approach to the fast and efficient characterization of wells as potential candidates for corrosion logs and workover. The framework can be easily expanded with new well data for improving classification.
APA, Harvard, Vancouver, ISO, and other styles
7

Lin, Tao, Mokhles Mezghani, Chicheng Xu, and Weichang Li. "Machine Learning for Multiple Petrophysical Properties Regression Based on Core Images and Well Logs in a Heterogenous Reservoir." In SPE Annual Technical Conference and Exhibition. SPE, 2021. http://dx.doi.org/10.2118/206089-ms.

Full text
Abstract:
Abstract Reservoir characterization requires accurate prediction of multiple petrophysical properties such as bulk density (or acoustic impedance), porosity, and permeability. However, it remains a big challenge in heterogeneous reservoirs due to significant diagenetic impacts including dissolution, dolomitization, cementation, and fracturing. Most well logs lack the resolution to obtain rock properties in detail in a heterogenous formation. Therefore, it is pertinent to integrate core images into the prediction workflow. This study presents a new approach to solve the problem of obtaining the high-resolution multiple petrophysical properties, by combining machine learning (ML) algorithms and computer vision (CV) techniques. The methodology can be used to automate the process of core data analysis with a minimum number of plugs, thus reducing human effort and cost and improving accuracy. The workflow consists of conditioning and extracting features from core images, correlating well logs and core analysis with those features to build ML models, and applying the models on new cores for petrophysical properties predictions. The core images are preprocessed and analyzed using color models and texture recognition, to extract image characteristics and core textures. The image features are then aggregated into a profile in depth, resampled and aligned with well logs and core analysis. The ML regression models, including classification and regression trees (CART) and deep neural network (DNN), are trained and validated from the filtered training samples of relevant features and target petrophysical properties. The models are then tested on a blind test dataset to evaluate the prediction performance, to predict target petrophysical properties of grain density, porosity and permeability. The profile of histograms of each target property are computed to analyze the data distribution. The feature vectors are extracted from CV analysis of core images and gamma ray logs. The importance of each feature is generated by CART model to individual target, which may be used to reduce model complexity of future model building. The model performances are evaluated and compared on each target. We achieved reasonably good correlation and accuracy on the models, for example, porosity R2=49.7% and RMSE=2.4 p.u., and logarithmic permeability R2=57.8% and RMSE=0.53. The field case demonstrates that inclusion of core image attributes can improve petrophysical regression in heterogenous reservoirs. It can be extended to a multi-well setting to generate vertical distribution of petrophysical properties which can be integrated into reservoir modeling and characterization. Machine leaning algorithms can help automate the workflow and be flexible to be adjusted to take various inputs for prediction.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography