Log in

Relevant bibliographies by topics / Metadata Features / Journal articles

To see the other types of publications on this topic, follow the link: Metadata Features.

Journal articles on the topic 'Metadata Features'

Author: Grafiati

Published: 25 June 2021

Last updated: 12 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Metadata Features.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Odier, Jérôme, Fabian Lambert, and Jérôme Fulachier. "The ATLAS Metadata Interface (AMI) 2.0 metadata ecosystem: new design principles and features." EPJ Web of Conferences 214 (2019): 05046. http://dx.doi.org/10.1051/epjconf/201921405046.

Full text

Abstract:

ATLAS Metadata Interface (AMI) is a generic ecosystem for metadata aggregation, transformation and cataloging. Benefiting from 18 years of feedback in the LHC context, the second major version was recently released. This paper describes the design choices and their benefits for providing high-level metadata-dedicated features. In particular, the Metadata Querying Language (MQL) - a domain-specific language allowing to query databases without knowing the relation between entities - and on the AMI Web framework are described.

APA, Harvard, Vancouver, ISO, and other styles

2

Li, Yaping. "Glowworm Swarm Optimization Algorithm- and K-Prototypes Algorithm-Based Metadata Tree Clustering." Mathematical Problems in Engineering 2021 (February 9, 2021): 1–10. http://dx.doi.org/10.1155/2021/8690418.

Full text

Abstract:

The main objective of this paper is to present a new clustering algorithm for metadata trees based on K-prototypes algorithm, GSO (glowworm swarm optimization) algorithm, and maximal frequent path (MFP). Metadata tree clustering includes computing the feature vector of the metadata tree and the feature vector clustering. Therefore, traditional data clustering methods are not suitable directly for metadata trees. As the main method to calculate eigenvectors, the MFP method also faces the difficulties of high computational complexity and loss of key information. Generally, the K-prototypes algorithm is suitable for clustering of mixed-attribute data such as feature vectors, but the K-prototypes algorithm is sensitive to the initial clustering center. Compared with other swarm intelligence algorithms, the GSO algorithm has more efficient global search advantages, which are suitable for solving multimodal problems and also useful to optimize the K-prototypes algorithm. To address the clustering of metadata tree structures in terms of clustering accuracy and high data dimension, this paper combines the GSO algorithm, K-prototypes algorithm, and MFP together to study and design a new metadata structure clustering method. Firstly, MFP is used to describe metadata tree features, and the key parameter of categorical data is introduced into the feature vector of MFP to improve the accuracy of the feature vector to describe the metadata tree; secondly, GSO is combined with K-prototypes to design GSOKP for clustering the feature vector that contains numeric data and categorical data so as to improve the clustering accuracy; finally, tests are conducted with a set of metadata trees. The experimental results show that the designed metadata tree clustering method GSOKP-FP has certain advantages in respect to clustering accuracy and time complexity.

APA, Harvard, Vancouver, ISO, and other styles

3

SCHERP, ANSGAR, CARSTEN SAATHOFF, and STEFAN SCHEGLMANN. "A PATTERN SYSTEM FOR DESCRIBING THE SEMANTICS OF STRUCTURED MULTIMEDIA DOCUMENTS." International Journal of Semantic Computing 06, no. 03 (September 2012): 263–88. http://dx.doi.org/10.1142/s1793351x12400089.

Full text

Abstract:

Today's metadata models and metadata standards often focus on a specific media type only, lack combinability with other metadata models, or are limited with respect to the features they support. Thus they are not sufficient to describe the semantics of rich, structured multimedia documents. To overcome these limitations, we have developed a comprehensive model for representing multimedia metadata, the Multimedia Metadata Ontology (M3O). The M3O has been developed by an extensive analysis of related work and abstracts from the features of existing metadata models and metadata standards. It is based on the foundational ontology DOLCE+DnS Ultralight and makes use of ontology design patterns. The M3O serves as generic modeling framework for integrating the existing metadata models and metadata standards rather than replacing them. As such, the M3O can be used internally as semantic data model within complex multimedia applications such as authoring tools or multimedia management systems. To make use of the M3O in concrete multimedia applications, a generic application programming interface (API) has been implemented based on a sophisticated persistence layer that provides explicit support for ontology design patterns. To demonstrate applicability of the M3O API, we have integrated and applied it with our SemanticMM4U framework for the multi-channel generation of semantically annotated multimedia documents.

APA, Harvard, Vancouver, ISO, and other styles

4

Rastogi, Ajay, Monica Mehrotra, and Syed Shafat Ali. "Effective Opinion Spam Detection: A Study on Review Metadata Versus Content." Journal of Data and Information Science 5, no. 2 (May 20, 2020): 76–110. http://dx.doi.org/10.2478/jdis-2020-0013.

Full text

Abstract:

AbstractPurposeThis paper aims to analyze the effectiveness of two major types of features—metadata-based (behavioral) and content-based (textual)—in opinion spam detection.Design/methodology/approachBased on spam-detection perspectives, our approach works in three settings: review-centric (spam detection), reviewer-centric (spammer detection) and product-centric (spam-targeted product detection). Besides this, to negate any kind of classifier-bias, we employ four classifiers to get a better and unbiased reflection of the obtained results. In addition, we have proposed a new set of features which are compared against some well-known related works. The experiments performed on two real-world datasets show the effectiveness of different features in opinion spam detection.FindingsOur findings indicate that behavioral features are more efficient as well as effective than the textual to detect opinion spam across all three settings. In addition, models trained on hybrid features produce results quite similar to those trained on behavioral features than on the textual, further establishing the superiority of behavioral features as dominating indicators of opinion spam. The features used in this work provide improvement over existing features utilized in other related works. Furthermore, the computation time analysis for feature extraction phase shows the better cost efficiency of behavioral features over the textual.Research limitationsThe analyses conducted in this paper are solely limited to two well-known datasets, viz., YelpZip and YelpNYC of Yelp.com.Practical implicationsThe results obtained in this paper can be used to improve the detection of opinion spam, wherein the researchers may work on improving and developing feature engineering and selection techniques focused more on metadata information.Originality/valueTo the best of our knowledge, this study is the first of its kind which considers three perspectives (review, reviewer and product-centric) and four classifiers to analyze the effectiveness of opinion spam detection using two major types of features. This study also introduces some novel features, which help to improve the performance of opinion spam detection methods.

APA, Harvard, Vancouver, ISO, and other styles

5

Li, Chunqiu, and Shigeo Sugimoto. "Provenance description of metadata application profiles for long-term maintenance of metadata schemas." Journal of Documentation 74, no. 1 (January 8, 2018): 36–61. http://dx.doi.org/10.1108/jd-03-2017-0042.

Full text

Abstract:

Purpose Provenance information is crucial for consistent maintenance of metadata schemas over time. The purpose of this paper is to propose a provenance model named DSP-PROV to keep track of structural changes of metadata schemas. Design/methodology/approach The DSP-PROV model is developed through applying the general provenance description standard PROV of the World Wide Web Consortium to the Dublin Core Application Profile. Metadata Application Profile of Digital Public Library of America is selected as a case study to apply the DSP-PROV model. Finally, this paper evaluates the proposed model by comparison between formal provenance description in DSP-PROV and semi-formal change log description in English. Findings Formal provenance description in the DSP-PROV model has advantages over semi-formal provenance description in English to keep metadata schemas consistent over time. Research limitations/implications The DSP-PROV model is applicable to keep track of the structural changes of metadata schema over time. Provenance description of other features of metadata schema such as vocabulary and encoding syntax are not covered. Originality/value This study proposes a simple model for provenance description of structural features of metadata schemas based on a few standards widely accepted on the Web and shows the advantage of the proposed model to conventional semi-formal provenance description.

APA, Harvard, Vancouver, ISO, and other styles

6

Kim, Jihyeok, Reinald Kim Amplayo, Kyungjae Lee, Sua Sung, Minji Seo, and Seung-won Hwang. "Categorical Metadata Representation for Customized Text Classification." Transactions of the Association for Computational Linguistics 7 (November 2019): 201–15. http://dx.doi.org/10.1162/tacl_a_00263.

Full text

Abstract:

The performance of text classification has improved tremendously using intelligently engineered neural-based models, especially those injecting categorical metadata as additional information, e.g., using user/product information for sentiment classification. This information has been used to modify parts of the model (e.g., word embeddings, attention mechanisms) such that results can be customized according to the metadata. We observe that current representation methods for categorical metadata, which are devised for human consumption, are not as effective as claimed in popular classification methods, outperformed even by simple concatenation of categorical features in the final layer of the sentence encoder. We conjecture that categorical features are harder to represent for machine use, as available context only indirectly describes the category, and even such context is often scarce (for tail category). To this end, we propose using basis vectors to effectively incorporate categorical metadata on various parts of a neural-based model. This additionally decreases the number of parameters dramatically, especially when the number of categorical features is large. Extensive experiments on various data sets with different properties are performed and show that through our method, we can represent categorical metadata more effectively to customize parts of the model, including unexplored ones, and increase the performance of the model greatly.

APA, Harvard, Vancouver, ISO, and other styles

7

Li, Fang, and Jie Zhang. "Case study: a metadata scheme for multi-type manuscripts for the T.D. Lee Archives Online." Library Hi Tech 32, no. 2 (June 10, 2014): 219–28. http://dx.doi.org/10.1108/lht-11-2013-0149.

Full text

Abstract:

Purpose – The purpose of this paper is to propose a solution for designing a metadata scheme for multi-type manuscripts based on a comparison of various existing metadata schemes. Design/methodology/approach – The diversity of manuscript types is analysed. Descriptive scheme based on machine-readable MARC and metadata specifications-based descriptive scheme are compared. User tasks and resource features are analysed. Several challenges are posed and resolved through the design and establishment of a metadata scheme for the T.D. Lee Archives Online. Findings – Clarify an approach for developing a metadata scheme for multi-type manuscripts. Originality/value – From a multi-type perspective, this study designs a metadata scheme, establishes the element set and expands elements by studying a typical practice case. Useful suggestions for libraries, archives and museums are provided.

APA, Harvard, Vancouver, ISO, and other styles

8

Gong, Minseo, Jae-Yoon Cheon, Young-Suk Park, Jeawon Park, and Jaehyun Choi. "User Musical Taste Prediction Technique Using Music Metadata and Features." International Journal of Multimedia and Ubiquitous Engineering 11, no. 8 (August 31, 2016): 163–70. http://dx.doi.org/10.14257/ijmue.2016.11.8.18.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Ahmed, Muhammad Waqas, and Muhammad Tanvir Afzal. "FLAG-PDFe: Features Oriented Metadata Extraction Framework for Scientific Publications." IEEE Access 8 (2020): 99458–69. http://dx.doi.org/10.1109/access.2020.2997907.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Tali, Dmitry, and Oleg Finko. "Cryptographic Recursive Control of Integrity of Metadata Electronic Documents. Part 2. Complex of Algorithms." Voprosy kiberbezopasnosti, no. 6(40) (2020): 32–47. http://dx.doi.org/10.21681/2311-3456-2020-06-32-47.

Full text

Abstract:

The purpose of the study is to develop a set of algorithms to increase the level of security of metadata of electronic documents in conditions of destructive influences from authorized users (insiders). Research methods: the principle of chain data recording technology, methods of the theory of algorithms, theoretical provisions for the construction of automated information systems of legally significant electronic document management. The result of the research: a complex of algorithms for cryptographic recursive 2-D control of the integrity of metadata of electronic documents has been developed. Its feature is the following features: 1. localization of modified (with signs of integrity violation) metadata records of electronic documents; 2. identification of authorized users (insiders) who have carried out unauthorized modifications to the metadata of electronic documents; 3. identification of the fact of collusion of trusted parties through the introduction of mutual control of the results of their actions. The proposed solution allows to implement the functions of cryptographic recursive two-dimensional control of the integrity of metadata of electronic documents. At the same time, the use of the technology of chain data recording, at the heart of the presented solution, is due to the peculiarities of the functioning of departmental automated information systems of electronic document management.

APA, Harvard, Vancouver, ISO, and other styles

11

Liu, Zheng, Hua Yan, and Zhen Li. "An Efficient Graph-Based Flickr Photo Clustering Algorithm." Applied Mechanics and Materials 29-32 (August 2010): 2649–55. http://dx.doi.org/10.4028/www.scientific.net/amm.29-32.2649.

Full text

Abstract:

Traditional image clustering methods mainly depends on visual features only. Due to the well-known “semantic gap”, visual features can hardly describe the semantics of the images independently. In the case of Web images, apart from visual features, there are rich metadata which could enhance the performance of image clustering, such as time information, GPS coordinate and initial annotations. This paper proposes an efficient Flickr photo clustering algorithm by simultaneous integration information of multiple types which are related to Flickr photos using k-partite graph partitioning. For a personal collection of Flickr, we firstly determine the value of k which means the number of data types we used. Secondly, these heterogeneous metadata are mapped to vertices of a k-partite graph, and relationship between the heterogeneous metadata is represented as edge weight. Finally, Flickr photos could be clustered by partitioning the k-partite graph. Experiments conducted on the photos in Flickr demonstrate the effectiveness of the proposed algorithm.

APA, Harvard, Vancouver, ISO, and other styles

12

Fugazza, Cristiano, Monica Pepe, Alessandro Oggioni, Paolo Tagliolato, and Paola Carrara. "Raising Semantics-Awareness in Geospatial Metadata Management." ISPRS International Journal of Geo-Information 7, no. 9 (September 7, 2018): 370. http://dx.doi.org/10.3390/ijgi7090370.

Full text

Abstract:

Geospatial metadata are often encoded in formats that either are not aimed at efficient retrieval of resources or are plainly outdated. Particularly, the quantum leap represented by the Semantic Web did not induce so far a consistent, interlinked baseline in the geospatial domain. Datasets, scientific literature related to them, and ultimately the researchers behind these products are only loosely connected; the corresponding metadata intelligible only to humans, duplicated in different systems, seldom consistently. We address these issues by relating metadata items to resources that represent keywords, institutes, researchers, toponyms, and virtually any RDF data structure made available over the Web via SPARQL endpoints. Essentially, our methodology fosters delegated metadata management as the entities referred to in metadata are independent, decentralized data structures with their own life cycle. Our example implementation of delegated metadata envisages: (i) editing via customizable web-based forms (including injection of semantic information); (ii) encoding of records in any XML metadata schema; and (iii) translation into RDF. Among the semantics-aware features that this practice enables, we present a worked-out example focusing on automatic update of metadata descriptions. Our approach, demonstrated in the context of INSPIRE metadata (the ISO 19115/19119 profile eliciting integration of European geospatial resources) is also applicable to a broad range of metadata standards, including non-geospatial ones.

APA, Harvard, Vancouver, ISO, and other styles

13

Gupta, Anand. "Identification of Image Spam by Using Low Level & Metadata Features." International Journal of Network Security & Its Applications 4, no. 2 (March 31, 2012): 163–78. http://dx.doi.org/10.5121/ijnsa.2012.4213.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Gafurova, Polina, Alexander Elizarov, and Evgeny Konstantinovich Lipachev. "Basic Services of Factory Metadata Digital Mathematical Library Lobachevskii-Dml." Russian Digital Libraries Journal 23, no. 3 (May 9, 2020): 336–81. http://dx.doi.org/10.26907/1562-5419-2020-23-3-336-381.

Full text

Abstract:

A number of problems related to the construction of the metadata factory of the digital mathematical library Lobachevskii-DML have been solved. By metadata factory we mean a system of interconnected software tools aimed at creating, processing, storing and managing metadata of digital library objects and allowing integrating created electronic collections into aggregating digital scientific libraries. In order to select the optimal such software tools from existing ones and their modernization:we discussed the features of the presentation of the metadata of documents of various electronic collections related both to the formats used and to changes in the composition and completeness of the set of metadata throughout the entire publication of the corresponding scientific journal;we presented and characterized software tools for managing scientific content and methods for organizing automated integration of repositories of mathematical documents with other information systems;we discussed such an important function of the digital library metadata factory as the normalization of metadata in accordance with the formats of other aggregating libraries.As a result of the development of the metadata factory of the digital mathematical library Lobachevskii-DML, we proposed a system of services for the automated generation of metadata for electronic mathematical collections; we have developed an xml metadata presentation language based on the Journal Archiving and Interchange Tag Suite (NISO JATS); we have created software tools for normalizing metadata of electronic collections of scientific documents in formats developed by international organizations – aggregators of resources in mathematics and Computer Science; we have developed an algorithm for converting metadata to oai_dc format and generating the archive structure for import into DSpace digital storage; we have proposed and implemented methods for integrating electronic mathematical collections of Kazan University into domestic and foreign digital mathematical libraries.

APA, Harvard, Vancouver, ISO, and other styles

15

Yang, Ru, Yuhui Deng, Yi Zhou, and Ping Huang. "Boosting the Restoring Performance of Deduplication Data by Classifying Backup Metadata." ACM/IMS Transactions on Data Science 2, no. 2 (April 20, 2021): 1–16. http://dx.doi.org/10.1145/3437261.

Full text

Abstract:

Restoring data is the main purpose of data backup in storage systems. The fragmentation issue, caused by physically scattering logically continuous data across a variety of disk locations, poses a negative impact on the restoring performance of a deduplication system. Rewriting algorithms are used to alleviate the fragmentation problem by improving the restoring speed of a deduplication system. However, rewriting methods give birth to a big sacrifice in terms of deduplication ratio, leading to a huge storage space waste. Furthermore, traditional backup approaches treat file metadata and chunk metadata as the same, which causes frequent on-disk metadata accesses. In this article, we start by analyzing storage characteristics of backup metadata. An intriguing finding shows that with 10 million files, the file metadata merely takes up approximately 340 MB. Motivated by this finding, we propose a Classified-Metadata based Restoring method (CMR) that classifies backup metadata into file metadata and chunk metadata . Because the file metadata merely takes up a meager amount of space, CMR maintains all file metadata in memory, whereas chunk metadata are aggressively prefetched to memory in a greedy manner. A deduplication system with CMR in place exhibits three salient features: (i) It avoids rewriting algorithms’ additional overhead by reducing the number of disk reads in a restoring process, (ii) it increases the restoring throughput without sacrificing the deduplication ratio, and (iii) it thoroughly leverages the hardware resources to boost the restoring performance. To quantitatively evaluate the performance of CMR, we compare our CMR against two state-of-the-art approaches, namely, a history-aware rewriting method (HAR) and a context-based rewriting scheme (CAP). The experimental results show that compared to HAR and CAP, CMR reduces the restoring time by 27.2% and 29.3%, respectively. Moreover, the deduplication ratio is improved by 1.91% and 4.36%, respectively.

APA, Harvard, Vancouver, ISO, and other styles

16

Sefid, Athar, Jian Wu, Allen C. Ge, Jing Zhao, Lu Liu, Cornelia Caragea, Prasenjit Mitra, and C. Lee Giles. "Cleaning Noisy and Heterogeneous Metadata for Record Linking across Scholarly Big Datasets." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 9601–6. http://dx.doi.org/10.1609/aaai.v33i01.33019601.

Full text

Abstract:

Automatically extracted metadata from scholarly documents in PDF formats is usually noisy and heterogeneous, often containing incomplete fields and erroneous values. One common way of cleaning metadata is to use a bibliographic reference dataset. The challenge is to match records between corpora with high precision. The existing solution which is based on information retrieval and string similarity on titles works well only if the titles are cleaned. We introduce a system designed to match scholarly document entities with noisy metadata against a reference dataset. The blocking function uses the classic BM25 algorithm to find the matching candidates from the reference data that has been indexed by ElasticSearch. The core components use supervised methods which combine features extracted from all available metadata fields. The system also leverages available citation information to match entities. The combination of metadata and citation achieves high accuracy that significantly outperforms the baseline method on the same test dataset. We apply this system to match the database of CiteSeerX against Web of Science, PubMed, and DBLP. This method will be deployed in the CiteSeerX system to clean metadata and link records to other scholarly big datasets.

APA, Harvard, Vancouver, ISO, and other styles

17

Hume, Samuel, Surendra Sarnikar, and Cherie Noteboom. "Enhancing Traceability in Clinical Research Data through a Metadata Framework." Methods of Information in Medicine 59, no. 02/03 (May 2020): 075–85. http://dx.doi.org/10.1055/s-0040-1714393.

Full text

Abstract:

Abstract Background The clinical research data lifecycle, from data collection to analysis results, functions in silos that restrict traceability. Traceability is a requirement for regulated clinical research studies and an important attribute of nonregulated studies. Current clinical research software tools provide limited metadata traceability capabilities and are unable to query variables across all phases of the data lifecycle. Objectives To develop a metadata traceability framework that can help query and visualize traceability metadata, identify traceability gaps, and validate metadata traceability to improve data lineage and reproducibility within clinical research studies. Methods This research follows the design science research paradigm where the objective is to create and evaluate an information technology (IT) artifact that explicitly addresses an organizational problem or opportunity. The implementation and evaluation of the IT artifact demonstrate the feasibility of both the design process and the final designed product. Results We present Trace-XML, a metadata traceability framework that extends standard clinical research metadata models and adapts graph traversal algorithms to provide clinical research study traceability queries, validation, and visualization. Trace-XML was evaluated using analytical and qualitative methods. The analytical methods show that Trace-XML accurately and completely assesses metadata traceability within a clinical research study. A qualitative study used thematic analysis of interview data to show that Trace-XML adds utility to a researcher's ability to evaluate metadata traceability within a study. Conclusion Trace-XML benefits include features that (1) identify traceability gaps in clinical study metadata, (2) validate metadata traceability within a clinical study, and (3) query and visualize traceability metadata. The key themes that emerged from the qualitative evaluation affirm that Trace-XML adds utility to the task of creating and assessing end-to-end clinical research study traceability.

APA, Harvard, Vancouver, ISO, and other styles

18

Martín, Ignacio, José Alberto Hernández, Alfonso Muñoz, and Antonio Guzmán. "Android Malware Characterization Using Metadata and Machine Learning Techniques." Security and Communication Networks 2018 (July 8, 2018): 1–11. http://dx.doi.org/10.1155/2018/5749481.

Full text

Abstract:

Android malware has emerged as a consequence of the increasing popularity of smartphones and tablets. While most previous work focuses on inherent characteristics of Android apps to detect malware, this study analyses indirect features and metadata to identify patterns in malware applications. Our experiments show the following: (1) the permissions used by an application offer only moderate performance results; (2) other features publicly available at Android markets are more relevant in detecting malware, such as the application developer and certificate issuer; and (3) compact and efficient classifiers can be constructed for the early detection of malware applications prior to code inspection or sandboxing.

APA, Harvard, Vancouver, ISO, and other styles

19

Kim, JeongYeon. "Protecting Metadata of Access Indicator and Region of Interests for Image Files." Security and Communication Networks 2020 (January 22, 2020): 1–10. http://dx.doi.org/10.1155/2020/4836109.

Full text

Abstract:

With popularity of social network services, the security and privacy issues over shared contents receive many attentions. Besides, multimedia files have additional concerns of copyright violation or illegal usage to share over communication networks. For image file management, JPEG group develops new image file format to enhance security and privacy features. Adopting a box structure with different application markers, new standards for privacy and security provide a concept of replacement substituting a private part of the original image or metadata with an alternative public data. In this paper, we extend data protection features of new JPEG formats to remote access control as a metadata. By keeping location information of access control data as a metadata in image files, the image owner can allow or deny other’s data consumption regardless where the media file is. License issue also can be resolved by applying new access control schemes, and we present how new formats protect commercial image files against unauthorized accesses.

APA, Harvard, Vancouver, ISO, and other styles

20

Arnomo, Ilham. "STUDI BANDING PERANGKAT LUNAK APLIKASI GANESHA DIGITAL LIBRARY (GDL) SEBAGAI REPOSITORY INSTITUSI BERBASIS OPEN SOURCE." JURNAL TEKNIK INFORMATIKA 12, no. 1 (June 20, 2019): 21–30. http://dx.doi.org/10.15408/jti.v12i1.8632.

Full text

Abstract:

The purpose of the research is to analyze the technical features of the Ganesha Digital Library (GDL) application with DSpace and Eprints, so that it will be proved technically whether the GDL application can meet the standards and criteria as the application software repository institution ?. Using experimental approach research methods by installing GDL, DSpace and Eprints application to further analyze and compare the technical features of the three applications. The results show that GDL application meet standards and criteria as institutional repository application, since they have most of the technical features of institutional repositories including the features of the OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting); has a dublincore metadata standard; and has been licensed to open source (General Public License) GPL that is needed for the utilization, use and development of institutional repository applications according to the needs of a college institution.

APA, Harvard, Vancouver, ISO, and other styles

21

Morgan-Lopez, Antonio A., Annice E. Kim, Robert F. Chew, and Paul Ruddle. "Predicting age groups of Twitter users based on language and metadata features." PLOS ONE 12, no. 8 (August 29, 2017): e0183537. http://dx.doi.org/10.1371/journal.pone.0183537.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Ghosh, Isha, and Vivek Singh. "Phones, privacy, and predictions." Online Information Review 44, no. 2 (October 23, 2018): 483–502. http://dx.doi.org/10.1108/oir-03-2018-0112.

Full text

Abstract:

Purpose Mobile phones have become one of the most favored devices to maintain social connections as well as logging digital information about personal lives. The privacy of the metadata being generated in this process has been a topic of intense debate over the last few years, but most of the debate has been focused on stonewalling such data. At the same time, such metadata is already being used to automatically infer a user’s preferences for commercial products, media, or political agencies. The purpose of this paper is to understand the predictive power of phone usage features on individual privacy attitudes. Design/methodology/approach The present study uses a mixed-method approach, involving analysis of mobile phone metadata, self-reported survey on privacy attitudes and semi-structured interviews. This paper analyzes the interconnections between user’s social and behavioral data as obtained via their phone with their self-reported privacy attitudes and interprets them based on the semi-structured interviews. Findings The findings from the study suggest that an analysis of mobile phone metadata reveals vital clues to a person’s privacy attitudes. This study finds that multiple phone signals have significant predictive power on an individual’s privacy attitudes. The results motivate a newer direction of automatically inferring a user’s privacy attitudes by leveraging their phone usage information. Practical implications An ability to automatically infer a user’s privacy attitudes could allow users to utilize their own phone metadata to get automatic recommendations for privacy settings appropriate for them. This study offers information scientists, government agencies and mobile app developers, an understanding of user privacy needs, helping them create apps that take these traits into account. Originality/value The primary value of this paper lies in providing a better understanding of the predictive power of phone usage features on individual privacy attitudes.

APA, Harvard, Vancouver, ISO, and other styles

23

Chang, Hsuan-Pu, and Jason C. Hung. "Comparison of the Features of EPUB E-Book and SCORM E-Learning Content Model." International Journal of Distance Education Technologies 16, no. 2 (April 2018): 1–17. http://dx.doi.org/10.4018/ijdet.2018040101.

Full text

Abstract:

E-books nowadays have greatly evolved in its presentation and functions, however its features for education need to be investigated and inspired because people who are accustomed to using printed books may consider and approach it in the same way as they do printed ones. Therefore, the authors compared the EPUB e-book content model with the SCORM e-learning content model from the respects of their content presentation, metadata and package structures. Drs. Chang and Hung found that 1) EPUB has the possibility to implement the advantage of content sharing and reusing. 2) EPUB e-books can present educational materials with multimedia and interactive components based on web technology. However, content creators should beware of the limited supported media types 3) EPUB lacks dedicated educational metadata. 4) EPUB e-books have a content reflow mechanism to adjust layouts to fit small screen devices and are able to use all resources offline. Finally, they determined the research issues and strategies that are worthy of further investigation and development for EPUB e-books in education based on our findings.

APA, Harvard, Vancouver, ISO, and other styles

24

Quezada-Naquid, Moisés, Ricardo Marcelín-Jiménez, and José Luis González-Compeán. "Babel." International Journal of Web Services Research 13, no. 4 (October 2016): 36–53. http://dx.doi.org/10.4018/ijwsr.2016100103.

Full text

Abstract:

The Babel File System is a dependable, scalable and flexible storage system. Among its main features the authors underline the availability of different types of data redundancy, a careful decoupling between data and metadata, a middleware that enforces metadata consistency, and its own load-balance and allocation procedure which adapts to the number and capacities of the supporting storage devices. It can be deployed over different hardware platforms, including commodity hardware. The authors' proposal has been designed to allow developers to settle a trade-off between price and performance, depending on their particular applications.

APA, Harvard, Vancouver, ISO, and other styles

25

Nesterova, E. I. "The Specific Features of Source Data and Metadata Ontology in Virtual Reality Systems." Automatic Documentation and Mathematical Linguistics 53, no. 6 (November 2019): 309–14. http://dx.doi.org/10.3103/s0005105519060025.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Liu, Z., J. Sun, M. Smith, L. Smith, and R. Warr. "Incorporating clinical metadata with digital image features for automated identification of cutaneous melanoma." British Journal of Dermatology 169, no. 5 (October 31, 2013): 1034–40. http://dx.doi.org/10.1111/bjd.12550.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Ma, Anqi, Yu Liu, Xiujuan Xu, and Tao Dong. "A deep-learning based citation count prediction model with paper metadata semantic features." Scientometrics 126, no. 8 (June 5, 2021): 6803–23. http://dx.doi.org/10.1007/s11192-021-04033-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Felicetti, Achille, and Matteo Lorenzini. "Metadata and Tools for Integration and Preservation of Cultural Heritage 3D Information." Geoinformatics FCE CTU 6 (December 21, 2011): 118–24. http://dx.doi.org/10.14311/gi.6.16.

Full text

Abstract:

In this paper we investigate many of the various storage, portability and interoperability issues arising among archaeologists and cultural heritage people when dealing with 3D technologies. On the one side, the available digital repositories look often unable to guarantee affordable features in the management of 3D models and their metadata; on the other side the nature of most of the available data format for 3D encoding seem to be not satisfactory for the necessary portability required nowadays by 3D information across different systems. We propose a set of possible solutions to show how integration can be achieved through the use of well known and wide accepted standards for data encoding and data storage. Using a set of 3D models acquired during various archaeological campaigns and a number of open source tools, we have implemented a straightforward encoding process to generate meaningful semantic data and metadata. We will also present the interoperability process carried out to integrate the encoded 3D models and the geographic features produced by the archaeologists. Finally we will report the preliminary (rather encouraging) development of a semantic enabled and persistent digital repository, where 3D models (but also any kind of digital data and metadata) can easily be stored, retrieved and shared with the content of other digital archives.

APA, Harvard, Vancouver, ISO, and other styles

29

Xia, Bohui, Hiroyuki Seshime, Xueting Wang, and Toshihiko Yamasaki. "Click-Through Rate Prediction of Online Banners Featuring Multimodal Analysis." International Journal of Semantic Computing 14, no. 01 (March 2020): 71–91. http://dx.doi.org/10.1142/s1793351x20400048.

Full text

Abstract:

As the online advertisement industry continues to grow, it is predicted that online advertisement will account for about 45% of global advertisement spending by 2020.a Thus, predicting the click-through rates (CTRs) of advertisements is increasingly crucial for the advertisement industry. Many studies have already addressed the CTR prediction. However, most studies tried to solve the problem using only metadata such as user id, URL of the landing page, business category, device type, etc., and did not include multimedia contents such as images or texts. Using these multimedia features with deep learning techniques, we propose a method to effectively predict CTRs for online banners, a popular form of online advertisements. We show that multimedia features of advertisements are useful for the task at hand. In our previous work [ 1 ], we proposed a CTR prediction model, which outperformed the state-of-the-art method that uses the three features mentioned above, and also we introduced an attention network for visualizing how much each feature affected the prediction result. In this work, we introduce another text analysis technique and more detailed metadata. As a result, we have achieved much better performance as compared to our previous work. Besides, for better analyzing of our model, we introduce another visualization technique to show regions in an image that make its CTR better or worse. Our prediction model gives us useful suggestions for improving design of advertisements to acquire higher CTRs.

APA, Harvard, Vancouver, ISO, and other styles

30

Oukhouya, Lamya, Anass El haddadi, Brahim Er-raha, and Hiba Asri. "A generic metadata management model for heterogeneous sources in a data warehouse." E3S Web of Conferences 297 (2021): 01069. http://dx.doi.org/10.1051/e3sconf/202129701069.

Full text

Abstract:

For more than 30 decades, data warehouses have been considered the only business intelligence storage system for enterprises. However, with the advent of big data, they have been modernized to support the variety and dynamics of data by adopting the data lake as a centralized data source for heterogeneous sources. Indeed, the data lake is characterized by its flexibility and performance when storing and analyzing data. However, the absence of schema on the data during ingestion increases the risk of the transformation of the data lake into a data swamp, so the use of metadata management is essential to exploit the data lake. In this paper, we will present a conceptual metadata management model for the data lake. Our solution will be based on a functional architecture of the data lake as well as on a set of features allowing the genericity of the metadata model. Furthermore, we will present a set of transformation rules, allowing us to translate our conceptual model into an owl ontology.

APA, Harvard, Vancouver, ISO, and other styles

31

Hong, Jung-Hong, and Yi-Tin Shi. "3D Perspective towards the Development of a Metadata-Driven Sharing Mechanism for Heterogeneous CCTV Systems." ISPRS International Journal of Geo-Information 10, no. 8 (August 15, 2021): 550. http://dx.doi.org/10.3390/ijgi10080550.

Full text

Abstract:

The installation of closed-circuit television monitors (CCTV) has rapidly increased in number ever since the 11 September attacks. With the advantages of direct visual inspection, CCTV systems are widely used on various occasions that require instantaneous and long-term monitoring. Especially for emergency response tasks, the prompt availability of CCTV offers EOC (Emergency Operation Center) commanders much better action reference about the reported incidents. However, the heterogeneity among the CCTV systems impedes the effective and efficient use and sharing of CCTV services hosted by different stakeholders, making individual CCTV systems often operate on their own and restrict the possibility of taking the best advantages of the huge number of existing CCTV systems. This research proposes a metadata-driven approach to facilitate a cross-domain sharing mechanism for heterogeneous CCTV systems. The CCTV metadata includes a set of enriched description information based on the analysis from the aspects of Who, When, Where, What, Why and How (5W1H) for CCTV. Sharing mechanisms based on standardised CCTV metadata can then suffice the need for querying and selecting CCTV across heterogeneous systems according to the task at hand. One distinguished design is the modelling of the field of view (FOV) of CCTV from the 3D perspective. By integrating with the 3D feature-based city model data, the 3D FOV information not only provides better visualisation about the spatial coverage of the CCTV systems but also enables the 3D visibility analysis of CCTV based on individual features, such that the selection decision can be further improved with the indexing of CCTV and features. As the number and variety of CCTV systems continuously grows, the proposed mechanism has a great potential to serve as a solid collaborated foundation for integrating heterogeneous CCTV systems for applications that demand comprehensive and instantaneous understanding about the dynamically changing world, e.g., smart cities, disaster management, criminal investigation, etc.

APA, Harvard, Vancouver, ISO, and other styles

32

Hong, Seong-Yong, and Sung-Joon Lee. "An Intelligent Web Digital Image Metadata Service Platform for Social Curation Commerce Environment." Modelling and Simulation in Engineering 2015 (2015): 1–10. http://dx.doi.org/10.1155/2015/651428.

Full text

Abstract:

Information management includes multimedia data management, knowledge management, collaboration, and agents, all of which are supporting technologies for XML. XML technologies have an impact on multimedia databases as well as collaborative technologies and knowledge management. That is, e-commerce documents are encoded in XML and are gaining much popularity for business-to-business or business-to-consumer transactions. Recently, the internet sites, such as e-commerce sites and shopping mall sites, deal with a lot of image and multimedia information. This paper proposes an intelligent web digital image information retrieval platform, which adopts XML technology for social curation commerce environment. To support object-based content retrieval on product catalog images containing multiple objects, we describe multilevel metadata structures representing the local features, global features, and semantics of image data. To enable semantic-based and content-based retrieval on such image data, we design an XML-Schema for the proposed metadata. We also describe how to automatically transform the retrieval results into the forms suitable for the various user environments, such as web browser or mobile device, using XSLT. The proposed scheme can be utilized to enable efficient e-catalog metadata sharing between systems, and it will contribute to the improvement of the retrieval correctness and the user’s satisfaction on semantic-based web digital image information retrieval.

APA, Harvard, Vancouver, ISO, and other styles

33

Vellino, André. "Harmonizing the Metadata Among Diverse Climate Change Datasets." International Journal of Digital Curation 10, no. 1 (May 14, 2015): 268–79. http://dx.doi.org/10.2218/ijdc.v10i1.367.

Full text

Abstract:

One of the critical problems in the curation of research data is the harmonization of its internal metadata schemata. The value of harmonizing such data is well illustrated by the Berkeley Earth project, which successfully integrated into one metadata schema the raw climate datasets from a wide variety geographical sources and time periods (250 years). Doing this enabled climate scientists to calculate a more accurate estimate of the recent changes in Earth’s average land surface temperatures and to ascertain the extent to which climate change is anthropogenic. This paper surveys some of the approaches that have been taken to the integration of data schemata in general and examines some of the specific metadata features of the source surface temperature datasets that were harmonized by Berkeley Earth. The conclusion drawn from this analysis is that the original source data and the Berkeley Earth common format provides a promising training set on which to apply machine learning methods for replicating the human data integration process. This paper describes research in progress on a domain-independent approach to the metadata harmonization problem that could be applied to other fields of study and be incorporated into a data portal to enhance the discoverability and reuse of data from a broad range of data sources.

APA, Harvard, Vancouver, ISO, and other styles

34

Shankaranarayanan, G., and Bin Zhu. "Enhancing decision-making with data quality metadata." Journal of Systems and Information Technology 23, no. 2 (August 18, 2021): 199–217. http://dx.doi.org/10.1108/jsit-08-2020-0153.

Full text

Abstract:

Purpose Data quality metadata (DQM) is a set of quality measurements associated with the data. Prior research in data quality has shown that DQM improves decision performance. The same research has also shown that DQM overloads the cognitive capacity of decision-makers. Visualization is a proven technique to reduce cognitive overload in decision-making. This paper aims to describe a prototype decision support system with a visual interface and examine its efficacy in reducing cognitive overload in the context of decision-making with DQM. Design/methodology/approach The authors describe the salient features of the prototype and following the design science paradigm, this paper evaluates its usefulness using an experimental setting. Findings The authors find that the interface not only reduced perceived mental demand but also improved decision performance despite added task complexity due to the presence of DQM. Research limitations/implications A drawback of this study is the sample size. With a sample size of 51, the power of the model to draw conclusions is weakened. Practical implications In today’s decision environments, decision-makers deal with extraordinary volumes of data the quality of which is unknown or not determinable with any certainty. The interface and its evaluation offer insights into the design of decision support systems that reduce the complexity of the data and facilitate the integration of DQM into the decision tasks. Originality/value To the best of my knowledge, this is the only research to build and evaluate a decision-support prototype for structured decision-making with DQM.

APA, Harvard, Vancouver, ISO, and other styles

35

Audeh, Bissan, Michel Beigbeder, Christine Largeron, and Diana Ramírez-Cifuentes. "Improving exploratory information retrieval for neophytes." ACM SIGAPP Applied Computing Review 20, no. 4 (January 12, 2021): 50–64. http://dx.doi.org/10.1145/3447332.3447336.

Full text

Abstract:

Digital libraries have become an essential tool for researchers in all scientific domains. With almost unlimited storage capacities, current digital libraries hold a tremendous number of documents. Though some efforts have been made to facilitate access to documents relevant to a specific information need, such a task remains a real challenge for a new researcher. Indeed neophytes do not necessarily use appropriate keywords to express their information need and they might not be qualified enough to evaluate correctly the relevance of documents retrieved by the system. In this study, we suppose that to better meet the needs of neophytes, the information retrieval system in a digital library should take into consideration features other than content-based relevance. To test this hypothesis, we use machine learning methods and build new features from several metadata related to documents. More precisely, we propose to consider as features for machine learning: content-based scores, scores based on the citation graph and scores based on metadata extracted from external resources. As acquiring such features is not a trivial task, we analyze their usefulness and their capacity to detect relevant documents. Our analysis concludes that the use of these additional features improves the performance of the system for a neophyte. In fact, by adding the new features we find more documents suitable for neophytes within the results returned by the system than when using content-based features alone.

APA, Harvard, Vancouver, ISO, and other styles

36

Vaishnavi, M. S., and A. Vijayalakshmi. "Age Estimation Using OLPP Features." Oriental journal of computer science and technology 10, no. 1 (March 23, 2017): 238–48. http://dx.doi.org/10.13005/ojcst/10.01.33.

Full text

Abstract:

Aging face recognition poses as a key difficulty in facial recognition. It refers to identification of a person face over varied ages. It includes issues like age estimation, progression and verification. Non-availability of facial aging databases make it harder for any system to achieve good accuracy as there are no good training sets available. Age estimation when done correctly has a varied number of real life applications like age detailed vending machines, age specific access control and finding missing children. This paper implements age estimation using Park Aging Mind laboratory - Face database that contains metadata and 293 unique images of 293 individuals. Ages range from 19 to 45 with a median age of 32. Race is classified into two categories : African-American and Caucasian giving an accuracy of 98%. Sobel edge detection and Orthogonal locality preservation projection were used as the dominant features for the training and testing of age estimation. A Multi-stage binary classification using support vector machine was used to classify images into an age group thereafter predicting an individual’s age. The effectiveness of this method can be increased by using a large dataset with a wider age range.

APA, Harvard, Vancouver, ISO, and other styles

37

Chen, Chia-Huang, and Yasufumi Takama. "Identification of Season-Dependent Sightseeing Spots Based on Metadata-Derived Features and Image Processing." Journal of Advanced Computational Intelligence and Intelligent Informatics 18, no. 3 (May 20, 2014): 353–60. http://dx.doi.org/10.20965/jaciii.2014.p0353.

Full text

Abstract:

Sharing traveling experience and photos on Social Network Service or Web albums is more and more popular recently. Good sightseeing photos in specific situation such as sunset and spring season can impress tourists well, and be clues for them to consider where and when to visit for sightseeing. Regarding situations to be identified, this paper focuses on season. Compared with situations relating with weather and time of day (e.g., sunrise/sunset), whether or not different seasons have different scenery depends on sightseeing spots. Therefore, classifying sightseeing spots into season-dependent/independent is required as preprocessing for season-based classification of sightseeing photos. This paper proposes a hybrid approach for identifying season-dependent sightseeing spots, of which the first phase applies machine learning with statistical features of sightseeing photos obtained from metadata. In order to improve precision, the second phase applies color-based classification to spots identified as season-dependent in the first phase. The experimental results show the effectiveness of the proposed method.

APA, Harvard, Vancouver, ISO, and other styles

38

Messina, Pablo, Vicente Dominguez, Denis Parra, Christoph Trattner, and Alvaro Soto. "Content-based artwork recommendation: integrating painting metadata with neural and manually-engineered visual features." User Modeling and User-Adapted Interaction 29, no. 2 (July 27, 2018): 251–90. http://dx.doi.org/10.1007/s11257-018-9206-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Gundersen, Sveinung, Sanjay Boddu, Salvador Capella-Gutierrez, Finn Drabløs, José M. Fernández, Radmila Kompova, Kieron Taylor, Dmytro Titov, Daniel Zerbino, and Eivind Hovig. "Recommendations for the FAIRification of genomic track metadata." F1000Research 10 (April 1, 2021): 268. http://dx.doi.org/10.12688/f1000research.28449.1.

Full text

Abstract:

Background: Many types of data from genomic analyses can be represented as genomic tracks, i.e. features linked to the genomic coordinates of a reference genome. Examples of such data are epigenetic DNA methylation data, ChIP-seq peaks, germline or somatic DNA variants, as well as RNA-seq expression levels. Researchers often face difficulties in locating, accessing and combining relevant tracks from external sources, as well as locating the raw data, reducing the value of the generated information. Description of work: We propose to advance the application of FAIR data principles (Findable, Accessible, Interoperable, and Reusable) to produce searchable metadata for genomic tracks. Findability and Accessibility of metadata can then be ensured by a track search service that integrates globally identifiable metadata from various track hubs in the Track Hub Registry and other relevant repositories. Interoperability and Reusability need to be ensured by the specification and implementation of a basic set of recommendations for metadata. We have tested this concept by developing such a specification in a JSON Schema, called FAIRtracks, and have integrated it into a novel track search service, called TrackFind. We demonstrate practical usage by importing datasets through TrackFind into existing examples of relevant analytical tools for genomic tracks: EPICO and the GSuite HyperBrowser. Conclusion: We here provide a first iteration of a draft standard for genomic track metadata, as well as the accompanying software ecosystem. It can easily be adapted or extended to future needs of the research community regarding data, methods and tools, balancing the requirements of both data submitters and analytical end-users.

APA, Harvard, Vancouver, ISO, and other styles

40

Powell, Christian D., and Hunter N. B. Moseley. "The mwtab Python Library for RESTful Access and Enhanced Quality Control, Deposition, and Curation of the Metabolomics Workbench Data Repository." Metabolites 11, no. 3 (March 12, 2021): 163. http://dx.doi.org/10.3390/metabo11030163.

Full text

Abstract:

The Metabolomics Workbench (MW) is a public scientific data repository consisting of experimental data and metadata from metabolomics studies collected with mass spectroscopy (MS) and nuclear magnetic resonance (NMR) analyses. MW has been constantly evolving; updating its ‘mwTab’ text file format, adding a JavaScript Object Notation (JSON) file format, implementing a REpresentational State Transfer (REST) interface, and nearly quadrupling the number of datasets hosted on the repository within the last three years. In order to keep up with the quickly evolving state of the MW repository, the ‘mwtab’ Python library and package have been continuously updated to mirror the changes in the ‘mwTab’ and JSONized formats and contain many new enhancements including methods for interacting with the MW REST interface, enhanced format validation features, and advanced features for parsing and searching for specific metabolite data and metadata. We used the enhanced format validation features to evaluate all available datasets in MW to facilitate improved curation and FAIRness of the repository. The ‘mwtab’ Python package is now officially released as version 1.0.1 and is freely available on GitHub and the Python Package Index (PyPI) under a Clear Berkeley Software Distribution (BSD) license with documentation available on ReadTheDocs.

APA, Harvard, Vancouver, ISO, and other styles

41

Morris, Jeremy W. "Making music behave: Metadata and the digital music commodity." New Media & Society 14, no. 5 (February 28, 2012): 850–66. http://dx.doi.org/10.1177/1461444811430645.

Full text

Abstract:

This article offers a case study of the Compact Disc Database and ID3 tags, two instrumental information technologies for digital music on computers. Using an interpretive analysis of the technical and cultural features of the Compact Disc Database and ID3 tags as well as press releases and journalistic accounts detailing the rise of these services, this article places digital metadata within the broader history of recorded music specifically and digital objects more generally. Started as hobby projects, the Compact Disc Database and ID3 tags have evolved into central components of the digital music ecosystem. As keystone technologies, they contributed to the emergence of a digital music commodity. Since both technologies derive much of their value from user contributions, this article also contributes to current theorization on the role of users in the production of digital commodities.

APA, Harvard, Vancouver, ISO, and other styles

42

Sefton, Peter, Ian Barnes, Ron Ward, and Jim Downing. "Embedding Metadata and Other Semantics in Word Processing Documents." International Journal of Digital Curation 4, no. 2 (October 15, 2009): 93–106. http://dx.doi.org/10.2218/ijdc.v4i2.96.

Full text

Abstract:

This paper describes a technique for embedding document metadata, and potentially other semantic references inline in word processing documents, which the authors have implemented with the help of a software development team. Several assumptions underly the approach; It must be available across computing platforms and work with both Microsoft Word (because of its user base) and OpenOffice.org (because of its free availability). Further the application needs to be acceptable to and usable by users, so the initial implementation covers only small number of features, which will only be extended after user-testing. Within these constraints the system provides a mechanism for encoding not only simple metadata, but for inferring hierarchical relationships between metadata elements from a ‘flat’ word processing file.The paper includes links to open source code implementing the techniques as part of a broader suite of tools for academic writing. This addresses tools and software, semantic web and data curation, integrating curation into research workflows and will provide a platform for integrating work on ontologies, vocabularies and folksonomies into word processing tools.

APA, Harvard, Vancouver, ISO, and other styles

43

Wu, H., and K. Fu. "A MANAGEMENT OF REMOTE SENSING BIG DATA BASE ON STANDARD METADATA FILE AND DATABASE MANAGEMENT SYSTEM." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-3/W10 (February 7, 2020): 653–57. http://dx.doi.org/10.5194/isprs-archives-xlii-3-w10-653-2020.

Full text

Abstract:

Abstract. As a kind of information carrier which is high capacity, remarkable reliability, easy to obtain and the other features，remote sensing image data is widely used in the fields of natural resources survey, monitoring, planning, disaster prevention and the others (Huang, Jie, et al, 2008). Considering about the daily application scenario for the remote sensing image in professional departments, the demand of usage and management of remote sensing big data is about to be analysed in this paper.In this paper, by combining professional department scenario, the application of remote sensing image analysis of remote sensing data in the use and management of professional department requirements, on the premise of respect the habits, is put forward to remote sensing image metadata standard for reference index, based on remote sensing image files and database management system, large data serialization of time management methods, the method to the realization of the design the metadata standard products, as well as to the standard of metadata content indexed storage of massive remote sensing image database management.

APA, Harvard, Vancouver, ISO, and other styles

44

Summann, Friedrich, Andreas Czerniak, Jochen Schirrwagen, and Dirk Pieper. "Data Science Tools for Monitoring the Global Repository Eco-System and its Lines of Evolution." Publications 8, no. 2 (June 22, 2020): 35. http://dx.doi.org/10.3390/publications8020035.

Full text

Abstract:

The global network of scholarly repositories for the publication and dissemination of scientific publications and related materials can already look back on a history of more than twenty years. During this period, there have been many developments in terms of technical optimization and the increase of content. It is crucial to observe and analyze this evolution in order to draw conclusions for the further development of repositories. The basis for such an analysis is data. The Open Archives Initiative (OAI) service provider Bielefeld Academic Search Engine (BASE) started indexing repositories in 2004 and has collected metadata also on repositories. This paper presents the main features of a planned repository monitoring system. Data have been collected since 2004 and includes basic repository metadata as well as publication metadata of a repository. This information allows an in-depth analysis of many indicators in different logical combinations. This paper outlines the systems approach and the integration of data science techniques. It describes the intended monitoring system and shows the first results.

APA, Harvard, Vancouver, ISO, and other styles

45

Zhu, Nan, Yangdi Lu, Wenbo He, Hua Yu, and Jike Ge. "Towards Update-Efficient and Parallel-Friendly Content-Based Indexing Scheme in Cloud Computing." International Journal of Semantic Computing 12, no. 02 (June 2018): 191–213. http://dx.doi.org/10.1142/s1793351x1840010x.

Full text

Abstract:

The sheer volume of contents generated by today’s Internet services is stored in the cloud. The effective indexing method is important to provide the content to users on demand. The indexing method associating the user-generated metadata with the content is vulnerable to the inaccuracy caused by the low quality of the metadata. While the content-based indexing does not depend on the error-prone metadata, the state-of-the-art research focuses on developing descriptive features and misses the system-oriented considerations when incorporating these features into the practical cloud computing systems. We propose an Update-Efficient and Parallel-Friendly content-based indexing system, called Partitioned Hash Forest (PHF). The PHF system incorporates the state-of-the-art content-based indexing models and multiple system-oriented optimizations. PHF contains an approximate content-based index and leverages the hierarchical memory system to support the high volume of updates. Additionally, the content-aware data partitioning and lock-free concurrency management module enable the parallel processing of the concurrent user requests. We evaluate PHF in terms of indexing accuracy and system efficiency by comparing it with the state-of-the-art content-based indexing algorithm and its variances. We achieve the significantly better accuracy with less resource consumption, around 37% faster in update processing and up to 2.5[Formula: see text] throughput speedup in a multi-core platform comparing to other parallel-friendly designs.

APA, Harvard, Vancouver, ISO, and other styles

46

Mazrouee, Sepideh, Susan J. Little, and Joel O. Wertheim. "Incorporating metadata in HIV transmission network reconstruction: A machine learning feasibility assessment." PLOS Computational Biology 17, no. 9 (September 22, 2021): e1009336. http://dx.doi.org/10.1371/journal.pcbi.1009336.

Full text

Abstract:

HIV molecular epidemiology estimates the transmission patterns from clustering genetically similar viruses. The process involves connecting genetically similar genotyped viral sequences in the network implying epidemiological transmissions. This technique relies on genotype data which is collected only from HIV diagnosed and in-care populations and leaves many persons with HIV (PWH) who have no access to consistent care out of the tracking process. We use machine learning algorithms to learn the non-linear correlation patterns between patient metadata and transmissions between HIV-positive cases. This enables us to expand the transmission network reconstruction beyond the molecular network. We employed multiple commonly used supervised classification algorithms to analyze the San Diego Primary Infection Resource Consortium (PIRC) cohort dataset, consisting of genotypes and nearly 80 additional non-genetic features. First, we trained classification models to determine genetically unrelated individuals from related ones. Our results show that random forest and decision tree achieved over 80% in accuracy, precision, recall, and F1-score by only using a subset of meta-features including age, birth sex, sexual orientation, race, transmission category, estimated date of infection, and first viral load date besides genetic data. Additionally, both algorithms achieved approximately 80% sensitivity and specificity. The Area Under Curve (AUC) is reported 97% and 94% for random forest and decision tree classifiers respectively. Next, we extended the models to identify clusters of similar viral sequences. Support vector machine demonstrated one order of magnitude improvement in accuracy of assigning the sequences to the correct cluster compared to dummy uniform random classifier. These results confirm that metadata carries important information about the dynamics of HIV transmission as embedded in transmission clusters. Hence, novel computational approaches are needed to apply the non-trivial knowledge collected from inter-individual genetic information to metadata from PWH in order to expand the estimated transmissions. We note that feature extraction alone will not be effective in identifying patterns of transmission and will result in random clustering of the data, but its utilization in conjunction with genetic data and the right algorithm can contribute to the expansion of the reconstructed network beyond individuals with genetic data.

APA, Harvard, Vancouver, ISO, and other styles

47

Liakos, Panagiotis, Panagiota Koltsida, George Kakaletris, Peter Baumann, Yannis Ioannidis, and Alex Delis. "A Distributed Infrastructure for Earth-Science Big Data Retrieval." International Journal of Cooperative Information Systems 24, no. 02 (June 2015): 1550002. http://dx.doi.org/10.1142/s0218843015500021.

Full text

Abstract:

Earth-Science data are composite, multi-dimensional and of significant size, and as such, continue to pose a number of ongoing problems regarding their management. With new and diverse information sources emerging as well as rates of generated data continuously increasing, a persistent challenge becomes more pressing: To make the information existing in multiple heterogeneous resources readily available. The widespread use of the XML data-exchange format has enabled the rapid accumulation of semi-structured metadata for Earth-Science data. In this paper, we exploit this popular use of XML and present the means for querying metadata emanating from multiple sources in a succinct and effective way. Thereby, we release the user from the very tedious and time consuming task of examining individual XML descriptions one by one. Our approach, termed Meta-Array Data Search (MAD Search), brings together diverse data sources while enhancing the user-friendliness of the underlying information sources. We gather metadata using different standards and construct an amalgamated service with the help of tools that discover and harvest such metadata; this service facilitates the end-user by offering easy and timely access to all metadata. The main contribution of our work is a novel query language termed xWCPS, that builds on top of two widely-adopted standards: XQuery and the Web Coverage Processing Service (WCPS). xWCPS furnishes a rich set of features regarding the way scientific data can be queried with. Our proposed unified language allows for requesting metadata while also giving processing directives. Consequently, the xWCPS-enabled MAD Search helps in both retrieval and processing of large data sets hosted in an heterogeneous infrastructure. We demonstrate the effectiveness of our approach through diverse use-cases that provide insights into the syntactic power and overall expressiveness of xWCPS. We evaluate MAD Search in a distributed environment that comprises five high-volume array-databases whose sizes range between 20 and 100 GB and so, we ascertain the applicability and potential of our proposal.

APA, Harvard, Vancouver, ISO, and other styles

48

Kuźma, Marta, and Albina Mościcka. "Accessibility evaluation of topographic maps in the National Library of Poland." Abstracts of the ICA 1 (July 15, 2019): 1. http://dx.doi.org/10.5194/ica-abs-1-201-2019.

Full text

Abstract:

<p><strong>Abstract.</strong> Digital libraries are created and managed mainly by traditional libraries, archives and museums. They collect, process, and make available digitized collections and data about them. These collections often constitute cultural heritage and they include, among others: books (including old prints), magazines, manuscripts, photographs, maps, atlases, postcards and graphics. An example of such a library is the National Library of Poland. It collects and provides digitally available data of about 55,000 maps.</p><p>The effective use of cultural heritage resources and information from National Library of Poland gives the prerequisites and challenges for multidisciplinary research and cross-sectoral cooperation. These resources are an unlimited source of knowledge, constituting value in themselves but also providing data for many new studies, including interdisciplinary studies of the past. Information necessary for such research is usually distributed across a wide spectrum of fields, formats and languages, reflecting different points of view, and the key task is to find them in digital libraries.</p><p>The growth of digital library collections requires high-quality metadata to make the materials collected by libraries fully accessible and to enable their integration and sharing between institutions. Consequently, three main metadata quality criteria have been defined to enable metadata management and evaluation. They are: accuracy, consistency, and completeness (Park, 2009, Park and Tosaka, 2010). Different aspects of metadata quality can also be defined as: accessibility, accuracy, availability, compactness, comprehensiveness, content, consistency, cost, data structure, ease of creation, ease of use, cost efficiency, flexibility, fitness for use, informativeness, quantity, reliability, standard, timeliness, transfer, usability (Moen et al., 1998). This list tells us where errors in metadata occur, which can result in hindering or completely disabling access to materials available through a digital library.</p><p>Archival maps have always been present in the libraries. In the digital age, geographical space has begun to exist in libraries in two aspects: as old maps’ collections, as well as a geographic reference of sources other than cartographic materials. Despite many experiences in this field, the authors emphasize that the main problem is related to the fact that most libraries are not populating the coordinates to the metadata, which is required to enable and support geographical search (Southall and Pridal, 2012).</p><p>During this stage the concept of research is born and the source materials necessary for the realization of this concept are collected. When using archival maps for such studies, it is important to be aware of detailed literature studies, including cartographic assumptions, the course and accuracy of cartographic works, the way of printing, the scope of updates of subsequent editions, and the period in which the given map was created. The ability to use cartographic materials also depends on the destination map. The awareness of the above issues allows researchers to avoid errors frequently made by non-cartographers, i.e. to prevent comparing maps on different scales and treating them as a basis for formulating very detailed yet unfortunately erroneous conclusions. Thus, one of the key tasks is to find materials that are comparable in terms of scale and that cover the same area and space in the historical period of interest.</p><p>The research aim is to evaluate the quality of topographic maps metadata provided by the National Library of Poland, which are the basis for effective access to cartographic resources.</p><p>The first research question is: how should topographic maps be described in metadata to enable finding them in the National Library of Poland? In other words, what kind of map-specific information should be saved in metadata (and in what way) to provide the proper characteristic of the spatially-related object?</p><p>The second research question is: which topographic maps have the best metadata in such a way as to give the users the best chance of finding the cartographic materials necessary for their research?</p><p>The paper will present the results of research connected with finding criteria and features to metadata evaluation, it means how archival maps are described. For the maps, it is a set of map features, which are collected in the metadata. This set includes the geographic location, map scale, map orientation, and cartographic presentation methods. The conducted evaluation refers to the quality of metadata, or, in other words, the accessibility of archival cartographic resources.</p>

APA, Harvard, Vancouver, ISO, and other styles

49

Susuri, Arsim, Mentor Hamiti, and Agni Dika. "Detection of Vandalism in Wikipedia using Metadata Features – Implementation in Simple English and Albanian sections." Advances in Science, Technology and Engineering Systems Journal 2, no. 4 (March 2017): 1–7. http://dx.doi.org/10.25046/aj020401.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Al-Fatlawii, Talib, and Abbas AL_Bakery. "Distributed Agents for Web Content Filtering." Iraqi Journal for Computers and Informatics 42, no. 1 (December 31, 2016): 1–4. http://dx.doi.org/10.25195/ijci.v42i1.77.

Full text

Abstract:

This paper describe Web Content Filtering that aimed to block out offensive material by using DistributedAgents. The proposed system using FCM algorithm and other page's features (Title, Metadata , Warning Message) to classifythe websites (using as candidate) into two types:- white that considered acceptable, and black that contain harmful materialtaking the English Pornographic websites as a case study.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!