Journal articles on the topic 'Cassandra database system'

To see the other types of publications on this topic, follow the link: Cassandra database system.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 45 journal articles for your research on the topic 'Cassandra database system.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Andor, C. F. "Performance Benchmarking for NoSQL Database Management Systems." Studia Universitatis Babeș-Bolyai Informatica 66, no. 1 (July 1, 2021): 23. http://dx.doi.org/10.24193/subbi.2021.1.02.

Full text
Abstract:
NoSQL database management systems are very diverse and are known to evolve very fast. With so many NoSQL database options available nowadays, it is getting harder to make the right choice for certain use cases. Also, even for a given NoSQL database management system, performance may vary significantly between versions. Database performance benchmarking shows the actual performance for different scenarios on different hardware configurations in a straightforward and precise manner. This paper presents a NoSQL database performance study in which two of the most popular NoSQL database management systems (MongoDB and Apache Cassandra) are compared, and the analyzed metric is throughput. Results show that Apache Cassandra outperformes MongoDB in an update heavy scenario only when the number of operations is high. Also, for a read intensive scenario, Apache Cassandra outperformes MongoDB only when both number of operations and degree of parallelism are high.
APA, Harvard, Vancouver, ISO, and other styles
2

Gorbenko, Anatoliy, and Olga Tarasyuk. "EXPLORING TIMEOUT AS A PERFORMANCE AND AVAILABILITY FACTOR OF DISTRIBUTED REPLICATED DATABASE SYSTEMS." RADIOELECTRONIC AND COMPUTER SYSTEMS, no. 4 (November 27, 2020): 98–105. http://dx.doi.org/10.32620/reks.2020.4.09.

Full text
Abstract:
A concept of distributed replicated data storages like Cassandra, HBase, MongoDB has been proposed to effectively manage the Big Data sets whose volume, velocity, and variability are difficult to deal with by using the traditional Relational Database Management Systems. Trade-offs between consistency, availability, partition tolerance, and latency are intrinsic to such systems. Although relations between these properties have been previously identified by the well-known CAP theorem in qualitative terms, it is still necessary to quantify how different consistency and timeout settings affect system latency. The paper reports results of Cassandra's performance evaluation using the YCSB benchmark and experimentally demonstrates how to read latency depends on the consistency settings and the current database workload. These results clearly show that stronger data consistency increases system latency, which is in line with the qualitative implication of the CAP theorem. Moreover, Cassandra latency and its variation considerably depend on the system workload. The distributed nature of such a system does not always guarantee that the client receives a response from the database within a finite time. If this happens, it causes so-called timing failures when the response is received too late or is not received at all. In the paper, we also consider the role of the application timeout which is the fundamental part of all distributed fault tolerance mechanisms working over the Internet and used as the main error detection mechanism here. The role of the application timeout as the main determinant in the interplay between system availability and responsiveness is also examined in the paper. It is quantitatively shown how different timeout settings could affect system availability and the average servicing and waiting time. Although many modern distributed systems including Cassandra use static timeouts it was shown that the most promising approach is to set timeouts dynamically at run time to balance performance, availability and improve the efficiency of the fault-tolerance mechanisms.
APA, Harvard, Vancouver, ISO, and other styles
3

Wang, Bo-Qian, Qi Yu, Xin Liu, Li Shen, and Zhi-ying Wang. "A System Performance Estimation Model for Cassandra Database." International Journal of Database Theory and Application 9, no. 3 (March 31, 2016): 123–36. http://dx.doi.org/10.14257/ijdta.2016.9.3.14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

P, aralakshmi, Hima S, and Surekha Mariam Varghese. "Cassandra a distributed NoSQL database for Hotel Management System." International Journal on Cybernetics & Informatics 5, no. 2 (April 30, 2016): 109–16. http://dx.doi.org/10.5121/ijci.2016.5212.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Béjar-Martos, Juan A., Antonio J. Rueda-Ruiz, Carlos J. Ogayar-Anguita, Rafael J. Segura-Sánchez, and Alfonso López-Ruiz. "Strategies for the Storage of Large LiDAR Datasets—A Performance Comparison." Remote Sensing 14, no. 11 (May 31, 2022): 2623. http://dx.doi.org/10.3390/rs14112623.

Full text
Abstract:
The widespread use of LiDAR technologies has led to an ever-increasing volume of captured data that pose a continuous challenge for its storage and organization, so that it can be efficiently processed and analyzed. Although the use of system files in formats such as LAS/LAZ is the most common solution for LiDAR data storage, databases are gaining in popularity due to their evident advantages: centralized and uniform access to a collection of datasets; better support for concurrent retrieval; distributed storage in database engines that allows sharding; and support for metadata or spatial queries by adequately indexing or organizing the data. The present work evaluates the performance of four popular NoSQL and relational database management systems with large LiDAR datasets: Cassandra, MongoDB, MySQL and PostgreSQL. To perform a realistic assessment, we integrate these database engines in a repository implementation with an elaborate data model that enables metadata and spatial queries and progressive/partial data retrieval. Our experimentation concludes that, as expected, NoSQL databases show a modest but significant performance difference in favor of NoSQL databases, and that Cassandra provides the best overall database solution for LiDAR data.
APA, Harvard, Vancouver, ISO, and other styles
6

Gorbenko, Anatoliy, Andrii Karpenko, and Olga Tarasyuk. "Performance evaluation of various deployment scenarios of the 3-replicated Cassandra NoSQL cluster on AWS." RADIOELECTRONIC AND COMPUTER SYSTEMS, no. 4 (November 29, 2021): 157–65. http://dx.doi.org/10.32620/reks.2021.4.13.

Full text
Abstract:
A concept of distributed replicated NoSQL data storages Cassandra-like, HBase, MongoDB has been proposed to effectively manage Big Data set whose volume, velocity and variability are difficult to deal with by using the traditional Relational Database Management Systems. Tradeoffs between consistency, availability, partition tolerance and latency is intrinsic to such systems. Although relations between these properties have been previously identified by the well-known CAP and PACELC theorems in qualitative terms, it is still necessary to quantify how different consistency settings, deployment patterns and other properties affect system performance.This experience report analysis performance of the Cassandra NoSQL database cluster and studies the tradeoff between data consistency guaranties and performance in distributed data storages. The primary focus is on investigating the quantitative interplay between Cassandra response time, throughput and its consistency settings considering different single- and multi-region deployment scenarios. The study uses the YCSB benchmarking framework and reports the results of the read and write performance tests of the three-replicated Cassandra cluster deployed in the Amazon AWS. In this paper, we also put forward a notation which can be used to formally describe distributed deployment of Cassandra cluster and its nodes relative to each other and to a client application. We present quantitative results showing how different consistency settings and deployment patterns affect Cassandra performance under different workloads. In particular, our experiments show that strong consistency costs up to 22 % of performance in case of the centralized Cassandra cluster deployment and can cause a 600 % increase in the read/write requests if Cassandra replicas and its clients are globally distributed across different AWS Regions.
APA, Harvard, Vancouver, ISO, and other styles
7

Aniceto, Rodrigo, Rene Xavier, Valeria Guimarães, Fernanda Hondo, Maristela Holanda, Maria Emilia Walter, and Sérgio Lifschitz. "Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency." International Journal of Genomics 2015 (2015): 1–7. http://dx.doi.org/10.1155/2015/502795.

Full text
Abstract:
Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.
APA, Harvard, Vancouver, ISO, and other styles
8

Stetsyk, Oleksii, and Svitlana Terenchuk. "COMPARATIVE ANALYSIS OF NOSQL DATABASES ARCHITECTURE." Management of Development of Complex Systems, no. 47 (September 27, 2021): 78–82. http://dx.doi.org/10.32347/2412-9933.2021.47.78-82.

Full text
Abstract:
This article is devoted to the study of problematic issues due to the growing scale and requirements for modern high-load distributed systems. The relevance of the work is ensured by the fact that an important component of each such system is a database. The paper highlights the main problems associated with the use of relational databases in many high-load distributed systems. The main focus is on the study of such properties as data consistency, availability, and stability of the system. Basic information about the architecture and purpose of non-relational databases with a wide column, databases of key-value type, and document-oriented databases is provided. The advantages and disadvantages of non-relational databases of different types are shown, which are manifested in solving different problems depending on the purpose and features of the system. The choice of non-relational databases of different types for comparative analysis is substantiated. Databases such as Cassandra, Redis, and Mongo, which have long been used in high-load distributed systems and have already proven themselves among users, have been studied in detail. The main task addressed in this article was to find an answer to the question of the feasibility of using non-relational databases of the architecture of Cassandra, Redis, and Mongo depending on the characteristics of the system, or record information. Based on the analysis, options for using these databases for systems with a high number of requests to read or write information are proposed.
APA, Harvard, Vancouver, ISO, and other styles
9

Milošević, Danijela, Selver Pepić, Muzafer Saračević, and Milan Tasić. "Weighted Moore–Penrose generalized matrix inverse: MySQL vs. Cassandra database storage system." Sādhanā 41, no. 8 (August 2016): 837–46. http://dx.doi.org/10.1007/s12046-016-0523-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Deng, Lu, Wei Hong Han, Dong Liu, and Ying Xiong. "The Model of Fuzzy Retrieval Based on External Index." Applied Mechanics and Materials 380-384 (August 2013): 1605–8. http://dx.doi.org/10.4028/www.scientific.net/amm.380-384.1605.

Full text
Abstract:
With the rapid development of the Internet, database is getting more and more attention for its advantages of high concurrency, high scalability and high availability. In order to solve the problem that how to retrieving the data in the database efficiently and accurately, a model of fuzzy retrieval based on external index is proposed in this paper. By this external index, the efficiency of the NoSQL databases retrieval whose column not be appointed is greatly improved. As an example, the Cassandra database is adopted to store the data and the external index is stored in the national databases. A community information management system is utilized to show the feasibility of the model. The results show that the model can save a lot of time in retrieval whose column not be appointed. Moreover, this model can also be used to some other NoSQL databases.
APA, Harvard, Vancouver, ISO, and other styles
11

Dahunsi, F. M., A. J. Joseph, O. A. Sarumi, and O. O. Obe. "Database management system for mobile crowdsourcing applications." Nigerian Journal of Technology 40, no. 4 (October 27, 2021): 713–27. http://dx.doi.org/10.4314/njt.v40i4.18.

Full text
Abstract:
The evaluation of mobile crowdsourcing activities and reports require a viable and large volume of data. These data are gathered in real-time and from a large number of paid or unpaid volunteers over a period. A high volume of quality data from smartphones or mobile devices is pivotal to the accuracy and validity of the results. Therefore, there is a need for a robust and scalable database structure that can effectively manage and store the large volumes of data collected from various volunteers without compromising the integrity of the data. An in-depth review of various database designs to select the most suitable that will meet the needs of a real-time, robust and large volunteer data handling system is presented. A non-relational database was proposed for the mobile- end database: Google Cloud Firestore specifically due to its support for mobile client implementation, this choice also makes the integration of data from the mobile end-users to the cloud-hosted database relatively easier with all proposed services being part of the Google Cloud Platform; although it is not as popular as some other database services. Separate comparative reviews of the Database Management System (DBMS) performance demonstrated that MongoDB (a non-relational database) performed better when reading large datasets and performing full-text queries, while MySQL (relational) and Cassandra (non-relational) performed much better for data insertion. Google BigQuery was proposed as an appropriate data warehouse solution. It will provide continuity and direct integration with Cloud Firestore and its Application Programming Interface (API) for data migration from Cloud Firestore to BigQuery, and the local server. Also Google BigQuery provides machine learning support for data analytics.
APA, Harvard, Vancouver, ISO, and other styles
12

Jansen, Gregory, Aaron Coburn, Adam Soroka, Will Thomas, and Richard Marciano. "DRAS-TIC Linked Data: Evenly Distributing the Past." Publications 7, no. 3 (July 4, 2019): 50. http://dx.doi.org/10.3390/publications7030050.

Full text
Abstract:
Memory institutions must be able to grow a fully-functional repository incrementally as collections grow, without expensive enterprise storage, massive data migrations, and the performance limits that stem from the vertical storage strategies. The Digital Repository at Scale that Invites Computation (DRAS-TIC) Fedora research project, funded by a two-year National Digital Platform grant from the Institute for Museum and Library Services (IMLS), is producing open-source software, tested cluster configurations, documentation, and best-practice guides that enable institutions to manage linked data repositories with petabyte-scale collections reliably. DRAS-TIC is a research initiative at the University of Maryland (UMD). The first DRAS-TIC repository system, named Indigo, was developed in 2015 and 2016 through a collaboration between U.K.-based storage company, Archive Analytics Ltd., and the UMD iSchool Digital Curation Innovation Center (DCIC), through funding from an NSF DIBBs (Data Infrastructure Building Blocks) grant (NCSA “Brown Dog”). DRAS-TIC Indigo leverages industry standard distributed database technology, in the form of Apache Cassandra, to provide open-ended scaling of repository storage without performance degradation. With the DRAS-TIC Fedora initiative, we make use of the Trellis Linked Data Platform (LDP), developed by Aaron Coburn at Amherst College, to add the LDP API over similar Apache Cassandra storage. This paper will explain our partner use cases, explore the system components, and showcase our performance-oriented approach, with the most emphasis given to performance measures available through the analytical dashboard on our testbed website.
APA, Harvard, Vancouver, ISO, and other styles
13

Nagarakshitha, B. R., K. S. Lohith, K. P. Aarthy, Arjun Gopkumar, and Uma Satya Ranjan. "Application of NoSQL Technology to Facilitate Storing and Retrieval of Clinical Data Using IndexedDb in Offline Conditions." Journal of Computational and Theoretical Nanoscience 17, no. 9 (July 1, 2020): 4012–15. http://dx.doi.org/10.1166/jctn.2020.9010.

Full text
Abstract:
Data collection is a very important aspect of any research, especially while dealing with the collection of clinical data. This paper presents a way to collect and manage clinical data using a web application with offline functionality. The whole application is an end-to-end PWA providing an interface to collect the data, store, and query. The available data is very huge and it is unstructured. To store it, a NoSQL database such as Cassandra is most suitable. The data will mostly be used for an OLAP system for querying the data, cleaning and performing analysis on it. In the process of collection of data, the application has to work under low or no bandwidth conditions for which a NoSQL system provided by most web browsers called IndexedDB, can be used to locally store data under offline conditions.
APA, Harvard, Vancouver, ISO, and other styles
14

Cereda, Stefano, Stefano Valladares, Paolo Cremonesi, and Stefano Doni. "CGPTuner." Proceedings of the VLDB Endowment 14, no. 8 (April 2021): 1401–13. http://dx.doi.org/10.14778/3457390.3457404.

Full text
Abstract:
Properly selecting the configuration of a database management system (DBMS) is essential to increase performance and reduce costs. However, the task is astonishingly tricky due to a large number of tunable configuration parameters and their inter-dependencies. Also, the optimal configuration depends upon the workload to which the DBMS is exposed. To extract the full potential of a DBMS, we must also consider the entire IT stack on which the DBMS is running, comprising layers like the Java virtual machine, the operating system and the physical machine. Each layer offers a multitude of parameters that we should take into account. The available parameters vary as new software versions are released, making it impractical to rely on historical knowledge bases. We present a novel tuning approach for the DBMS configuration auto-tuning that quickly finds a well-performing configuration of an IT stack and adapts it to workload variations, without having to rely on a knowledge base. We evaluate the proposed approach using the Cassandra and MongoDB DBMSs, showing that it adjusts the suggested configuration to the observed workload and is portable across different IT applications. We try to minimise the memory consumption without increasing the response time, showing that the proposed approach reduces the response time and increases the memory requirements only under heavy-load conditions, reducing it again when the load decreases.
APA, Harvard, Vancouver, ISO, and other styles
15

Das, Moumita, Jack C. P. Cheng, and Kincho H. Law. "An ontology-based web service framework for construction supply chain collaboration and management." Engineering, Construction and Architectural Management 22, no. 5 (September 21, 2015): 551–72. http://dx.doi.org/10.1108/ecam-07-2014-0089.

Full text
Abstract:
Purpose – The purpose of this paper is to present a framework for integrating construction supply chain in order to resolve the data heterogeneity and data sharing problems in the construction industry. Design/methodology/approach – Standardized web service technology is used in the proposed framework for data specification, transfer, and integration. Open standard SAWSDL is used to annotate web service descriptions with pointers to concepts defined in ontologies. NoSQL database Cassandra is used for distributed data storage among construction supply chain stakeholders. Findings – Ontology can be used to support heterogeneous data transfer and integration through web services. Distributed data storage facilitates data sharing and enhances data control. Practical implications – This paper presents examples of two ontologies for expressing construction supply chain information – ontology for material and ontology for purchase order. An example scenario is presented to demonstrate the proposed web service framework for material procurement process involving three parties, namely, project manager, contractor, and material supplier. Originality/value – The use of web services is not new to construction supply chains (CSCs). However, it still faces problems in channelizing information along CSCs due to data heterogeneity. Trust issue is also a barrier to information sharing for integrating supply chains in a centralized collaboration system. In this paper, the authors present a web service framework, which facilitates storage and sharing of information on a distributed manner mediated through ontology-based web services. Security is enhanced with access control. A data model for the distributed databases is also presented for data storage and retrieval.
APA, Harvard, Vancouver, ISO, and other styles
16

Ferencz, Katalin. "Overview of Modern Nosql Database Management Systems. Case Study: Apache Cassandra." Műszaki Tudományos Közlemények 9, no. 1 (October 1, 2018): 83–86. http://dx.doi.org/10.33894/mtk-2018.09.16.

Full text
Abstract:
Abstract The wide spread of IoT devices makes possible the collection of enormous amounts of sensor data. Traditional SQL (structured query language) database management systems are not the most appropriate for storing this type of data. For this task, distributed database management systems are the most adequate. Apache Cassandra is an open source, distributed database server software that stores large amounts of data on low-coast servers, providing high availability. The Cassandra uses the gossip protocol to exchange information between the distributed servers. The query language used is the CQL (Cassandra Query Language). In this paper we present an alternative solution to traditional SQL-based database management systems - the so called NoSQL type database management systems, summarize the main types of these systems and provide a detailed description of the Apache Cassandra open source distributed database server installation, configuration and operation.
APA, Harvard, Vancouver, ISO, and other styles
17

Lu, Wenjuan, Aiguo Liu, and Chengcheng Zhang. "Research and implementation of big data visualization based on WebGIS." Proceedings of the ICA 2 (July 10, 2019): 1–6. http://dx.doi.org/10.5194/ica-proc-2-79-2019.

Full text
Abstract:
<p><strong>Abstract.</strong> With the development of geographic information technology, the way to get geographical information is constantly, and the data of space-time is exploding, and more and more scholars have started to develop a field of data processing and space and time analysis. In this, the traditional data visualization technology is high in popularity and simple and easy to understand, through simple pie chart and histogram, which can reveal and analyze the characteristics of the data itself, but still cannot combine with the map better to display the hidden time and space information to exert its application value. How to fully explore the spatiotemporal information contained in massive data and accurately explore the spatial distribution and variation rules of geographical things and phenomena is a key research problem at present. Based on this, this paper designed and constructed a universal thematic data visual analysis system that supports the full functions of data warehousing, data management, data analysis and data visualization. In this paper, Weifang city is taken as the research area, starting from the aspects of rainfall interpolation analysis and population comprehensive analysis of Weifang, etc., the author realizes the fast and efficient display under the big data set, and fully displays the characteristics of spatial and temporal data through the visualization effect of thematic data. At the same time, Cassandra distributed database is adopted in this research, which can also store, manage and analyze big data. To a certain extent, it reduces the pressure of front-end map drawing, and has good query analysis efficiency and fast processing ability.</p>
APA, Harvard, Vancouver, ISO, and other styles
18

Karpenko, Andrii, Olga Tarasyuk, and Anatoliy Gorbenko. "Research consistency and perfomance of nosql replicated databases." Advanced Information Systems 5, no. 3 (October 18, 2021): 66–75. http://dx.doi.org/10.20998/2522-9052.2021.3.09.

Full text
Abstract:
This paper evaluates performance of distributed fault-tolerant computer systems and replicated NoSQL databases and studies the impact of data consistency on performance and throughput on the example of a three-replicated Cassandra cluster. The paper presents results of heavy-load testing (benchmarking) of Cassandra cluster’s read and write performance which replicas were deployed on Amazon EC2 cloud. The presented quantitative results show how different consistency settings affect the performance of a Cassandra cluster under different workloads considering two deployment scenarios: when all cluster replicas are located in the sane data center, and when they are geographically distributed across different data centers (i.e. Amazon availability zones). We propose a new method of minimizing Cassandra response time while ensuring strong data consistency which is based on optimization of consistency settings depending on the current workload and the proportion between read and write operations.
APA, Harvard, Vancouver, ISO, and other styles
19

Chang, Bao Rong, Hsiu-Fen Tsai, and Cin-Long Guo. "Applying Intelligent Adaptation to Remote Cloud Datacenter Backup." Journal of Advanced Computational Intelligence and Intelligent Informatics 20, no. 6 (November 20, 2016): 928–40. http://dx.doi.org/10.20965/jaciii.2016.p0928.

Full text
Abstract:
HBase and Cassandra are two most commonly used large-scale distributed NoSQL database management systems; especially applicable to a large amount of data processing. Regarding remote data backup, each kind of datacenter has its own backup strategy to prevent the risks of data loss. With Thrift Java, this paper aims to implement in-cloud high efficient remote datacenter backup applied to in-cloud NoSQL databases like HBase and Cassandra. The binary communications protocol technology from Apache Thrift is employed to establish the graphical user interface instead of the command line interface so as to ease data manipulation. In order to control the network traffic flow smoothly, intelligent adaptation using ANFIS and PSO is employed to tune the parameters of NoSQL databases during the remote data backup to improve QoS in the network. The stress test has taken on strictly data reading/writing and remote backup of a huge amount of data to verify the effectiveness. Finally, the performance evaluation of a variety of benchmark databases has been done by performance index. As a result, the proposed HBase approach outperforms the other databases.
APA, Harvard, Vancouver, ISO, and other styles
20

Silva-Muñoz, Moisés, Alberto Franzin, and Hugues Bersini. "Automatic configuration of the Cassandra database using irace." PeerJ Computer Science 7 (August 5, 2021): e634. http://dx.doi.org/10.7717/peerj-cs.634.

Full text
Abstract:
Database systems play a central role in modern data-centered applications. Their performance is thus a key factor in the efficiency of data processing pipelines. Modern database systems expose several parameters that users and database administrators can configure to tailor the database settings to the specific application considered. While this task has traditionally been performed manually, in the last years several methods have been proposed to automatically find the best parameter configuration for a database. Many of these methods, however, use statistical models that require high amounts of data and fail to represent all the factors that impact the performance of a database, or implement complex algorithmic solutions. In this work we study the potential of a simple model-free general-purpose configuration tool to automatically find the best parameter configuration of a database. We use the irace configurator to automatically find the best parameter configuration for the Cassandra NoSQL database using the YCBS benchmark under different scenarios. We establish a reliable experimental setup and obtain speedups of up to 30% over the default configuration in terms of throughput, and we provide an analysis of the configurations obtained.
APA, Harvard, Vancouver, ISO, and other styles
21

Matallah, Houcine, Ghalem Belalem, and Karim Bouamrane. "Evaluation of NoSQL Databases." International Journal of Software Science and Computational Intelligence 12, no. 4 (October 2020): 71–91. http://dx.doi.org/10.4018/ijssci.2020100105.

Full text
Abstract:
The explosion of the data quantities, which reflects the scaling of volumes, numbers, and types, has resulted in the development of new locations techniques and access to data. The final steps in this evolution have emerged new technologies: cloud computing and big data. The new requirements and the difficulties encountered in the management of data classified “big data” have emerged NoSQL and NewSQL systems. This paper develops a comparative study about the performance of six solutions NoSQL, employed by the important companies in the IT sector: MongoDB, Cassandra, HBase, Redis, Couchbase, and OrientDB. To compare the performance of these NoSQL systems, the authors will use a very powerful tool called YCSB: Yahoo! Cloud Serving Benchmark. The contribution is to provide some answers to choose the appropriate NoSQL system for the type of data used and the type of processing performed on that data.
APA, Harvard, Vancouver, ISO, and other styles
22

Saundatt, Sujay i. "Databases In The 21’st Century." International Journal for Research in Applied Science and Engineering Technology 10, no. 6 (June 30, 2022): 1440–44. http://dx.doi.org/10.22214/ijraset.2022.43982.

Full text
Abstract:
Abstract: NoSQL databases are the 21’st century databases created to defeat the disadvantages of RDBMS. The objective of NoSQL is to give versatility, accessibility and meet different necessities of distributed computing.The main motivations for NoSQL databases systems are achieving scalability and fail over needs. In the vast majority of the NoSQL data set frameworks, information is parceled and repeated across numerous hubs. Innately, the majority of them utilize either Google's MapReduce or Hadoop Distributed File System or Hadoop MapReduce for information assortment. Cassandra, HBase and MongoDB are for the most part utilized and they can be named as the agent of NoSQL world.
APA, Harvard, Vancouver, ISO, and other styles
23

Kaur, Harpreet. "Analysis of Nosql Database State-of-The-Art Techniques and their Security Issues." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 2 (April 11, 2021): 467–71. http://dx.doi.org/10.17762/turcomat.v12i2.852.

Full text
Abstract:
NOSql database systems are extremely optimized for performing retrieval and adjoining operations on large quantity of data as compared to relational models which are relatively inefficient. They are used majorly for real-time applications and statistically analyzing the growing amount of data. NoSQL databases emerging in market claim to outperform SQL databases. In Present time of technology, every person wants to save and secure its data so that no one can check their information without their permission .However, there are multifarious security issues which are yet to be resolved. In this paper, we are discussing and reviewing about the Nosql databases and their most popular security issues link (Cassandra and Mongo DB).
APA, Harvard, Vancouver, ISO, and other styles
24

Diogo, Miguel, Bruno Cabral, and Jorge Bernardino. "Consistency Models of NoSQL Databases." Future Internet 11, no. 2 (February 14, 2019): 43. http://dx.doi.org/10.3390/fi11020043.

Full text
Abstract:
Internet has become so widespread that most popular websites are accessed by hundreds of millions of people on a daily basis. Monolithic architectures, which were frequently used in the past, were mostly composed of traditional relational database management systems, but quickly have become incapable of sustaining high data traffic very common these days. Meanwhile, NoSQL databases have emerged to provide some missing properties in relational databases like the schema-less design, horizontal scaling, and eventual consistency. This paper analyzes and compares the consistency model implementation on five popular NoSQL databases: Redis, Cassandra, MongoDB, Neo4j, and OrientDB. All of which offer at least eventual consistency, and some have the option of supporting strong consistency. However, imposing strong consistency will result in less availability when subject to network partition events.
APA, Harvard, Vancouver, ISO, and other styles
25

Gupta, Sangeeta. "Performance Evaluation of Unstructured PBRA for Bigdata with Cassandra and MongoDB in Cloud." International Journal of Cloud Applications and Computing 8, no. 3 (July 2018): 48–59. http://dx.doi.org/10.4018/ijcac.2018070104.

Full text
Abstract:
In this article, performance evaluation of web collection data in data stores, such as NoSQL-Cassandra and MongoDB is presented, yielding scalability of applications. In addition to scalability, security of NoSQL databases remains highly unproved. It is noteworthy that existing works in the area of cloud with NoSQL focus on either scalability or security but not both aspects. Also, security, if provided, is at minor interface levels. In this article, the PBRA system is designed to deal with highly unstructured big data emerging from the twitter social networking service, which is new of its kind to strengthen the bigdata security. PBRA is Passphrase Based REST API model where the REST API methods are integrated with the user generated passphrase in addition to the private key for a set of records of user desirable number before storing into the Cassandra and MongoDB databases. Results are presented to illustrate the same for nearly 1 million records and the efficiency of Cassandra over MongoDB is observed. It is observed from the results that though the time taken to load and retrieve bulk data records is higher than dealing with cipher text, Cassandra performs better than MongoDB with the proposed security model.
APA, Harvard, Vancouver, ISO, and other styles
26

Antas, João, Rodrigo Rocha Silva, and Jorge Bernardino. "Assessment of SQL and NoSQL Systems to Store and Mine COVID-19 Data." Computers 11, no. 2 (February 21, 2022): 29. http://dx.doi.org/10.3390/computers11020029.

Full text
Abstract:
COVID-19 has provoked enormous negative impacts on human lives and the world economy. In order to help in the fight against this pandemic, this study evaluates different databases’ systems and selects the most suitable for storing, handling, and mining COVID-19 data. We evaluate different SQL and NoSQL database systems using the following metrics: query runtime, memory used, CPU used, and storage size. The databases systems assessed were Microsoft SQL Server, MongoDB, and Cassandra. We also evaluate Data Mining algorithms, including Decision Trees, Random Forest, Naive Bayes, and Logistic Regression using Orange Data Mining software data classification tests. Classification tests were performed using cross-validation in a table with about 3 M records, including COVID-19 exams with patients’ symptoms. The Random Forest algorithm has obtained the best average accuracy, recall, precision, and F1 Score in the COVID-19 predictive model performed in the mining stage. In performance evaluation, MongoDB has presented the best results for almost all tests with a large data volume.
APA, Harvard, Vancouver, ISO, and other styles
27

Elghamrawy, Sally M., and Aboul Ella Hassanien. "A partitioning framework for Cassandra NoSQL database using Rendezvous hashing." Journal of Supercomputing 73, no. 10 (April 6, 2017): 4444–65. http://dx.doi.org/10.1007/s11227-017-2027-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Pop, Claudia, Marcel Antal, Tudor Cioara, Ionut Anghel, David Sera, Ioan Salomie, Giuseppe Raveduto, Denisa Ziu, Vincenzo Croce, and Massimo Bertoncini. "Blockchain-Based Scalable and Tamper-Evident Solution for Registering Energy Data." Sensors 19, no. 14 (July 10, 2019): 3033. http://dx.doi.org/10.3390/s19143033.

Full text
Abstract:
Nowadays, it has been recognized that blockchain can provide the technological infrastructure for developing decentralized, secure, and reliable smart energy grid management systems. However, an open issue that slows the adoption of blockchain technology in the energy sector is the low scalability and high processing overhead when dealing with the real-time energy data collected by smart energy meters. Thus, in this paper, we propose a scalable second tier solution which combines the blockchain ledger with distributed queuing systems and NoSQL (Not Only SQL database) databases to allow the registration of energy transactions less frequently on the chain without losing the tamper-evident benefits brought by the blockchain technology. At the same time, we propose a technique for tamper-evident registration of smart meters’ energy data and associated energy transactions using digital fingerprinting which allows the energy transaction to be linked hashed-back on-chain, while the sensors data is stored off-chain. A prototype was implemented using Ethereum and smart contracts for the on-chain components while for the off-chain components we used Cassandra database and RabbitMQ messaging broker. The prototype proved to be effective in managing a settlement of energy imbalances use-case and during the evaluation conducted in simulated environment shows promising results in terms of scalability, throughput, and tampering of energy data sampled by smart energy meters.
APA, Harvard, Vancouver, ISO, and other styles
29

Khashan, Eman, Ali Eldesouky, and Sally Elghamrawy. "An adaptive spark-based framework for querying large-scale NoSQL and relational databases." PLOS ONE 16, no. 8 (August 19, 2021): e0255562. http://dx.doi.org/10.1371/journal.pone.0255562.

Full text
Abstract:
The growing popularity of big data analysis and cloud computing has created new big data management standards. Sometimes, programmers may interact with a number of heterogeneous data stores depending on the information they are responsible for: SQL and NoSQL data stores. Interacting with heterogeneous data models via numerous APIs and query languages imposes challenging tasks on multi-data processing developers. Indeed, complex queries concerning homogenous data structures cannot currently be performed in a declarative manner when found in single data storage applications and therefore require additional development efforts. Many models were presented in order to address complex queries Via multistore applications. Some of these models implemented a complex unified and fast model, while others’ efficiency is not good enough to solve this type of complex database queries. This paper provides an automated, fast and easy unified architecture to solve simple and complex SQL and NoSQL queries over heterogeneous data stores (CQNS). This proposed framework can be used in cloud environments or for any big data application to automatically help developers to manage basic and complicated database queries. CQNS consists of three layers: matching selector layer, processing layer, and query execution layer. The matching selector layer is the heart of this architecture in which five of the user queries are examined if they are matched with another five queries stored in a single engine stored in the architecture library. This is achieved through a proposed algorithm that directs the query to the right SQL or NoSQL database engine. Furthermore, CQNS deal with many NoSQL Databases like MongoDB, Cassandra, Riak, CouchDB, and NOE4J databases. This paper presents a spark framework that can handle both SQL and NoSQL Databases. Four scenarios’ benchmarks datasets are used to evaluate the proposed CQNS for querying different NoSQL Databases in terms of optimization process performance and query execution time. The results show that, the CQNS achieves best latency and throughput in less time among the compared systems.
APA, Harvard, Vancouver, ISO, and other styles
30

Kim, Jinsul, Akm Ashiquzzaman, Van Quan Nguyen, and Sang Woo Kim. "Hybrid Mobile-App. on Multi-MEC Platforms in NFV Environment." International Journal of Engineering & Technology 7, no. 4.38 (December 3, 2018): 383. http://dx.doi.org/10.14419/ijet.v7i4.38.24587.

Full text
Abstract:
In recent times, practicality of web applications has become more reliant upon big-data orientated materials such 4K videos, hi-def. resolution images, lossless audios and massive texts. Structured Query Languages (SQL) faces compatibility issues with large scale databases. Because of this data storage problem, NoSQL databases are used for storing big-data. NoSQL databases have been recently gaining traction with many options such MongoDB, CouchDB, Redis and Apache Cassandra. One of the major restrictions companies, enterprises and developers encounter during developing an application is multiplicative cost of building a native programing across different platforms. Besides, network Function Virtualization (NFV) plays a vital role for providing services for utilizing such applications in larger and more effective scale. Hence, in this paper, we discussed our main motivation behind selecting Iconic Framework, a hybrid system for rapid development real-time application based on Firebase in the NFV environment cooperating with Mobile Edge Computing (MEC). As a result, this approach provides comparatively flexible features.
APA, Harvard, Vancouver, ISO, and other styles
31

Mazurova, Oksana, Mykola Andrushchenko, and Mariya Shirokopetleva. "RESEARCH OF METHODS OF SOFTWARE IMPLEMENTATION OF THE COSMOS DB API ON THE .NET PLATFORM." Innovative Technologies and Scientific Solutions for Industries, no. 2 (24) (August 5, 2023): 118–30. http://dx.doi.org/10.30837/itssi.2023.24.118.

Full text
Abstract:
Large number of developers use the .NET platform to create applications that work with databases today. In turn, Cosmos DB is becoming an increasingly popular choice as a NoSQL storage for such databases. Cosmos DB is a flexible and scalable system, and the correct selection of the appropriate API during software implementation can significantly affect the performance of the programs themselves. Cosmos DB provides different APIs for working with different types of databases, such as SQL databases, running MongoDB or Cassandra. In turn, each of these APIs can be used using various methods of software implementation. The subject of research is software implementations on the .NET platform for various Cosmos DB APIs. When choosing the most suitable Cosmos DB API on the .NET platform, developers can be helped not only by the documentation, but also by the results of experimental studies of these APIs, which in turn will improve the quality of the code and the performance of the systems themselves. The goal of the work is to increase the efficiency of software development on the .NET platform. That use the Cosmos DB API, by developing a recommendation for the selection of software implementation methods for these APIs based on the results of their experimental research. The task: to investigate and compare the methods of software implementation of the Cosmos DB API through an experimental study of the performance of different types of queries on these software solutions; analyze the obtained results and develop recommendations for the use of methods. Methods: multi-criteria analysis of Cosmos DB API, logical data modeling, experimental research. Results: developed software solutions based on the use of CosmosClient, Entity Framework Core for Cosmos DB API for NoSQL and based on MongoClient for Cosmos DB API for MongoDB. A series of experiments and measurements of performance metrics for each of the software solutions were conducted, the obtained results were analyzed, and recommendations were offered for using the considered methods of software implementations of the Cosmos DB API on the .NET platform. Conclusion: In general, the choice of software approach depends on the specific task, but experiments have shown that CosmosDB API for NoSQL using CosmosClient is the best choice for small projects, and using the Entity Framework Core Cosmos is suitable for more complex projects with larger volumes of data and complex queries. If MongoDB is used in the project, then the corresponding solution using MongoClient is a better option than Cosmos DB API for NoSQL.
APA, Harvard, Vancouver, ISO, and other styles
32

Burmester, G. R., L. Coates, S. B. Cohen, Y. Tanaka, I. Vranic, E. Nagy, A. S. Chen, et al. "POS0232 POST-MARKETING SAFETY SURVEILLANCE OF TOFACITINIB OVER 9 YEARS IN PATIENTS WITH RHEUMATOID ARTHRITIS AND PSORIATIC ARTHRITIS." Annals of the Rheumatic Diseases 82, Suppl 1 (May 30, 2023): 347–48. http://dx.doi.org/10.1136/annrheumdis-2023-eular.1722.

Full text
Abstract:
BackgroundThe safety of tofacitinib in patients (pts) with rheumatoid arthritis (RA) and psoriatic arthritis (PsA) has been demonstrated in clinical studies with up to 9.5 and 4 years (yrs) of observation, respectively. Real-world post-marketing surveillance (PMS) safety data comprised of spontaneous and voluntary adverse event (AE) reports for tofacitinib have been published for RA and ulcerative colitis, but not PsA.ObjectivesTo further characterise the real-world safety profile of tofacitinib in RA and PsA.MethodsAE reports were collected from 6 Nov 2012–6 Nov 2021 (RA) and 14 Dec 2017–6 Nov 2021 (PsA) from the Pfizer safety database. Tofacitinib was approved in the US for RA on 6 Nov 2012 (immediate release [IR]) and 24 Feb 2016 (extended release [XR]), and for PsA on 14 Dec 2017 (IR and XR). Safety endpoints included AEs, serious AEs (SAEs), AEs of special interest (AESI) and fatal cases. Pt years (PY) of exposure were estimated from IQVIA commercial sales data from 61 countries and 1 region. Number (N), frequency and reporting rates (RR; number of events/100 PY of estimated exposure) for each endpoint were summarised by indication (RA/PsA) and formulation (IR [5 or 10 mg twice daily], XR [11 mg once daily] or all tofacitinib [IR+XR]). A sensitivity analysis truncated the analysis period to the first 4 yrs post approval for RA (2012–16), to align with the duration of PsA data.ResultsOf the 73 525 case reports (68 131 RA/5394 PsA), 4239/368 (6.2%/6.8%) did not report a formulation and were excluded. Most AE reports were for females (RA: 81.8%/PsA: 71.3%); around half were submitted by healthcare professionals (49.8%/63.1%) and the majority were from North America (80.1%/82.3%). Almost all XR reports (RA/PsA: 93.0%/97.0%) originated from North America (IR reports: 72.3%/68.6%). For both indications, the RR for AEs was higher with XR vs IR; RR and frequency of SAEs, AESIs and fatal cases were mostly similar between XR and IR (Table 1). The most frequently reported AEs in RA and PsA by Preferred Term included drug ineffective, pain, condition aggravated, headache and arthralgia (Figure 1). Off label use was more frequently reported in PsA than RA (Figure 1). In the first 4 yrs post approval of the IR formulation for RA (IR/XR: 49 439/2000 PY), AEs, SAEs and fatal cases RRs were 95.9/147.0, 19.1/24.5 and 0.4/0.4, respectively.Table 1.Safety summaryRATofacitinib IR 312 632 PYTofacitinib XR 126 738 PYAll tofacitinib 439 370 PYN%RRN%RRN%RRAEs137 47644.082 15364.8219 62950.0SAEs24 96618.28.011 97814.69.536 94416.88.4Serious infections49443.61.624673.02.074113.41.7HZ11940.90.45290.60.417230.80.4CV eventsa7730.60.34130.50.311860.50.3Malignancyb9410.70.34290.50.313700.60.3VTE3180.20.11500.20.14680.20.1Fatal cases8392.1c0.32791.2c0.211181.8c0.3PsATofacitinib IR 14 000 PYTofacitinib XR 6706 PYAll tofacitinib 20 706 PYN%RRN%RRN%RRAEs834959.67602113.415 95177.0SAEs113613.68.191212.013.6204812.89.9Serious infections2392.91.72002.63.04392.82.1HZ490.60.4350.50.5840.50.4CV eventsa440.50.3250.30.4690.40.3Malignancyb300.40.2270.40.4570.40.3VTE270.30.2120.20.2390.20.2Fatal cases220.9c0.2190.8c0.3410.8c0.2All cases reported ≥1 AE and ≥0 SAEaIncludes Standardised MedDRA Queries: central nervous system vascular disorders, myocardial infarction and associated terms, ischaemic heart disease and associated terms; and Preferred Terms: cardiac death, cardiac failure congestive, sudden cardiac death and pulmonary embolismbExcluding non-melanoma skin cancercBased on total case reports by formulation: RA, 39 744 IR/24 148 XR; PsA: 2601 IR/2425 XRCV, cardiovascular; HZ, herpes zoster; MedDRA, Medical Dictionary for Regulatory Activities; VTE, venous thromboembolismConclusionTofacitinib PMS safety data from submitted AE reports were consistent for RA and PsA and aligned with the established safety profile. Reporting bias, reporter identity, regional differences in formulation use and exposure data (lower XR vs IR; estimation from commercial sales data) limit interpretation.AcknowledgementsThis study was sponsored by Pfizer. Medical writing support, under the direction of the authors, was provided by Julia King, PhD, CMC Connect, a division of IPG Health Medical Communications, and was funded by Pfizer, New York, NY, USA, in accordance with Good Publication Practice (GPP 2022) guidelines (Ann Intern Med 2022; 175: 1298-1304).Disclosure of InterestsGerd Rüdiger Burmester Speakers bureau: AbbVie, Amgen, Bristol Myers Squibb, Eli Lilly, Janssen, Galapagos, Novartis, Pfizer Inc and Sanofi, Consultant of: AbbVie, Amgen, Bristol Myers Squibb, Eli Lilly, Janssen, Galapagos, Novartis, Pfizer Inc and Sanofi, Laura Coates Speakers bureau: AbbVie, Amgen, Biogen, Celgene, Eli Lilly, Galapagos, Gilead Sciences, GSK, Janssen, Medac, Novartis, Pfizer Inc and UCB, Consultant of: AbbVie, Amgen, Bristol Myers Squibb, Celgene, Eli Lilly, Galapagos, Gilead Sciences, Janssen, MoonLake, Novartis, Pfizer Inc and UCB, Grant/research support from: AbbVie, Amgen, Celgene, Eli Lilly, Janssen, Novartis, Pfizer Inc and UCB, Stanley B. Cohen Consultant of: AbbVie, Amgen, Boehringer Ingelheim, Gilead Sciences, Merck and Pfizer Inc, Yoshiya Tanaka Speakers bureau: AbbVie, AstraZeneca, Boehringer Ingelheim, Bristol Myers Squibb, Chugai, Daiichi Sankyo, Eisai, Eli Lilly, Gilead Sciences, GSK, Mitsubishi-Tanabe and Pfizer Inc, Grant/research support from: AbbVie, Asahi-Kasei, Boehringer Ingelheim, Chugai, Daiichi Sankyo, Eisai and Takeda, Ivana Vranic Shareholder of: Pfizer Inc, Employee of: Pfizer Inc, Edward Nagy Shareholder of: Pfizer Ltd, Employee of: Pfizer Ltd, All-shine Chen Shareholder of: Pfizer Inc, Employee of: Pfizer Inc, Irina Lazariciu Employee of: IQVIA, who were paid contractors to Pfizer Inc in the development of this abstract and in providing statistical support, Kenneth Kwok Shareholder of: Pfizer Inc, Employee of: Pfizer Inc, Lara Fallon Shareholder of: Pfizer Inc, Employee of: Pfizer Inc, Cassandra Kinch Shareholder of: Pfizer Inc, Employee of: Pfizer Inc.
APA, Harvard, Vancouver, ISO, and other styles
33

Tigua Moreira, Sonia, Edison Cruz Navarrete, and Geovanny Cordova Perez. "Big Data: paradigm in construction in the face of the challenges and challenges of the financial sector in the 21st century." Universidad Ciencia y Tecnología 25, no. 110 (August 26, 2021): 127–37. http://dx.doi.org/10.47460/uct.v25i110.485.

Full text
Abstract:
The world of finance is immersed in multiple controversies, laden with contradictions and uncertainties typical of a social ecosystem, generating dynamic changes that lead to significant transformations, where the thematic discussion of Big Data becomes crucial for real-time logical decision-making. In this field of knowledge is located this article, which reports as a general objective to explore the strengths, weaknesses and future trends of Big Data in the financial sector, using as a methodology for exploration a scientific approach with the bibliographic tools scopus and scielo, using as a search equation the Big Data, delimited to the financial sector. The findings showed the growing importance of gaining knowledge from the huge amount of financial data generated daily globally, developing predictive capacity towards creating scenarios inclined to find solutions and make timely decisions. Keywords: Big Data, financial sector, decision-making. References [1]D. Reinsel, J. Gantz y J. Rydning, «Data Age 2025: The Evolution of Data to Life-Critical,» IDC White Pape, 2017. [2]R. Barranco Fragoso, «Que es big data IBM Developer works,» 18 Junio 2012. [Online]. Available: https://developer.ibm.com/es/articles/que-es-big-data/. [3]IBM, «IBM What is big data? - Bringing big data to the enterprise,» 2014. [Online]. Available: http://www.ibm.com/big-data/us/en/. [4]IDC, «Resumen Ejecutivo -Big Data: Un mercado emergente.,» Junio 2012. [Online]. Available: https://www.diarioabierto.es/wp-content/uploads/2012/06/Resumen-Ejecutivo-IDC-Big-Data.pdf. [5]Factor humano Formación, «Factor humano formación escuela internacional de postgrado.,» 2014. [Online]. Available: http//factorhumanoformación.com/big-data-ii/. [6]J. Luna, «Las tecnologías Big Data,» 23 Mayo 2018. [Online]. Available: https://www.teldat.com/blog/es/procesado-de-big-data-base-de-datos-de-big-data-clusters-nosql-mapreduce/#:~:text=Tecnolog%C3%ADas%20de%20procesamiento%20Big%20Data&text=De%20este%20modo%20es%20posible,las%20necesidades%20de%20procesado%20disminuyan. [7]T.A.S Foundation, "Apache cassandra 2015", The apache cassandra project, 2015. [8]E. Dede, B. Sendir, P. Kuzlu, J. Hartog y M. Govindaraju, «"An Evaluation of Cassandra for Hadoop",» de 2013 IEEE Sixth International Conference on Cloud Computing, Santa Clara, CA, USA, 2013. [9]The Apache Software Foundation, «"Apache HBase",» 04 Agosto 2017. [Online]. Available: http://hbase.apache.org/. [10]G. Deka, «"A Survey of Cloud Database Systems",» IT Professional, vol. 16, nº 02, pp. 50-57, 2014. [11]P. Dueñas, «Introducción al sistema financiero y bancario,» Bogotá. Politécnico Grancolombiano, 2008. [12]V. Mesén Figueroa, «Contabilización de CONTRATOS de FUTUROS, OPCIONES, FORWARDS y SWAPS,» Tec Empresarial, vol. 4, nº 1, pp. 42-48, 2010. [13] A. Castillo, «Cripto educación es lo que se necesita para entender el mundo de la Cripto-Alfabetización,» Noticias Artech Digital , 04 Junio 2018. [Online].Available: https://www.artechdigital.net/cripto-educacion-cripto-alfabetizacion/. [14]Conceptodefinicion.de, «Definicion de Cienciometría,» 16 Diciembre 2020. [Online]. Available: https://conceptodefinicion.de/cienciometria/. [15]Elsevier, «Scopus The Largest database of peer-reviewed literature» https//www.elsevier.com/solutions/scopus., 2016. [16]J. Russell, «Obtención de indicadores bibliométricos a partir de la utilización de las herramientas tradicionales de información,» de Conferencia presentada en el Congreso Internacional de información-INFO 2004, La Habana, Cuba, 2004. [17]J. Durán, Industrialized and Ready for Digital Transformation?, Barcelona: IESE Business School, 2015. [18]P. Orellana, «Omnicanalidad,» 06 Julio 2020. [Online]. Available: https://economipedia.com/definiciones/omnicanalidad.html. [19]G. Electrics, «Innovation Barometer,» 2018. [20]D. Chicoma y F. Casafranca, Interviewees, Entrevista a Daniel Chicoma y Fernando Casafranca, docentes del PADE Internacional en Gerencia de Tecnologías de la Información en ESAN. [Entrevista]. 2018. [21]L.R. La república, «La importancia del mercadeo en la actualidad,» 21 Junio 2013. [Online]. Available: https://www.larepublica.co/opinion/analistas/la-importancia-del-mercadeo-en-la-actualidad-2041232#:~:text=El%20mercadeo%20es%20cada%20d%C3%ADa,en%20los%20mercados%20(clientes). [22]UNED, «Acumulación de datos y Big data: Las preguntas correctas,» 10 Noviembre 2017. [Online]. Available: https://www.masterbigdataonline.com/index.php/en-el-blog/150-el-big-data-y-las-preguntas-correctas. [23]J. García, Banca aburrida: el negocio bancario tras la crisis económica, Fundacion Funcas - economía y sociedad, 2015, pp. 101 - 150. [24]G. Cutipa, «Las 5 principales ventajas y desventajas de bases de datos relacionales y no relacionales: NoSQL vs SQL,» 20 Abril 2020. [Online]. Available: https://guidocutipa.blog.bo/principales-ventajas-desventajas-bases-de-datos-relacionales-no-relacionales-nosql-vs-sql/. [25]R. Martinez, «Jornadas Big Data ANALYTICS,»19 Septiembre 2019. [Online]. Available: https://www.cfp.upv.es/formacion-permanente/curso/jornada-big-data-analytics_67010.html. [26]J. Rifkin, The End of Work: The Decline of the Global Labor Force and the Dawn of the Post-Market Era, Putnam Publishing Group, 1995. [27]R. Conde del Pozo, «Los 5 desafíos a los que se enfrenta el Big Data,» 13 Agosto 2019. [Online]. Available: https://diarioti.com/los-5-desafios-a-los-que-se-enfrenta-el-big-data/110607.
APA, Harvard, Vancouver, ISO, and other styles
34

J, Jyothi. "Cassandra is a Better Option for Handling Big Data in a No-SQL Database." International Journal of Research Publication and Reviews, September 17, 2022, 880–83. http://dx.doi.org/10.55248/gengpi.2022.3.9.27.

Full text
Abstract:
With no single point of failure and the flexibility to handle enormous volumes of data across numerous commodity servers, Apache Cassandra is an opensource distributed database management system. With asynchronous master less replication, Cassandra provides strong support for clusters spanning several datacenters and enables low latency operations for all clients. Like skilled carpenters, data engineers are aware that certain tasks call for various tools. Choosing the appropriate equipment and being knowledgeable about their use can be the most crucial aspects of any job. A distributed information system for managing massive amounts of structured data across multiple goods servers, Apache Cassandra, a top-level Apache project developed at Facebook and based on Google's Big Table and Amazon's Generator, offers extremely offered service and has no single point of failure. In comparison to other NoSQL databases and relational databases, Cassandra offers features including continuous availability, linear scale performance, operational simplicity, and straightforward knowledge distribution over various knowledge centers and cloud availability zones. Cassandra's capacity to scale, perform, and supply continuous time period is due to its design. Cassandra uses an attractive, simple-to-setup, and simple-to-maintain lordless "ring" design rather than a traditional master-slave or a manual and labor-intensive shared system. With continuous availability, linear scale performance, operational simplicity, and simple data distribution over numerous data centers and cloud availability zones, Apache Cassandra is a massively scalable open-source non-relational database. Cassandra was initially created at Facebook; it was open sourced in 2008; and in 2010, it was elevated to the status of a top-level Apache project.
APA, Harvard, Vancouver, ISO, and other styles
35

Singh, Sandeep Kumar, and Mamata Jenamani. "Cassandra-based data repository design for food supply chain traceability." VINE Journal of Information and Knowledge Management Systems ahead-of-print, ahead-of-print (March 9, 2020). http://dx.doi.org/10.1108/vjikms-08-2019-0119.

Full text
Abstract:
Purpose The purpose of this paper is to design a supply chain database schema for Cassandra to store real-time data generated by Radio Frequency IDentification technology in a traceability system. Design/methodology/approach The real-time data generated in such traceability systems are of high frequency and volume, making it difficult to handle by traditional relational database technologies. To overcome this difficulty, a NoSQL database repository based on Casandra is proposed. The efficacy of the proposed schema is compared with two such databases, document-based MongoDB and column family-based Cassandra, which are suitable for storing traceability data. Findings The proposed Cassandra-based data repository outperforms the traditional Structured Query Language-based and MongoDB system from the literature in terms of concurrent reading, and works at par with respect to writing and updating of tracing queries. Originality/value The proposed schema is able to store the real-time data generated in a supply chain with low latency. To test the performance of the Cassandra-based data repository, a test-bed is designed in the lab and supply chain operations of Indian Public Distribution System are simulated to generate data.
APA, Harvard, Vancouver, ISO, and other styles
36

Khashan, Eman A., Ali I. El Desouky, and Sally M. Elghamrawy. "A Framework for Executing Complex Querying for Relational and NoSQL Databases (CQNS)." European Journal of Electrical Engineering and Computer Science 4, no. 5 (September 22, 2020). http://dx.doi.org/10.24018/ejece.2020.4.5.195.

Full text
Abstract:
The increasing of data on the web poses major confrontations. The amount of stored data and query data sources have become needful features for huge data systems. There are a large number of platforms used to handle the NoSQL database model such as: Spark, H2O and Hadoop HDFS / MapReduce, which are suitable for controlling and managing the amount of big data. Developers of different applications impose data stores on difficult tasks by interacting with mixed data models through different APIs and queries. In this paper, a complex SQL Query and NoSQL (CQNS) framework that acts as an interpreter sends complex queries received from any data store to its corresponding executable engine called CQNS. The proposed framework supports application queries and database transformation at the same time, which in turn speeds up the process. Moreover, CQNS handles many NoSQL databases like MongoDB and Cassandra. This paper provides a spark framework that can handle SQL and NoSQL databases. This work also examines the importance of MongoDB block sharding and composition. Cassandra database deals with two types of sections vertex and edge Portioning. The four scenarios criteria datasets are used to evaluate the proposed CQNS to query the various NOSQL databases in terms of optimization performance and timing of query execution. The results show that among the comparative system, CQNS achieves optimum latency and productivity in less time.
APA, Harvard, Vancouver, ISO, and other styles
37

"A Scalable and Fault Tolerant Health Risk Predictor using Bigdata Process Systems." International Journal of Innovative Technology and Exploring Engineering 8, no. 9S3 (August 23, 2019): 609–14. http://dx.doi.org/10.35940/ijitee.i3122.0789s319.

Full text
Abstract:
fault tolerant system to do real-time analytics for different health care applications. Users can get their health condition analysis report from the system by sending their health records in real-time. The health conditions occurrence can be considered as complex events and it may extended to different heterogeneous scenarios. Based on scalability and availability requirements, the system is developed using Kafka, Spark Streaming and Cassandra and implemented by using Scala. This system is capable for event stream processing and event batch processing. Users send the health data to Kafka through their producer clients in real-time. Spark streaming process the data from Kafka of different window sizes by analyzing the health conditions. In another scenario, user request stored into Cassandra database and is processed asynchronously by spark streaming. This system is tested with the use case of Heart attack hazard and stress prediction with different health datasets Keywords—Healthcare, Bigdata, Spark Streaming, Kafka, cassandra, Heart failure Prediction, Stress Index analysis
APA, Harvard, Vancouver, ISO, and other styles
38

Li, Jiaxin, Yiming Zhang, Shan Lu, Haryadi S. Gunawi, Xiaohui Gu, Feng Huang, and Dongsheng Li. "Performance Bug Analysis and Detection for Distributed Storage and Computing Systems." ACM Transactions on Storage, January 18, 2023. http://dx.doi.org/10.1145/3580281.

Full text
Abstract:
This paper systematically studies 99 distributed performance bugs from five widely-deployed distributed storage and computing systems (Cassandra, HBase, HDFS, Hadoop MapReduce and ZooKeeper). We present the TaxPerf database, which collectively organizes the analysis results as over 400 classification labels and over 2,500 lines of bug re-description. TaxPerf is classified into six bug categories (and 18 bug subcategories) by their root causes, namely, resource, blocking, synchronization, optimization, configuration, and logic. TaxPerf can be used as a benchmark for performance bug studies and debug tool designs. Although it is impractical to automatically detect all categories of performance bugs in TaxPerf, fortunately we find that an important category of blocking bugs can be effectively solved by analysis tools. We analyze the cascading nature of blocking bugs and design an automatic detection tool called PCatch , which (i) performs program analysis to identify code regions whose execution time can potentially increase dramatically with the workload size; (ii) adapts the traditional happens-before model to reason about software resource contention and performance dependency relationship; and (iii) uses dynamic tracking to identify whether the slowdown propagation is contained in one job. Evaluation shows that PCatch can accurately detect blocking bugs of representative distributed storage and computing systems by observing system executions under small-scale workloads.
APA, Harvard, Vancouver, ISO, and other styles
39

J, Jyothi. "MogoDB: A NoSQL Database with Amazing Advantages and Features." International Journal of Research Publication and Reviews, September 19, 2022, 1005–8. http://dx.doi.org/10.55248/gengpi.2022.3.9.30.

Full text
Abstract:
The extremely compressed data stores on non-relational database management systems, commonly referred to as NoSQL data stores, which facilitate the management of data for internet user programs, do not now provide this feature. Some of the words that come to mind are data security, encryption methods, MongoDB, and NoSQL. Without negatively compromising the database's performance in terms of speed or memory utilization, the appropriate encryption methods utilized to various essential data fields provide data protection. This is possible because the right encryption methods are being used. Databases are often used to manage different types of data since document-oriented and unstructured data are typically stored there. These databases include Cassandra, MongoDB, CouchDB, Redis, Hyper-table, and others. They must provide strong security and safeguard users' private data when it is in use or at rest because they are open source. There is a need for a single solution that may enhance data transmission security by offering an improved encryption technique that expedites operations and consumes less memory when managing databases.Today's IT growth is attributed to the development of more data-intensive applications. Due to stringent limitations on data structure, data relations, and other factors, relational databases do not allow us to work with huge data sets or maintain high volume databases. Unstructured data from many industries, including a variety of forms, must be processed and stored in databases. Thus, NoSQL types offer solutions to a number of problems associated with huge data bases. Largely because of benefits like flexibility and horizontal scalability, NOSQL (Not Only SQL) is in demand. This essay explores many NoSQL database types, such as MongoDB, CouchDB, HBase, Cassandra, etc.
APA, Harvard, Vancouver, ISO, and other styles
40

J, Jyothi. "MogoDB: A NoSQL Database with Amazing Advantages and Features." International Journal of Research Publication and Reviews, October 20, 2022, 1468–71. http://dx.doi.org/10.55248/gengpi.2022.3.10.50.

Full text
Abstract:
The extremely compressed data stores on non-relational database management systems, commonly referred to as NoSQL data stores, which facilitate the management of data for internet user programs, do not now provide this feature. Some of the words that come to mind are data security, encryption methods, MongoDB, and NoSQL. Without negatively compromising the database's performance in terms of speed or memory utilization, the appropriate encryption methods utilized to various essential data fields provide data protection. This is possible because the right encryption methods are being used. Databases are often used to manage different types of data since document-oriented and unstructured data are typically stored there. These databases include Cassandra, MongoDB, CouchDB, Redis, Hyper-table, and others. They must provide strong security and safeguard users' private data when it is in use or at rest because they are open source. There is a need for a single solution that may enhance data transmission security by offering an improved encryption technique that expedites operations and consumes less memory when managing databases.Today's IT growth is attributed to the development of more data-intensive applications. Due to stringent limitations on data structure, data relations, and other factors, relational databases do not allow us to work with huge data sets or maintain high volume databases. Unstructured data from many industries, including a variety of forms, must be processed and stored in databases. Thus, NoSQL types offer solutions to a number of problems associated with huge data bases. Largely because of benefits like flexibility and horizontal scalability, NOSQL (Not Only SQL) is in demand. This essay explores many NoSQL database types, such as MongoDB, CouchDB, HBase, Cassandra, etc.
APA, Harvard, Vancouver, ISO, and other styles
41

Kanchan, Shivangi, Parmeet Kaur, and Pranjal Apoorva. "Empirical Evaluation of NoSQL and Relational Database Systems." Recent Advances in Computer Science and Communications 13 (June 12, 2020). http://dx.doi.org/10.2174/2666255813999200612113208.

Full text
Abstract:
Aim: To evaluate the performance of Relational and NoSQL databases in terms of execution time and memory consumption during operations involving structured data. Objective: To outline the criteria that decision makers should consider while making a choice of the database most suited to an application. Methods: Extensive experiments were performed on MySQL, MongoDB, Cassandra, Redis using the data for a IMDB movies schema prorated into 4 datasets of 1000, 10000, 25000 and 50000 records. The experiments involved typical database operations of insertion, deletion, update read of records with and without indexing as well as aggregation operations. Databases’ performance has been evaluated by measuring the time taken for operations and computing memory usage. Results: * Redis provides the best performance for write, update and delete operations in terms of time elapsed and memory usage whereas MongoDB gives the worst performance when the size of data increases, due to its locking mechanism. * For the read operations, Redis provides better performance in terms of latency than Cassandra and MongoDB. MySQL shows worst performance due to its relational architecture. On the other hand, MongoDB shows the best performance among all databases in terms of efficient memory usage. * Indexing improves the performance of any database only for covered queries. * Redis and MongoDB give good performance for range based queries and for fetching complete data in terms of elapsed time whereas MySQL gives the worst performance. * MySQL provides better performance for aggregate functions. NoSQL is not suitable for complex queries and aggregate functions. Conclusion: It has been found from the extensive empirical analysis that NoSQL outperforms SQL based systems in terms of basic read and write operations. However, SQL based systems are better if queries on the dataset mainly involves aggregation operations.
APA, Harvard, Vancouver, ISO, and other styles
42

Eshtay, Mohammed, Azzam Sleit, and Monther Aldwairi. "IMPLEMENTING BI-TEMPORAL PROPERTIES INTO VARIOUS NOSQL DATABASE CATEGORIES." International Journal of Computing, March 31, 2019, 45–52. http://dx.doi.org/10.47839/ijc.18.1.1272.

Full text
Abstract:
NoSQL database systems have emerged and developed at an accelerating rate in the last years. Attractive properties such as scalability and performance, which are needed by many applications today, contributed to their increasing popularity. Time is very important aspect in many applications. Many NoSQL database systems do not offer built in management for temporal properties. In this paper, we discuss how we can embed temporal properties in NoSQL databases. We review and differentiate between the most popular NoSQL stores. Moreover, we propose various solutions to modify data models for embedding bitemporal properties in two of the most popular categories of NoSQL databases (Key-value stores and Column stores). In addition, we give examples of how to represent bitemporal properties using Redis Key-value store and Cassandra column oriented store. This work can be used as basis for designing and implementing temporal operators and temporal data management in NoSQL databases.
APA, Harvard, Vancouver, ISO, and other styles
43

Dettki, Holger, Debora Arlt, Johan Bäckman, and Mathieu Blanchet. "Key-value pairs and NoSQL databases: a novel concept to manage biologging data in data repositories." Biodiversity Information Science and Standards 7 (August 21, 2023). http://dx.doi.org/10.3897/biss.7.111438.

Full text
Abstract:
Traditional data scheme concepts for biologging data previously relied on traditional relational databases and fixed normalized tables. In practice, this means that a repository contains separate, fixed table structures for each type of sensor data. Prominent examples are the current Wireless Remote Animal Monitoring (WRAM) data schema or the now discontinued ZoaTrack approach. While the traditional approach worked fine as long as few sensors with fixed data types were used, rapid technological development continuously introduces new sensor types and more advanced sensor platforms. This means more data providers, new data types, and rapidly increasing amounts of data. Storage solutions using relational data models generate constant requirements for additional tables, changes to existing table structures, and as a consequence, changes to the overall data scheme in the repository. Further, it becomes very difficult to adapt to emerging international standards, as any change in a particular data field in a single table may have wide ranging consequences to the overall database structure. A concept better suited to deal with the growing amount of sensors and sensor types is the Key-Value Pair (KVP) concept: A KVP is a data type that includes two pieces of data that have a group of key identifiers and a set of associated values. The KVP concept has been used for a long time in data exchange/transport (e.g., JavaScript Object Notation (JSON), XML). Today, very good database solutions exist (e.g., MongoDB, Apache Cassandra DB, Apache HBase) that use KVP directly as the data store. Within a KVP, there are two related data elements. The first element, the key, is a constant used to identify the data type. The other element is a value, which is a variable representing the actual measurement of the data type. In other words, instead of using two separate tables with different table structures to store data e.g., from an acceleration sensor and a GPS-sensor (Global Positioning System), we simply define key-IDs representing the different data types of a GPS-sensor and store its associated measurement values, for example: longitude, latitude, date, and time. We can then store the key-IDs for ‘Acceleration’ in the same table with it's associated unique values without requiring any change to the overall data scheme.(Fig. 1). The data is stored in a key-value store: a non-relational or NoSQL database specifically designed to handle key-value pairs. The obvious advantage is flexibility: A key-value store allows any new sensor type easily to be added to the repository without requiring any structural change. Furthermore, this concept allows for scalability, speed, and optimization of storage space. While the traditional concept required the input of ‘null’ for optional values, key-value stores just skip this particular optional value, resulting in smaller storage requirements. Biologging datasets also differ from more classical 'biodiversity' datasets in size: a single standard 3-axis-acceleration sensor measuring at 30 Hz (30 measurements per second) produces ca. one billion measurements per axis and year for a single individal. Thus, high scalability is a necessity when serving modern sensor systems that accumulate these vast amounts of data. Databases like MongoDB are easy to design as distributed systems. High performance comes from the flexible data structures, e.g., the possibility of storing large structures of data in a single document, which allows performance-critical queries to be made in a single request, but also from the horizontal scalability, which allows for load distribution across multiple hardware systems. In 2021 the former CAnMove (Center for Animal Movement) initiative at Lund University, Sweden, which previously adapted the ZoaTrack application for Swedish needs, and the WRAM biotelemetry e-infrastructure at the Swedish University of Agricultural Sciences (SLU) joined forced within the Swedish Biodiversity Data Infrastructure (SBDI) to develop a new data model based on the KVP-concept. We started analyzing the data and sensor types used in the WRAM and CAnMove repositories and constructed KVPs that can cover all current data. We also added metadata descriptions for projects, datasets and sensors used (Fig. 2). The concept is currently being tested with an implementation into MongoDB. While different tables are used to identify the project, dataset or sampling event in a one-to-many relationship (Fig. 2), the KVP table ‘Record’ contains the actual measurements. In order to identify which sensor can take which measurements, the KVP table ‘Sensor’ serves as a ‘look-up’ table. Hence, to add new sensors types to the repository, only records in the KPV table ‘Sensor’ have to be added to update the repository to handle and store these data. Data in a KVP model are easy to parse and since we strictly use open standards when available, such as Darwin Core in our data, it is relatively easy to publish and exchange data in other formats. As NoSQL databases are now mature products with many proven use cases, there is no reason to hesitate building production systems for biologging repositories based on these. Further work will be done to ensure coherence with the emerging standards for biologging data to enable seamless data sharing across other biologging repositories, such as Movebank, and data aggregation into the Global Biodiversity Information Facility (GBIF).
APA, Harvard, Vancouver, ISO, and other styles
44

B, Shahida. ""Using MongoDB to Understand the Underlying Methods Techniques Encryption in NoSQL database"." International Journal of Research Publication and Reviews, September 17, 2022, 862–66. http://dx.doi.org/10.55248/gengpi.2022.3.9.23.

Full text
Abstract:
There is a need for appropriate encryption methods, which must be adhered by the concerned parties, in order to provide a high level of security for the secret data. In this study, an examination of the different encryption algorithms and their performance in the management of private data with authentication, access control, secure configuration, and data encryption is presented. It includes enhancing MongoDB's level-based access protected model and adding privacy keys for security and monitoring purposes respectively. The NoSQL data stores, also known as highly compressed data on non-relational database management systems, which provide support for data management of internet user programs, do not now offer this service. Data security, encryption techniques, MongoDB, and NoSQL are some of the keywords that come to mind.The proper encryption methods used to various crucial data fields provide data protection without adversely affecting the database's performance in terms of speed or memory usage. Due to the use of appropriate encryption techniques, this is achievable. Since document-oriented and unstructured data are generally kept in databases, it is frequently used to handle various sorts of data. Examples of such databases include MongoDB, Cassandra, CouchDB, Redis, Hyper-table, and others. Since they are open source, there is a critical necessity to offer good security and protect the user's private information when it is in transit or at rest. There is a need for a single solution that can improve the security of data transfer by providing an improved encryption method that speeds up processes and uses less memory when maintaining databases.
APA, Harvard, Vancouver, ISO, and other styles
45

Rupali Chopade and Vinod Pachghare. "Data Tamper Detection from NoSQL Database in Forensic Environment." Journal of Cyber Security and Mobility, April 8, 2021. http://dx.doi.org/10.13052/jcsm2245-1439.1025.

Full text
Abstract:
The growth of service sector is increasing the usage of digital applications worldwide. These digital applications are making use of database to store the sensitive and secret information. As the database has distributed over the internet, cybercrime attackers may tamper the database to attack on such sensitive and confidential information. In such scenario, maintaining the integrity of database is a big challenge. Database tampering will change the database state by any data manipulation operation like insert, update or delete. Tamper detection techniques are useful for the detection of such data tampering which play an important role in database forensic investigation process. Use of NoSQL database has been attracted by big data requirements. Previous research work has limited to tamper detection in relational database and very less work has been found in NoSQL database. So there is a need to propose a mechanism to detect the tampering of NoSQL database systems. Whereas this article proposes an idea of tamper detection in NoSQL database such as MongoDB and Cassandra, which are widely used document-oriented and column-based NoSQL database respectively. This research work has proposed tamper detection technique which works in forensic environment to give more relevant outcome on data tampering and distinguish between suspicious and genuine tampering.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography