Academic literature on the topic 'Approximate record matching'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Approximate record matching.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Approximate record matching"

1

Verykios, Vassilios S., Ahmed K. Elmagarmid, and Elias N. Houstis. "Automating the approximate record-matching process." Information Sciences 126, no. 1-4 (2000): 83–98. http://dx.doi.org/10.1016/s0020-0255(00)00013-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Seleznjev, Oleg, and Bernhard Thalheim. "Random Databases with Approximate Record Matching." Methodology and Computing in Applied Probability 12, no. 1 (2008): 63–89. http://dx.doi.org/10.1007/s11009-008-9092-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Rozinek, Ondřej, Jaroslav Marek, Jan Panuš, and Jan Mareš. "Real-Time Fuzzy Record-Matching Similarity Metric and Optimal Q-Gram Filter." Algorithms 18, no. 3 (2025): 150. https://doi.org/10.3390/a18030150.

Full text
Abstract:
In this paper, we introduce an advanced Fuzzy Record Similarity Metric (FRMS) that improves approximate record matching and models human perception of record similarity. The FRMS utilizes a newly developed similarity space with favorable properties combined with a metric space, employing a bag-of-words model with general applications in text mining and cluster analysis. To optimize the FRMS, we propose a two-stage method for approximate string matching and search that outperforms baseline methods in terms of average time complexity and F measure on various datasets. In the first stage, we cons
APA, Harvard, Vancouver, ISO, and other styles
4

Essex, Aleksander. "Secure Approximate String Matching for Privacy-Preserving Record Linkage." IEEE Transactions on Information Forensics and Security 14, no. 10 (2019): 2623–32. http://dx.doi.org/10.1109/tifs.2019.2903651.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

J, Ujwala Rekha, and Shahu Chatrapati K. "Probabilistic multiple correlation based term weighting scheme for measuring similarity of unstructured text records." Indian Journal of Science and Technology 13, no. 11 (2020): 1276–82. https://doi.org/10.17485/IJST/v13i11.2020-31.

Full text
Abstract:
Abstract <strong>Background/Objectives:</strong>&nbsp;In this study, a term weighting scheme derived from probabilistic multiple correlation is defined for measuring similarity between unstructured text records.&nbsp;<strong>Methods:</strong>&nbsp;While the intra-correlation is the correlation of terms in the same record, inter-correlation is the correlation of terms that exist in different records. Probabilistic multiple correlation-based term weighting calculates the weight or relevance of a term by considering its intra-correlation with one or more terms simultaneously. Subsequently, the te
APA, Harvard, Vancouver, ISO, and other styles
6

Vasylenko, Oleh. "ANALYSIS OF KEY METADATA FOR IDENTIFYING DUPLICATES IN BIBLIOGRAPHIC RECORDS." Cybersecurity: Education, Science, Technique 3, no. 27 (2025): 87–99. https://doi.org/10.28925/2663-4023.2025.27.700.

Full text
Abstract:
This study addresses the issue of duplicate bibliographic records in library information systems, a problem that is becoming increasingly relevant with the growth of digital catalogs. It specifically examines the key metadata fields used for comparing records and identifying duplicate entries. The analysis includes critical metadata fields such as title, ISBN, publisher, place of publication, publication date, pagination, series, and additional attributes used for identifying editions. Special attention is given to the variability of data within these fields, including issues arising from misp
APA, Harvard, Vancouver, ISO, and other styles
7

Hanrath, Scott, and Erik Radio. "User search terms and controlled subject vocabularies in an institutional repository." Library Hi Tech 35, no. 3 (2017): 360–67. http://dx.doi.org/10.1108/lht-11-2016-0133.

Full text
Abstract:
Purpose The purpose of this paper is to investigate the search behavior of institutional repository (IR) users in regard to subjects as a means of estimating the potential impact of applying a controlled subject vocabulary to an IR. Design/methodology/approach Google Analytics data were used to record cases where users arrived at an IR item page from an external web search and subsequently downloaded content. Search queries were compared against the Faceted Application of Subject Terminology (FAST) schema to determine the topical nature of the queries. Queries were also compared against the it
APA, Harvard, Vancouver, ISO, and other styles
8

Williams, Richard, David Jenkins, Thomas Bolton, et al. "Replicating a COVID-19 study in a national England database to assess the generalisability of research with regional electronic health record data." BMJ Open 15, no. 4 (2025): e093080. https://doi.org/10.1136/bmjopen-2024-093080.

Full text
Abstract:
ObjectivesTo assess the degree to which we can replicate a study between a regional and a national database of electronic health record data in the UK. The original study examined the risk factors associated with hospitalisation following COVID-19 infection in people with diabetes.DesignA replication of a retrospective cohort study.SettingObservational electronic health record data from primary and secondary care sources in the UK. The original study used data from a large, urbanised region (Greater Manchester Care Record, Greater Manchester, UK—2.8 m patients). This replication study used a n
APA, Harvard, Vancouver, ISO, and other styles
9

Bianchi Santiago, Josie D., Héctor Colón Jordán, and Didier Valdés. "Record Linkage of Crashes with Injuries and Medical Cost in Puerto Rico." Transportation Research Record: Journal of the Transportation Research Board 2674, no. 10 (2020): 739–48. http://dx.doi.org/10.1177/0361198120935439.

Full text
Abstract:
Cost considerations are critical in the analysis and prevention of traffic crashes. Integration of cost data into crash datasets facilitates the crash-cost analyses with all their related attributes. It is, however, a challenging task because of the lack of availability of unique identifiers across the databases and because of privacy and confidentiality regulations. This study performed a record linkage comparison between the deterministic and probabilistic approaches using attributes matching techniques with numerical distance and weight patterns under the Fellegi–Sunter approach. As a resul
APA, Harvard, Vancouver, ISO, and other styles
10

Douglas, M. M., D. Gardner, D. Hucker, and S. W. Kendrick. "Best-Link Matching of Scottish Health Data Sets." Methods of Information in Medicine 37, no. 01 (1998): 64–68. http://dx.doi.org/10.1055/s-0038-1634494.

Full text
Abstract:
Abstract:Methods are described used to link the Community Health Index and the National Health Service Central Register (NHSCR) in Scotland to provide a basis for a national patient index. The linkage used a combination of deterministic and probability matching techniques. A best-link principle was used by which each Community Health Index record was allowed to link only to the NHSCR record with which it achieved the highest match weight. This strategy, applied in the context of two files which each covered virtually the entire population of Scotland, increased the accuracy of linkage approxim
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Approximate record matching"

1

Jupin, Joseph. "Temporal Graph Record Linkage and k-Safe Approximate Match." Diss., Temple University Libraries, 2016. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/412419.

Full text
Abstract:
Computer and Information Science<br>Ph.D.<br>Since the advent of electronic data processing, organizations have accrued vast amounts of data contained in multiple databases with no reliable global unique identifier. These databases were developed by different departments for different purposes at different times. Organizing and analyzing these data for human services requires linking records from all sources. RL (Record Linkage) is a process that connects records that are related to the identical or a sufficiently similar entity from multiple heterogeneous databases. RL is a data and compute i
APA, Harvard, Vancouver, ISO, and other styles
2

Tam, Siu-lung. "Linear-size indexes for approximate pattern matching and dictionary matching." Click to view the E-thesis via HKUTO, 2010. http://sunzi.lib.hku.hk/hkuto/record/B44205326.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Тодоріко, Ольга Олексіївна. "Моделі та методи очищення та інтеграції текстових даних в інформаційних системах". Thesis, Запорізький національний університет, 2016. http://repository.kpi.kharkov.ua/handle/KhPI-Press/21856.

Full text
Abstract:
Дисертація на здобуття наукового ступеня кандидата технічних наук за спеціальністю 05.13.06 – інформаційні технології. – Національний технічний університет "Харківський політехнічний інститут", Харків, 2016. У дисертаційній роботі вирішена актуальна науково-практична задача підвищення ефективності та якості технології очищення та інтеграції текстових даних в довідкових і пошукових інформаційних системах за рахунок використання моделей словозмінної парадигми та методу побудови лексемного індексу при організації пошуку за схожістю. Розроблено моделі словозмінної парадигми, що включають представл
APA, Harvard, Vancouver, ISO, and other styles
4

Тодоріко, Ольга Олексіївна. "Моделі та методи очищення та інтеграції текстових даних в інформаційних системах". Thesis, НТУ "ХПІ", 2016. http://repository.kpi.kharkov.ua/handle/KhPI-Press/21853.

Full text
Abstract:
Дисертація на здобуття наукового ступеня кандидата технічних наук за спеціальністю 05.13.06 – інформаційні технології. – Національний технічний університет «Харківський політехнічний інститут», Харків, 2016. У дисертаційній роботі вирішена актуальна науково-практична задача підвищення ефективності та якості технології очищення та інтеграції текстових даних в довідкових і пошукових інформаційних системах за рахунок використання моделей словозмінної парадигми та методу побудови лексемного індексу при організації пошуку за схожістю. Розроблено моделі словозмінної парадигми, що включають представл
APA, Harvard, Vancouver, ISO, and other styles
5

Vatsalan, Dinusha. "Scalable and approximate privacy-preserving record linkage." Phd thesis, 2014. http://hdl.handle.net/1885/12370.

Full text
Abstract:
Record linkage, the task of linking multiple databases with the aim to identify records that refer to the same entity, is occurring increasingly in many application areas. Generally, unique entity identifiers are not available in all the databases to be linked. Therefore, record linkage requires the use of personal identifying attributes, such as names and addresses, to identify matching records that need to be reconciled to the same entity. Often, it is not permissible to exchange personal identifying data across different organizations due to privacy and confidentiality concerns or regulatio
APA, Harvard, Vancouver, ISO, and other styles
6

Dobiášovský, Jan. "Přibližná shoda znakových řetězců a její aplikace na ztotožňování metadat vědeckých publikací." Master's thesis, 2020. http://www.nusl.cz/ntk/nusl-415121.

Full text
Abstract:
The thesis explores the application of approximate string matching in scientific publication record linkage process. An introduction to record matching along with five commonly used metrics for string distance (Levenshtein, Jaro, Jaro-Winkler, Cosine distances and Jaccard coefficient) are provided. These metrics are applied on publication metadata from V3S current research information system of the Czech Technical University in Prague. Based on the findings, optimal thresholds in the F​1​, F​2​ and F​3​-measures are determined for each metric.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Approximate record matching"

1

Dong, Boxiang, and Hui Wendy Wang. "Efficient Authentication of Approximate Record Matching for Outsourced Databases." In Advances in Intelligent Systems and Computing. Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-319-98056-0_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Grannis Shaun J., Overhage J. Marc, and McDonald Clement. "Real World Performance of Approximate String Comparators for use in Patient Matching." In Studies in Health Technology and Informatics. IOS Press, 2004. https://doi.org/10.3233/978-1-60750-949-3-43.

Full text
Abstract:
Medical record linkage is becoming increasingly important as clinical data is distributed across independent sources. To improve linkage accuracy we studied different name comparison methods that establish agreement or disagreement between corresponding names. In addition to exact raw name matching and exact phonetic name matching, we tested three approximate string comparators. The approximate comparators included the modified Jaro-Winkler method, the longest common substring, and the Levenshtein edit distance. We also calculated the combined root-mean square of all three. We tested each name
APA, Harvard, Vancouver, ISO, and other styles
3

Margaritis, Dimitris, Christos Faloutsos, and Sebastian Thrun. "NetCube." In Database Technologies. IGI Global, 2009. http://dx.doi.org/10.4018/978-1-60566-058-5.ch120.

Full text
Abstract:
We present a novel method for answering count queries from a large database approximately and quickly. Our method implements an approximate DataCube of the application domain, which can be used to answer any conjunctive count query that can be formed by the user. The DataCube is a conceptual device that in principle stores the number of matching records for all possible such queries. However, because its size and generation time are inherently exponential, our approach uses one or more Bayesian networks to implement it approximately. Bayesian networks are statistical graphical models that can
APA, Harvard, Vancouver, ISO, and other styles
4

Margaritis, Dimitris, Christos Faloutsos, and Sebastian Thrun. "NetCube." In Bayesian Network Technologies. IGI Global, 2007. http://dx.doi.org/10.4018/978-1-59904-141-4.ch004.

Full text
Abstract:
We present a novel method for answering count queries from a large database approximately and quickly. Our method implements an approximate DataCube of the application domain, which can be used to answer any conjunctive count query that can be formed by the user. The DataCube is a conceptual device that in principle stores the number of matching records for all possible such queries. However, because its size and generation time are inherently exponential, our approach uses one or more Bayesian networks to implement it approximately. Bayesian networks are statistical graphical models that can
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Approximate record matching"

1

Gollapalli, Mohammed, Xue Li, Ian Wood, and Guido Governatori. "Approximate Record Matching Using Hash Grams." In 2011 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 2011. http://dx.doi.org/10.1109/icdmw.2011.33.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Dong, Boxiang, and Wendy Wang. "ARM: Authenticated Approximate Record Matching for Outsourced Databases." In 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI). IEEE, 2016. http://dx.doi.org/10.1109/iri.2016.86.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Gonçalves, Marcos A. "Session details: Record linkage and approximate matching (DB)." In CIKM07: Conference on Information and Knowledge Management. ACM, 2007. http://dx.doi.org/10.1145/3250795.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Schraagen, Marijn. "Complete Coverage for Approximate String Matching in Record Linkage Using Bit Vectors." In 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2011. http://dx.doi.org/10.1109/ictai.2011.116.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Jia, Dan, Yong-Yi Wang, and Steve Rapp. "Material Properties and Flaw Characteristics of Vintage Girth Welds." In 2020 13th International Pipeline Conference. American Society of Mechanical Engineers, 2020. http://dx.doi.org/10.1115/ipc2020-9658.

Full text
Abstract:
Abstract Vintage pipelines, which in the context of this paper refer to pipelines built before approximately 1970, account for a large portion of the energy pipeline systems in North America. Integrity assessment of these pipelines can sometimes present challenges due to incomplete records and lack of material property data. When material properties for the welds of interest are not available, conservative estimates based on past experience are typically used for the unknown material property values. Such estimates can be overly conservative, potentially leading to unnecessary remedial actions
APA, Harvard, Vancouver, ISO, and other styles
6

Hitz, Arne, Anja Konzept, Benedikt Reick, and Klaus Rheinberger. "Efficient GPS Route Matching Method for Battery Electric Bus Fleets." In Conference on Sustainable Mobility. SAE International, 2024. http://dx.doi.org/10.4271/2024-24-0026.

Full text
Abstract:
&lt;div class="section abstract"&gt;&lt;div class="htmlview paragraph"&gt;A challenge of public transportation GPS data is the frequent utilization of monitoring systems with low sampling rates, primarily driven by the high costs associated with cellular data transmission of large datasets. Altitude data is often imprecise or not recorded at all in regions without large elevation changes. The low data quality limits the use of the data for further detailed investigations like a realistic energy consumption forecast for assessing the electrical grid load resulting from charging the vehicle flee
APA, Harvard, Vancouver, ISO, and other styles
7

Ramakrishnan, Kishore Ranganath, Shoaib Ahmed, Benjamin Wahls, et al. "Gas Turbine Combustor Liner Wall Heat Load Characterization for Different Gaseous Fuels." In ASME 2019 International Mechanical Engineering Congress and Exposition. American Society of Mechanical Engineers, 2019. http://dx.doi.org/10.1115/imece2019-11283.

Full text
Abstract:
Abstract The knowledge of detailed distribution of heat load on swirl stabilized combustor liner wall is imperative in the development of liner-specific cooling arrangements, aimed towards maintaining uniform liner wall temperatures for reduced thermal stress levels. Heat transfer and fluid flow experiments have been conducted on a swirl stabilized lean premixed combustor to understand the behavior of Methane-, Propane-, and Butane-based flames. These fuels were compared at different equivalence ratios for a matching adiabatic flame temperature of Methane at 0.65 equivalence ratio. Above exper
APA, Harvard, Vancouver, ISO, and other styles
8

Cummings, Scott M. "Prediction of Rolling Contact Fatigue Using Instrumented Wheelsets." In ASME 2008 Rail Transportation Division Fall Technical Conference. ASMEDC, 2008. http://dx.doi.org/10.1115/rtdf2008-74013.

Full text
Abstract:
The measured wheel/rail forces from four wheels in the leading truck of a coal hopper car during one revenue service roundtrip were used to by the Wheel Defect Prevention Research Consortium (WDPRC) to predict rolling contact fatigue (RCF) damage. The data was recorded in March 2005 by TTCI for an unrelated Strategic Research Initiatives project funded by the Association of American Railroads (AAR). RCF damage was predicted in only a small portion of the approximately 4,000 km (2,500 miles) for which data was analyzed. The locations where RCF damage was predicted to occur were examined careful
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Approximate record matching"

1

Day, Christopher M., Howell Li, Sarah M. L. Hubbard, and Darcy M. Bullock. Observations of Trip Generation, Route Choice, and Trip Chaining with Private-Sector Probe Vehicle GPS Data. Purdue University, 2022. http://dx.doi.org/10.5703/1288284317368.

Full text
Abstract:
This paper presents an exploratory study of GPS data from a private-sector data provider for analysis of trip generation, route choice, and trip chaining. The study focuses on travel to and from the Indianapolis International Airport. GPS data consisting of nearly 1 billion waypoints for 12 million trips collected over a 6-week period in the state of Indiana. Within this data, there were approximately 10,000 trip records indicating travel to facilities associated with the Indianapolis airport. The analysis is based the matching of waypoints to geographic areas that define the extents of roadwa
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!