Journal articles on the topic 'Document image collections'

To see the other types of publications on this topic, follow the link: Document image collections.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Document image collections.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Han, Yuehui, Weilan Wang, Huaming Liu, and Yiqun Wang. "A Combined Approach for the Binarization of Historical Tibetan Document Images." International Journal of Pattern Recognition and Artificial Intelligence 33, no. 14 (May 9, 2019): 1954038. http://dx.doi.org/10.1142/s0218001419540387.

Full text
Abstract:
It is common that historical Tibetan documents belonging to historical collections are poorly preserved and are prone to degradation processes. This causes many challenges that can be addressed by image binarization, the most common of which is stains. A lack of uniform standard datasets makes it difficult to evaluate binarization effects. Motivated by the poor effects and difficulty of evaluating the binarization of historical Tibetan document images, a combined approach is proposed that aims to improve overall performance. The method includes the following parts: first, image generation through standard binarization and the background extraction of color images, which are both used for image synthesis in preparing for an evaluation. Then, preliminary binarization processing is implemented through channel combination in the lab color space and through local binarization. The synthetic images are used to select the coefficient when the channels are combined. Furthermore, Local Binary Pattern (hereafter LBP) and image smoothing is carried out after the combination of the channels to obtain the outline of the text area. Finally, the final binarization image is obtained by combining the preliminary binarization image and the text area contour image. Our method achieved top performance compared to other methods after a large number of synthetic image tests with a variety of background types.
APA, Harvard, Vancouver, ISO, and other styles
2

Zhang, Xiafen, and Vijayan Sugumaran. "Content Based Search Engine for Historical Calligraphy Images." International Journal of Intelligent Information Technologies 10, no. 3 (July 2014): 1–18. http://dx.doi.org/10.4018/ijiit.2014070101.

Full text
Abstract:
Paper collections of historical calligraphy objects in Libraries and museums are scanned into document images to serve the academic society. However, these digitized collections are in image format, lacking the technology to search by image content. This paper proposes a search engine for searching calligraphy image content. First, 2503 page images are segmented into characters and components. Second, characters are interactively labeled and features are extracted to build a calligraphy database. When an image search query is submitted, coarse features are first extracted and used to prune the long list of calligraphy characters into a shorter list. Then fine shape features are employed to determine the most similar characters. iDistance and NB-Tree are used to create the high dimensional index. The efficiency of the algorithm has been demonstrated through experiments with 110,737 individual calligraphic character images. This research provides a demonstration of the potential use of calligraphy content search on the web.
APA, Harvard, Vancouver, ISO, and other styles
3

Paulus, Erick, Mira Suryani, Setiawan Hadi, and Akik Hidayat. "INVESTIGASI SEGMENTASI BARIS UNTUK CITRA DOKUMEN SUNDA LAMPAU." JIKO (Jurnal Informatika dan Komputer) 2, no. 2 (October 12, 2017): 60. http://dx.doi.org/10.26798/jiko.2017.v2i2.65.

Full text
Abstract:
The variety of image quality of old Sundanese documents can be a real challenge for the process of text line segmentation. This paper describes the results of the investigation of two text line segmentation methods against several collections of Sunda document images, ie projection profile method and Seam Carving method. The deep investigation is done on handwritten documents written on lontar and paper media. The comparative experimental study was used as an investigative methodology in this study. Both methods is tested their performance capability on colored images and binary images using the evaluation matrix provided in handwriting segmentation competition ICDAR 2013. Experimental results show that projection profile method can work optimally on binary image and the type of writing is relatively horizontal. While the Seam Carving method is able to segment the lines in a non-linear manner and produce performance above 80%. With the added of binarization process in the pre-processing stage, the performance of Seam Carving method can increase up to 99% and the number of segmented lines is close to the number of groundtruth lines.
APA, Harvard, Vancouver, ISO, and other styles
4

Alturki, Hend Mohammed. "Museum Collections and the Importance of Studying History." Advances in Social Sciences Research Journal 7, no. 8 (August 12, 2020): 118–27. http://dx.doi.org/10.14738/assrj.78.8771.

Full text
Abstract:
Generally speaking , museums reflect peoples civilizations, enhance patriotism , and play a vital role in serving culture and heritage .in addition, they monitor and document the history of nations and traditions of peoples through their cultural programs. They also provide an image of the cultural momentum and indicate how far the society is aware of the importance of historical documentation. From this perspective , museum visitors have the opportunity to be acknowledged with the cultures and civilizations of nations through their antiquity collections .
APA, Harvard, Vancouver, ISO, and other styles
5

Giral, Angela. "Digital image libraries and the teaching of art and architectural history." Art Libraries Journal 23, no. 4 (1998): 18–25. http://dx.doi.org/10.1017/s0307472200011251.

Full text
Abstract:
Can museums and libraries profit from sharing their information, visual or textual? Is direct access to digital archives a more logical or economic way to develop access to images for teaching and research than assembling local collections? Recent digital image library projects in the United States, and their impact on the teaching practices of art and architectural historians, show the advantages of focusing on issues such as licensing and intellectual property, metadata and evolving cataloging practice, image quality, and the different costs of creation and delivery. But there are other potential benefits such as document delivery and the dissemination of archival information, as well as the preservation of fragile illustrated texts through digital imaging.
APA, Harvard, Vancouver, ISO, and other styles
6

Lafia, Sara, David A. Bleckley, and J. Trent Alexander. "Digitizing and parsing semi-structured historical administrative documents from the G.I. Bill mortgage guarantee program." Journal of Documentation 79, no. 7 (July 31, 2023): 225–39. http://dx.doi.org/10.1108/jd-03-2023-0055.

Full text
Abstract:
PurposeMany libraries and archives maintain collections of research documents, such as administrative records, with paper-based formats that limit the documents' access to in-person use. Digitization transforms paper-based collections into more accessible and analyzable formats. As collections are digitized, there is an opportunity to incorporate deep learning techniques, such as Document Image Analysis (DIA), into workflows to increase the usability of information extracted from archival documents. This paper describes the authors' approach using digital scanning, optical character recognition (OCR) and deep learning to create a digital archive of administrative records related to the mortgage guarantee program of the Servicemen's Readjustment Act of 1944, also known as the G.I. Bill.Design/methodology/approachThe authors used a collection of 25,744 semi-structured paper-based records from the administration of G.I. Bill Mortgages from 1946 to 1954 to develop a digitization and processing workflow. These records include the name and city of the mortgagor, the amount of the mortgage, the location of the Reconstruction Finance Corporation agent, one or more identification numbers and the name and location of the bank handling the loan. The authors extracted structured information from these scanned historical records in order to create a tabular data file and link them to other authoritative individual-level data sources.FindingsThe authors compared the flexible character accuracy of five OCR methods. The authors then compared the character error rate (CER) of three text extraction approaches (regular expressions, DIA and named entity recognition (NER)). The authors were able to obtain the highest quality structured text output using DIA with the Layout Parser toolkit by post-processing with regular expressions. Through this project, the authors demonstrate how DIA can improve the digitization of administrative records to automatically produce a structured data resource for researchers and the public.Originality/valueThe authors' workflow is readily transferable to other archival digitization projects. Through the use of digital scanning, OCR and DIA processes, the authors created the first digital microdata file of administrative records related to the G.I. Bill mortgage guarantee program available to researchers and the general public. These records offer research insights into the lives of veterans who benefited from loans, the impacts on the communities built by the loans and the institutions that implemented them.
APA, Harvard, Vancouver, ISO, and other styles
7

Gudonis, Vytautas. "THE IMAGE A VISUALLY IMPAIRED PERSON IN PHILATELY AS A MEANS OF FORMING AN ADEQUATE ATTITUDE TOWARDS THE BLINDNESS AND BLIND." SOCIETY. INTEGRATION. EDUCATION. Proceedings of the International Scientific Conference 4 (May 21, 2019): 390. http://dx.doi.org/10.17770/sie2019vol4.3962.

Full text
Abstract:
The topic of blindness and the image of a blind person in philately, although rarely analysed, has a great information potential. This research topic is part of our research "The Image of a Blind Man in the Cultural Heritage of Humanity." The purpose of the study is to systematize knowledge on the subject of image and blindness in philately, to consider the social aspects of this phenomenon. To collect information, the bulletin used the analysis of literature and the search for postage stamps depicting blind people in private collections of philatelists. The iconological method of interpretation of culture and art history was also used, which permitted to reveal the meaning of visions, symbols and their contexts. The monograph is based on the methodological assumptions of art historians Aby Walburg (1866 – 1929) and Erwin Panofsky (1892 – 1968), who claimed that historical and social aspects could be revealed through the works of art. E. Panofsky states that the works of art as human signs as well as other works can be considered documents, encoding the knowledge of the epoch, its culture and attitudes. The work of art is a symbol, indicating “something else” and allowing us to perceive the allegory; it is a document, telling us about certain cultural, religious, social and historic phenomena, depending on the context. The image of a blind person in stamps and commemorative envelopes are divided according to separate themes and analysed as social phenomena. The image of the blind and the topic of blindness in philately allow acquiring more knowledge about the blind, their potential, embossed writing, specificity of their orientation and mobility and at the same time forming positive attitudes towards visually impaired people. These findings activate further research on the image of a blind person in other areas of cultural heritage.
APA, Harvard, Vancouver, ISO, and other styles
8

Deju, Georgeta. "The Restoration of a Hunedoara Noble Family’s Library: the Kenderesi from Sălaşu de Sus." Études bibliologiques/Library Research Studies 3, no. 3 (2021): 29–50. http://dx.doi.org/10.33993/eb.2021.03.

Full text
Abstract:
The Kenderesi family from Sălaşu de Sus was one of the oldest families of Hunedoara County, with roots in a well-known Romanian family Cânde, from Ţara Haţegului. In 1887, Mihály Kenderesi and his son Ernő, donated their ‘beautiful book collection’ to the Society of History and Archaeology of Hunedoara County. The information appeared in the librarian’s report for 1887 and did not specify the title or number of the books. Thus, a research project began with the aim to identify the books with the notes and signatures of Kenderesi family members, among the documents of the Deva Museum Library. The initial findings were a no. of 21 books, printed in the 16th-19th centuries. After a few years, an archived document on the donation of the books was identified listing 54 positions, representing one periodical and books, some in rare editions. A comparison between the donation list and the inventories of the Deva Museum Library, indicated a no of 15 missing books, whilst the others were identified in The Library. Considering the number of volumes, their age, topic or the notes preserved on the pages of the books, the Kenderesi family donation that entered the collections of the Deva Museum Library in 1887 is significant for the region of Hunedoara and enriches the actual image of the bibliophile heritage.
APA, Harvard, Vancouver, ISO, and other styles
9

Petrov, Nikolai I. "Sacralization of the Portrait of the Empress Elizabeth Petrovna in the Orenburg Gubernia in the 1760s." Herald of an archivist, no. 1 (2023): 144–58. http://dx.doi.org/10.28995/2073-0101-2023-1-144-158.

Full text
Abstract:
An interesting example of the Russian phenomenon of monarch sacralization is reflected in the “lowest report” (1767) sent to the Most Holy Governing Synod by the retired secretary of the Bugulma voivodeship chancellery Ivan Nikiforovich Kurcheev (see appendix). In 1762 I. N. Kurcheev was an eyewitness to a miraculous phenomenon connected with the portrait of the Russian Empress Elizabeth Petrovna: “... And from the aforesaid image of the Most Radiant Monarchess there emanated an effulgence which attained the image of the Savior stationed in my house ...” Later I. N. Kurcheev began to revere this portrait as an icon: “... I began to put candles before this image ...” This was known to the local clergy and I. N. Kurcheev himself was convinced of the permissibility of such veneration of the Empress Elizabeth’s portrait, to whose intercession he ascribed the healing of his mother, children, and ward. I. N. Kurcheev formed a local cult of the deceased Empress, convinced of her holiness and of imperishability of her “relics” (when writing of Elizabeth’s death he used the word “dormition”). This conviction was based on the connection between the icon of the Savior and the portrait of Empress Elizabeth, miraculously shown to I. N. Kurcheev. The mentioned service of supplication (moleben) to Empress Elizabeth, which was done “under the name” of her saint patroness-namesake, as she was not canonized by the Church, correlates with the image of St. Elisabeth bearing likeness of Empress Elizabeth in the original Russian worship service to Sts. Zachariah the Prophet and Elisabeth the Righteous (the 5th of September) in the late 19th century. The later archival caption of I. N. Kurcheev’s report was supplemented by a laconic note that some “stucco image” of the Empress Elizabeth was attached to the document. Apparently, this refers to the said portrait, from which, according to I. N. Kurcheev, “emanated an effulgence” in 1762. There is no any additional information on this “stucco image.” One can assume that it was a painted bas-relief plaster portrait of the Empress Elizabeth, probably similar to mid-18th century reproductions of the lead portrait of the Empress Elizabeth by B. C. Rastrelli (1743), which are now stored in the museum collections. The published document is a peculiar and striking source on the Russian tradition of monarch’s portrait sacralization. This phenomenon of Russian folk piety developed in the 19th – 20th centuries.
APA, Harvard, Vancouver, ISO, and other styles
10

Brown, JP. "Introduction to 3D Imaging Using Photogrammetry." Biodiversity Information Science and Standards 2 (July 4, 2018): e27029. http://dx.doi.org/10.3897/biss.2.27029.

Full text
Abstract:
This full day workshop will provide an introduction to 3D imaging using photogrammetry. The course is designed for museum professionals who are already familiar with using digital single lens reflex (DSLR) cameras, and want to extend their practice to 3D imaging. Photogrammetry is a low-cost-of-entry 3D imaging method which can be used to produce excellent results for many different museum specimens, and scales well. From large buildings to tiny clay molds, photogrammetry has been used to successfully model and document a very wide variety of museum material in full color and in three dimensions. The technique can also be extended to multi-spectral imaging. The workshop will be hands-on and will cover camera setup, lighting, and image processing, imaging flat and contoured specimens. We will look at working at different scales, and metric photogrammetry using Agisoft Photoscan. The course will be led by a museum professional with five years of experience of using photogrammetry to image museum collections from bivalves and taxidermy to textiles, and fossils to furniture. Due to the intensive and fast-moving nature of the workshop, participation is limited to eight people. Participants will be expected to bring a DSLR and a laptop computer to the workshop.
APA, Harvard, Vancouver, ISO, and other styles
11

Stanford, Charlotte A. "Beyond Words: New Research on Manuscripts in Boston Collections, ed. Jeffrey F. Hamburger, Lisa Fagin Davis, Anne-Marie Eze, Nancy Netzer, and William P. Stoneman. Text, Image, Context: Studies in Medieval Manuscript Illumination, 8. Toronto: Pontifical Institute of Mediaeval Studies, 2021, 361 pp, 291 col. Ill." Mediaevistik 34, no. 1 (January 1, 2021): 274–76. http://dx.doi.org/10.3726/med.2021.01.20.

Full text
Abstract:
This study stems from an exhibition/ conference of the same name, “Beyond Words,” presented in Boston in 2006; however, it goes well beyond the bounds of a conventional exhibition catalog, which was produced at the time to accompany the objects on display. The volume produced here expands these initial parameters to consider additional questions about the manuscripts held in these Boston collections, notably Houghton Library at Harvard University, McMullen Museum of Art at Boston College, and the Isabella Stewart Gardner Museum of Boston. The book is divided into four major sections, devoted respectively to monastic manuscripts (3 essays), courtly culture and patronage (5 essays), princes, patricians, prelates and pontiffs (4 essays), and illuminating history (3 essays) with a coda on manuscripts in the modern era provided by the final essay. As the editors remark in their introduction, the emphasis is Christian and central European; this is due in part to the collection parameters themselves (the above institutions have no Ethiopian or Hebrew manuscripts, for example) and in part by limitations of time and focus (there are a number of Islamic manuscripts in the Boston collections which have not been included here but would be well worth exploring in a separate study of their own). The richness and depth of the sixteen essays here offer insights into many aspects of the late medieval world. The chapter by Patricia Stirnemann on Gilbert de la Porrée traces book collection of the works of a single, theologically problematic author, and offers a valuable case study on the transmission of writings by a scholar charged (though exonerated) with heresy. Brigitte Miriam Bedos-Rezak demonstrates how the charters of the abbey of Sawley preserved in the Houghton library allow us to consider the “medial role” of document writing, and how this practice assisted an English Cistercian monastery to shape its own representation with its neighbors by crafting records of land ownership disputes. Kathryn M. Rudy examines manuscript workshops among nuns in Delft in the fifteenth century, providing a vivid model of book production practices in these devotional contexts.
APA, Harvard, Vancouver, ISO, and other styles
12

Sivkov, S. I., S. P. Simakov, and A. I. Vinokur. "Algorithms for contactless scanning of book monuments." Proceedings of SPSTL SB RAS, no. 3 (September 21, 2021): 9–15. http://dx.doi.org/10.20913/2618-7575-2021-3-9-15.

Full text
Abstract:
The article is devoted to the questions of cultural heritage preservation by creating the digital collection of book monuments. The original documents are monuments of book culture and their dilapidated state requires careful handling, splitting of documents for scanning is extremely undesirable. The market does not present the equipment for contactless scanning of books without embroidering, therefore an algorithm that allows digitalizing book monuments in a contactless way has been developed. The technique has been constructed using an algorithm based on the projection of the light grid on the object scanned. The authors propose a sequence of actions consisting of image processing and comparing the results between two images. The first snapshot determines the initial parameters of the grid; the second snapshot determines the actual distortion of the test snapshot. Subsequent mathematical processing allows getting scanned images without absence of geometric distortions of the scanned page due to the system of using the two-dimensional array of corrections. The application of the system has been modeled on the example of «The legend of the destruction of Siberian cities of Tara and Tyumen by the lesser Tatars / / Collection of moral stories, words, lives and other articles [hand.]». The evaluation parameters of the simulation result have been the following: text distinctness, absence of geometric distortions, color quality, uniformity of document scanning quality within a single book, etc., as checked and recognized as high by the experts.The experience described opens possibilities of book monuments digitization using the new algorithm. The development of the system is aimed at expanding the database of objects of material culture to be digitized, perfecting the software, improving the quality of digital images, as well as the capabilities of image recognition and search for the document itself and information it contains.
APA, Harvard, Vancouver, ISO, and other styles
13

Demner-Fushman, Dina, Marc D. Kohli, Marc B. Rosenman, Sonya E. Shooshan, Laritza Rodriguez, Sameer Antani, George R. Thoma, and Clement J. McDonald. "Preparing a collection of radiology examinations for distribution and retrieval." Journal of the American Medical Informatics Association 23, no. 2 (July 1, 2015): 304–10. http://dx.doi.org/10.1093/jamia/ocv080.

Full text
Abstract:
Abstract Objective Clinical documents made available for secondary use play an increasingly important role in discovery of clinical knowledge, development of research methods, and education. An important step in facilitating secondary use of clinical document collections is easy access to descriptions and samples that represent the content of the collections. This paper presents an approach to developing a collection of radiology examinations, including both the images and radiologist narrative reports, and making them publicly available in a searchable database. Materials and Methods The authors collected 3996 radiology reports from the Indiana Network for Patient Care and 8121 associated images from the hospitals’ picture archiving systems. The images and reports were de-identified automatically and then the automatic de-identification was manually verified. The authors coded the key findings of the reports and empirically assessed the benefits of manual coding on retrieval. Results The automatic de-identification of the narrative was aggressive and achieved 100% precision at the cost of rendering a few findings uninterpretable. Automatic de-identification of images was not quite as perfect. Images for two of 3996 patients (0.05%) showed protected health information. Manual encoding of findings improved retrieval precision. Conclusion Stringent de-identification methods can remove all identifiers from text radiology reports. DICOM de-identification of images does not remove all identifying information and needs special attention to images scanned from film. Adding manual coding to the radiologist narrative reports significantly improved relevancy of the retrieved clinical documents. The de-identified Indiana chest X-ray collection is available for searching and downloading from the National Library of Medicine ( http://openi.nlm.nih.gov/ ).
APA, Harvard, Vancouver, ISO, and other styles
14

Noel, Steven, Chee-Hung Henry Chu, and Vijay Raghavan. "Co-Citation Count vs Correlation for Influence Network Visualization." Information Visualization 2, no. 3 (September 2003): 160–70. http://dx.doi.org/10.1057/palgrave.ivs.9500049.

Full text
Abstract:
Visualization of author or document influence networks as a two-dimensional image can provide key insights into the direct influence of authors or documents on each other in a document collection. The influence network is constructed based on the minimum spanning tree, in which the nodes are documents and an edge is the most direct influence between two documents. Influence network visualizations have typically relied on co-citation correlation as a measure of document similarity. That is, the similarity between two documents is computed by correlating the sets of citations to each of the two documents. In a different line of research, co-citation count (the number of times two documents are jointly cited) has been applied as a document similarity measure. In this work, we demonstrate the impact of each of these similarity measures on the document influence network. We provide examples, and analyze the significance of the choice of similarity measure. We show that correlation-based visualizations exhibit chaining effects (low average vertex degree), a manifestation of multiple minor variations in document similarities. These minor similarity variations are absent in count-based visualizations. The result is that count-based influence network visualizations are more consistent with the intuitive expectation of authoritative documents being hubs that directly influence large numbers of documents.
APA, Harvard, Vancouver, ISO, and other styles
15

Lizunov, Petro, Andrii Biloshchytskyi, Alexander Kuchansky, and Yurii Andrashko. "Combined methods for identifying incomplete duplicates in scientific publications." Management of Development of Complex Systems, no. 48 (December 20, 2021): 85–94. http://dx.doi.org/10.32347/2412-9933.2021.48.85-94.

Full text
Abstract:
Recognition of incomplete duplicates of images and tables is considered. In order to recognize graphical data (for image classification and compression), wavelet analysis is used with a set of classic characteristic functions: Morlet and Haar wavelets, Mexican hat wavelet, etc. Special types of filters are also used, which are based on the so-called ridgelet, curvlet and beamlet transformations. The main classical methods of image collection clustering that can be used to find incomplete duplicates in the graphic data of electronic documents are considered. The Harris method is analyzed, which allows to determine the reference points of the images by measuring the intensity of the brightness of the image. SIFT (scale-invariant feature transformation) technology, which is a powerful tool for forming a system of invariant structural features, is also analyzed, another class of methods is considered, which are easy to implement and use to detect incomplete duplicate images – hash methods. It is described that there are three such signals for RGB images: brightness in Red, Green and Blue channels. In signal processing and related branches of Fourier transform, decomposition of the signal into frequencies and amplitudes is usually considered. A method for identifying context-sensitive values and indexing textual data is considered, which helps to find incomplete duplicates in tables based on textual and numerical representation of data. Similarly, the described method can be used to index data of numerical and text types, if they are not placed in a table, but inside the content of an electronic document. The results of the research are used in combination with the system of detection of incomplete duplicates in scientific documents, in particular dissertations for the degree.
APA, Harvard, Vancouver, ISO, and other styles
16

Hansen, Matthias, André Pomp, Kemal Erki, and Tobias Meisen. "Data-Driven Recognition and Extraction of PDF Document Elements." Technologies 7, no. 3 (September 11, 2019): 65. http://dx.doi.org/10.3390/technologies7030065.

Full text
Abstract:
In the age of digitalization, the collection and analysis of large amounts of data is becoming increasingly important for enterprises to improve their businesses and processes, such as the introduction of new services or the realization of resource-efficient production. Enterprises concentrate strongly on the integration, analysis and processing of their data. Unfortunately, the majority of data analysis focuses on structured and semi-structured data, although unstructured data such as text documents or images account for the largest share of all available enterprise data. One reason for this is that most of this data is not machine-readable and requires dedicated analysis methods, such as natural language processing for analyzing textual documents or object recognition for recognizing objects in images. Especially in the latter case, the analysis methods depend strongly on the application. However, there are also data formats, such as PDF documents, which are not machine-readable and consist of many different document elements such as tables, figures or text sections. Although the analysis of PDF documents is a major challenge, they are used in all enterprises and contain various information that may contribute to analysis use cases. In order to enable their efficient retrievability and analysis, it is necessary to identify the different types of document elements so that we are able to process them with tailor-made approaches. In this paper, we propose a system that forms the basis for structuring unstructured PDF documents, so that the identified document elements can subsequently be retrieved and analyzed with tailor-made approaches. Due to the high diversity of possible document elements and analysis methods, this paper focuses on the automatic identification and extraction of data visualizations, algorithms, other diagram-like objects and tables from a mixed document body. For that, we present two different approaches. The first approach uses methods from the area of deep learning and rule-based image processing whereas the second approach is purely based on deep learning. To train our neural networks, we manually annotated a large corpus of PDF documents with our own annotation tool, of which both are being published together with this paper. The results of our extraction pipeline show that we are able to automatically extract graphical items with a precision of 0.73 and a recall of 0.8. For tables, we reach a precision of 0.78 and a recall of 0.94.
APA, Harvard, Vancouver, ISO, and other styles
17

Wang, Qian, Lida Li, Chew Lim Tan, and Tao Xia. "Image Enhancement of Historical Documents Using Directional Wavelet." International Journal of Wavelets, Multiresolution and Information Processing 01, no. 03 (September 2003): 291–305. http://dx.doi.org/10.1142/s0219691303000190.

Full text
Abstract:
This paper proposes a novel algorithm to clean up a large collection of historical handwritten documents kept in the National Archives of Singapore. Due to the seepage of ink over long period of storage, the front page of each document has been severely marred by the reverse side writing. Earlier attempts have been made to match both sides of a page to identify the offending strokes originating from the back so as to eliminate them with the aid of a wavelet transform. Perfect matching, however, is difficult due to document skews, differing resolutions, inadvertently missing out reverse side and warped pages during image capture. A new approach is now proposed to do away with double side mapping by using a directional wavelet transform that is able to distinguish the foreground and reverse side strokes much better than the conventional wavelet transform. Experiments have shown that the method indeed enhances the readability of each document significantly after the directional wavelet operation without the need for mapping with its reverse side.
APA, Harvard, Vancouver, ISO, and other styles
18

Kim, Gyuho, Jung Gon Kim, Kitaek Kang, and Woo Sik Yoo. "Image-Based Quantitative Analysis of Foxing Stains on Old Printed Paper Documents." Heritage 2, no. 3 (September 18, 2019): 2665–77. http://dx.doi.org/10.3390/heritage2030164.

Full text
Abstract:
We studied the feasibility of image-based quantitative analysis of foxing stains on collections of old (16th–20th century) European books stored in the Rare Book Library of the Seoul National University in Korea. We were able to quantitatively determine the foxing affected areas on books from their photographs using a newly developed image processing software (PicMan) including cultural property characterization applications, specifically. Dimensional and color analysis of photographs were successfully done quantitatively. Histograms of RGB (red, green, blue) pixels of photographs clearly showed the change in color distribution of foxing stains compared to the other areas of the photographs. Several sample images of quantitative measurement of foxing stains and virtually restored images were generated to provide easy visual inspection and comparison between restored images and the original photographs. Image quality, resolution, and digital file format requirements for quantitative analysis are described. Image-based quantitative analysis of foxing stains on paper documents are found to be very promising towards automation for objective characterization of photographs of cultural properties. This technique can be used to create a cultural property digital database. Quantitative and statistical analysis techniques can be introduced to monitor the effect of storage and conservation environment on the cultural properties.
APA, Harvard, Vancouver, ISO, and other styles
19

Sari, Toufik, Abderrahmane Kefali, and Halima Bahi. "Text Extraction from Historical Document Images by the Combination of Several Thresholding Techniques." Advances in Multimedia 2014 (2014): 1–10. http://dx.doi.org/10.1155/2014/934656.

Full text
Abstract:
This paper presents a new technique for the binarization of historical document images characterized by deteriorations and damages making their automatic processing difficult at several levels. The proposed method is based on hybrid thresholding combining the advantages of global and local methods and on the mixture of several binarization techniques. Two stages have been included. In the first stage, global thresholding is applied on the entire image and two different thresholds are determined from which the most of image pixels are classified intoforegroundorbackground. In the second stage, the remaining pixels are assigned toforegroundorbackgroundclasses based on local analysis. In this stage, several local thresholding methods are combined and the final binary value of each remaining pixel is chosen as the most probable one. The proposed technique has been tested on a large collection of standard and synthetic documents and compared with well-known methods using standard measures and was shown to be more powerful.
APA, Harvard, Vancouver, ISO, and other styles
20

Dhanikonda, Srinivasa Rao, Ponnuru Sowjanya, M. Laxmidevi Ramanaiah, Rahul Joshi, B. H. Krishna Mohan, Dharmesh Dhabliya, and N. Kannaiya Raja. "An Efficient Deep Learning Model with Interrelated Tagging Prototype with Segmentation for Telugu Optical Character Recognition." Scientific Programming 2022 (August 29, 2022): 1–10. http://dx.doi.org/10.1155/2022/1059004.

Full text
Abstract:
More than 66 million people in India speak Telugu, a language that dates back thousands of years and is widely spoken in South India. There has not been much progress reported on the advancement of Telugu text Optical Character Recognition (OCR) systems. Telugu characters can be composed of many symbols joined together. OCR is the process of turning a document image into a text-editable one that may be used in other applications. It saves a great deal of time and effort by not having to start from scratch each time. There are hundreds of thousands of different combinations of modifiers and consonants when writing compound letters. Symbols joined to one another form a compound character. Since there are so many output classes in Telugu, there’s a lot of interclass variation. Additionally, there are not any Telugu OCR systems that take use of recent breakthroughs in deep learning, which prompted us to create our own. When used in conjunction with a word processor, an OCR system has a significant impact on real-world applications. In a Telugu OCR system, we offer two ways to improve symbol or glyph segmentation. When it comes to Telugu OCR, the ability to recognise that Telugu text is crucial. In a picture, connected components are collections of identical pixels that are connected to one another by either 4- or 8-pixel connectivity. These connected components are known as glyphs in Telugu. In the proposed research, an efficient deep learning model with Interrelated Tagging Prototype with Segmentation for Telugu Text Recognition (ITP-STTR) is introduced. The proposed model is compared with the existing model and the results exhibit that the proposed model’s performance in text recognition is high.
APA, Harvard, Vancouver, ISO, and other styles
21

Zahra, Syeda Binish. "Desktop Based: Off-line Information Retrieval System." International Journal for Electronic Crime Investigation 2, no. 4 (December 7, 2018): 4. http://dx.doi.org/10.54692/ijeci.2018.020423.

Full text
Abstract:
Information retrieval is rapidlydeveloping field and there are many changes are introduced day by day in traditional techniques for IR. IR system is intended to evaluate examine and accumulate the sources of information and get back those that match user's requirement. The need of today's fast moving life is to get maximum information but within minimum time. For getting maximum information in minimum time requires more efforts. The main functionality of IR is to provide access of documents (that document may be in collection of thousand, or millions). With the help of an appropriate description, user can recover any one document from a collection of documents. In this paper I describe my IR system which retrieves information from any directory and this information may be in terms of image, audio or in text form. The selection of good features also allows the space, time and costs of the retrieval process to be reduced. Two documents may be considered similar in this system if they have same name and places in different folders or directories. To explore the retrieval process from that system, I used Apache Lucene with JAVA implemented in IntelliJ.
APA, Harvard, Vancouver, ISO, and other styles
22

Zahra, Syeda Binish. "Desktop Based: Off-line Information Retrieval System." International Journal for Electronic Crime Investigation 2, no. 4 (December 7, 2018): 4. http://dx.doi.org/10.54692/ijeci.2019.030123.

Full text
Abstract:
Information retrieval is rapidlydeveloping field and there are many changes are introduced day by day in traditional techniques for IR. IR system is intended to evaluate examine and accumulate the sources of information and get back those that match user's requirement. The need of today's fast moving life is to get maximum information but within minimum time. For getting maximum information in minimum time requires more efforts. The main functionality of IR is to provide access of documents (that document may be in collection of thousand, or millions). With the help of an appropriate description, user can recover any one document from a collection of documents. In this paper I describe my IR system which retrieves information from any directory and this information may be in terms of image, audio or in text form. The selection of good features also allows the space, time and costs of the retrieval process to be reduced. Two documents may be considered similar in this system if they have same name and places in different folders or directories. To explore the retrieval process from that system, I used Apache Lucene with JAVA implemented in IntelliJ.
APA, Harvard, Vancouver, ISO, and other styles
23

Edwards, E. "Nigerian Collections in Pitt Rivers Museum Archives, University of Oxford." African Research & Documentation 55 (1991): 51–52. http://dx.doi.org/10.1017/s0305862x00015892.

Full text
Abstract:
Pitt Rivers Museum is one of the major anthropological museums in the world and as such has considerable object collections from Nigeria. Less known is its archive collection which contains a small but interesting collection of material relating to Nigeria. The Museum has been collecting archival material since its foundation in 1884 and the collections are still growing annually as more material is donated. At present the entire collection stands in the region of sixty manuscript collections of varying sizes and about 70,000 photographic images. The archive collections do not document specific objects in the museum collections (any material of this nature belongs with specific object records) but the broader historical and intellectual contexts which shaped anthropology in general and the Museum's collection in particular. The Nigerian material, although it is somewhat uneven, typifies this collecting policy and comprises both manuscripts and photographs.
APA, Harvard, Vancouver, ISO, and other styles
24

Kalinowski, P., F. Both, T. Luhmann, and U. Warnke. "DATA FUSION OF HISTORICAL PHOTOGRAPHS WITH MODERN 3D DATA FOR AN ARCHAEOLOGICAL EXCAVATION – CONCEPT AND FIRST RESULTS." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B2-2021 (June 28, 2021): 571–76. http://dx.doi.org/10.5194/isprs-archives-xliii-b2-2021-571-2021.

Full text
Abstract:
Abstract. Through the destruction of war, most of the documents of an archaeological excavation from 1934 – 1939 of a megalithic tomb in north-west Germany have been destroyed irretrievably. Fortunately, more than 500 historical pictures have been preserved, which visually document the excavation situation at that time. Parts of the image collection are preserved on fragile glass plates that are difficult to preserve and have to be digitised urgendly. A method for digitising these glass plates will be presented first. With the help of the digitised historical images, the excavation situation at that time shall be reconstructed. Since a reconstruction based only on the historical images is not possible, the current state of the megalithic tombs has been recorded with modern measuring technology and a 3D model has been calculated. The aim is to fuse the historical images with the modern 3D model. For this purpose, different possibilities of linking the data are presented. As first results, point clouds calculated by Structure from Motion and the orientation of historical images in relation to the modern 3D model using direct linear transformation are shown. The hybrid model of historical and modern data will be used for archaeological interpretations of the excavation.
APA, Harvard, Vancouver, ISO, and other styles
25

Perry, Barbara. "Image as Document: The Pictorial Collection of the National Library of Australia." Australian Academic & Research Libraries 22, no. 4 (January 1991): 81–87. http://dx.doi.org/10.1080/00048623.1991.10754741.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Benabdelaziz, Ryma, Djamel Gaceb, and Mohammed Haddad. "Word Spotting Based on Bispace Similarity for Visual Information Retrieval in Handwritten Document Images." International Journal of Computer Vision and Image Processing 9, no. 3 (July 2019): 38–58. http://dx.doi.org/10.4018/ijcvip.2019070103.

Full text
Abstract:
Retrieving information from a huge collection of ancient handwritten documents is important for indexing, interpreting, browsing, and searching documents in various domains. Word spotting approaches are widely used in this context but have several limitations related to the complex properties of handwriting. These can appear at several steps: interest point detection, description, and matching. This article proposes a new word spotting approach for the word retrieval in handwritten document, which mainly leverages the properties of image gradients for visual features detection and description. The proposed approach is based on the combination of spatial relationships with textural information to design a more accurate matching. The experimental results of the proposed approach demonstrate a higher performance over the Jeremy Bentham dataset, evaluated following the recent benchmarks of ICDAR 2015 Competition on Keyword Spotting for Handwritten Documents.
APA, Harvard, Vancouver, ISO, and other styles
27

Mikhaylov, D. V., and G. M. Emelyanov. "Ranking of documents of topical corpus according to their mutual relevance in the problem of estimating of affinity of a text to the sense standard." Journal of Physics: Conference Series 2052, no. 1 (November 1, 2021): 012027. http://dx.doi.org/10.1088/1742-6596/2052/1/012027.

Full text
Abstract:
Abstract The offered paper is devoted to the problem of oneness and integrity of image for the semantic pattern (i.e., sense standard) revealed phrase by phrase for some text within a topical collection. One phrase corresponds here to an extended natural-language sentence. The basis of estimating affinity to the standard is the classifying of words of each phrase in a text according to the TF-IDF value relative to some text corpus. Texts to the corpus are pre-selected by an expert. The essence of the problem: for each phrase, its maximal affinity to the sense standard is achieved concerning the individual corpus document, and, consequently, it is necessary to estimate the mutual relevance of such documents concerning different phrases of the analyzed text. Based on distances between vectors of TF-IDF for words of a separate phrase obtained relative to different corpus documents, the significance estimation for each such document is entered into consideration to choose a pair of mutual relevant.
APA, Harvard, Vancouver, ISO, and other styles
28

Samosir, Ridha Sefina. "Filtering and Wavelet Transform Algorithm for Old Document Image Restoration." ComTech: Computer, Mathematics and Engineering Applications 8, no. 3 (September 30, 2017): 177. http://dx.doi.org/10.21512/comtech.v8i3.3995.

Full text
Abstract:
The aim of this research was to develop image restoration system using filtering and wavelet transform algorithm. Data collection was through observation and system was developed using prototyping model. Result of this research is a computer based on system to restore image containing noise. Based on the research process, filtering and wavelet transform algorithm can used to restore old document image from interferences (noise).
APA, Harvard, Vancouver, ISO, and other styles
29

Ben Arbia, Ines Ben Messaoud, Haikal El Abed, Volker Märgner, and Hamid Amiri. "Collaborative Access to Ancient Documents." International Journal of Mobile Computing and Multimedia Communications 4, no. 3 (July 2012): 34–53. http://dx.doi.org/10.4018/jmcmc.2012070103.

Full text
Abstract:
With the evolution of the next generation networks several applications have emerged to be used through the web. Applications allowing the analysis and the recognition of documents are emerged to be used through Internet. Document pre-processing output may affect the efficiency of document analysis and recognition systems. In order to ameliorate the efficiency of such systems, an objective evaluation of pre-processing steps is necessary. The authors propose a new framework for automatic evaluation of binarization approaches. The evaluation of binarization approaches is based on the comparison between binary images and their ground-truths. For that reason a method for the generation of ground-truth is proposed. This method is evaluated using the benchmarking dataset DIBCO 2009. The comparison between binarization methods’ results is based on several evaluation metrics as FM and PSNR. The proposed framework is tested on printed images (250 images) from the Google-Books collection and on handwritten images (60 images) from the IAM historical database.
APA, Harvard, Vancouver, ISO, and other styles
30

Purba, Angga Maulana, Agus Harjoko, and Mohammad Edi Wibowo. "Text Detection In Indonesian Identity Card Based On Maximally Stable Extremal Regions." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 13, no. 2 (April 30, 2019): 177. http://dx.doi.org/10.22146/ijccs.41259.

Full text
Abstract:
Most of Indonesian organizations either it is government or non government sometime required their member to provide their identity card (E-KTP) as legal document collection in their database. This collection of image usually being used as manual verification method. These document images acquired by each person with their own device, there are variations of angles they are used to acquire the image. This situation created problems in text recognition by OCR softwares especially in text detection part, orientation and noise will affect their accuracy. These cases making the text detection more complex and cannot be solved by simple vertical projection profile of black pixels. This research proposed a method to improve text detection in identity document by fixing the orientation first, then using MSER regions to form text region. We fix the orientation using the line that made by Progressive Probabilistic Hough Transform. Then we used MSER to obtain all candidate regions and Horizontal RLSA acts as connector between those candidate. The orientation fixing strategy reach average of margin error 0.377o (in 360o system) and the text detection method reach 84.49% accuracy in best condition.
APA, Harvard, Vancouver, ISO, and other styles
31

Mikhaylov, Andrey Anatolievitch. "Automatic data labeling for document image segmentation using deep neural networks." Proceedings of the Institute for System Programming of the RAS 34, no. 6 (2022): 137–46. http://dx.doi.org/10.15514/ispras-2022-34(6)-10.

Full text
Abstract:
The article proposes a new method for automatic data annotation for solving the problem of document image segmentation using deep object detection neural networks. The format of marked PDF files is considered as the initial data for markup. The peculiarity of this format is that it includes hidden marks that describe the logical and physical structure of the document. To extract them, a tool has been developed that simulates the operation of a stack-based printing machine according to the PDF format specification. For each page of the document, an image and annotation are generated in PASCAL VOC format. The classes and coordinates of the bounding boxes are calculated during the interpretation of the labeled PDF file based on the labels. To test the method, a collection of marked up PDF files was formed from which images of document pages and annotations for three segmentation classes (text, table, figure) were automatically obtained. Based on these data, a neural network of the EfficientDet D2 architecture was trained. The model was tested on manually labeled data from the same domain, which confirmed the effectiveness of using automatically generated data for solving applied problems.
APA, Harvard, Vancouver, ISO, and other styles
32

Cardall, Anna Catherine, Riley Chad Hales, Kaylee Brooke Tanner, Gustavious Paul Williams, and Kel N. Markert. "LASSO (L1) Regularization for Development of Sparse Remote-Sensing Models with Applications in Optically Complex Waters Using GEE Tools." Remote Sensing 15, no. 6 (March 20, 2023): 1670. http://dx.doi.org/10.3390/rs15061670.

Full text
Abstract:
Remote-sensing data are used extensively to monitor water quality parameters such as clarity, temperature, and chlorophyll-a (chl-a) content. This is generally achieved by collecting in situ data coincident with satellite data collections and then creating empirical water quality models using approaches such as multi-linear regression or step-wise linear regression. These approaches, which require modelers to select model parameters, may not be well suited for optically complex waters, where interference from suspended solids, dissolved organic matter, or other constituents may act as “confusers”. For these waters, it may be useful to include non-standard terms, which might not be considered when using traditional methods. Recent machine-learning work has demonstrated an ability to explore large feature spaces and generate accurate empirical models that do not require parameter selection. However, these methods, because of the large number of included terms involved, result in models that are not explainable and cannot be analyzed. We explore the use of Least Absolute Shrinkage and Select Operator (LASSO), or L1, regularization to fit linear regression models and produce parsimonious models with limited terms to enable interpretation and explainability. We demonstrate this approach with a case study in which chl-a models are developed for Utah Lake, Utah, USA., an optically complex freshwater body, and compare the resulting model terms to model terms from the literature. We discuss trade-offs between interpretability and model performance while using L1 regularization as a tool. The resulting model terms are both similar to and distinct from those in the literature, thereby suggesting that this approach is useful for the development of models for optically complex water bodies where standard model terms may not be optimal. We investigate the effect of non-coincident data, that is, the length of time between satellite image collection and in situ sampling, on model performance. We find that, for Utah Lake (for which there are extensive data available), three days is the limit, but 12 h provides the best trade-off. This value is site-dependent, and researchers should use site-specific numbers. To document and explain our approach, we provide Colab notebooks for compiling near-coincident data pairs of remote-sensing and in situ data using Google Earth Engine (GEE) and a second notebook implementing L1 model creation using scikitlearn. The second notebook includes data-engineering routines with which to generate band ratios, logs, and other combinations. The notebooks can be easily modified to adapt them to other locations, sensors, or parameters.
APA, Harvard, Vancouver, ISO, and other styles
33

Behrend, Dawn. "Sex & Sexuality Module I: Research Collections from the Kinsey Institute Library & Special Collections." Charleston Advisor 21, no. 4 (April 1, 2020): 40–44. http://dx.doi.org/10.5260/chara.21.4.40.

Full text
Abstract:
Sex & Sexuality, Module I: Research Collections from the Kinsey Institute Library & Special Collections published by Adam Matthew Digital is a collection of digitized primary sources obtained exclusively from the Kinsey Institute Library & Special Collections dedicated to the study of human sexuality throughout the twentieth century. The collection makes use of the artificial intelligence capabilities of Handwritten Text Recognition (HTR) to enable keyword searching of handwritten documents. The documents and images in the collection have been meticulously digitized by Adam Matthew Digital making them discoverable, visually appealing, and adjustable. The proprietary interface is intuitive to navigate with the product being compatible with a range of browsers and electronic devices. Contract provisions are standard to the product and permit for use across locations and interlibrary loan sharing. As pricing is primarily determined by size and enrollment, the collection may be affordable for libraries of varying sizes. Users seeking more current research on gender and women’s studies may find ProQuest’s GenderWatch a more suitable choice, while those seeking information on sexuality from the sixteenth to mid-twentieth centuries may prefer Part III of Gale’s Archives of Sexuality & Gender with both resources providing access to a range of sources beyond that of the Kinsey Institute.
APA, Harvard, Vancouver, ISO, and other styles
34

Santosh, K. C., Naved Alam, Partha Pratim Roy, Laurent Wendling, Sameer Antani, and George R. Thoma. "A Simple and Efficient Arrowhead Detection Technique in Biomedical Images." International Journal of Pattern Recognition and Artificial Intelligence 30, no. 05 (April 21, 2016): 1657002. http://dx.doi.org/10.1142/s0218001416570020.

Full text
Abstract:
In biomedical documents/publications, medical images tend to be complex by nature and often contain several regions that are annotated using arrows. In this context, an automated arrowhead detection is a critical precursor to region-of-interest (ROI) labeling and image content analysis. To detect arrowheads, in this paper, images are first binarized using fuzzy binarization technique to segment a set of candidates based on connected component (CC) principle. To select arrow candidates, we use convexity defect-based filtering, which is followed by template matching via dynamic time warping (DTW). The DTW similarity score confirms the presence of arrows in the image. Our test results on biomedical images from imageCLEF 2010 collection shows the interest of the technique, and can be compared with previously reported state-of-the-art results.
APA, Harvard, Vancouver, ISO, and other styles
35

McGlamery, Patrick, and Robert G. Cromley. "The Map Library’s Emerging Role in the Dissemination of Cartographic Information on the Internet." Cartographic Perspectives, no. 39 (June 1, 2001): 4–11. http://dx.doi.org/10.14714/cp39.605.

Full text
Abstract:
The Internet is allowing a range of cartographic products from images of map documents to numerical databases of cartographic content to be transmitted to a global user community. This technological innovation is forcing map libraries to rethink the manner in which to provide their services since libraries have traditionally had the responsibility for the storage of, and access to, information by society. The functions of a map library that allow a patron to search the holdings, go to the storage location, browse the document, and ultimately copy it in-house or check out the document can now be provided online. This paper describes the efforts and problems of collection development, assessment of user community needs and access policies associated with an internet-based map library.
APA, Harvard, Vancouver, ISO, and other styles
36

Moscicka, Albina. "“GEOHeritage” - GIS Based Application for Movable Heritage." Geoinformatics FCE CTU 6 (December 21, 2011): 228–32. http://dx.doi.org/10.14311/gi.6.28.

Full text
Abstract:
The paper will present the results of a research project „A methodology for mapping movable heritage”. This project, financed by the Polish Ministry of Science and Higher Education in 2008-2010, was realized by the Institute of Geodesy and Cartography in cooperation with the Research and Academic Computer Network (portal Polska.pl), the Central Archives of Historical Records and Department of Art History of the Wroclaw University. The idea of the project was to simplify access to digital movable cultural heritage by the use of spatial information. The main aspect of the project was to use a Geographic Information System (GIS) - as a technology and as a tool - to integrate different digital collections, present their content in one space and provide online access to them from one common level – from an online map. The essence of the research was to present on the online map movable monument as multi-spatial object. The base of this assumption is that most of monuments, especially movable ones, can have several places in the geographical space that are connected with them (several various space relations). As a rule archival documents were created in one place, describe the other, today can be kept in places far away from the place they were prepared, and what more the parts of the same collection can be kept in different archives. Moreover, one single document can be connected or have relations (typological, thematically, temporal, spatial) with other relations to the same or the other one. The reason for it is that documents concerning various places are housed in the same archive, various documents can present the same place or the place of creating particular document can be the place of housing another. In the project the basic source material was digital collections of original records. Their metadata defined in the international standards of monuments’ description were used for connecting digital monuments with the geographic space. With the use of these standards, the Internet application for presenting cultural heritage on the map was developed. It can be found at www.GEOHeritage.polska.pl (Polish version) and www.GEOHeritage.poland.pl (English version). The application is based on the Geographic Information System (GIS), and its functionality is mainly the elements of selecting the resources, presenting the documents on the map in different ways and finding their images. The paper will present methodological solutions necessary for building on-line map of movable heritage together with the functionality of the application.
APA, Harvard, Vancouver, ISO, and other styles
37

Liu, Xiaoyu, Shirley J. Dyke, Chul Min Yeum, Ilias Bilionis, Ali Lenjani, and Jongseong Choi. "Automated Indoor Image Localization to Support a Post-Event Building Assessment." Sensors 20, no. 6 (March 13, 2020): 1610. http://dx.doi.org/10.3390/s20061610.

Full text
Abstract:
Image data remains an important tool for post-event building assessment and documentation. After each natural hazard event, significant efforts are made by teams of engineers to visit the affected regions and collect useful image data. In general, a global positioning system (GPS) can provide useful spatial information for localizing image data. However, it is challenging to collect such information when images are captured in places where GPS signals are weak or interrupted, such as the indoor spaces of buildings. The inability to document the images’ locations hinders the analysis, organization, and documentation of these images as they lack sufficient spatial context. In this work, we develop a methodology to localize images and link them to locations on a structural drawing. A stream of images can readily be gathered along the path taken through a building using a compact camera. These images may be used to compute a relative location of each image in a 3D point cloud model, which is reconstructed using a visual odometry algorithm. The images may also be used to create local 3D textured models for building-components-of-interest using a structure-from-motion algorithm. A parallel set of images that are collected for building assessment is linked to the image stream using time information. By projecting the point cloud model to the structural drawing, the images can be overlaid onto the drawing, providing clear context information necessary to make use of those images. Additionally, components- or damage-of-interest captured in these images can be reconstructed in 3D, enabling detailed assessments having sufficient geospatial context. The technique is demonstrated by emulating post-event building assessment and data collection in a real building.
APA, Harvard, Vancouver, ISO, and other styles
38

Saputra, Ade Chandra, and Agus Sehatman Saragih. "IMPLEMENTASI ALGORITMA RIJNDAEL DALAM ENKRIPSI DAN DEKRIPSI GAMBAR DIGITAL BERBASIS WEB." Jurnal Teknologi Informasi Jurnal Keilmuan dan Aplikasi Bidang Teknik Informatika 14, no. 1 (January 28, 2020): 52–63. http://dx.doi.org/10.47111/jti.v14i1.609.

Full text
Abstract:
More and more abuse of digital images, data or information that is personal in nature can be easily known by others who are not entitled through digital images. This can cause material and immaterial losses to people whose personal information is misused by others. Then the application applies the Rijndael algorithm to secure digital image images which contain information or data that is personal in nature. In securing digital images, the Rijndael algorithm is used to protect the information contained in these images, this algorithm runs with processes such as SubBytes, ShiftRows, MixColumns, and AddRoundKey. The methodology applied is data collection methods such as field studies and literature studies, then methods of developing Waterfall software (Communication, Planning, Modeling, Construction, and Deployment) for system design. The results of the test analysis get an accuracy value of 100% from the 14 image files tested, all of them were successfully encrypted and decrypted so that it returned to the original form of the original image. For further development, this application can input the data files of other documents and increase the key length to 196 bits and 256 bits
APA, Harvard, Vancouver, ISO, and other styles
39

Kimwah, Junior, Ismail Ibrahim, and Baharudin Mohd Arus. "ZOOMORPHIC IMAGES IN PREHISTORIC CAVE PAINTING OF KAIN HITAM CAVE (PAINTED CAVE) NIAH, SARAWAK." International Journal of Heritage, Art and Multimedia 3, no. 8 (March 10, 2020): 12–19. http://dx.doi.org/10.35631/ijham.38002.

Full text
Abstract:
This article debates the zoomorphic images or images of animals that have been produced on prehistoric cave walls. Kain Hitam Cave or known as Painted Cave, located within the Niah Cave complex, is a cave that believed have been inhabited by Neolithic peoples. Inside the cave, there were a variety of artifacts including boat-shaped coffins, jewelry made of shells, bones, and ceramics. Inside the cave, there is also a cave painting that is produced on the wall using hematite material mixed with a mixture of plant material. The main focus of this article is to record all the animal images found inside the cave wall. This research also attempted to document images digitally. The researchers produced a re-image using Adobe Photoshop's digital software. The results of this research are to collect a more organized and detailed collection of images.
APA, Harvard, Vancouver, ISO, and other styles
40

Sandri, Eva. "Quelles utilisations des images de l’exposition sur les sites Internet de musées ? Congruence et incohérence entre objets et images numériques." Article cinq 7, no. 2 (May 7, 2015): 95–109. http://dx.doi.org/10.7202/1030252ar.

Full text
Abstract:
Dans cet article, l’auteure observe la relation entre les objets d’une collection muséale tels qu’ils sont disposés dans l’exposition et tels qu’ils sont valorisés sur les sites Internet des institutions muséales. L’objectif étant d’évaluer le degré de congruence entre ces deux lieux, il s’agira de comparer les deux médias que sont l’exposition et le site Internet afin de mettre au jour la relation qui les unit et de comprendre s’il y a un rapport de subordination ou de complémentarité entre les deux. Cette analyse sera menée à l’aide des outils descriptifs du webdesign avec les exemples de trois institutions muséales visitées en mai 2013 : Boréalis Centre d’histoire de l’industrie papetière de Trois-Rivières, le Centre d’histoire de Montréal et le musée Grévin de Montréal. À travers l’analyse de la place des substituts numériques de ces objets sur les sites Internet de ces institutions se dessine une typologie de ces sites : ceux qui expliquent l’image et ceux qui donnent simplement à voir le lieu d’exposition. On observera que la majorité des sites Internet étudiés n’accordent pas une place centrale à leurs objets de collection. Le contenu des sites semble davantage focalisé sur l’évocation du lieu et les informations pratiques. Il y aurait donc un décalage dans la façon dont les objets de musée (notamment des musées d’histoire et d’ethnographie) sont exposés dans les deux médias. En outre, cette circulation de l’objet de collection entre le statut d’expôt, de document, de source et d’oeuvre, complexifie la mise en place d’un site Internet qui distingue clairement ce qui relève de la collection et ce qui relève de la description de la collection. À l’heure où le document et l’archive tendent à faire partie des collections, les sites Internet observés rendent compte de cette indécision. En observant le parti pris de ces sites, on remarque que les images numériques montrent les lieux de l’exposition (notamment l’organisation spatiale de l’espace) plus que leurs objets.
APA, Harvard, Vancouver, ISO, and other styles
41

Arabas, Iwona, Larysa Bondar, and Lidia Czechowicz. "Nie tylko kurs historii naturalnej. Księżnej Anny Jabłonowskiej zbiór „wszystkich przedmiotów dociekań rozumu człowieka”." Kwartalnik Historii Nauki i Techniki, no. 1 (2021): 137–60. http://dx.doi.org/10.4467/0023589xkhnt.21.005.13389.

Full text
Abstract:
Not Only a Natural History Course. Duchess Anna Jabłonowska’s “Collection of All Objects of Human Reason Inquiries” One of the richest natural history collections in Europe at the end of the 18th century was the Cabinet of Natural History of Duchess Anna Jabłonowska née Sapieha (1728–1800) in Siemiatycze. In 1802, the collection was purchased by Tsar Alexander I and handed over to the University in Moscow (where it burned down in 1812). It was only possible to recreate the richness of the collection and the way it was taken over after the sales documents had been found in 2008 in the Archive of the Russian Academy of Sciences in St. Petersburg. However, some documents were illegible, and it was only in 2020 that the entire documentation was read. It revealed a completely different image of the collection than expected, as in one part the collection refers to cabinets of curiosities. The article is the first publication in Polish on Anna Jabłonowska’s “art cabinet”, with translations of the lists of exhibits by Count Stanisław Sołtyk (from French) and by V.M. Severgin and A.F. Sevastyanov (from Russian).
APA, Harvard, Vancouver, ISO, and other styles
42

Paiz-Reyes, Evelyn, Mathieu Brédif, and Sidonie Christophe. "Cluttering Reduction for Interactive Navigation and Visualization of Historical Images." Proceedings of the ICA 4 (December 3, 2021): 1–7. http://dx.doi.org/10.5194/ica-proc-4-81-2021.

Full text
Abstract:
Abstract. Iconographic representations, such as historical photos of geographic spaces, are precious cultural heritage resources capable of describing a particular geographical area’s evolution over time. These photographic collections may vary in size, between hundreds and thousands of items. With the advent of the digital era, many of these documents have been digitized, spatialized, and are available online. Browsing through these digital image collections represents new challenges. This paper examines the topic of historical image exploration in a virtual environment enabling the co-visualization of historical photos into a contemporary 3D scene. We address the topic of user interaction considering the potential volume of the input data. Our methodology is based on design guidelines that rely on visual perception techniques to ease visual complexity and improve saliency on specific cues. The designs are additionally implemented following an image-based rendering approach and evaluated in a group of users. Overall, these propositions may be a notable addition to creating innovative ways to visualize and discover historical images in a virtual geographic environment.
APA, Harvard, Vancouver, ISO, and other styles
43

Fioretti, Giovanna, Pasquale Acquafredda, Alessandro Monno, Vincenza Montenegro, and Ruggero Francescangeli. "Roman Marble Collections in the Earth Sciences Museum of the University of Bari (Italy): A Valuable Heritage to Support Provenance Studies." Heritage 6, no. 5 (April 30, 2023): 4054–71. http://dx.doi.org/10.3390/heritage6050213.

Full text
Abstract:
The Earth Sciences Museum of the University of Bari (Italy) boasts the presence of a precious and complete nineteenth-century collection of white marbles and colored stones used by the Romans to embellish their buildings and, afterward, reuse in new buildings and artworks for their high symbolic and aesthetic value. This collection, arranged by Francesco and Filippo Belli, consists of 577 samples and a printed inventory and other documents, which allowed to reconstruct the history of the collection. Another collection of 29 marble samples was donated to the museum in 2010 by the Armenise family. Both collections represent a very useful reference tool in provenance studies for marble pieces at the archaeological and artistic sites and for samples of other collections. The systematic organization of these collections and their sharing among scholars, especially through the web network, is clearly essential. The work presented here focuses on the most recent discoveries about Belli’s collection, on the results of the identification of Armenise’s marbles and stones, and above all, on the actions undertaken in recent years in order to valorize this museum’s heritage. Specifically, both collections were reorganized following novel insights about lithotypes and the provenance of each sample, a detailed database including data on each sample was created, and a website reporting information and images of the two collections was built in order to guarantee the correct dissemination of data.
APA, Harvard, Vancouver, ISO, and other styles
44

Sudjiran, Sudjiran, and Akbar Syahbanta Limbong. "Sistem Retensi dan Alih Image Rekam Medis Inaktif RS Khusus Kanker MRCCC Siloam Semanggi." Jurnal Informatika Universitas Pamulang 6, no. 1 (March 31, 2021): 139. http://dx.doi.org/10.32493/informatika.v6i1.9638.

Full text
Abstract:
Along with the development of technology, the speed of data processing is needed in order to compete with competitors. A company must have an advantage over other companies if it does not want to lose in the competition. MRCCC Siloam Semanggi is a company that provides health services for cancer patients. One of the transaction processes within the hospital is sensing data in the form of images of patient data. Image data processing activities at this hospital are not yet structured and require a database in order to assist in fast data processing. This study aims to create an image transfer system to transfer physical documents into digital documents. This system is useful for hospital employees to be able to find documents easily for certain purposes, the system is made web-based using XAMPP, using PHP language with MySQL database. The results of the analysis of research that has been done, there are problems that arise related to the retention system in hospital patient data. Retention data collection activities are usually carried out by sorting out patient medical record documents from those not recorded on a computer.
APA, Harvard, Vancouver, ISO, and other styles
45

Cârja, Ion. "Romanians in Austria-Hungary in the Years of “the Great War”. The Perspective of Visual Sources." Studia Universitatis Babeș-Bolyai Historia 66, no. 1 (February 2022): 171–86. http://dx.doi.org/10.24193/subbhist.2021.1.09.

Full text
Abstract:
"The present article is based on two research experiences that were resulted in the printing of two volumes that included visual documents. In the present article, our aim is to present the content categories that can be found in the photographs and postcards with and about the Romanians from the Austrian-Hungarian monarchy who took part in the traumatizing experience of World War I. Thus, a first theme that is rich and varied included the “faces” of the officers, soldiers and, last but not least, the civilians, in different situations, contexts and stances imposed by the war’s developments. There are group photographs that contain a varying number of soldiers, from two – three persons, up to several dozens, along with individual photographs; in all of these photographs, there are soldiers and officers, together or separate. Next, there is a distinct category of visual materials, concerning propaganda; they are mostly illustrated postcards that circulated as correspondence between the firing line and the “home front”. The symbolism of the state authority, along with the image of the emperor and that of the imperial family, were a recurring presence in the imagistic content with which the Austrian-Hungarian postcards printed during the war tried to send a loyalist message or to consolidate it in the community’s mentality. The materials that are related to the course of the daily life near the front, as well as behind it, are particularly interesting; the photographs taken during the war usually depict non-fighting moments, the moments of rest, containing with varied and diverse themes. There is a special category of visual documents that have been preserved from the time of the war, depicting the suffering that was inflicted upon the participants and the manner in which it was “handled”. Thus, among the photographs that fall in this category, we encountered those of the wounded and hospitalized soldiers, the field hospitals and the personnel with medical attributions that served near the units. Another theme section directly connected to the previous one is represented by the physical embodiment of death along the front line: photographs of funerals, graves and military cemeteries. There is a category of visual sources, from both public and private collections, that related to the war “on the seas”, photographs and postal cards of the marine troops serving in the Empire; they were stationed at Pola, in the Adriatic Sea. The photographs taken during the Great War that depict soldiers alongside civilians are of particular interest. Mostly, they are soldiers together with their own family members (mothers, wives, children etc) that are depicted in photographs that were taken far from the front line, during leaves, when the soldiers could briefly re-join their native communities. The Romanians that served in the war, wearing the military uniform of the double monarchy and who left its sphere of loyalty, either by becoming prisoners or by voluntary desertion, is a theme that was not overlooked by the visual sources that have survived from that period. These photographs of prisoners and Romanian volunteers from the time of the Great War are also relevant for the geographic coordinates, very far from one another, where the course of the events carried the Romanian soldiers, from France to far-away Siberia, at Vladivostok. The document images from the time of the Great War allow for a sui generis dialogue with those “who are no more”, over a temporal gap of a century. The camera lens often captured expressive faces, whose identity is known in the cases in which the photographs include markings and notes, along with those that offer no additional information concerning those who took the photos or their subjects; in the latter case, we can say that these images are the anonymous bearers of war’s memory. These materials offer us today the unique privilege of visually “communicating” with our forbearers from a century ago, with the representatives of the humanity that plunged into the terrible adventure of World War I. Keywords: “The Great War”, Romanians, Austria-Hungary, visual sources, cultural history. "
APA, Harvard, Vancouver, ISO, and other styles
46

Zhu, Tiange, Raphaël Fournier-S’niehotta, Philippe Rigaux, and Nicolas Travers. "A Framework for Content-Based Search in Large Music Collections." Big Data and Cognitive Computing 6, no. 1 (February 23, 2022): 23. http://dx.doi.org/10.3390/bdcc6010023.

Full text
Abstract:
We address the problem of scalable content-based search in large collections of music documents. Music content is highly complex and versatile and presents multiple facets that can be considered independently or in combination. Moreover, music documents can be digitally encoded in many ways. We propose a general framework for building a scalable search engine, based on (i) a music description language that represents music content independently from a specific encoding, (ii) an extendible list of feature-extraction functions, and (iii) indexing, searching, and ranking procedures designed to be integrated into the standard architecture of a text-oriented search engine. As a proof of concept, we also detail an actual implementation of the framework for searching in large collections of XML-encoded music scores, based on the popular ElasticSearch system. It is released as open-source in GitHub, and available as a ready-to-use Docker image for communities that manage large collections of digitized music documents.
APA, Harvard, Vancouver, ISO, and other styles
47

Gatenbee, Chandler Dean, Ann-Marie Baker, Sandhya Prabhakaran, Mark Robertson-Tessi, Trevor Graham, and Alexander R. Anderson. "Abstract 2078: VALIS: Virtual Alignment of pathoLogy Image Series for multi-gigapixel whole slide images." Cancer Research 83, no. 7_Supplement (April 4, 2023): 2078. http://dx.doi.org/10.1158/1538-7445.am2023-2078.

Full text
Abstract:
Abstract Interest in spatial omics is on the rise, but generation of highly multiplexed images used in many spatial analyses remains challenging, due to cost, expertise, methodical constraints, and/or access to technology. An alternative to performing highly multiplexed staining is to register collections of whole slide images (WSI), creating a collection of aligned images that can undergo spatial analyses. However, registration of WSI is two part problem, with the first being the alignment itself, and the second being the application of the transformations to huge multi-gigapixel images. To address both challenges, we have developed the Virtual Alignment of pathoLogy Image Series (VALIS) software, which enables one to rapidly and easily generate highly multiplexed images by aligning (registering) any number of multi-gigapixel whole slide images (WSI) stained using immunohistochemistry (IHC) and/or immunofluorescence (IF). Benchmarking using the publicly available 2019 ANHIR and 2022 ACROBAT Grand Challenge datasets indicates that VALIS provides state of the art accuracy, being one of the most accurate publicly available methods in the ANHIR challenge, and the most accurate opensource method in the ACROBAT challenge. VALIS is able to read, register, and save multi-gigapixel images as ome.tiff, thereby addressing the second challenge. In addition to the benchmarking datasets, the generalizability of VALIS has been tested with 273 IHC samples and 340 IF samples, each of which contained between 2-69 images per sample. In total, VALIS has therefore been tested with 5,138 images. The registered WSI tend to have low error and are completed within a matter of minutes. VALIS is written in Python, requires only few lines of code for execution, is readily available and fully documented. VALIS therefore provides a free, opensource, flexible, scalable, robust, and easy to use pipeline for rigid and non-rigid registration of multi-gigapixel WSI, facilitating spatial analyses of prospective and existing datasets,breathing new life into the countless collections of brightfield and immunofluorescence images. Citation Format: Chandler Dean Gatenbee, Ann-Marie Baker, Sandhya Prabhakaran, Mark Robertson-Tessi, Trevor Graham, Alexander R. Anderson. VALIS: Virtual Alignment of pathoLogy Image Series for multi-gigapixel whole slide images [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 2078.
APA, Harvard, Vancouver, ISO, and other styles
48

Canedo, Daniel, Pedro Fonseca, Petia Georgieva, and António J. R. Neves. "A Deep Learning-Based Dirt Detection Computer Vision System for Floor-Cleaning Robots with Improved Data Collection." Technologies 9, no. 4 (December 1, 2021): 94. http://dx.doi.org/10.3390/technologies9040094.

Full text
Abstract:
Floor-cleaning robots are becoming increasingly more sophisticated over time and with the addition of digital cameras supported by a robust vision system they become more autonomous, both in terms of their navigation skills but also in their capabilities of analyzing the surrounding environment. This document proposes a vision system based on the YOLOv5 framework for detecting dirty spots on the floor. The purpose of such a vision system is to save energy and resources, since the cleaning system of the robot will be activated only when a dirty spot is detected and the quantity of resources will vary according to the dirty area. In this context, false positives are highly undesirable. On the other hand, false negatives will lead to a poor cleaning performance of the robot. For this reason, a synthetic data generator found in the literature was improved and adapted for this work to tackle the lack of real data in this area. This synthetic data generator allows for large datasets with numerous samples of floors and dirty spots. A novel approach in selecting floor images for the training dataset is proposed. In this approach, the floor is segmented from other objects in the image such that dirty spots are only generated on the floor and do not overlap those objects. This helps the models to distinguish between dirty spots and objects in the image, which reduces the number of false positives. Furthermore, a relevant dataset of the Automation and Control Institute (ACIN) was found to be partially labelled. Consequently, this dataset was annotated from scratch, tripling the number of labelled images and correcting some poor annotations from the original labels. Finally, this document shows the process of generating synthetic data which is used for training YOLOv5 models. These models were tested on a real dataset (ACIN) and the best model attained a mean average precision (mAP) of 0.874 for detecting solid dirt. These results further prove that our proposal is able to use synthetic data for the training step and effectively detect dirt on real data. According to our knowledge, there are no previous works reporting the use of YOLOv5 models in this application.
APA, Harvard, Vancouver, ISO, and other styles
49

Loda, Liubov, Iryna Pigel, and Lesia Dzendzeluk. "A Study on the state of photographic documents in Vasyl Stefanyk National Scientific Library of Ukraine in Lviv." Proceedings of Vasyl Stefanyk National Scientific Library of Ukraine in Lviv, no. 11(27) (2019): 195–208. http://dx.doi.org/10.37222/2524-0315-2019-11(27)-12.

Full text
Abstract:
Collections of photographic documents constitute considerable information, document and art heritage. Photographs with non-existing today art and architectural objects are of particular importance. The task for their keepers is to save all visual information and to ensure accessibility to users. The paper’s purpose is to study the state of photographic documents kept in Vasyl Stefanyk National Scientific Library of Ukraine in Lviv. The article provides a profound analyze of damages’ types as well as factors that caused them and measures for their preservation. Authors highlighted the role of indoor climate’s stability for the safety of photographic documents. Photographs are multi-component objects and 208 consist of several layers, each carrying out certain functions. Photographic documents are of low light resistance, so their improper storage may cause irreversible fading. Damages may be caused by both physical, chemical and biological factors. Actually, their monitoring allows to record any injuries and to identify destructive processes that just begin. On the base of photographs dated back to 1860s – 1930s from museum collections of Shevchenko Scientific Society, Ossolinski National Institute and People’s House in Lviv, the state of paper base and clarity of images were assessed. The results of the study were registered in tables, where the state of their conservation was specified in details. Authors applied five-level system to assess injuries of photographic documents. The paper describes three kinds of damages, including mechanical injuries (loss, deformation, breakings), physical and chemical (fading, color changes, spots), biological (pigmentation, contamination by microorganism, insects etc.). The study and investigation made it possible to refine conservation measures, to develop means for minimizing the influence of harmful factors. Researchers’ growing interest to photographic documents increased their usage in the Library. Supporting safety conditions and accessibility to conducting researches allow to use photographic documents widely for history and art studies, cataloging and exhibitions. Keywords: photographical documents, photo collections, Shevchenko Scientific Society, People’s House, Petrushevych Museum, preservation, conservation.
APA, Harvard, Vancouver, ISO, and other styles
50

Wasi, Md Adnan, Rakesh Das, Purnendu Sarkar, Suvajit Singha, Tanmay Barman, Sourov Kumar Kundu, Moloy Dhar, and Sayan Roy Chaudhuri. "Image Captioning Using Deep Learning." International Journal for Research in Applied Science and Engineering Technology 11, no. 6 (June 30, 2023): 521–25. http://dx.doi.org/10.22214/ijraset.2023.53625.

Full text
Abstract:
Abstract: This paper focuses on developing an image captioning system using deep learning techniques. The paper aims to generate descriptive textual captions for images, enabling machines to understand and communicate the content of visual data. The methodology involves leveraging convolutional neural networks (CNNs) for image feature extraction and recurrent neural networks (RNNs) for sequential language generation. The paper includes steps such as dataset collection, data preprocessing, CNN feature extraction, RNN-based captioning model implementation, model evaluation using metrics like BLEU score and METEOR, and presenting the results obtained. The expected deliverables include a functional image captioning system, comprehensive documentation, and a well-documented codebase. Through this paper, students gain practical experience in deep learning, computer vision, and natural language processing, contributing to advancements in image understanding and humanmachine interaction with visual data
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography