Log in

Relevant bibliographies by topics / Image retrieval / Dissertations / Theses

Dissertations / Theses on the topic 'Image retrieval'

To see the other types of publications on this topic, follow the link: Image retrieval.

Author: Grafiati

Published: 4 June 2021

Last updated: 22 June 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Image retrieval.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Ahmad, Fauzi Mohammad Faizal. "Content-based image retrieval of museum images." Thesis, University of Southampton, 2004. https://eprints.soton.ac.uk/261546/.

Full text

Abstract:

Content-based image retrieval (CBIR) is becoming more and more important with the advance of multimedia and imaging technology. Among many retrieval features associated with CBIR, texture retrieval is one of the most difficult. This is mainly because no satisfactory quantitative definition of texture exists at this time, and also because of the complex nature of the texture itself. Another difficult problem in CBIR is query by low-quality images, which means attempts to retrieve images using a poor quality image as a query. Not many content-based retrieval systems have addressed the problem of query by low-quality images. Wavelet analysis is a relatively new and promising tool for signal and image analysis. Its time-scale representation provides both spatial and frequency information, thus giving extra information compared to other image representation schemes. This research aims to address some of the problems of query by texture and query by low quality images by exploiting all the advantages that wavelet analysis has to offer, particularly in the context of museum image collections. A novel query by low-quality images algorithm is presented as a solution to the problem of poor retrieval performance using conventional methods. In the query by texture problem, this thesis provides a comprehensive evaluation on wavelet-based texture method as well as comparison with other techniques. A novel automatic texture segmentation algorithm and an improved block oriented decomposition is proposed for use in query by texture. Finally all the proposed techniques are integrated in a content-based image retrieval application for museum image collections.

APA, Harvard, Vancouver, ISO, and other styles

2

Gibson, Stuart Edward. "Sieves for image retrieval." Thesis, University of East Anglia, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.405401.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Nahar, Vikas. "Content based image retrieval for bio-medical images." Diss., Rolla, Mo. : Missouri University of Science and Technology, 2010. http://scholarsmine.mst.edu/thesis/pdf/Nahar_09007dcc80721e0b.pdf.

Full text

Abstract:

Thesis (M.S.)--Missouri University of Science and Technology, 2010.
Vita. The entire thesis text is included in file. Title from title screen of thesis/dissertation PDF file (viewed Dec. 23, 2009). Includes bibliographical references (p. 82-83).

APA, Harvard, Vancouver, ISO, and other styles

4

Saavedra, Rondo José Manuel. "Image Descriptions for Sketch Based Image Retrieval." Tesis, Universidad de Chile, 2013. http://www.repositorio.uchile.cl/handle/2250/112670.

Full text

Abstract:

Doctor en Ciencias, Mención Computación
Debido al uso masivo de Internet y a la proliferación de dispositivos capaces de generar información multimedia, la búsqueda y recuperación de imágenes basada en contenido se han convertido en áreas de investigación activas en ciencias de la computación. Sin embargo, la aplicación de búsqueda por contenido requiere una imagen de ejemplo como consulta, lo cual muchas veces puede ser un problema serio, que imposibilite la usabilidad de la aplicación. En efecto, los usuarios comúnmente hacen uso de un buscador de imágenes porque no cuentan con la imagen deseada. En este sentido, un modo alternativo de expresar lo que el usuario intenta buscar es mediante un dibujo a mano compuesto, simplemente, de trazos, sketch, lo que onduce a la búsqueda por imágenes basada en sketches. Hacer este tipo de consultas es soportado, además, por el hecho de haberse incrementado la accesibilidad a dispositivos táctiles, facilitando realizar consultas de este tipo. En este trabajo, se proponen dos métodos aplicados a la recuperación de imágenes basada en sketches. El primero es un método global que calcula un histograma de orientaciones usando gradientes cuadrados. Esta propuesta exhibe un comportamiento sobresaliente con respecto a otros métodos globales. En la actualidad, no existen métodos que aprovechen la principal característica de los sketches, la información estructural. Los sketches carecen de color y textura y representan principalmente la estructura de los objetos que se quiere buscar. En este sentido, se propone un segundo método basado en la representación estructural de las imágenes mediante un conjunto de formas primitivas que se denominan keyshapes. Los resultados de nuestra propuesta han sido comparados con resultados de métodos actuales, mostrando un incremento significativo en la efectividad de la recuperación. Además, puesto que nuestra propuesta basada en keyshapes explota una característica novedosa, es posible combinarla con otras técnicas para incrementar la efectividad de los resultados. Así, en este trabajo se ha evaluado la combinación del método propuesto con el método propuesto por Eitz et al., basado en Bag of Words, logrando un aumento de la efectividad de casi 22%. Finalmente, con el objetivo de mostrar el potencial de nuestra propuesta, se muestran dos aplicaciones. La primera está orientada al contexto de recuperación de modelos 3D usando un dibujo a mano como consulta. En esta caso, nuestros resultados muestran competitividad con el estado del arte. La segunda aplicación explota la idea de buscar objetos basada en la estructura para mejorar el proceso de segmentación. En particular, mostramos una aplicación de segmentación de manos en ambientes semi-controlados.

APA, Harvard, Vancouver, ISO, and other styles

5

Ingratta, Donato. "Texture image retrieval using fuzzy image subdivision." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape10/PQDD_0012/MQ52743.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Ren, Feng Hui. "Multi-image query content-based image retrieval." Access electronically, 2006. http://www.library.uow.edu.au/adt-NWU/public/adt-NWU20070103.143624/index.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Nanayakkara, Wasam Uluwitige Dinesha Chathurani. "Content based image retrieval with image signatures." Thesis, Queensland University of Technology, 2017. https://eprints.qut.edu.au/104286/1/Dinesha_Chathurani_Nanayakkara_Thesis.pdf.

Full text

Abstract:

This thesis develops a system to search for relevant images when user inputs a particular image as a query. The concept is similar to text search in Google or Yahoo. However, understanding image content is more difficult than text content. The system provides a method to retrieve similar images pertaining to the query easily and quickly. It allows end users to refine the original query iteratively where they have no effective way to reformulate the original image query. The results from empirical evaluations suggest that our system is fast, provides a broad spectrum of images even with underlying changes.

APA, Harvard, Vancouver, ISO, and other styles

8

Larsson, Jimmy. "Taxonomy Based Image Retrieval : Taxonomy Based Image Retrieval using Data from Multiple Sources." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-180574.

Full text

Abstract:

With a multitude of images available on the Internet, how do we find what we are looking for? This project tries to determine how much the precision and recall of search queries is improved by using a word taxonomy on traditional Text-Based Image Search and Content-Based Image Search. By applying a word taxonomy to different data sources, a strong keyword filter and a keyword extender were implemented and tested. The results show that depending on the implementation, the precision or the recall can be increased. By using a similar approach on real life implementations, it is possible to force images with higher precisions to the front while keeping a high recall value, thus increasing the experienced relevance of image search.
Med den mängd bilder som nu finns tillgänglig på Internet, hur kan vi fortfarande hitta det vi letar efter? Denna uppsats försöker avgöra hur mycket bildprecision och bildåterkallning kan öka med hjälp av appliceringen av en ordtaxonomi på traditionell Text-Based Image Search och Content-Based Image Search. Genom att applicera en ordtaxonomi på olika datakällor kan ett starkt ordfilter samt en modul som förlänger ordlistor skapas och testas. Resultaten pekar på att beroende på implementationen så kan antingen precisionen eller återkallningen förbättras. Genom att använda en liknande metod i ett verkligt scenario är det därför möjligt att flytta bilder med hög precision längre fram i resultatlistan och samtidigt behålla hög återkallning, och därmed öka den upplevda relevansen i bildsök.

APA, Harvard, Vancouver, ISO, and other styles

9

U, Leong Hou. "Web image clustering and retrieval." Thesis, University of Macau, 2005. http://umaclib3.umac.mo/record=b1445902.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Manja, Philip. "Image Retrieval within Augmented Reality." Master's thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-229922.

Full text

Abstract:

Die vorliegende Arbeit untersucht das Potenzial von Augmented Reality zur Verbesserung von Image Retrieval Prozessen. Herausforderungen in Design und Gebrauchstauglichkeit wurden für beide Forschungsbereiche dargelegt und genutzt, um Designziele für Konzepte zu entwerfen. Eine Taxonomie für Image Retrieval in Augmented Reality wurde basierend auf der Forschungsarbeit entworfen und eingesetzt, um verwandte Arbeiten und generelle Ideen für Interaktionsmöglichkeiten zu strukturieren. Basierend auf der Taxonomie wurden Anwendungsszenarien als weitere Anforderungen für Konzepte formuliert. Mit Hilfe der generellen Ideen und Anforderungen wurden zwei umfassende Konzepte für Image Retrieval in Augmented Reality ausgearbeitet. Eins der Konzepte wurde auf einer Microsoft HoloLens umgesetzt und in einer Nutzerstudie evaluiert. Die Studie zeigt, dass das Konzept grundsätzlich positiv aufgenommen wurde und bietet Erkenntnisse über unterschiedliches Verhalten im Raum und verschiedene Suchstrategien bei der Durchführung von Image Retrieval in der erweiterten Realität
The present work investigates the potential of augmented reality for improving the image retrieval process. Design and usability challenges were identiﬁed for both ﬁelds of research in order to formulate design goals for the development of concepts. A taxonomy for image retrieval within augmented reality was elaborated based on research work and used to structure related work and basic ideas for interaction. Based on the taxonomy, application scenarios were formulated as further requirements for concepts. Using the basic interaction ideas and the requirements, two comprehensive concepts for image retrieval within augmented reality were elaborated. One of the concepts was implemented using a Microsoft HoloLens and evaluated in a user study. The study showed that the concept was rated generally positive by the users and provided insight in different spatial behavior and search strategies when practicing image retrieval in augmented reality

APA, Harvard, Vancouver, ISO, and other styles

11

Zhang, Dengsheng 1963. "Image retrieval based on shape." Monash University, School of Computing and Information Technology, 2002. http://arrow.monash.edu.au/hdl/1959.1/8688.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Torres, Jose Roberto Perez. "Image retrieval using semantic trees." Thesis, University of East Anglia, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.493013.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Mohamed, Aamer S. S. "From content-based to semantic image retrieval. Low level feature extraction, classification using image processing and neural networks, content based image retrieval, hybrid low level and high level based image retrieval in the compressed DCT domain." Thesis, University of Bradford, 2010. http://hdl.handle.net/10454/4438.

Full text

Abstract:

Digital image archiving urgently requires advanced techniques for more efficient storage and retrieval methods because of the increasing amount of digital. Although JPEG supply systems to compress image data efficiently, the problems of how to organize the image database structure for efficient indexing and retrieval, how to index and retrieve image data from DCT compressed domain and how to interpret image data semantically are major obstacles for further development of digital image database system. In content-based image, image analysis is the primary step to extract useful information from image databases. The difficulty in content-based image retrieval is how to summarize the low-level features into high-level or semantic descriptors to facilitate the retrieval procedure. Such a shift toward a semantic visual data learning or detection of semantic objects generates an urgent need to link the low level features with semantic understanding of the observed visual information. To solve such a -semantic gap¿ problem, an efficient way is to develop a number of classifiers to identify the presence of semantic image components that can be connected to semantic descriptors. Among various semantic objects, the human face is a very important example, which is usually also the most significant element in many images and photos. The presence of faces can usually be correlated to specific scenes with semantic inference according to a given ontology. Therefore, face detection can be an efficient tool to annotate images for semantic descriptors. In this thesis, a paradigm to process, analyze and interpret digital images is proposed. In order to speed up access to desired images, after accessing image data, image features are presented for analysis. This analysis gives not only a structure for content-based image retrieval but also the basic units ii for high-level semantic image interpretation. Finally, images are interpreted and classified into some semantic categories by semantic object detection categorization algorithm.

APA, Harvard, Vancouver, ISO, and other styles

14

Mohamed, Aamer Saleh Sahel. "From content-based to semantic image retrieval : low level feature extraction, classification using image processing and neural networks, content based image retrieval, hybrid low level and high level based image retrieval in the compressed DCT domain." Thesis, University of Bradford, 2010. http://hdl.handle.net/10454/4438.

Full text

Abstract:

Digital image archiving urgently requires advanced techniques for more efficient storage and retrieval methods because of the increasing amount of digital. Although JPEG supply systems to compress image data efficiently, the problems of how to organize the image database structure for efficient indexing and retrieval, how to index and retrieve image data from DCT compressed domain and how to interpret image data semantically are major obstacles for further development of digital image database system. In content-based image, image analysis is the primary step to extract useful information from image databases. The difficulty in content-based image retrieval is how to summarize the low-level features into high-level or semantic descriptors to facilitate the retrieval procedure. Such a shift toward a semantic visual data learning or detection of semantic objects generates an urgent need to link the low level features with semantic understanding of the observed visual information. To solve such a 'semantic gap' problem, an efficient way is to develop a number of classifiers to identify the presence of semantic image components that can be connected to semantic descriptors. Among various semantic objects, the human face is a very important example, which is usually also the most significant element in many images and photos. The presence of faces can usually be correlated to specific scenes with semantic inference according to a given ontology. Therefore, face detection can be an efficient tool to annotate images for semantic descriptors. In this thesis, a paradigm to process, analyze and interpret digital images is proposed. In order to speed up access to desired images, after accessing image data, image features are presented for analysis. This analysis gives not only a structure for content-based image retrieval but also the basic units ii for high-level semantic image interpretation. Finally, images are interpreted and classified into some semantic categories by semantic object detection categorization algorithm.

APA, Harvard, Vancouver, ISO, and other styles

15

Dey, Sounak. "Mapping between Images and Conceptual Spaces: Sketch-based Image Retrieval." Doctoral thesis, Universitat Autònoma de Barcelona, 2020. http://hdl.handle.net/10803/671082.

Full text

Abstract:

El diluvi de contingut visual a Internet –de contingut generat per l’usuari a col·leccions d’imatges comercials- motiva nous mètodes intuïtius per cercar contingut d’imatges digitals: com podem trobar determinades imatges en una base de dades de milions? La recuperació d’imatges basada en esbossos (SBIR) és un tema de recerca emergent en què es pot utilitzar un dibuix a mà lliure per consultar visualment imatges fotogràfiques. SBIR s’alinea a les tendències emergents de consum de contingut visual en dispositius mòbils basats en pantalla tàctil, per a les quals les interaccions gestuals com el croquis són una alternativa natural a l’entrada textual. Aquesta tesi presenta diverses contribucions a la literatura de SBIR. En primer lloc, proposem un marc d’aprenentatge entre modalitats que mapi tant esbossos com text en un espai d’inserció conjunta invariant a l’estil representatiu, conservant la semàntica. L’incrustació resultant permet la comparació directa i la cerca entre esbossos / text i imatges i es basa en una xarxa neuronal convolutional multi-branca (CNN) formada mitjançant esquemes d’entrenament únics. S’ha demostrat que l’incorporació profundament obtinguda ofereix un rendiment de recuperació d’última generació en diversos punts de referència SBIR. En segon lloc, proposem un enfocament per a la recuperació d’imatges multimodals en imatges amb etiquetes múltiples. Es formula una arquitectura de xarxa profunda multi-modal per modelar conjuntament esbossos i text com a modalitats de consulta d’entrada en un espai d’inscripció comú, que s’alinea encara més amb l’espai de funcions d’imatge. La nostra arquitectura també es basa en una detecció d’objectes destacables mitjançant un model d’atenció visual basat en LSTM supervisat, obtingut de funcions convolutives. Tant l’alineació entre les consultes com la imatge i la supervisió de l’atenció a les imatges s’obté generalitzant l’algoritme hongarès mitjançant diferents funcions de pèrdua. Això permet codificar les funcions basades en l’objecte i la seva alineació amb la consulta independentment de la disponibilitat de la coincidència de diferents objectes del conjunt d’entrenament. Validem el rendiment del nostre enfocament en conjunts de dades d’un sol objecte o amb diversos objectes, mostrant el rendiment més modern en tots els conjunts de dades SBIR. En tercer lloc, investiguem el problema de la recuperació d’imatges basada en esbossos de zero (ZS-SBIR), on els esbossos humans s’utilitzen com a consultes per a la recuperació de fotografies de categories no vistes. Avancem de forma important les arts prèvies proposant un nou escenari ZS-SBIR que representi un pas endavant en la seva aplicació pràctica. El nou entorn reconeix exclusivament dos importants reptes importants, però sovint descuidats, de la pràctica ZS-SBIR, (i) la gran bretxa de domini entre el dibuix i la fotografia aficionats, i (ii) la necessitat d’avançar cap a una recuperació a gran escala. Primer cop aportem a la comunitat un nou conjunt de dades ZS-SBIR, QuickDraw-Extended, que consisteix en esbossos de 330.000 dòlars i 204.000 dòlars de fotos en 110 categories. Esbossos humans amateurs altament abstractes s’obtenen intencionadament per maximitzar la bretxa de domini, en lloc dels inclosos en conjunts de dades existents que sovint poden ser semi-fotorealistes. A continuació, formulem un marc ZS-SBIR per modelar conjuntament esbossos i fotografies en un espai d’inserció comú. Una nova estratègia per extreure la informació mútua entre dominis està dissenyada específicament per pal·liar la bretxa de domini.
El diluvio de contenido visual en Internet, desde contenido generado por el usuario hasta colecciones de imágenes comerciales, motiva nuevos métodos intuitivos para buscar contenido de imágenes digitales: ¿cómo podemos encontrar ciertas imágenes en una base de datos de millones? La recuperación de imágenes basada en bocetos (SBIR) es un tema de investigación emergente en el que se puede usar un dibujo a mano libre para consultar visualmente imágenes fotográficas. SBIR está alineado con las tendencias emergentes para el consumo de contenido visual en dispositivos móviles con pantalla táctil, para los cuales las interacciones gestuales como el boceto son una alternativa natural a la entrada de texto. Esta tesis presenta varias contribuciones a la literatura de SBIR. En primer lugar, proponemos un marco de aprendizaje multimodal que mapea tanto los bocetos como el texto en un espacio de incrustación conjunto invariante al estilo representativo, al tiempo que conserva la semántica. La incrustación resultante permite la comparación directa y la búsqueda entre bocetos / texto e imágenes y se basa en una red neuronal convolucional de múltiples ramas (CNN) entrenada utilizando esquemas de entrenamiento únicos. La incrustación profundamente aprendida muestra un rendimiento de recuperación de última generación en varios puntos de referencia SBIR. En segundo lugar, proponemos un enfoque para la recuperación de imágenes multimodales en imágenes con etiquetas múltiples. Una arquitectura de red profunda multimodal está formulada para modelar conjuntamente bocetos y texto como modalidades de consulta de entrada en un espacio de incrustación común, que luego se alinea aún más con el espacio de características de la imagen. Nuestra arquitectura también se basa en una detección de objetos sobresalientes a través de un modelo de atención visual supervisado basado en LSTM aprendido de las características convolucionales. Tanto la alineación entre las consultas y la imagen como la supervisión de la atención en las imágenes se obtienen generalizando el algoritmo húngaro utilizando diferentes funciones de pérdida. Esto permite codificar las características basadas en objetos y su alineación con la consulta independientemente de la disponibilidad de la concurrencia de diferentes objetos en el conjunto de entrenamiento. Validamos el rendimiento de nuestro enfoque en conjuntos de datos estándar de objeto único / múltiple, mostrando el rendimiento más avanzado en cada conjunto de datos SBIR. En tercer lugar, investigamos el problema de la recuperación de imágenes basadas en bocetos de disparo cero (ZS-SBIR), donde los bocetos humanos se utilizan como consultas para llevar a cabo la recuperación de fotos de categorías invisibles. Avanzamos de manera importante en las técnicas anteriores al proponer un nuevo escenario ZS-SBIR que representa un firme paso adelante en su aplicación práctica. El nuevo entorno reconoce de manera única dos desafíos importantes pero a menudo descuidados de la práctica ZS-SBIR, (i) la gran brecha de dominio entre el boceto aficionado y la foto, y (ii) la necesidad de avanzar hacia la recuperación a gran escala. Primero contribuimos a la comunidad con un nuevo conjunto de datos ZS-SBIR, QuickDraw -Extended, que consta de bocetos de $ 330,000 $ y fotos de $ 204,000 $ que abarcan 110 categorías. Los bocetos humanos aficionados altamente abstractos se obtienen a propósito para maximizar la brecha de dominio, en lugar de los incluidos en los conjuntos de datos existentes que a menudo pueden ser semi-fotorrealistas. Luego formulamos un marco ZS-SBIR para modelar conjuntamente bocetos y fotos en un espacio de incrustación común.
The deluge of visual content on the Internet – from user-generated content to commercial image collections - motivates intuitive new methods for searching digital image content: how can we find certain images in a database of millions? Sketch-based image retrieval (SBIR) is an emerging research topic in which a free-hand drawing can be used to visually query photographic images. SBIR is aligned to emerging trends for visual content consumption on mobile touch-screen based devices, for which gestural interactions such as sketch are a natural alternative to textual input. This thesis presents several contributions to the literature of SBIR. First, we propose a cross-modal learning framework that maps both sketches and text into a joint embedding space invariant to depictive style, while preserving semantics. The resulting embedding enables direct comparison and search between sketches/text and images and is based upon a multi-branch convolutional neural network (CNN) trained using unique training schemes. The deeply learned embedding is shown to yield state-of-art retrieval performance on several SBIR benchmarks. Second, we propose an approach for multi-modal image retrieval in multi-labelled images. A multi-modal deep network architecture is formulated to jointly model sket-ches and text as input query modalities into a common embedding space, which is then further aligned with the image feature space. Our architecture also relies on a salient object detection through a supervised LSTM-based visual attention model lear-ned from convolutional features. Both the alignment between the queries and the image and the supervision of the attention on the images are obtained by generalizing the Hungarian Algorithm using different loss functions. This permits encoding the object-based features and its alignment with the query irrespective of the availability of the co-occurrence of different objects in the training set. We validate the performance of our approach on standard single/multi-object datasets, showing state-of-the art performance in every SBIR dataset. Third, we investigate the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognizes two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended, that consists of $330,000$ sketches and $204,000$ photos spanning across 110 categories. Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic. We then formulate a ZS-SBIR framework to jointly model sketches and photos into a common embedding space. A novel strategy to mine the mutual information among domains is specifically engineered to alleviate the domain gap. External semantic knowledge is further embedded to aid semantic transfer. We show that, rather surprisingly, retrieval performance significantly outperforms that of state-of-the-art on existing datasets that can already be achieved using a reduced version of our model. We further demonstrate the superior performance of our full model by comparing with a number of alternatives on the newly proposed dataset.

APA, Harvard, Vancouver, ISO, and other styles

16

Palma, Alberto de Jesus Pastrana. "Feature Extraction, Correspondence Regions and Image Retrieval using Structured Images." Thesis, University of East Anglia, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.502556.

Full text

Abstract:

This thesis is about image descriptors, image retrieval and correspondence regions. The advantages of using scale-space on image descriptors are first discussed and a novel implementation of the sieve algorithm is introduced. We call this implementation 'The Structured Image'. It is shown here how such implementation decomposes the image in to a tree hierarchy collecting colour and texture descriptors throughout scale-space whilst remaining on a nearly linear order complexity. The algorithm is evaluated for correspondence repeatability rates and content based image retrieval. Results confirm the effectiveness of the implementation for both applications. We have also developed a graphic user interface to enable relevance feedback in to our image retrieval model. Our model is prepared to deal with segmentations of images rather than global att~ibutes of the image and it has been tested using two types of segmentations. Results in terms of precision rates are presented here for different iterations of relevance feedback.

APA, Harvard, Vancouver, ISO, and other styles

17

Elliott, Desmond. "Structured representation of images for language generation and image retrieval." Thesis, University of Edinburgh, 2015. http://hdl.handle.net/1842/10524.

Full text

Abstract:

A photograph typically depicts an aspect of the real world, such as an outdoor landscape, a portrait, or an event. The task of creating abstract digital representations of images has received a great deal of attention in the computer vision literature because it is rarely useful to work directly with the raw pixel data. The challenge of working with raw pixel data is that small changes in lighting can result in different digital images, which is not typically useful for downstream tasks such as object detection. One approach to representing an image is automatically extracting and quantising visual features to create a bag-of-terms vector. The bag-of-terms vector helps overcome the problems with raw pixel data but this unstructured representation discards potentially useful information about the spatial and semantic relationships between the parts of the image. The central argument of this thesis is that capturing and encoding the relationships between parts of an image will improve the performance of extrinsic tasks, such as image description or search. We explore this claim in the restricted domain of images representing events, such as riding a bicycle or using a computer. The first major contribution of this thesis is the Visual Dependency Representation: a novel structured representation that captures the prominent region–region relationships in an image. The key idea is that images depicting the same events are likely to have similar spatial relationships between the regions contributing to the event. This representation is inspired by dependency syntax for natural language, which directly captures the relationships between the words in a sentence. We also contribute a data set of images annotated with multiple human-written descriptions, labelled image regions, and gold-standard Visual Dependency Representations, and explain how the gold-standard representations can be constructed by trained human annotators. The second major contribution of this thesis is an approach to automatically predicting Visual Dependency Representations using a graph-based statistical dependency parser. A dependency parser is typically used in Natural Language Processing to automatically predict the dependency structure of a sentence. In this thesis we use a dependency parser to predict the Visual Dependency Representation of an image because we are working with a discrete image representation – that of image regions. Our approach can exploit features from the region annotations and the description to predict the relationships between objects in an image. In a series of experiments using gold-standard region annotations, we report significant improvements in labelled and unlabelled directed attachment accuracy over a baseline that assumes there are no relationships between objects in an image. Finally, we find significant improvements in two extrinsic tasks when we represent images as Visual Dependency Representations predicted from gold-standard region annotations. In an image description task, we show significant improvements in automatic evaluation measures and human judgements compared to state-of-the-art models that use either external text corpora or region proximity to guide the generation process. In the query-by-example image retrieval task, we show a significant improvement in Mean Average Precision and the precision of the top 10 images compared to a bag-of-terms approach. We also perform a correlation analysis of human judgements against automatic evaluation measures for the image description task. The automatic measures are standard measures adopted from the machine translation and summarization literature. The main finding of the analysis is that unigram BLEU is less correlated with human judgements than Smoothed BLEU, Meteor, or skip-bigram ROUGE.

APA, Harvard, Vancouver, ISO, and other styles

18

Ozcanli-ozbay, Ozge Can. "Image Retrieval Based On Region Classification." Master's thesis, METU, 2004. http://etd.lib.metu.edu.tr/upload/12605082/index.pdf.

Full text

Abstract:

In this thesis, a Content Based Image Retrieval (CBIR) system to query the objects in an image database is proposed. Images are represented as collections of regions after being segmented with Normalized Cuts algorithm. MPEG-7 content descriptors are used to encode regions in a 239-dimensional feature space. User of the proposed CBIR system decides which objects to query and labels exemplar regions to train the system using a graphical interface. Fuzzy ARTMAP algorithm is used to learn the mapping between feature vectors and binary coded class identification numbers. Preliminary recognition experiments prove the power of fuzzy ARTMAP as a region classifier. After training, features of all regions in the database are extracted and classified. Simple index files enabling fast access to all regions from a given class are prepared to be used in the querying phase. To retrieve images containing a particular object, user opens an image and selects a query region together with a label in the graphical interface of our system. Then the system ranks all regions in the indexed set of the query class with respect to their L2 (Euclidean) distance to the query region and displays resulting images. During retrieval experiments, comparable class precisions with respect to exhaustive searching of the database are maintained which demonstrates e ectiveness of the classifier in narrowing down the search space.

APA, Harvard, Vancouver, ISO, and other styles

19

Yan, Bin. "Web Recommendation System with Image Retrieval." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-156430.

Full text

Abstract:

The amount of information on the Internet has dramatically increased during recent years such that increment causes a problem so called “information overload”, which can only be partially solved by search engines. Although there is a considerable literature on search engine focusing on information overload, it has still not been completely overcome to date due to concerns about commercial interests, individual difference and objective process. Addressing those concerns, recommendation systems, which are information-filtering systems that can recommend information without explicit participation of the user, was designed to aim those problems. The recommendation system collects the interests of users to create an independent profile for each user. Moreover, it compares the user profile to some reference characteristics, and the system recommends information of potential interest to the user. They redeem from shortcomings of search engines, since recommendation systems focus on the specific characteristics of each user. Unlike previous literature that focuses on text, this thesis presents an improved recommendation system, which considers the information stored in images. Based on methods of user modeling and user profile expression are analyzed, A new design for user profiles joint with methods for content based image retrieval are presented. In this design, the new user profile contains information from images on the web pages to increase the accuracy of the recommendation. Furthermore, algorithms for updating the user model according to user feedback are also introduced such that the user model can reflect the interest modification of users. Using a real-word deployment, the thesis shows the new system achieves better accuracy comparing to existed text-only methods given small amount of data. Finally, the thesis argues about the feature selecting in Image analysis is the bottleneck for recommendation system. It appears very hard to significant improve existed system without new features and semantic analysis.

APA, Harvard, Vancouver, ISO, and other styles

20

Su, Hongjiang. "Shoeprint image noise reduction and retrieval." Thesis, Queen's University Belfast, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.486207.

Full text

Abstract:

A shoeprint is a mark made when the sole of a shoe comes into contact with a surfice. People committing crimes inevitably leave their shoe marks at the crime scene. A study suggests that footwear impressions could be located and retrieved at approximately 35% of all crime scenes. More and more shoeprint images have been collected, leading to a few of shoeprint image databases. The constantly increasing of the size of these databases leads to a problem that it takes too much time to classify or retrieve them manually. In addition, when a shoeprint is actually being made, distortion, capture device-dependent noise, and cutting-out can be introduced. This thesis deals with the problems involved in the development of an automated shoeprint image classification/retrieval system. Firstly, it is concerned with investigating the problem of noise and artefact reduction, and the segmentation of a shoeprint from a noisy background. It aims to provide a software package to pre-processing an input shoeprint image from variety of sources. Secondly it is concerned with developing and investigating robust descriptors for a shoeprint image, and it also addresses the problem of matching shoeprint images using these descriptors. In this thesis, some novel techniques for image quality measure, Gaussian noise and Germ-grain noise reduction, pattern segmentation and screening have been developed. In addition, a few of low-level image feature descriptors, pattern & topological spectra and local image features, have been proposed for indexing and searching a shoeprint image dataset. This thesis also has developed a prototype system to demonstrate the proposed algorithms and the application cases in forensic science. Shoeprint image retrieval tests on a few of datasets (totally more than 15, 000 images) suggest that local image features, compared with other shoeprint image descriptors, have great potential to be applied in real-world forensic investigations.

APA, Harvard, Vancouver, ISO, and other styles

21

Awg, Iskandar Dayang Nurfatimah, and dnfaiz@fit unimas my. "Image Retrieval using Automatic Region Tagging." RMIT University. Computer Science and Information Technology, 2008. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20090302.155704.

Full text

Abstract:

The task of tagging, annotating or labelling image content automatically with semantic keywords is a challenging problem. To automatically tag images semantically based on the objects that they contain is essential for image retrieval. In addressing these problems, we explore the techniques developed to combine textual description of images with visual features, automatic region tagging and region-based ontology image retrieval. To evaluate the techniques, we use three corpora comprising: Lonely Planet travel guide articles with images, Wikipedia articles with images and Goats comic strips. In searching for similar images or textual information specified in a query, we explore the unification of textual descriptions and visual features (such as colour and texture) of the images. We compare the effectiveness of using different retrieval similarity measures for the textual component. We also analyse the effectiveness of different visual features extracted from the images. We then investigate the best weight combination of using textual and visual features. Using the queries from the Multimedia Track of INEX 2005 and 2006, we found that the best weight combination significantly improves the effectiveness of the retrieval system. Our findings suggest that image regions are better in capturing the semantics, since we can identify specific regions of interest in an image. In this context, we develop a technique to tag image regions with high-level semantics. This is done by combining several shape feature descriptors and colour, using an equal-weight linear combination. We experimentally compare this technique with more complex machine-learning algorithms, and show that the equal-weight linear combination of shape features is simpler and at least as effective as using a machine learning algorithm. We focus on the synergy between ontology and image annotations with the aim of reducing the gap between image features and high-level semantics. Ontologies ease information retrieval. They are used to mine, interpret, and organise knowledge. An ontology may be seen as a knowledge base that can be used to improve the image retrieval process, and conversely keywords obtained from automatic tagging of image regions may be useful for creating an ontology. We engineer an ontology that surrogates concepts derived from image feature descriptors. We test the usability of the constructed ontology by querying the ontology via the Visual Ontology Query Interface, which has a formally specified grammar known as the Visual Ontology Query Language. We show that synergy between ontology and image annotations is possible and this method can reduce the gap between image features and high-level semantics by providing the relationships between objects in the image. In this thesis, we conclude that suitable techniques for image retrieval include fusing text accompanying the images with visual features, automatic region tagging and using an ontology to enrich the semantic meaning of the tagged image regions.

APA, Harvard, Vancouver, ISO, and other styles

22

Li, Fang. "Content-based retrieval for image databases." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0019/MQ48276.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Kochakornjarupong, Paijit. "Trademark image retrieval by local features." Thesis, University of Glasgow, 2011. http://theses.gla.ac.uk/2677/.

Full text

Abstract:

The challenge of abstract trademark image retrieval as a test of machine vision algorithms has attracted considerable research interest in the past decade. Current operational trademark retrieval systems involve manual annotation of the images (the current ‘gold standard’). Accordingly, current systems require a substantial amount of time and labour to access, and are therefore expensive to operate. This thesis focuses on the development of algorithms that mimic aspects of human visual perception in order to retrieve similar abstract trademark images automatically. A significant category of trademark images are typically highly stylised, comprising a collection of distinctive graphical elements that often include geometric shapes. Therefore, in order to compare the similarity of such images the principal aim of this research has been to develop a method for solving the partial matching and shape perception problem. There are few useful techniques for partial shape matching in the context of trademark retrieval, because those existing techniques tend not to support multicomponent retrieval. When this work was initiated most trademark image retrieval systems represented images by means of global features, which are not suited to solving the partial matching problem. Instead, the author has investigated the use of local image features as a means to finding similarities between trademark images that only partially match in terms of their subcomponents. During the course of this work, it has been established that the Harris and Chabat detectors could potentially perform sufficiently well to serve as the basis for local feature extraction in trademark image retrieval. Early findings in this investigation indicated that the well established SIFT (Scale Invariant Feature Transform) local features, based on the Harris detector, could potentially serve as an adequate underlying local representation for matching trademark images. There are few researchers who have used mechanisms based on human perception for trademark image retrieval, implying that the shape representations utilised in the past to solve this problem do not necessarily reflect the shapes contained in these image, as characterised by human perception. In response, a ii practical approach to trademark image retrieval by perceptual grouping has been developed based on defining meta-features that are calculated from the spatial configurations of SIFT local image features. This new technique measures certain visual properties of the appearance of images containing multiple graphical elements and supports perceptual grouping by exploiting the non-accidental properties of their configuration. Our validation experiments indicated that we were indeed able to capture and quantify the differences in the global arrangement of sub-components evident when comparing stylised images in terms of their visual appearance properties. Such visual appearance properties, measured using 17 of the proposed metafeatures, include relative sub-component proximity, similarity, rotation and symmetry. Similar work on meta-features, based on the above Gestalt proximity, similarity, and simplicity groupings of local features, had not been reported in the current computer vision literature at the time of undertaking this work. We decided to adopted relevance feedback to allow the visual appearance properties of relevant and non-relevant images returned in response to a query to be determined by example. Since limited training data is available when constructing a relevance classifier by means of user supplied relevance feedback, the intrinsically non-parametric machine learning algorithm ID3 (Iterative Dichotomiser 3) was selected to construct decision trees by means of dynamic rule induction. We believe that the above approach to capturing high-level visual concepts, encoded by means of meta-features specified by example through relevance feedback and decision tree classification, to support flexible trademark image retrieval and to be wholly novel. The retrieval performance the above system was compared with two other state-of-the-art image trademark retrieval systems: Artisan developed by Eakins (Eakins et al., 1998) and a system developed by Jiang (Jiang et al., 2006). Using relevance feedback, our system achieves higher average normalised precision than either of the systems developed by Eakins’ or Jiang. However, while our trademark image query and database set is based on an image dataset used by Eakins, we employed different numbers of images. It was not possible to access to the same query set and image database used in the evaluation of Jiang’s trademark iii image retrieval system evaluation. Despite these differences in evaluation methodology, our approach would appear to have the potential to improve retrieval effectiveness.

APA, Harvard, Vancouver, ISO, and other styles

24

Stathopoulos, Vassilios. "Generative probabilistic models for image retrieval." Thesis, University of Glasgow, 2012. http://theses.gla.ac.uk/3360/.

Full text

Abstract:

Searching for information is a recurring problem that almost everyone has faced at some point. Being in a library looking for a book, searching through newspapers and magazines for an old article or searching through emails for an old conversation with a colleague are some examples of the searching activity. These are some of the many situations where someone; the “user”; has some vague idea of the information he is looking for; an “information need”; and is searching through a large number of documents, emails or articles; “information items”; to find the most “relevant” item for his purpose. In this thesis we study the problem of retrieving images from large image archives. We consider two different approaches for image retrieval. The first approach is content based image retrieval where the user is searching images using a query image. The second approach is semantic retrieval where the users expresses his query using keywords. We proposed a unified framework to treat both approaches using generative probabilistic models in order to rank and classify images with respect to user queries. The methodology presented in this Thesis is evaluated on a real image collection and compared against state of the art methods.

APA, Harvard, Vancouver, ISO, and other styles

25

Shao, Ling. "Invariant salient regions based image retrieval." Thesis, University of Oxford, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.497094.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Rickman, Richard Matthew. "Image database retrieval using neural networks." Thesis, Brunel University, 1993. http://bura.brunel.ac.uk/handle/2438/4319.

Full text

Abstract:

The broad objective of this work has been to achieve retrieval of images from large unconstrained databases using image content. The problem is typified by the need to locate a target image within a database where no numerical indexing terms exist. Here, retrieval is based on important features within in an image and uses sample images or user sketches to specify a query. A typical query might be framed as "Find all images similar to this one", for example. The aim of this work has been to show how neural networks can provide a practical, flexible and robust solution to this problem. A neural network is basically an adaptive information filter which can be used to extract the salient characteristics of a data set during a training phase. The transformation learnt by the network can map the images into compact indices which support very rapid fuzzy matching of images across the database. This learning process optimises the performance of the code with respect to the contents of the database. We assess the applicability of several neural network architectures and learning rules for a practical coding scheme and investigate how the system parameters affect the performance of the system. We introduce a novel learning law which has a number of advantages over existing paradigms. In-depth mathematical analysis and extensive empirical tests are used to corroborate the arguments presented throughout. This thesis aims to show the nature of the image retrieval problem, how current research trends attempt to tackle it and how neural networks can offer us a real alternative to conventional approaches.

APA, Harvard, Vancouver, ISO, and other styles

27

Lai, Ting-Sheng. "CHROMA : a photographic image retrieval system." Thesis, University of Sunderland, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.301314.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Kesorn, Kraisak. "Multi modal multi-semantic image retrieval." Thesis, Queen Mary, University of London, 2010. http://qmro.qmul.ac.uk/xmlui/handle/123456789/438.

Full text

Abstract:

The rapid growth in the volume of visual information, e.g. image, and video can overwhelm users’ ability to find and access the specific visual information of interest to them. In recent years, ontology knowledge-based (KB) image information retrieval techniques have been adopted into in order to attempt to extract knowledge from these images, enhancing the retrieval performance. A KB framework is presented to promote semi-automatic annotation and semantic image retrieval using multimodal cues (visual features and text captions). In addition, a hierarchical structure for the KB allows metadata to be shared that supports multi-semantics (polysemy) for concepts. The framework builds up an effective knowledge base pertaining to a domain specific image collection, e.g. sports, and is able to disambiguate and assign high level semantics to ‘unannotated’ images. Local feature analysis of visual content, namely using Scale Invariant Feature Transform (SIFT) descriptors, have been deployed in the ‘Bag of Visual Words’ model (BVW) as an effective method to represent visual content information and to enhance its classification and retrieval. Local features are more useful than global features, e.g. colour, shape or texture, as they are invariant to image scale, orientation and camera angle. An innovative approach is proposed for the representation, annotation and retrieval of visual content using a hybrid technique based upon the use of an unstructured visual word and upon a (structured) hierarchical ontology KB model. The structural model facilitates the disambiguation of unstructured visual words and a more effective classification of visual content, compared to a vector space model, through exploiting local conceptual structures and their relationships. The key contributions of this framework in using local features for image representation include: first, a method to generate visual words using the semantic local adaptive clustering (SLAC) algorithm which takes term weight and spatial locations of keypoints into account. Consequently, the semantic information is preserved. Second a technique is used to detect the domain specific ‘non-informative visual words’ which are ineffective at representing the content of visual data and degrade its categorisation ability. Third, a method to combine an ontology model with xi a visual word model to resolve synonym (visual heterogeneity) and polysemy problems, is proposed. The experimental results show that this approach can discover semantically meaningful visual content descriptions and recognise specific events, e.g., sports events, depicted in images efficiently. Since discovering the semantics of an image is an extremely challenging problem, one promising approach to enhance visual content interpretation is to use any associated textual information that accompanies an image, as a cue to predict the meaning of an image, by transforming this textual information into a structured annotation for an image e.g. using XML, RDF, OWL or MPEG-7. Although, text and image are distinct types of information representation and modality, there are some strong, invariant, implicit, connections between images and any accompanying text information. Semantic analysis of image captions can be used by image retrieval systems to retrieve selected images more precisely. To do this, a Natural Language Processing (NLP) is exploited firstly in order to extract concepts from image captions. Next, an ontology-based knowledge model is deployed in order to resolve natural language ambiguities. To deal with the accompanying text information, two methods to extract knowledge from textual information have been proposed. First, metadata can be extracted automatically from text captions and restructured with respect to a semantic model. Second, the use of LSI in relation to a domain-specific ontology-based knowledge model enables the combined framework to tolerate ambiguities and variations (incompleteness) of metadata. The use of the ontology-based knowledge model allows the system to find indirectly relevant concepts in image captions and thus leverage these to represent the semantics of images at a higher level. Experimental results show that the proposed framework significantly enhances image retrieval and leads to narrowing of the semantic gap between lower level machinederived and higher level human-understandable conceptualisation.

APA, Harvard, Vancouver, ISO, and other styles

29

Tieu, Kinh H. (Kinh Han) 1976. "Boosting sparse representations for image retrieval." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/86431.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Mensink, Thomas. "Learning Image Classification and Retrieval Models." Thesis, Grenoble, 2012. http://www.theses.fr/2012GRENM113/document.

Full text

Abstract:

Nous assistons actuellement à une explosion de la quantité des données visuelles. Par exemple, plusieurs millions de photos sont partagées quotidiennement sur les réseaux sociaux. Les méthodes d'interprétation d'images vise à faciliter l'accès à ces données visuelles, d'une manière sémantiquement compréhensible. Dans ce manuscrit, nous définissons certains buts détaillés qui sont intéressants pour les taches d'interprétation d'images, telles que la classification ou la recherche d'images, que nous considérons dans les trois chapitres principaux. Tout d'abord, nous visons l'exploitation de la nature multimodale de nombreuses bases de données, pour lesquelles les documents sont composés d'images et de descriptions textuelles. Dans ce but, nous définissons des similarités entre le contenu visuel d'un document, et la description textuelle d'un autre document. Ces similarités sont calculées en deux étapes, tout d'abord nous trouvons les voisins visuellement similaires dans la base multimodale, puis nous utilisons les descriptions textuelles de ces voisins afin de définir une similarité avec la description textuelle de n'importe quel document. Ensuite, nous présentons une série de modèles structurés pour la classification d'images, qui encodent explicitement les interactions binaires entre les étiquettes (ou labels). Ces modèles sont plus expressifs que des prédicateurs d'étiquette indépendants, et aboutissent à des prédictions plus fiables, en particulier dans un scenario de prédiction interactive, où les utilisateurs fournissent les valeurs de certaines des étiquettes d'images. Un scenario interactif comme celui-ci offre un compromis intéressant entre la précision, et l'effort d'annotation manuelle requis. Nous explorons les modèles structurés pour la classification multi-étiquette d'images, pour la classification d'image basée sur les attributs, et pour l'optimisation de certaines mesures de rang spécifiques. Enfin, nous explorons les classifieurs par k plus proches voisins, et les classifieurs par plus proche moyenne, pour la classification d'images à grande échelle. Nous proposons des méthodes d'apprentissage de métrique efficaces pour améliorer les performances de classification, et appliquons ces méthodes à une base de plus d'un million d'images d'apprentissage, et d'un millier de classes. Comme les deux méthodes de classification permettent d'incorporer des classes non vues pendant l'apprentissage à un coût presque nul, nous avons également étudié leur performance pour la généralisation. Nous montrons que la classification par plus proche moyenne généralise à partir d'un millier de classes, sur dix mille classes à un coût négligeable, et les performances obtenus sont comparables à l'état de l'art
We are currently experiencing an exceptional growth of visual data, for example, millions of photos are shared daily on social-networks. Image understanding methods aim to facilitate access to this visual data in a semantically meaningful manner. In this dissertation, we define several detailed goals which are of interest for the image understanding tasks of image classification and retrieval, which we address in three main chapters. First, we aim to exploit the multi-modal nature of many databases, wherein documents consists of images with a form of textual description. In order to do so we define similarities between the visual content of one document and the textual description of another document. These similarities are computed in two steps, first we find the visually similar neighbors in the multi-modal database, and then use the textual descriptions of these neighbors to define a similarity to the textual description of any document. Second, we introduce a series of structured image classification models, which explicitly encode pairwise label interactions. These models are more expressive than independent label predictors, and lead to more accurate predictions. Especially in an interactive prediction scenario where a user provides the value of some of the image labels. Such an interactive scenario offers an interesting trade-off between accuracy and manual labeling effort. We explore structured models for multi-label image classification, for attribute-based image classification, and for optimizing for specific ranking measures. Finally, we explore k-nearest neighbors and nearest-class mean classifiers for large-scale image classification. We propose efficient metric learning methods to improve classification performance, and use these methods to learn on a data set of more than one million training images from one thousand classes. Since both classification methods allow for the incorporation of classes not seen during training at near-zero cost, we study their generalization performances. We show that the nearest-class mean classification method can generalize from one thousand to ten thousand classes at negligible cost, and still perform competitively with the state-of-the-art

APA, Harvard, Vancouver, ISO, and other styles

31

Hare, Jonathon S. "Saliency for image description and retrieval." Thesis, University of Southampton, 2006. https://eprints.soton.ac.uk/262437/.

Full text

Abstract:

We live in a world where we are surrounded by ever increasing numbers of images. More often than not, these images have very little metadata by which they can be indexed and searched. In order to avoid information overload, techniques need to be developed to enable these image collections to be searched by their content. Much of the previous work on image retrieval has used global features such as colour and texture to describe the content of the image. However, these global features are insufficient to accurately describe the image content when different parts of the image have different characteristics. This thesis initially discusses how this problem can be circumvented by using salient interest regions to select the areas of the image that are most interesting and generate local descriptors to describe the image characteristics in that region. The thesis discusses a number of different saliency detectors that are suitable for robust retrieval purposes and performs a comparison between a number of these region detectors. The thesis then discusses how salient regions can be used for image retrieval using a number of techniques, but most importantly, two techniques inspired from the field of textual information retrieval. Using these robust retrieval techniques, a new paradigm in image retrieval is discussed, whereby the retrieval takes place on a mobile device using a query image captured by a built-in camera. This paradigm is demonstrated in the context of an art gallery, in which the device can be used to find more information about particular images. The final chapter of the thesis discusses some approaches to bridging the semantic gap in image retrieval. The chapter explores ways in which un-annotated image collections can be searched by keyword. Two techniques are discussed; the first explicitly attempts to automatically annotate the un-annotated images so that the automatically applied annotations can be used for searching. The second approach does not try to explicitly annotate images, but rather, through the use of linear algebra, it attempts to create a semantic space in which images and keywords are positioned such that images are close to the keywords that represent them within the space.

APA, Harvard, Vancouver, ISO, and other styles

32

Rodhetbhai, Wasara. "Preprocessing for content-based image retrieval." Thesis, University of Southampton, 2009. https://eprints.soton.ac.uk/66393/.

Full text

Abstract:

The research focuses on image retrieval problems where the query is formed as an image of a specific object of interest. The broad aim is to investigate pre-processing for retrieval of images of objects when an example image containing the object is given. The object may be against a variety of backgrounds. Given the assumption that the object of interest is fairly centrally located in the image, the normalized cut segmentation and region growing segmentation are investigated to segment the object from the background but with limited success. An alternative approach comes from identifying salient regions in the image and extracting local features as a representation of the regions. The experiments show an improvement for retrieval by local features when compared with retrieval using global features from the whole image. For situations where object retrieval is required and where the foreground and background can be assumed to have different characteristics, it is useful to exclude salient regions which are characteristic of the background if they can be identified before matching is undertaken. This thesis proposes techniques to filter out salient regions believed to be associated with the background area. Background filtering using background clusters is the first technique which is proposed in the situation where only the background information is available for training. The second technique is the K-NN classification based on the foreground and background probability. In the last chapter, the support vector machine (SVM) method with PCA-SIFT descriptors is applied in an attempt to improve classification into foreground and background salient region classes. Retrieval comparisons show that the use of salient region background filtering gives an improvement in performance when compared with the unfiltered method.

APA, Harvard, Vancouver, ISO, and other styles

33

Torres, Fernandez Sara. "Designing Variational Autoencoders for Image Retrieval." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-234931.

Full text

Abstract:

The explosive growth of acquired visual data on the Internet has raised interestin developing advanced image retrieval systems. The main problem relies on thesearch of a specic image among large collections or databases, and this issue isshared by lots of users from a variety of domains, like crime prevention, medicineor journalism. To deal with this situation, this project focuses on variationalautoencoders for image retrieval.Variational autoencoders (VAE) are neural networks used for the unsupervisedlearning of complicated distributions by using stochastic variational inference.Traditionally, they have been used for image reconstruction or generation.However, the goal of this thesis consists of testing variational autoencoders forthe classication and retrieval of dierent images from a database.This thesis investigates several methods to achieve the best performance forimage retrieval applications. We use the latent variables in the bottleneck stageof the VAE as the learned features for the image retrieval task. In order toachieve fast retrieval, we focus on discrete latent features. Specically, the sigmoidfunction for binarization and the Gumbel-Softmax method for discretizationare investigated. The tests show that using the mean of the latent variablesas features gives generally better performance than their stochastic representations.Further, discrete features that use the Gumbel-Softmax method in thelatent space show good performance. It is close to the maximum a posterioriperformance as achieved by using a continuous latent space.
Den explosiva tillväxten av förvärvade visuella data på Internet har ökat in- tresse för att utveckla avancerade bildhämtningssystem. Huvudproblemet är beroende av sökandet efter en specifik bild bland stora samlingar eller databaser, och det här problemet delas av många användare från olika domäner, som brottsförebyggande, medicin eller journalistik. För att hantera denna situation fokuserar detta projekt på Variations autokodare för bildhämtning. Variations autokodare (VAE) är neurala nätverk som används för oövervakat lärande av komplicerade fördelningar genom att använda stokastisk variationsinferens. Traditionellt har de använts för bildrekonstruktion eller generation. Målet med denna avhandling består emellertid i att testa olika autokodare för klassificering och hämtning av olika bilder från en databas. Denna avhandling undersöker flera metoder för att uppnå bästa prestanda för bildåtervinning. Vi använder de latenta variablerna i flaskhalsstadiet i VAE som de lärda funktionerna för bildhämtningsuppgiften. För att uppnå snabb hämtning fokuserar vi på diskreta latenta funktioner. Specifikt undersöks sigmoidfunktionen för binärisering och Gumbel-Softmax-metoden för diskretisering. Testerna visar att med hjälp av medelvärdet av latenta variabler som funktioner ger generellt bättre prestanda än deras stokastiska representationer. Vidare visar diskreta funktioner som använder Gumbel-Softmax-metoden i det latenta utrymmet bra prestanda. Det ligger nära det maximala prestanda somuppnås genom att använda ett kontinuerligt latent utrymme.

APA, Harvard, Vancouver, ISO, and other styles

34

Ozendi, Mustafa. "Viewpoint Independent Image Classification and Retrieval." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1285011830.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Wong, Chun Fan. "Automatic semantic image annotation and retrieval." HKBU Institutional Repository, 2010. http://repository.hkbu.edu.hk/etd_ra/1188.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Janicki, James H. "Retrieval from an image knowledge base /." Online version of thesis, 1993. http://hdl.handle.net/1850/12196.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Alaei, Fahimeh. "Texture Feature-based Document Image Retrieval." Thesis, Griffith University, 2019. http://hdl.handle.net/10072/385939.

Full text

Abstract:

Storing and manipulating documents in digital form to contribute to a paperless society has been the propensity of emerging technology. There has been notable growth in the variety and quantity of digitised documents, which have often been scanned/photographed and archived as images without any labelling or sufficient index information. The growth of these kinds of document images will undoubtedly continue with new technology. To provide an effective way for retrieving and organizing these document images, many techniques have been implemented in the literature. However, designing automation systems to accurately retrieve document images from archives remains a challenging problem. Finding discriminative and effective features is the fundamental task for developing an efficient retrieval system. An overview of the literature reveals that research on document image retrieval using texture-based features has not yet been broadly investigated. Texture features are suitable for large volume data and are generally fast to compute. In this study, the effectiveness of more than 50 different texture-based feature extraction methods from four categories of texture features - statistical, transform-based, model-based, and structural approaches - are investigated in order to propose a more accurate method for document image retrieval. Moreover, the influence of resolution and similarity metrics on document image retrieval are examined. The MTDB, ITESOFT, and CLEF_IP datasets, which are heterogeneous datasets providing a great variety of page layouts and contents, are considered for experimentation, and the results are computed in terms of retrieval precision, recall, and F-score. By considering the performance, time complexity, and memory usage of different texture features on three datasets, the best category of texture features for obtaining the best retrieval results is discussed. The effectiveness of the transform-based category over other categories in regard to obtaining higher retrieval result is proven. Many new feature extraction and document image retrieval methods are proposed in this research. To attain fast document image retrieval, the number of extracted features and time complexity play a significant role in the retrieval process. Thus, a fast and non-parametric texture feature extraction method based on summarising the local grey-level structure of the image is further proposed in this research work. The proposed fast local binary pattern provided promising results, with lower computing time as well as smaller memory space consumption compared to other variations of local binary pattern-based methods. There is a challenge in DIR systems when document images in queries are of different resolutions from the document images considered for training the system. In addition, a small number of document image samples with a particular resolution may only be available for training a DIR system. To investigate these two issues, an under-sampling concept is considered to generate under-sampled images and to improve the retrieval results. In order to use more than one characteristic of document images for document image retrieval, two different texture-based features are used for feature extraction. The fast-local binary method as a statistical approach, and a wavelet analysis technique as a transform-based approach, are used for feature extraction, and two feature vectors are obtained for every document image. The classifier fusion method using the weighted average fusion of distance measures obtained in relation to each feature vector is then proposed to improve document image retrieval results. To extract features similar to human visual system perception, an appearance-based feature extraction method for document images is also proposed. In the proposed method, the Gist operator is employed on the sub-images obtained from the wavelet transform. Thereby, a set of global features from the original image as well as sub-images are extracted. Wavelet-based features are also considered as the second feature set. The classifier fusion technique is finally employed to find similarity distances between the extracted features using the Gist and wavelet transform from a given query and the knowledge-base. Higher document image retrieval results have been obtained from this proposed system compared to the other systems in the literature. The other appearance-based document image retrieval system proposed in this research is based on the use of a saliency map obtained from human visual attention. The saliency map obtained from the input document image is used to form a weighted document image. Features are then extracted from the weighted document images using the Gist operator. The proposed retrieval system provided the best document image retrieval results compared to the results reported from other systems. Further research could be undertaken to combine the properties of other approaches to improve retrieval result. Since in the conducted experiments, a priori knowledge regarding document image layout and content has not been considered, the use of prior knowledge about the document classes may also be integrated into the feature set to further improve the retrieval performance
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Info & Comm Tech
Science, Environment, Engineering and Technology
Full Text

APA, Harvard, Vancouver, ISO, and other styles

38

Zhu, Bin, and Hsinchun Chen. "Validating a Geographic Image Retrieval System." Wiley Periodicals, Inc, 2000. http://hdl.handle.net/10150/105934.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona
This paper summarizes a prototype geographical image retrieval system that demonstrates how to integrate image processing and information analysis techniques to support large-scale content-based image retrieval. By using an image as its interface, the prototype system addresses a troublesome aspect of traditional retrieval models, which require users to have complete knowledge of the low-level features of an image. In addition we describe an experiment to validate the performance of this image retrieval system against that of human subjects in an effort to address the scarcity of research evaluating performance of an algorithm against that of human beings. The results of the experiment indicate that the system could do as well as human subjects in accomplishing the tasks of similarity analysis and image categorization. We also found that under some circumstances texture features of an image are insufficient to represent a geographic image. We believe, however, that our image retrieval system provides a promising approach to integrating image processing techniques and information retrieval algorithms.

APA, Harvard, Vancouver, ISO, and other styles

39

Reddy, Vishwanath Reddy Keshi, and Praveen Bandikolla. "Image Retrieval Using a Combination of Keywords and Image Features." Thesis, Blekinge Tekniska Högskola, Avdelningen för för interaktion och systemdesign, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-3372.

Full text

Abstract:

Information retrieval systems are playing an important role in our day to day life for getting the required information. Many text retrieval systems are available and are working successfully. Even though internet is full of other media like images, audio and video, retrieval systems for these media are rare and have not achieved success as that of text retrieval systems. Image retrieval systems are useful in many applications; there is a high demand for effective and efficient tool for image organization and retrieval as per users need. Images are classified into text based image retrieval and content based image retrieval, we proposed a text based image retrieval system, which makes use of ontology to make the retrieval process intelligent. We worked on Cricket World Cup 2007. We combined text based image retrieval approach with content based image retrieval, which uses color and texture as basic low level features.
kvishu223@gmail.com, pravs72@yahoo.co.in.

APA, Harvard, Vancouver, ISO, and other styles

40

姚景岳. "Pistol image Retrieval." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/88x78j.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Tieu, Kinh, and Paul Viola. "Boosting Image Database Retrieval." 1999. http://hdl.handle.net/1721.1/5927.

Full text

Abstract:

We present an approach for image database retrieval using a very large number of highly-selective features and simple on-line learning. Our approach is predicated on the assumption that each image is generated by a sparse set of visual "causes" and that images which are visually similar share causes. We propose a mechanism for generating a large number of complex features which capture some aspects of this causal structure. Boosting is used to learn simple and efficient classifiers in this complex feature space. Finally we will describe a practical implementation of our retrieval system on a database of 3000 images.

APA, Harvard, Vancouver, ISO, and other styles

42

Cheng, Hsiang-Fen, and 鄭翔芬. "Image Clustering and Retrieval." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/59657803465700576860.

Full text

Abstract:

碩士
國立臺灣科技大學
資訊管理系
97
Nowadays, due to the rapid growth of World Wide Web (WWW), a large amount of multimedia data is generated on Internet, which is usually compressed in JPEG format in order to transmit and store efficiently. However, current approaches for content based image retrieval almost focus on uncompressed images. They need to decode images to spatial domain first, which would consume a lot of computation and search time. Therefore, in order to shorten the retrieval time, directly processing the feature extraction and image retrieval from compressed domain can save a lot of time. In addition, the value obtaining by only partial decoding could also represent the image’s characteristic explicitly. Nevertheless, most of approaches in this compressed domain still select a lot of coefficients to represent the image’s features or process those coefficients in additional steps for obtaining image features. However, in this way, the search time will increase dramatically with the size of the image database. Hence, the purpose of this thesis is to extract only a few representative features from the compressed domain, and effectively use these features in image retrieval system such that the images requested by users can be retrieved efficiently. This thesis proposes an efficient image clustering and retrieval approaches. They can improve search time and effectively retrieve the similar images. Using bisecting K-means algorithm, the images from an compressed image database are separated according to the image’s content first, so the retrieval approach is not necessary to search all images in the image database in later processes. Moreover, DC (Direct Current) coefficients are directly extracted from DCT (Discrete Cosine Transformation) domain without fully decoding the compressed images. Therefore, the time of similarity measurement is decreased, and the features extracted from the image database are easy to be managed. In addition, using DC features on the clustering stage and similarity computing stage, the proposed approach can efficiently retrieve the images which match the user’s demand. Experimental results reveal that the proposed approach has highly efficient response time and improves the performance of image retrieval result.

APA, Harvard, Vancouver, ISO, and other styles

43

傅德瑋. "Region-Based Image Retrieval." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/09033606120251965326.

Full text

Abstract:

碩士
國立臺灣大學
資訊工程學研究所
91
A content-based image region retrieval system (CBRR) can retrieve the desired image regions for a user from a large database, based on image region content. There are some difference between CBRR and traditional content-based image retrieval system (CBIR). The CBIR is focused on global image similarity from some classified category images. However the CBRR is focused on local image region similarity. For the sake of obtaining the image region, we must segment each image into regions, and we apply the watershed segmentation in this thesis.

APA, Harvard, Vancouver, ISO, and other styles

44

Wu, Jui-Chien, and 吳瑞千. "Template-based Image Retrieval." Thesis, 1998. http://ndltd.ncl.edu.tw/handle/64925670330552846623.

Full text

Abstract:

碩士
國立清華大學
資訊工程學系
86
A template-based image retrieval system is proposed. The user can specify a small template and let the system to find which image(s) in the database contains this template. The system stores the projections of edges as features and uses different similarity measures such as sum of absolute difference, variance, and elastic distance to deal with templates of differentdistortions. Experiments by a distributed implementation show that the proposedmethod can retrieve the desired image(s) in minutes from a database of thousands of images and tolerate minor distortions.

APA, Harvard, Vancouver, ISO, and other styles

45

Wu, Wei-Liang, and 吳韋良. "Semantics-based Image Retrieval." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/52979211332423742224.

Full text

Abstract:

碩士
國立交通大學
資訊科學與工程研究所
105
Image search is an important technique in multimedia applications, in which image retrieval is a common technology for image search. Given a query image, the goal is to retrieve relevant images from an image database. Most previous research studies rely on important features extracted from images to calculate the similarity of two images. One of the drawbacks of this approach is that it focuses on the image-specific features without considering semantics of the images. Therefore, the images that are semantically related to query images but highly differ in image features will not be the candidates of the retrieval. Additionally, many photo websites allow users to provide descriptions or tags for the photos they uploaded, inspiring us to use the image itself and its description to propose a semantics-based image retrieval framework by using machine learning techniques. The key idea behind the proposed method is to extract important objects from the query image, and classified the extracted objects as predefined labels for this query image. Then, we project the labels and the descriptions in the data set to the same latent space by using deep neural networks, and calculate semantic similarity in the latent space. This thesis conducts experiments on Flickr data set and evaluates the results with the average irrelevant image number of the searching results. The experimental results indicate that when only using an image as query, the retrieved results are much acceptable than other methods' results.

APA, Harvard, Vancouver, ISO, and other styles

46

Jueng, Cheng-Yuan, and 鄭承淵. "Texture Image Retrieval System." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/rv59ng.

Full text

Abstract:

碩士
崑山科技大學
電腦與通訊研究所
102
When the designers want to look for leathers or cloths, the most common way to search the materials is to browse the vendor catalogs. Due to the advance of web technologies, they can use the internet search engines to browse the image database giving either the keywords or pictures of the samples. Sometimes, what the designers thinking in mind is just a style or a concept which is difficult to be described literally and to match the annotation of images in the image database. Using pictures of the samples for image retrieval, or called content-based image retrieval, might be a better way to solve the problems caused by poor literal description. Current public search engines which support content-based image retrieval functions, such as google or Tineye, retrieve the images mainly based on the similarity of color histograms between the user upload query image and each image in the database. However, for the fabric materials, those with the same texture might appear with different colors. Therefore, merely color feature is not suitable for comparing different fabric materials. In this study, we develop a content-based image retrieval system for leather and fabric materials based on not only the feature of color histogram but also the features derived LBP (local binary pattern), FAST (Features from Accelerated Segment Test) and Haar Wavelet. These features are able to discriminate not only the textures but also the pattern styles. 45 categories of fabrics with various textures, colors and patterns are test in our experiments. Top 9 retrieved images are considered as the candidates. When the search scope is set to ten categories, our system can reach 87.37% retrieval rate. When all 45 categories are test, our system can reach 41.27% retrieval rate. In the future, we will continue to improve the algorithms to enhance the overall retrieval rate.

APA, Harvard, Vancouver, ISO, and other styles

47

Kakde, Bhavana. "Content-Based Image Retrieval." Thesis, 2018. http://ethesis.nitrkl.ac.in/9891/1/2018_MT_216EC6250_B_Kakde_Content.pdf.

Full text

Abstract:

Content-Based Image Retrieval (CBIR) has a useful role in image retrieval framework. It is a commonly received solution for an efficient and effective method that can look up for the required image from the large database without human interaction. There occurs a need of CBIR because the development digital images, due to widespread capturing of images using web cameras, mobile phones enable with the camera, and digital cameras is rendering the management of image database tedious. It can also be used in other application such as web engines and social media which stores a large number of images and requires fast retrieval of the image selected by the user. The extraction of a feature in CBIR is a noticeable step whose viability is dependent upon the strategy used for feature extraction from given images. These features can be arranged in classes like as histogram, spatial layout, shape, texture, color, etc.The CBIR uses these features to retrieve and index the image database. The objective is to find a unique representation for all different variation because the user captures images in various conditions such as occlusion and varying illumination etc. Feature extraction method used for CBIR are Local tetra patterns (LTrP) and Dual-Cross Patterns (DCP). Local tetra patterns (LTrP) is a method which acquires more detailed information by using four possible directions of every center pixel in an image, and is calculated from first order derivatives in horizontal and vertical directions. DCP encodes second order information in the vertical, horizontal and diagonal direction, by performing the encoding of sample points in the local surrounding region of every center pixel in an image. Simulation is performed in Matlab 8.6 to evaluate the performance of retrieval framework using LTrP and DCP, and Corel 900 database(D) is used for this purpose. The simulation performance of presented technique is evaluated in terms of average recall and average precision. The average recall for LTrP is 42% and the average precision for LTrP is 63%. The average recall for DCP is 35% and the average precision for DCP is 61.05%. Our extensive simulation on Corel database shows that the DCP technique has better computational complexity compared to LTrP methods.

APA, Harvard, Vancouver, ISO, and other styles

48

Tan, Hsiao-lan, and 譚曉蘭. "Similarity Retrieval for Rotated Images in Image Database." Thesis, 1999. http://ndltd.ncl.edu.tw/handle/13849315357878412082.

Full text

Abstract:

碩士
淡江大學
資訊管理學系
87
The spatial relationship between objects is one of the important characteristics of an image. In an image database system, spatial reasoning and similarity retrieval are often performed based on the spatial relationships. Hence, how to use a spatial data structure to represent the spatial relationships within an image has been discussed in the related research. The 2D string and all the strings extended from it have been used as the data structures to represent the spatial relationships between objects. 　　However, these strings are used in the Cartesian coordinates systems and the query image must have the same orientation as that of the images in the image database. Consequently, a database image will not be retrieved by a query image although they have the same spatial relationships between objects but different orientations. RS-string has been proposed to try to solve this problem. However, it is based on the polar coordinates system. The description of the spatial relationships between objects is different from that in the Cartesian coordinates system. In this paper, we propose and approach to retrieve the database image when the query image provided by the user is in the different orientation according to the spatial relationships between objects in the Cartesian coordinates system. 　　In the proposed approach, the change of the spatial relationship between each pair of objects in the query image rotated from 0 to 360 is recorded in the relation table. The table is then compared to the spatial relationships between objects of the images in the image database. For an image in database, if it has the same spatial relationship between every pair of objects as the query image when the query image is rotated to an orientation or a range of orientation, it is a similar image. Hence, the similarity retrieval can be performed in the Cartesian coordinates when the images are in different orientation. Finally, a prototype is presented.

APA, Harvard, Vancouver, ISO, and other styles

49

Chen, Bo-Rong, and 陳柏融. "Content-Based Image Retrieval System for Real Images." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/77260057315471754913.

Full text

Abstract:

碩士
國立雲林科技大學
資訊工程系
104
With the rapid progress of network technologies and multimedia data, information retrieval techniques gradually become content-based, and not text-based yet. In this paper, we propose a content-based image retrieval system to query similar images in a real image database. First, we employ segmentation and main object detection to separate the main object from an image. Then, we extract MPEG-7 features from the object and select relevant features using the SAHS algorithm. Next, two approaches “one-against-all” and “one-against-one” are proposed to build the classifiers based on SVM. To further reduce indexing complexity, K-means clustering is used to generate MPEG-7 signatures. Thus, we combine the classes predicted by the classifiers and the results based on the MPEG-7 signatures, and find out the similar images to a query image. Finally, the experimental results show that our method is feasible in image searching from the real image database and more effective than the other methods.

APA, Harvard, Vancouver, ISO, and other styles

50

Chang, Yih-Cheng, and 張亦塵. "Image Sense Disambiguation in Web Image Retrieval." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/07023748217771211093.

Full text

Abstract:

碩士
國立臺灣大學
資訊工程學研究所
96
In these few years, images in the web have explosively increased. Image retrieval for web images becomes more and more important. Image sense disambiguation/discrimination (ISD) is a task to disambiguate/discriminate image senses of retrieved web images. This technology can be used to improve the performance of web image retrieval or be applied in image annotation or object recognition tasks to help collecting training samples. ISD is a new task not being well studied but may become important in the future. In this thesis, we analyze and discuss ISD problem. We propose a method to find senses of web images. There may be many senses in the web are not be included in the dictionary. For each sense, we collect sample images and pages without human annotation. Unlike previous approaches that use clustering methods in ISD, we use classifying method instead. Four kinds of classifiers and a merge method are proposed in this thesis. The steps of our methods are evaluated and discussed and in the end of this thesis we will summarize our work and discuss some interesting future works.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!