Journal articles on the topic 'Image representation'

To see the other types of publications on this topic, follow the link: Image representation.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Image representation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Li, Hong, Jin Ping Zhang, Fen Xia Wu, and Cong E. Tan. "Image Fusion with Sparse Representation." Advanced Materials Research 798-799 (September 2013): 737–40. http://dx.doi.org/10.4028/www.scientific.net/amr.798-799.737.

Full text
Abstract:
Sparse representation is a new image representation theory. It can accurately represent the image information. In this paper, a novel fusion scheme using sparse representation is proposed. The sparse representation is conducted on overlapping patches. Each source image is divided into patches, and all the patches are transformed into vectors. Decompose the vectors into theirs sparse representations using orthogonal matching pursuit. Sparse coefficients are fused with the maximum absolute. The simulation results show that the proposed method can provide high-quality images.
APA, Harvard, Vancouver, ISO, and other styles
2

HU, CHAO, LI LIU, BO SUN, and MAX Q. H. MENG. "COMPACT REPRESENTATION AND PANORAMIC REPRESENTATION FOR CAPSULE ENDOSCOPE IMAGES." International Journal of Information Acquisition 06, no. 04 (December 2009): 257–68. http://dx.doi.org/10.1142/s0219878909001989.

Full text
Abstract:
A capsule endoscope robot is a miniature medical instrument for inspection of gastrointestinal tract. In this paper, we present image compact representation and preliminary panoramic representation methods for the capsule endoscope. First, the characteristics of the capsule endoscopic images are investigated and different coordinate representations of the circular image are discussed. Secondly, effective compact representation methods including special DPCM and wavelet compression techniques are applied to the endoscopic images to get high compression ratio and signal to noise ratio. Then, a preliminary approach to panoramic representation of endoscopic images is presented.
APA, Harvard, Vancouver, ISO, and other styles
3

Gavaler, Chris. "Refining the Comics Form." European Comic Art 10, no. 2 (September 1, 2017): 1–23. http://dx.doi.org/10.3167/eca.2017.100202.

Full text
Abstract:
Setting aside historical factors and focusing exclusively on a formal definition of comics as juxtaposed images, comics may be further refined by analysing the divisions, orders and relationships of those images. The images may also have both representational and abstract levels that together produce narrative’s intrinsic patterns and its extrinsic feeling of story. Although narrative comics and abstract comics sound like opposites, a representative narrative may be understood non-representationally because it is composed of abstract marks, and a sequence of abstract images can still create the experience of story through implied conflict and transformation. Analysed according to image representation, image relation, and image order, comics divide into six formally distinct categories: representational and abstract narratives; representational and abstract arrangements; and representational and abstract non sequiturs.
APA, Harvard, Vancouver, ISO, and other styles
4

Song, Lijuan. "Image Segmentation Based on Supervised Discriminative Learning." International Journal of Pattern Recognition and Artificial Intelligence 32, no. 10 (June 20, 2018): 1854027. http://dx.doi.org/10.1142/s0218001418540277.

Full text
Abstract:
In view of the complex background of images and the segmentation difficulty, a sparse representation and supervised discriminative learning were applied to image segmentation. The sparse and over-complete representation can represent images in a compact and efficient manner. Most atom coefficients are zero, only a few coefficients are large, and the nonzero coefficient can reveal the intrinsic structures and essential properties of images. Therefore, sparse representations are beneficial to subsequent image processing applications. We first described the sparse representation theory. This study mainly revolved around three aspects, namely a trained dictionary, greedy algorithms, and the application of the sparse representation model in image segmentation based on supervised discriminative learning. Finally, we performed an image segmentation experiment on standard image datasets and natural image datasets. The main focus of this thesis was supervised discriminative learning, and the experimental results showed that the proposed algorithm was optimal, sparse, and efficient.
APA, Harvard, Vancouver, ISO, and other styles
5

Tian, Chunwei, Qi Zhang, Jian Zhang, Guanglu Sun, and Yuan Sun. "2D-PCA Representation and Sparse Representation for Image Recognition." Journal of Computational and Theoretical Nanoscience 14, no. 1 (January 1, 2017): 829–34. http://dx.doi.org/10.1166/jctn.2017.6281.

Full text
Abstract:
The two-dimensional principal component analysis (2D-PCA) method has been widely applied in fields of image classification, computer vision, signal processing and pattern recognition. The 2D-PCA algorithm also has a satisfactory performance in both theoretical research and real-world applications. It not only retains main information of the original face images, but also decreases the dimension of original face images. In this paper, we integrate the 2D-PCA and spare representation classification (SRC) method to distinguish face images, which has great performance in face recognition. The novel representation of original face image obtained using 2D-PCA is complementary with original face image, so that the fusion of them can obviously improve the accuracy of face recognition. This is also attributed to the fact the features obtained using 2D-PCA are usually more robust than original face image matrices. The experiments of face recognition demonstrate that the combination of original face images and new representations of the original face images is more effective than the only original images. Especially, the simultaneous use of the 2D-PCA method and sparse representation can extremely improve accuracy in image classification. In this paper, the adaptive weighted fusion scheme automatically obtains optimal weights and it has no any parameter. The proposed method is not only simple and easy to achieve, but also obtains high accuracy in face recognition.
APA, Harvard, Vancouver, ISO, and other styles
6

Younas, Junaid, Shoaib Ahmed Siddiqui, Mohsin Munir, Muhammad Imran Malik, Faisal Shafait, Paul Lukowicz, and Sheraz Ahmed. "Fi-Fo Detector: Figure and Formula Detection Using Deformable Networks." Applied Sciences 10, no. 18 (September 16, 2020): 6460. http://dx.doi.org/10.3390/app10186460.

Full text
Abstract:
We propose a novel hybrid approach that fuses traditional computer vision techniques with deep learning models to detect figures and formulas from document images. The proposed approach first fuses the different computer vision based image representations, i.e., color transform, connected component analysis, and distance transform, termed as Fi-Fo image representation. The Fi-Fo image representation is then fed to deep models for further refined representation-learning for detecting figures and formulas from document images. The proposed approach is evaluated on a publicly available ICDAR-2017 Page Object Detection (POD) dataset and its corrected version. It produces the state-of-the-art results for formula and figure detection in document images with an f1-score of 0.954 and 0.922, respectively. Ablation study results reveal that the Fi-Fo image representation helps in achieving superior performance in comparison to raw image representation. Results also establish that the hybrid approach helps deep models to learn more discriminating and refined features.
APA, Harvard, Vancouver, ISO, and other styles
7

RIZO-RODRÍGUEZ, DAYRON, HEYDI MÉNDEZ-VAZQUEZ, and EDEL GARCÍA-REYES. "ILLUMINATION INVARIANT FACE RECOGNITION IN QUATERNION DOMAIN." International Journal of Pattern Recognition and Artificial Intelligence 27, no. 03 (May 2013): 1360004. http://dx.doi.org/10.1142/s0218001413600045.

Full text
Abstract:
The performance of face recognition systems tends to decrease when images are affected by illumination. Feature extraction is one of the main steps of a face recognition process, where it is possible to alleviate the illumination effects on face images. In order to increase the accuracy of recognition tasks, different methods for obtaining illumination invariant features have been developed. The aim of this work is to compare two different ways to represent face image descriptions in terms of their illumination invariant properties for face recognition. The first representation is constructed following the structure of complex numbers and the second one is based on quaternion numbers. Using four different face description approaches both representations are constructed, transformed into frequency domain and expressed in polar coordinates. The most illumination invariant component of each frequency domain representation is determined and used as the representative information of the face image. Verification and identification experiments are then performed in order to compare the discriminative power of the selected components. Representative component of the quaternion representation overcame the complex one.
APA, Harvard, Vancouver, ISO, and other styles
8

Sun, Bo, Abdullah M. Iliyasu, Fei Yan, Fangyan Dong, and Kaoru Hirota. "An RGB Multi-Channel Representation for Images on Quantum Computers." Journal of Advanced Computational Intelligence and Intelligent Informatics 17, no. 3 (May 20, 2013): 404–17. http://dx.doi.org/10.20965/jaciii.2013.p0404.

Full text
Abstract:
RGB multi channel representation is proposed for images on quantum computers (MCQI) that captures information about colors (RGB channels) and their corresponding positions in an image in a normalized quantum state. The proposed representation makes it possible to store the RGB information about an image simultaneously by using 2n+3 qubits for encoding 2n× 2npixel images, whereas pixel-wise processing is necessary in many other quantum image representations, e.g., qubit lattice, grid qubit, and quantum lattice. Simulation of storage and retrieval of MCQI images using human facial images demonstrated that 15 qubits are required for encoding 64 × 64 colored images, and encoded information is retrieved by measurement. Perspectives of designing quantum image operators are also discussed based onMCQI representation, e.g., channel of interest, channel swapping, and restrict version of color transformation.
APA, Harvard, Vancouver, ISO, and other styles
9

Mihálik, A., and R. Ďurikovič. "Image-based BRDF Representation." Journal of Applied Mathematics, Statistics and Informatics 11, no. 2 (December 1, 2015): 47–56. http://dx.doi.org/10.1515/jamsi-2015-0011.

Full text
Abstract:
Abstract To acquire a certain level of photorealism in computer graphics, it is necessary to analyze, how the materials scatter the incident light. In this work, we propose the method to direct rendering of isotropic bidirectional reflectance function (BRDF) from the small set of images. The image-based rendering is focused to synthesize as accurately as possible scenes composed of natural and artificial objects. The realistic image synthesis of BRDF data requires evaluation of radiance over the multiple directions of incident and scattered light from the surface. In our approach the images depict only the material reflectance, the shape is represented as the object geometry. We store the BRDF representation, acquired from the sample material, in a number of two-dimensional textures that contain images of spheres lit from the multiple directions. In order to render particular material, we interpolate between textures in the similar way the image morphing works. Our method allows the real-time rendering of tabulated BRDF data on low memory devices such as mobile phones.
APA, Harvard, Vancouver, ISO, and other styles
10

Zheng, Shijun, Yongjun Zhang, Wenjie Liu, and Yongjie Zou. "Improved image representation and sparse representation for image classification." Applied Intelligence 50, no. 6 (February 10, 2020): 1687–98. http://dx.doi.org/10.1007/s10489-019-01612-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Chen, Yuhao, Alexander Wong, Yuan Fang, Yifan Wu, and Linlin Xu. "Deep Residual Transform for Multi-scale Image Decomposition." Journal of Computational Vision and Imaging Systems 6, no. 1 (January 15, 2021): 1–5. http://dx.doi.org/10.15353/jcvis.v6i1.3537.

Full text
Abstract:
Multi-scale image decomposition (MID) is a fundamental task in computer vision and image processing that involves the transformation of an image into a hierarchical representation comprising of different levels of visual granularity from coarse structures to fine details. A well-engineered MID disentangles the image signal into meaningful components which can be used in a variety of applications such as image denoising, image compression, and object classification. Traditional MID approaches such as wavelet transforms tackle the problem through carefully designed basis functions under rigid decomposition structure assumptions. However, as the information distribution varies from one type of image content to another, rigid decomposition assumptions lead to inefficiently representation, i.e., some scales can contain little to no information. To address this issue, we present Deep Residual Transform (DRT), a data-driven MID strategy where the input signal is transformed into a hierarchy of non-linear representations at different scales, with each representation being independently learned as the representational residual of previous scales at a user-controlled detail level. As such, the proposed DRT progressively disentangles scale information from the original signal by sequentially learning residual representations. The decomposition flexibility of this approach allows for highly tailored representations cater to specific types of image content, and results in greater representational efficiency and compactness. In this study, we realize the proposed transform by leveraging a hierarchy of sequentially trained autoencoders. To explore the efficacy of the proposed DRT, we leverage two datasets comprising of very different types of image content: 1) CelebFaces and 2) Cityscapes. Experimental results show that the proposed DRT achieved highly efficient information decomposition on both datasets amid their very different visual granularity characteristics.
APA, Harvard, Vancouver, ISO, and other styles
12

Lu, Jiahao, Johan Öfverstedt, Joakim Lindblad, and Nataša Sladoje. "Is image-to-image translation the panacea for multimodal image registration? A comparative study." PLOS ONE 17, no. 11 (November 28, 2022): e0276196. http://dx.doi.org/10.1371/journal.pone.0276196.

Full text
Abstract:
Despite current advancement in the field of biomedical image processing, propelled by the deep learning revolution, multimodal image registration, due to its several challenges, is still often performed manually by specialists. The recent success of image-to-image (I2I) translation in computer vision applications and its growing use in biomedical areas provide a tempting possibility of transforming the multimodal registration problem into a, potentially easier, monomodal one. We conduct an empirical study of the applicability of modern I2I translation methods for the task of rigid registration of multimodal biomedical and medical 2D and 3D images. We compare the performance of four Generative Adversarial Network (GAN)-based I2I translation methods and one contrastive representation learning method, subsequently combined with two representative monomodal registration methods, to judge the effectiveness of modality translation for multimodal image registration. We evaluate these method combinations on four publicly available multimodal (2D and 3D) datasets and compare with the performance of registration achieved by several well-known approaches acting directly on multimodal image data. Our results suggest that, although I2I translation may be helpful when the modalities to register are clearly correlated, registration of modalities which express distinctly different properties of the sample are not well handled by the I2I translation approach. The evaluated representation learning method, which aims to find abstract image-like representations of the information shared between the modalities, manages better, and so does the Mutual Information maximisation approach, acting directly on the original multimodal images. We share our complete experimental setup as open-source (https://github.com/MIDA-group/MultiRegEval), including method implementations, evaluation code, and all datasets, for further reproducing and benchmarking.
APA, Harvard, Vancouver, ISO, and other styles
13

Thayer, Colette, and Laura Skufca. "Media Image Landscape: Age Representation in Online Images." Innovation in Aging 4, Supplement_1 (December 1, 2020): 101. http://dx.doi.org/10.1093/geroni/igaa057.332.

Full text
Abstract:
Abstract This study looked at the extent to which the 50-plus population is portrayed in media images online. A random sample of images was drawn from 2.7 million images downloaded from professional and semiprofessional domains and social distributions for brands and thought leaders. Natural language processing technology was employed to find images using topical guides chosen to be reflective of online images. Results of this study showed that while some media has moved toward more positive visual representation of older people, the 50-plus population is still not accurately portrayed in the media. For example, while nearly half of the U.S. adult population is age 50-plus, only 15% of images containing adults include people this age. In addition, when the 50-plus are shown, they are more likely to be portrayed negatively than those under age 50. The 50-plus population is often portrayed as dependent and disconnected from the rest of world although most are actively engaged in their communities. They are rarely shown with technology and in work settings. Furthermore, while a myriad of vibrant personalities come across in images of adults under age 50, the representation of people 50-plus starts to homogenize and exaggerate stereotypical and outdated physical appearance characteristics. This study demonstrates the need for visual representations that reflect greater diversity and authenticity of the 50-plus population as these images affect the attitudes, expectations, and behaviors of older and younger people alike. Keywords: ageism, reframing aging, media image representation
APA, Harvard, Vancouver, ISO, and other styles
14

Mykhailov, Dmytro. "Postphenomenological variation of instrumental realism on the "problem of representation"." Prometeica - Revista de Filosofía y Ciencias, Especial (August 11, 2022): 64–78. http://dx.doi.org/10.34024/prometeica.2022.especial.13520.

Full text
Abstract:
In the present paper, I take findings from the postphenomenological variation of instrumental realism to develop an ‘environmental framework’ to provide a philosophical answer to the ‘problem of representation.’ The framework focuses on three elements of the representational environment, image-making technology, image as a representational device, and scientific hermeneutic strategies occurring within the image interpretation process in the laboratory set-up. The central idea in this regard is that scientific images do not produce meanings without their instrumental environment or that an image becomes representational through the interplay between three framework elements. In the second part of the paper, I apply the framework to contemporary debates on fMRI imaging. I show that fMRI images receive meaning not in isolation but within a complex instrumental environment.
APA, Harvard, Vancouver, ISO, and other styles
15

Xu, Yong, Bob Zhang, and Zuofeng Zhong. "Multiple representations and sparse representation for image classification." Pattern Recognition Letters 68 (December 2015): 9–14. http://dx.doi.org/10.1016/j.patrec.2015.07.032.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Mancas, Matei, Bernard Gosselin, and Benoît Macq. "Perceptual Image Representation." EURASIP Journal on Image and Video Processing 2007, no. 1 (2007): 098181. http://dx.doi.org/10.1186/1687-5281-2007-098181.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Mancas, Matei, Bernard Gosselin, and Benoît Macq. "Perceptual Image Representation." EURASIP Journal on Image and Video Processing 2007 (2007): 1–9. http://dx.doi.org/10.1155/2007/98181.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Gao, Ruiqi, Jianwen Xie, Siyuan Huang, Yufan Ren, Song-Chun Zhu, and Ying Nian Wu. "Learning V1 Simple Cells with Vector Representation of Local Content and Matrix Representation of Local Motion." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 6 (June 28, 2022): 6674–84. http://dx.doi.org/10.1609/aaai.v36i6.20622.

Full text
Abstract:
This paper proposes a representational model for image pairs such as consecutive video frames that are related by local pixel displacements, in the hope that the model may shed light on motion perception in primary visual cortex (V1). The model couples the following two components: (1) the vector representations of local contents of images and (2) the matrix representations of local pixel displacements caused by the relative motions between the agent and the objects in the 3D scene. When the image frame undergoes changes due to local pixel displacements, the vectors are multiplied by the matrices that represent the local displacements. Thus the vector representation is equivariant as it varies according to the local displacements. Our experiments show that our model can learn Gabor-like filter pairs of quadrature phases. The profiles of the learned filters match those of simple cells in Macaque V1. Moreover, we demonstrate that the model can learn to infer local motions in either a supervised or unsupervised manner. With such a simple model, we achieve competitive results on optical flow estimation.
APA, Harvard, Vancouver, ISO, and other styles
19

Nagy, Marius, and Naya Nagy. "Image processing: why quantum?" Quantum Information and Computation 20, no. 7&8 (June 2020): 616–26. http://dx.doi.org/10.26421/qic20.7-8-6.

Full text
Abstract:
Quantum Image Processing has exploded in recent years with dozens of papers trying to take advantage of quantum parallelism in order to offer a better alternative to how current computers are dealing with digital images. The vast majority of these papers define or make use of quantum representations based on very large superposition states spanning as many terms as there are pixels in the image they try to represent. While such a representation may apparently offer an advantage in terms of space (number of qubits used) and speed of processing (due to quantum parallelism), it also harbors a fundamental flaw: only one pixel can be recovered from the quantum representation of the entire image, and even that one is obtained non-deterministically through a measurement operation applied on the superposition state. We investigate in detail this measurement bottleneck problem by looking at the number of copies of the quantum representation that are necessary in order to recover various fractions of the original image. The results clearly show that any potential advantage a quantum representation might bring with respect to a classical one is paid for dearly with the huge amount of resources (space and time) required by a quantum approach to image processing.
APA, Harvard, Vancouver, ISO, and other styles
20

Choi, Jaewoong, Daeha Kim, and Byung Cheol Song. "Style-Guided and Disentangled Representation for Robust Image-to-Image Translation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (June 28, 2022): 463–71. http://dx.doi.org/10.1609/aaai.v36i1.19924.

Full text
Abstract:
Recently, various image-to-image translation (I2I) methods have improved mode diversity and visual quality in terms of neural networks or regularization terms. However, conventional I2I methods relies on a static decision boundary and the encoded representations in those methods are entangled with each other, so they often face with ‘mode collapse’ phenomenon. To mitigate mode collapse, 1) we design a so-called style-guided discriminator that guides an input image to the target image style based on the strategy of flexible decision boundary. 2) Also, we make the encoded representations include independent domain attributes. Based on two ideas, this paper proposes Style-Guided and Disentangled Representation for Robust Image-to-Image Translation (SRIT). SRIT showed outstanding FID by 8%, 22.8%, and 10.1% for CelebA-HQ, AFHQ, and Yosemite datasets, respectively. The translated images of SRIT reflect the styles of target domain successfully. This indicates that SRIT shows better mode diversity than previous works.
APA, Harvard, Vancouver, ISO, and other styles
21

Nguyen, Nhu Van, Alain Boucher, and Jean-Marc Ogier. "Keyword Visual Representation for Image Retrieval and Image Annotation." International Journal of Pattern Recognition and Artificial Intelligence 29, no. 06 (August 12, 2015): 1555010. http://dx.doi.org/10.1142/s0218001415550101.

Full text
Abstract:
Keyword-based image retrieval is more comfortable for users than content-based image retrieval. Because of the lack of semantic description of images, image annotation is often used a priori by learning the association between the semantic concepts (keywords) and the images (or image regions). This association issue is particularly difficult but interesting because it can be used for annotating images but also for multimodal image retrieval. However, most of the association models are unidirectional, from image to keywords. In addition to that, existing models rely on a fixed image database and prior knowledge. In this paper, we propose an original association model, which provides image-keyword bidirectional transformation. Based on the state-of-the-art Bag of Words model dealing with image representation, including a strategy of interactive incremental learning, our model works well with a zero-or-weak-knowledge image database and evolving from it. Some objective quantitative and qualitative evaluations of the model are proposed, in order to highlight the relevance of the method.
APA, Harvard, Vancouver, ISO, and other styles
22

Peng, Feng, and Kai Li. "Deep Image Clustering Based on Label Similarity and Maximizing Mutual Information across Views." Applied Sciences 13, no. 1 (January 3, 2023): 674. http://dx.doi.org/10.3390/app13010674.

Full text
Abstract:
Most existing deep image clustering methods use only class-level representations for clustering. However, the class-level representation alone is not sufficient to describe the differences between images belonging to the same cluster. This may lead to high intra-class representation differences, which will harm the clustering performance. To address this problem, this paper proposes a clustering model named Deep Image Clustering based on Label Similarity and Maximizing Mutual Information Across Views (DCSM). DCSM consists of a backbone network, class-level and instance-level mapping block. The class-level mapping block learns discriminative class-level features by selecting similar (dissimilar) pairs of samples. The proposed extended mutual information is to maximize the mutual information between features extracted from views that were obtained by using data augmentation on the same image and as a constraint on the instance-level mapping block. This forces the instance-level mapping block to capture high-level features that affect multiple views of the same image, thus reducing intra-class differences. Four representative datasets are selected for our experiments, and the results show that the proposed model is superior to the current advanced image clustering models.
APA, Harvard, Vancouver, ISO, and other styles
23

Javid, Tariq, Muhammad Faris, and Pervez Akhtar. "Integrated representation for discrete Fourier and wavelet transforms using vector notation." Mehran University Research Journal of Engineering and Technology 41, no. 3 (July 1, 2022): 175–84. http://dx.doi.org/10.22581/muet1982.2203.18.

Full text
Abstract:
Many mathematical operations are implemented easily through transform domain operations. Multiple transform domain operations are used independently in large and complex applications. There is a need to develop integrated representations for multiple transform domain operations. This paper presents an integrated mathematical representation for the discrete Fourier transformation and the discrete wavelet transformation. The proposed combined representation utilizes the powerful vector notation. A mathematical operator, called the star operator, is formulated that merges coefficients from different transform domains. The star operator implements both convolution and correlation processes in a weighted fashion to compute the aggregated representation. The application of the proposed mathematical formulation is demonstrated successfully through merging transform domain representations of time-domain and image-domain representations. Heart sound signals and magnetic resonance images are used to describe transform-domain data merging applications. The significance of the proposed technique is demonstrated through merging time-domain and image-domain representations in a single- stage that may be implemented as the primary processing engine inside a typical digital image processing and analysis system.
APA, Harvard, Vancouver, ISO, and other styles
24

Martey, Ezekiel Mensah, Hang Lei, Xiaoyu Li, and Obed Appiah. "Image Representation Using Stacked Colour Histogram." Algorithms 14, no. 8 (July 30, 2021): 228. http://dx.doi.org/10.3390/a14080228.

Full text
Abstract:
Image representation plays a vital role in the realisation of Content-Based Image Retrieval (CBIR) system. The representation is performed because pixel-by-pixel matching for image retrieval is impracticable as a result of the rigid nature of such an approach. In CBIR therefore, colour, shape and texture and other visual features are used to represent images for effective retrieval task. Among these visual features, the colour and texture are pretty remarkable in defining the content of the image. However, combining these features does not necessarily guarantee better retrieval accuracy due to image transformations such rotation, scaling, and translation that an image would have gone through. More so, concerns about feature vector representation taking ample memory space affect the running time of the retrieval task. To address these problems, we propose a new colour scheme called Stack Colour Histogram (SCH) which inherently extracts colour and neighbourhood information into a descriptor for indexing images. SCH performs recurrent mean filtering of the image to be indexed. The recurrent blurring in this proposed method works by repeatedly filtering (transforming) the image. The output of a transformation serves as the input for the next transformation, and in each case a histogram is generated. The histograms are summed up bin-by-bin and the resulted vector used to index the image. The image blurring process uses pixel’s neighbourhood information, making the proposed SCH exhibit the inherent textural information of the image that has been indexed. The SCH was extensively tested on the Coil100, Outext, Batik and Corel10K datasets. The Coil100, Outext, and Batik datasets are generally used to assess image texture descriptors, while Corel10K is used for heterogeneous descriptors. The experimental results show that our proposed descriptor significantly improves retrieval and classification rate when compared with (CMTH, MTH, TCM, CTM and NRFUCTM) which are the start-of-the-art descriptors for images with textural features.
APA, Harvard, Vancouver, ISO, and other styles
25

Yu, Siquan, Jiaxin Liu, Zhi Han, Yong Li, Yandong Tang, and Chengdong Wu. "Representation Learning Based on Autoencoder and Deep Adaptive Clustering for Image Clustering." Mathematical Problems in Engineering 2021 (January 9, 2021): 1–11. http://dx.doi.org/10.1155/2021/3742536.

Full text
Abstract:
Image clustering is a complex procedure, which is significantly affected by the choice of image representation. Most of the existing image clustering methods treat representation learning and clustering separately, which usually bring two problems. On the one hand, image representations are difficult to select and the learned representations are not suitable for clustering. On the other hand, they inevitably involve some clustering step, which may bring some error and hurt the clustering results. To tackle these problems, we present a new clustering method that efficiently builds an image representation and precisely discovers cluster assignments. For this purpose, the image clustering task is regarded as a binary pairwise classification problem with local structure preservation. Specifically, we propose here such an approach for image clustering based on a fully convolutional autoencoder and deep adaptive clustering (DAC). To extract the essential representation and maintain the local structure, a fully convolutional autoencoder is applied. To manipulate feature to clustering space and obtain a suitable image representation, the DAC algorithm participates in the training of autoencoder. Our method can learn an image representation that is suitable for clustering and discover the precise clustering label for each image. A series of real-world image clustering experiments verify the effectiveness of the proposed algorithm.
APA, Harvard, Vancouver, ISO, and other styles
26

Nasrun, Rully Charitas Indra Prahmana, and Irwan Akib. "The Students’ Representative Processes in Solving Mathematical Word Problems." Knowledge 3, no. 1 (January 28, 2023): 70–79. http://dx.doi.org/10.3390/knowledge3010006.

Full text
Abstract:
Representation in mathematics is essential as a basis for students to be able to understand and apply mathematical ideas. This study aims to describe how students produce different representations in solving word problems. In solving word problems, students make verbal–written representations, image representations, and symbol representations. This research uses a qualitative descriptive study involving 75 fifth-grade students at one of the private schools in Makassar, Indonesia. Setting and Participants: two subjects were chosen from 75 participants based on the completion of word problems that resulted in different representations, including verbal–written representations, picture representations, and symbol representations. The instruments used were word problems and interview sheets, although some other students only used one or two forms of mathematical representation. The results of this study indicate that, from the different representations produced that include verbal–written representations, image representations, and symbol representations, students carry out the process of translation, integration, solution, and evaluation until finding answers. In addition, other findings were students’ ‘mathematical literacy which immensely helped the students’ representation process in solving word problems. three forms of representation were found to be produced by students: verbal–written, image representation, and symbol representation. Furthermore, the three forms of representation were created through carrying out four representation processes, namely the processes of translation, integration, solution, and evaluation.
APA, Harvard, Vancouver, ISO, and other styles
27

Lu, Xuchao, Li Song, Rong Xie, Xiaokang Yang, and Wenjun Zhang. "Deep Binary Representation for Efficient Image Retrieval." Advances in Multimedia 2017 (2017): 1–10. http://dx.doi.org/10.1155/2017/8961091.

Full text
Abstract:
With the fast growing number of images uploaded every day, efficient content-based image retrieval becomes important. Hashing method, which means representing images in binary codes and using Hamming distance to judge similarity, is widely accepted for its advantage in storage and searching speed. A good binary representation method for images is the determining factor of image retrieval. In this paper, we propose a new deep hashing method for efficient image retrieval. We propose an algorithm to calculate the target hash code which indicates the relationship between images of different contents. Then the target hash code is fed to the deep network for training. Two variants of deep network, DBR and DBR-v3, are proposed for different size and scale of image database. After training, our deep network can produce hash codes with large Hamming distance for images of different contents. Experiments on standard image retrieval benchmarks show that our method outperforms other state-of-the-art methods including unsupervised, supervised, and deep hashing methods.
APA, Harvard, Vancouver, ISO, and other styles
28

Zhang, Kai Song, Luo Zhong, and Xuan Ya Zhang. "Image Restoration via Group l2,1 Norm-Based Structural Sparse Representation." International Journal of Pattern Recognition and Artificial Intelligence 32, no. 04 (December 13, 2017): 1854008. http://dx.doi.org/10.1142/s0218001418540083.

Full text
Abstract:
Sparse representation has recently been extensively studied in the field of image restoration. Many sparsity-based approaches enforce sparse coding on patches with certain constraints. However, extracting structural information is a challenging task in the field image restoration. Motivated by the fact that structured sparse representation (SSR) method can capture the inner characteristics of image structures, which helps in finding sparse representations of nonlinear features or patterns, we propose the SSR approach for image restoration. Specifically, a generalized model is developed using structured restraint, namely, the group [Formula: see text]-norm of the coefficient matrix is introduced in the traditional sparse representation with respect to minimizing the differences within classes and maximizing the differences between classes for sparse representation, and its applications with image restoration are also explored. The sparse coefficients of SSR are obtained through iterative optimization approach. Experimental results have shown that the proposed SSR technique can significantly deliver the reconstructed images with high quality, which manifest the effectiveness of our approach in both peak signal-to-noise ratio performance and visual perception.
APA, Harvard, Vancouver, ISO, and other styles
29

Zheng, Yun Ping, Zu Jia Li, Mudar Sarem, Qing Hong Yang, and Xiu Xiu Liao. "An Improved RNAMC Image Representation." Applied Mechanics and Materials 143-144 (December 2011): 746–49. http://dx.doi.org/10.4028/www.scientific.net/amm.143-144.746.

Full text
Abstract:
In this paper, by controlling the ratio of the length and the width of a homogenous block, we proposed an improved algorithm for the gray image representation by using the Rectangular Non-symmetry and Anti-packing Model Coding (RNAMC) and extended shading approach, which is called the IRNAMC image representation method. Also, we present an IRNAMC representation algorithm of gray images. By comparing our proposed IRNAMC method with the conventional S-Tree Coding (STC) method, the experimental results presented in this paper show that the former can significantly reduce the lower bit rate and the number of homogenous blocks than the latter whereas remaining the satisfactory image quality. Also, the experimental results show that by controlling the ratio of the length and the width, we can improve the reconstructed image quality of the RNAMC method.
APA, Harvard, Vancouver, ISO, and other styles
30

Tian, Chunwei, Guanglu Sun, Qi Zhang, Weibing Wang, Teng Chen, and Yuan Sun. "Integrating Sparse and Collaborative Representation Classifications for Image Classification." International Journal of Image and Graphics 17, no. 02 (April 2017): 1750007. http://dx.doi.org/10.1142/s0219467817500073.

Full text
Abstract:
Collaborative representation classification (CRC) is an important sparse method, which is easy to carry out and uses a linear combination of training samples to represent a test sample. CRC method utilizes the offset between representation result of each class and the test sample to implement classification. However, the offset usually cannot well express the difference between every class and the test sample. In this paper, we propose a novel representation method for image recognition to address the above problem. This method not only fuses sparse representation and CRC method to improve the accuracy of image recognition, but also has novel fusion mechanism to classify images. The implementations of the proposed method have the following steps. First of all, it produces collaborative representation of the test sample. That is, a linear combination of all the training samples is first determined to represent the test sample. Then, it gets the sparse representation classification (SRC) of the test sample. Finally, the proposed method respectively uses CRC and SRC representations to obtain two kinds of scores of the test sample and fuses them to recognize the image. The experiments of face recognition show that the combination of CRC and SRC has satisfactory performance for image classification.
APA, Harvard, Vancouver, ISO, and other styles
31

WANG, P. S. P. "PARALLEL OBJECT REPRESENTATION AND RECOGNITION." Parallel Processing Letters 03, no. 03 (September 1993): 279–90. http://dx.doi.org/10.1142/s0129626493000320.

Full text
Abstract:
A parallel method for 3-dimensional (3d) object recognition is introduced, using the concept of coordinated graph, layered graph representation, and parallel matching techniques, and significantly reduces the time required for dealing with 3-dimensional image analysis problems. Their fundamental parallel properties and concept of finite representations are investigated and several interested examples including curved and disconnected images are illustrated. In addition to its importance in theoretical parallel study, it can also be applied to 3-d parallel object recognition, image processing and computer vision.
APA, Harvard, Vancouver, ISO, and other styles
32

Zhou, Jianhang, and Bob Zhang. "Collaborative Representation Using Non-Negative Samples for Image Classification." Sensors 19, no. 11 (June 8, 2019): 2609. http://dx.doi.org/10.3390/s19112609.

Full text
Abstract:
Collaborative representation based classification (CRC) is an efficient classifier in image classification. By using l 2 regularization, the collaborative representation based classifier holds competitive performances compared with the sparse representation based classifier using less computational time. However, each of the elements calculated from the training samples are utilized for representation without selection, which can lead to poor performances in some classification tasks. To resolve this issue, in this paper, we propose a novel collaborative representation by directly using non-negative representations to represent a test sample collaboratively, termed Non-negative Collaborative Representation-based Classifier (NCRC). To collect all non-negative collaborative representations, we introduce a Rectified Linear Unit (ReLU) function to perform filtering on the coefficients obtained by l 2 minimization according to CRC’s objective function. Next, we represent the test sample by using a linear combination of these representations. Lastly, the nearest subspace classifier is used to perform classification on the test samples. The experiments performed on four different databases including face and palmprint showed the promising results of the proposed method. Accuracy comparisons with other state-of-art sparse representation-based classifiers demonstrated the effectiveness of NCRC at image classification. In addition, the proposed NCRC consumes less computational time, further illustrating the efficiency of NCRC.
APA, Harvard, Vancouver, ISO, and other styles
33

Zhang, Yiyi, Li Niu, Ziqi Pan, Meichao Luo, Jianfu Zhang, Dawei Cheng, and Liqing Zhang. "Exploiting Motion Information from Unlabeled Videos for Static Image Action Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 12918–25. http://dx.doi.org/10.1609/aaai.v34i07.6990.

Full text
Abstract:
Static image action recognition, which aims to recognize action based on a single image, usually relies on expensive human labeling effort such as adequate labeled action images and large-scale labeled image dataset. In contrast, abundant unlabeled videos can be economically obtained. Therefore, several works have explored using unlabeled videos to facilitate image action recognition, which can be categorized into the following two groups: (a) enhance visual representations of action images with a designed proxy task on unlabeled videos, which falls into the scope of self-supervised learning; (b) generate auxiliary representations for action images with the generator learned from unlabeled videos. In this paper, we integrate the above two strategies in a unified framework, which consists of Visual Representation Enhancement (VRE) module and Motion Representation Augmentation (MRA) module. Specifically, the VRE module includes a proxy task which imposes pseudo motion label constraint and temporal coherence constraint on unlabeled videos, while the MRA module could predict the motion information of a static action image by exploiting unlabeled videos. We demonstrate the superiority of our framework based on four benchmark human action datasets with limited labeled data.
APA, Harvard, Vancouver, ISO, and other styles
34

Fell Contreras, Stephannie. "Technologies of Representation." Materia Arquitectura, no. 20 (December 25, 2020): 116. http://dx.doi.org/10.56255/ma.v0i20.486.

Full text
Abstract:
“In teaching us a new visual code”, Susan Sontag wrote more than forty years ago, “photographs alter and enlarge our notions of what is worth looking at and what we have a right to observe” (Sontag, 1977, p. 3). Since the publication of Sontag’s `Plato’s Cave ́, the most radical change in this visual code has been the pace and breadth of its reach. We carry image-making and image-sharing devices in a pocket. We can easily upload a photograph to a search engine and call up millions of images based on visual or conceptual similarity. But even with the leveling of tools for making and distributing images, Sontag’s empowering `ethics of seeing ́ is today a territory in dispute. Millions of image-makers and instant sharing capabilities are met by algorithmic filter-bubbles and widespread misinformation campaigns. The sheer quantity of image circulation did not amount to improved visibility, let alone mutual understanding. After 18-O (18 October) in Chile and the Covid-19 pandemic, it became clear that any discussion on representation should address the unrealized promise of the `ethics of images ́. This implies not forgetting that, as Gayatri Spivak proposes, whenever we use the word ‘representation’ we are compounding its two meanings: to ‘re-present’ as in art or philosophy, and to ‘speak for’ as in politics (1988, p. 275). Architects are familiar with images that ‘do’ things for us: renders pre-visualize, orthographic projections measure, collages hint at experiences of space. Architectural drawings can be translated, as Robin Evans (1997) put it, into buildings and urban plans. But what is architecture’s relationship to other kinds of images, those which were never meant to become buildings? This issue of Materia Arquitectura was a call to explore the agency of images in the construction of realities, the imposition of borders and the narration of stories that are political. Not only because their object is the polis, but because they alter our relationship with the built environment and consequently, the way in which we understand, imagine and shape, as a society, the common territory that is at the base of the exercise of public power.
APA, Harvard, Vancouver, ISO, and other styles
35

Alahmadi, Mohammad D. "Medical Image Segmentation with Learning Semantic and Global Contextual Representation." Diagnostics 12, no. 7 (June 25, 2022): 1548. http://dx.doi.org/10.3390/diagnostics12071548.

Full text
Abstract:
Automatic medical image segmentation is an essential step toward accurate diseases diagnosis and designing a follow-up treatment. This assistive method facilitates the cancer detection process and provides a benchmark to highlight the affected area. The U-Net model has become the standard design choice. Although the symmetrical structure of the U-Net model enables this network to encode rich semantic representation, the intrinsic locality of the CNN layers limits this network’s capability in modeling long-range contextual dependency. On the other hand, sequence to sequence Transformer models with a multi-head attention mechanism can enable them to effectively model global contextual dependency. However, the lack of low-level information stemming from the Transformer architecture limits its performance for capturing local representation. In this paper, we propose a two parallel encoder model, where in the first path the CNN module captures the local semantic representation whereas the second path deploys a Transformer module to extract the long-range contextual representation. Next, by adaptively fusing these two feature maps, we encode both representations into a single representative tensor to be further processed by the decoder block. An experimental study demonstrates that our design can provide rich and generic representation features which are highly efficient for a fine-grained semantic segmentation task.
APA, Harvard, Vancouver, ISO, and other styles
36

Zhang, Jianfu, Yuanyuan Huang, Yaoyi Li, Weijie Zhao, and Liqing Zhang. "Multi-Attribute Transfer via Disentangled Representation." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 9195–202. http://dx.doi.org/10.1609/aaai.v33i01.33019195.

Full text
Abstract:
Recent studies show significant progress in image-to-image translation task, especially facilitated by Generative Adversarial Networks. They can synthesize highly realistic images and alter the attribute labels for the images. However, these works employ attribute vectors to specify the target domain which diminishes image-level attribute diversity. In this paper, we propose a novel model formulating disentangled representations by projecting images to latent units, grouped feature channels of Convolutional Neural Network, to disassemble the information between different attributes. Thanks to disentangled representation, we can transfer attributes according to the attribute labels and moreover retain the diversity beyond the labels, namely, the styles inside each image. This is achieved by specifying some attributes and swapping the corresponding latent units to “swap” the attributes appearance, or applying channel-wise interpolation to blend different attributes. To verify the motivation of our proposed model, we train and evaluate our model on face dataset CelebA. Furthermore, the evaluation of another facial expression dataset RaFD demonstrates the generalizability of our proposed model.
APA, Harvard, Vancouver, ISO, and other styles
37

Zhang, Li Liang, Xi Ling Liu, and Shi Liang Zhang. "An Algorithm for Image Enhancement via Sparse Representation." Applied Mechanics and Materials 556-562 (May 2014): 4806–10. http://dx.doi.org/10.4028/www.scientific.net/amm.556-562.4806.

Full text
Abstract:
This paper presents an approach of enhance images subjective visual quality, based on image sparse representation. Firstly, comparativing and analysing the performance of the current several popular image denoising methods by two kinds of different content image, and using the K-SVD, MB3D and CSR algorithm, we obtain clean images namely the images noise removing. Then, decomposing the already denoised image into both cartoon and texture component by Morphological Component Analysis (MCA ) method, and superresolution the cartoon part and enhance the contrast of the texture in image. Finally, fusion between the cartoon and the texture gain the desired image.
APA, Harvard, Vancouver, ISO, and other styles
38

Pu, Tao, Tianshui Chen, Hefeng Wu, and Liang Lin. "Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 2 (June 28, 2022): 2091–98. http://dx.doi.org/10.1609/aaai.v36i2.20105.

Full text
Abstract:
Training the multi-label image recognition models with partial labels, in which merely some labels are known while others are unknown for each image, is a considerably challenging and practical task. To address this task, current algorithms mainly depend on pre-training classification or similarity models to generate pseudo labels for the unknown labels. However, these algorithms depend on sufficient multi-label annotations to train the models, leading to poor performance especially with low known label proportion. In this work, we propose to blend category-specific representation across different images to transfer information of known labels to complement unknown labels, which can get rid of pre-training models and thus does not depend on sufficient annotations. To this end, we design a unified semantic-aware representation blending (SARB) framework that exploits instance-level and prototype-level semantic representation to complement unknown labels by two complementary modules: 1) an instance-level representation blending (ILRB) module blends the representations of the known labels in an image to the representations of the unknown labels in another image to complement these unknown labels. 2) a prototype-level representation blending (PLRB) module learns more stable representation prototypes for each category and blends the representation of unknown labels with the prototypes of corresponding labels to complement these labels. Extensive experiments on the MS-COCO, Visual Genome, Pascal VOC 2007 datasets show that the proposed SARB framework obtains superior performance over current leading competitors on all known label proportion settings, i.e., with the mAP improvement of 4.6%, 4.6%, 2.2% on these three datasets when the known label proportion is 10%. Codes are available at https://github.com/HCPLab-SYSU/HCP-MLR-PL.
APA, Harvard, Vancouver, ISO, and other styles
39

Shi, Li, Xiao Ke Niu, Zhi Zhong Wang, and Hui Ge Shi. "A Study on Image Representation Method Based on Biological Visual Mechanism." Applied Mechanics and Materials 249-250 (December 2012): 1283–88. http://dx.doi.org/10.4028/www.scientific.net/amm.249-250.1283.

Full text
Abstract:
Image representation is a key issue among many image processing tasks. By considering the problems faced by current general image representation methods, such as excessive computing amount, sensitivity to noise, lack of self-adaptability etc, a novel image representation method based on biologic visual mechanisms is proposed in this paper. Through simulating the primary visual cortex to realize the sparse representation of outside image it also introduced the synchronization mechanism to make it more accordant with visual system. Finally the presented method was verified by applying it to compress natural images and digital literature images respectively. The result showed that this new representation method is better than the general sparse representation method on both aspects of compression ratio and noise sensitivity.
APA, Harvard, Vancouver, ISO, and other styles
40

Liang, Dong Tai. "Color Image Denoising Using Gaussian Multiscale Multivariate Image Analysis." Applied Mechanics and Materials 37-38 (November 2010): 248–52. http://dx.doi.org/10.4028/www.scientific.net/amm.37-38.248.

Full text
Abstract:
Inspired by the human vision system, a new image representation and analysis model based on Gaussian multiscale multivariate image analysis (MIA) is proposed. The multiscale color texture representations for the original image are used to constitute the multivariate image, each channel of which represents a perceptual observation from different scales. Then the MIA decomposes this multivariate image into multiscale color texture perceptual features (the principal component score images). These score images could be interpreted as 1) the output of three color opponent channels: black versus white, red versus green and blue versus yellow, and 2) the edge information, and 3) higher-order Gaussian derivatives. Finally the color image denoising approach based on the models is presented. Experiments show that this denoising method against Gaussian filters significantly improves the denoising effect by preserving more edge information.
APA, Harvard, Vancouver, ISO, and other styles
41

Di Sciascio, E., F. M. Donini, and M. Mongiello. "Structured Knowledge Representation for Image Retrieval." Journal of Artificial Intelligence Research 16 (April 1, 2002): 209–57. http://dx.doi.org/10.1613/jair.902.

Full text
Abstract:
We propose a structured approach to the problem of retrieval of images by content and present a description logic that has been devised for the semantic indexing and retrieval of images containing complex objects. As other approaches do, we start from low-level features extracted with image analysis to detect and characterize regions in an image. However, in contrast with feature-based approaches, we provide a syntax to describe segmented regions as basic objects and complex objects as compositions of basic ones. Then we introduce a companion extensional semantics for defining reasoning services, such as retrieval, classification, and subsumption. These services can be used for both exact and approximate matching, using similarity measures. Using our logical approach as a formal specification, we implemented a complete client-server image retrieval system, which allows a user to pose both queries by sketch and queries by example. A set of experiments has been carried out on a testbed of images to assess the retrieval capabilities of the system in comparison with expert users ranking. Results are presented adopting a well-established measure of quality borrowed from textual information retrieval.
APA, Harvard, Vancouver, ISO, and other styles
42

Dewa Ayu Ketut Septika. "Image-making and Architecture: A Digital Medium for Qualitative Design Representative." Built Environment Studies 1, no. 1 (October 22, 2020): 29–36. http://dx.doi.org/10.22146/best.v1i1.504.

Full text
Abstract:
Architecture is inseparable from visual aspects in the form of representation as a way to establish communication. Due to digitalization and the rapid development of technology, there has been a shift in the paradigm of image representation. Manual images become digital images, from sketching and drawing to image-making and rendering. Digital rendering is considered to be object-oriented and quantitative. It has the characteristics of being precise, fast, reflecting form and materiality, with a photorealistic image as a result. There is no visible involvement of subjects such as architects, image-maker, and observers in the process because the image represents the final outcome. Here, the role of representation to evoke imagination and deeper interpretation is lost, because it does not leave room for intervention and contemplation of the design. That is why another kind of digital method emerged in the world of representation, instead of “rendering” the design as an image, it is relevant to the act of “making” an image as design representation. Its characteristic as a representation method makes the produced images have the ability to be evocative, in order to make its viewers contemplate and interpret the design, which makes it a qualitative representation. The aim of this paper is to understand the act of image-making through digital collage as a medium for qualitative representation in architectural design. By comparing digital rendering and collage images obtained through literature studies, this paper wants to offer the author’s viewpoint on the qualities and experiences brought by architectural representation.
APA, Harvard, Vancouver, ISO, and other styles
43

Jin, Xu, Teng Huang, Ke Wen, Mengxian Chi, and Hong An. "HistoSSL: Self-Supervised Representation Learning for Classifying Histopathology Images." Mathematics 11, no. 1 (December 26, 2022): 110. http://dx.doi.org/10.3390/math11010110.

Full text
Abstract:
The success of image classification depends on copious annotated images for training. Annotating histopathology images is costly and laborious. Although several successful self-supervised representation learning approaches have been introduced, they are still insufficient to consider the unique characteristics of histopathology images. In this work, we propose the novel histopathology-oriented self-supervised representation learning framework (HistoSSL) to efficiently extract representations from unlabeled histopathology images at three levels: global, cell, and stain. The model transfers remarkably to downstream tasks: colorectal tissue phenotyping on the NCTCRC dataset and breast cancer metastasis recognition on the CAMELYON16 dataset. HistoSSL achieved higher accuracies than state-of-the-art self-supervised learning approaches, which proved the robustness of the learned representations.
APA, Harvard, Vancouver, ISO, and other styles
44

Zheng, Min, Yangliao Geng, and Qingyong Li. "Revisiting Local Descriptors via Frequent Pattern Mining for Fine-Grained Image Retrieval." Entropy 24, no. 2 (January 20, 2022): 156. http://dx.doi.org/10.3390/e24020156.

Full text
Abstract:
Fine-grained image retrieval aims at searching relevant images among fine-grained classes given a query. The main difficulty of this task derives from the small interclass distinction and the large intraclass variance of fine-grained images, posing severe challenges to the methods that only resort to global or local features. In this paper, we propose a novel fine-grained image retrieval method, where global–local aware feature representation is learned. Specifically, the global feature is extracted by selecting the most relevant deep descriptors. Meanwhile, we explore the intrinsic relationship of different parts via the frequent pattern mining, thus obtaining the representative local feature. Further, an aggregation feature that learns global–local aware feature representation is designed. Consequently, the discriminative ability among different fine-grained classes is enhanced. We evaluate the proposed method on five popular fine-grained datasets. Extensive experimental results demonstrate that the performance of fine-grained image retrieval is improved with the proposed global–local aware representation.
APA, Harvard, Vancouver, ISO, and other styles
45

Sahoo, Arabinda, and Pranati Das. "Dictionary based Image Compression via Sparse Representation." International Journal of Electrical and Computer Engineering (IJECE) 7, no. 4 (August 1, 2017): 1964. http://dx.doi.org/10.11591/ijece.v7i4.pp1964-1972.

Full text
Abstract:
Nowadays image compression has become a necessity due to a large volume of images. For efficient use of storage space and data transmission, it becomes essential to compress the image. In this paper, we propose a dictionary based image compression framework via sparse representation, with the construction of a trained over-complete dictionary. The over-complete dictionary is trained using the intra-prediction residuals obtained from different images and is applied for sparse representation. In this method, the current image block is first predicted from its spatially neighboring blocks, and then the prediction residuals are encoded via sparse representation. Sparse approximation algorithm and the trained over-complete dictionary are applied for sparse representation of prediction residuals. The detail coefficients obtained from sparse representation are used for encoding. Experimental result shows that the proposed method yields both improved coding efficiency and image quality as compared to some state-of-the-art image compression methods.
APA, Harvard, Vancouver, ISO, and other styles
46

Silva, Samuel Henrique, Arun Das, Adel Aladdini, and Peyman Najafirad. "Adaptive Clustering of Robust Semantic Representations for Adversarial Image Purification on Social Networks." Proceedings of the International AAAI Conference on Web and Social Media 16 (May 31, 2022): 968–79. http://dx.doi.org/10.1609/icwsm.v16i1.19350.

Full text
Abstract:
Advances in Artificial Intelligence (AI) have made it possible to automate human-level visual search and perception tasks on the massive sets of image data shared on social media on a daily basis. However, AI-based automated filters are highly susceptible to deliberate image attacks that can lead to content misclassification of cyberbulling, child sexual abuse material (CSAM), adult content, and deepfakes. One of the most effective methods to defend against such disturbances is adversarial training, but this comes at the cost of generalization for unseen attacks and transferability across models. In this article, we propose a robust defense against adversarial image attacks, which is model agnostic and generalizable to unseen adversaries. We begin with a baseline model, extracting the latent representations for each class and adaptively clustering the latent representations that share a semantic similarity. Next, we obtain the distributions for these clustered latent representations along with their originating images. We then learn semantic reconstruction dictionaries (SRD). We adversarially train a new model constraining the latent space representation to minimize the distance between the adversarial latent representation and the true cluster distribution. To purify the image, we decompose the input into low and high-frequency components. The high-frequency component is reconstructed based on the best SRD from the clean dataset. In order to evaluate the best SRD, we rely on the distance between the robust latent representations and semantic cluster distributions. The output is a purified image with no perturbations. Evaluations using comprehensive datasets including image benchmarks and social media images demonstrate that our proposed purification approach guards and enhances the accuracy of AI-based image filters for unlawful and harmful perturbed images considerably.
APA, Harvard, Vancouver, ISO, and other styles
47

LABBI, ABDERRAHIM, HOLGER BOSCH, and CHRISTIAN PELLEGRINI. "HIGH ORDER STATISTICS FOR IMAGE CLASSIFICATION." International Journal of Neural Systems 11, no. 04 (August 2001): 371–77. http://dx.doi.org/10.1142/s0129065701000837.

Full text
Abstract:
This paper addresses the problem of image classification using local information which is aggregated to provide global representation of different image classes. Local information is adaptively extracted from an image database using Independent Component Analysis (ICA) which provides a set of localized, oriented, and band-pass filters selective to independent features of the images. Local representation using ICA techniques has been previously investigated by several researchers. However, very little work has been done on further use of these representations to provide more complex and global description of images. In this paper, we present an algorithm which uses the energy of a minimal set of ICA filters to provide class-specific signatures which are shown to be strongly discriminant. Computer simulations are carried on two image databases, one consisting of five classes -referred to as categories- (buildings, rooms, mountains, forests and beaches) and one consisting of a set of 30 objects from multiple views for viewpoint invariant object recognition. The classification performance of the algorithm using both Independent and Principal Component Analyses are reported and discussed.
APA, Harvard, Vancouver, ISO, and other styles
48

Zare, Mohammad Reza, Woo Chaw Seng, and Ahmed Mueen. "Automatic Classification Of Medical X-Ray Images." Malaysian Journal of Computer Science 26, no. 1 (March 1, 2013): 9–22. http://dx.doi.org/10.22452/mjcs.vol26no1.2.

Full text
Abstract:
Image representation is one of the major aspects of automatic classification algorithms. In this paper, different feature extraction techniques have been utilized to represent medical X-ray images. They are categorized into two groups; (i) low-level image representation such as Gray Level Co-occurrence Matrix(GLCM), Canny Edge Operator, Local Binary Pattern(LBP) , pixel value, and (ii) local patch-based image representation such as Bag of Words (BoW). These features have been exploited in different algorithms for automatic classification of medical Xray images. We then analyzed the classification performance obtained with regard to the image representation techniques used. These experiments were evaluated on ImageCLEF 2007 database consists of 11000 medical X-ray images with 116 classes. Experimental results showed the classification performance obtained by exploiting LBP and BoW outperformed the other algorithms with respect to the image representation techniques used.
APA, Harvard, Vancouver, ISO, and other styles
49

Wundrich, Ingo J., Christoph von der Malsburg, and Rolf P. Würtz. "Image Representation by Complex Cell Responses." Neural Computation 16, no. 12 (December 1, 2004): 2563–75. http://dx.doi.org/10.1162/0899766042321760.

Full text
Abstract:
We present an analysis of the representation of images as the magnitudes of their transform with complex-valued Gabor wavelets. Such a representation is a model for complex cells in the early stage of visual processing and of high technical usefulness for image understanding, because it makes the representation insensitive to small local shifts. We show that if the images are band limited and of zero mean, then reconstruction from the magnitudes is unique up to the sign for almost all images.
APA, Harvard, Vancouver, ISO, and other styles
50

Zhou, Caiyue, Yanfen Kong, Chuanyong Zhang, Lin Sun, Dongmei Wu, and Chongbo Zhou. "A Hybrid Sparse Representation Model for Image Restoration." Sensors 22, no. 2 (January 11, 2022): 537. http://dx.doi.org/10.3390/s22020537.

Full text
Abstract:
Group-based sparse representation (GSR) uses image nonlocal self-similarity (NSS) prior to grouping similar image patches, and then performs sparse representation. However, the traditional GSR model restores the image by training degraded images, which leads to the inevitable over-fitting of the data in the training model, resulting in poor image restoration results. In this paper, we propose a new hybrid sparse representation model (HSR) for image restoration. The proposed HSR model is improved in two aspects. On the one hand, the proposed HSR model exploits the NSS priors of both degraded images and external image datasets, making the model complementary in feature space and the plane. On the other hand, we introduce a joint sparse representation model to make better use of local sparsity and NSS characteristics of the images. This joint model integrates the patch-based sparse representation (PSR) model and GSR model, while retaining the advantages of the GSR model and the PSR model, so that the sparse representation model is unified. Extensive experimental results show that the proposed hybrid model outperforms several existing image recovery algorithms in both objective and subjective evaluations.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography