To see the other types of publications on this topic, follow the link: Image structure representation.

Journal articles on the topic 'Image structure representation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Image structure representation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Chen, Yuhao, Alexander Wong, Yuan Fang, Yifan Wu, and Linlin Xu. "Deep Residual Transform for Multi-scale Image Decomposition." Journal of Computational Vision and Imaging Systems 6, no. 1 (January 15, 2021): 1–5. http://dx.doi.org/10.15353/jcvis.v6i1.3537.

Full text
Abstract:
Multi-scale image decomposition (MID) is a fundamental task in computer vision and image processing that involves the transformation of an image into a hierarchical representation comprising of different levels of visual granularity from coarse structures to fine details. A well-engineered MID disentangles the image signal into meaningful components which can be used in a variety of applications such as image denoising, image compression, and object classification. Traditional MID approaches such as wavelet transforms tackle the problem through carefully designed basis functions under rigid decomposition structure assumptions. However, as the information distribution varies from one type of image content to another, rigid decomposition assumptions lead to inefficiently representation, i.e., some scales can contain little to no information. To address this issue, we present Deep Residual Transform (DRT), a data-driven MID strategy where the input signal is transformed into a hierarchy of non-linear representations at different scales, with each representation being independently learned as the representational residual of previous scales at a user-controlled detail level. As such, the proposed DRT progressively disentangles scale information from the original signal by sequentially learning residual representations. The decomposition flexibility of this approach allows for highly tailored representations cater to specific types of image content, and results in greater representational efficiency and compactness. In this study, we realize the proposed transform by leveraging a hierarchy of sequentially trained autoencoders. To explore the efficacy of the proposed DRT, we leverage two datasets comprising of very different types of image content: 1) CelebFaces and 2) Cityscapes. Experimental results show that the proposed DRT achieved highly efficient information decomposition on both datasets amid their very different visual granularity characteristics.
APA, Harvard, Vancouver, ISO, and other styles
2

RIZO-RODRÍGUEZ, DAYRON, HEYDI MÉNDEZ-VAZQUEZ, and EDEL GARCÍA-REYES. "ILLUMINATION INVARIANT FACE RECOGNITION IN QUATERNION DOMAIN." International Journal of Pattern Recognition and Artificial Intelligence 27, no. 03 (May 2013): 1360004. http://dx.doi.org/10.1142/s0218001413600045.

Full text
Abstract:
The performance of face recognition systems tends to decrease when images are affected by illumination. Feature extraction is one of the main steps of a face recognition process, where it is possible to alleviate the illumination effects on face images. In order to increase the accuracy of recognition tasks, different methods for obtaining illumination invariant features have been developed. The aim of this work is to compare two different ways to represent face image descriptions in terms of their illumination invariant properties for face recognition. The first representation is constructed following the structure of complex numbers and the second one is based on quaternion numbers. Using four different face description approaches both representations are constructed, transformed into frequency domain and expressed in polar coordinates. The most illumination invariant component of each frequency domain representation is determined and used as the representative information of the face image. Verification and identification experiments are then performed in order to compare the discriminative power of the selected components. Representative component of the quaternion representation overcame the complex one.
APA, Harvard, Vancouver, ISO, and other styles
3

Fu, Y., Y. Ye, G. Liu, B. Zhang, and R. Zhang. "ROBUST MULTIMODAL IMAGE MATCHING BASED ON MAIN STRUCTURE FEATURE REPRESENTATION." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B3-2020 (August 21, 2020): 583–89. http://dx.doi.org/10.5194/isprs-archives-xliii-b3-2020-583-2020.

Full text
Abstract:
Abstract. Image matching is a crucial procedure for multimodal remote sensing image processing. However, the performance of conventional methods is often degraded in matching multimodal images due to significant nonlinear intensity differences. To address this problem, this letter proposes a novel image feature representation named Main Structure with Histogram of Orientated Phase Congruency (M-HOPC). M-HOPC is able to precisely capture similar structure properties between multimodal images by reinforcing the main structure information for the construction of the phase congruency feature description. Specifically, each pixel of an image is assigned an independent weight for feature descriptor according to the main structure such as large contours and edges. Then M-HOPC is integrated as the similarity measure for correspondence detection by a template matching scheme. Three pairs of multimodal images including optical, LiDAR, and SAR data have been used to evaluate the proposed method. The results show that M-HOPC is robust to nonlinear intensity differences and achieves the superior matching performance compared with other state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
4

WANG, ZHIYONG, ZHERU CHI, DAGAN FENG, and AH CHUNG TSOI. "CONTENT-BASED IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK USING ADAPTIVE PROCESSING OF TREE-STRUCTURE IMAGE REPRESENTATION." International Journal of Image and Graphics 03, no. 01 (January 2003): 119–43. http://dx.doi.org/10.1142/s0219467803000944.

Full text
Abstract:
Content-based image retrieval has become an essential technique in multimedia data management. However, due to the difficulties and complications involved in the various image processing tasks, a robust semantic representation of image content is still very difficult (if not impossible) to achieve. In this paper, we propose a novel content-based image retrieval approach with relevance feedback using adaptive processing of tree-structure image representation. In our approach, each image is first represented with a quad-tree, which is segmentation free. Then a neural network model with the Back-Propagation Through Structure (BPTS) learning algorithm is employed to learn the tree-structure representation of the image content. This approach that integrates image representation and similarity measure in a single framework is applied to the relevance feedback of the content-based image retrieval. In our approach, an initial ranking of the database images is first carried out based on the similarity between the query image and each of the database images according to global features. The user is then asked to categorize the top retrieved images into similar and dissimilar groups. Finally, the BPTS neural network model is used to learn the user's intention for a better retrieval result. This process continues until satisfactory retrieval results are achieved. In the refining process, a fine similarity grading scheme can also be adopted to improve the retrieval performance. Simulations on texture images and scenery pictures have demonstrated promising results which compare favorably with the other relevance feedback methods tested.
APA, Harvard, Vancouver, ISO, and other styles
5

Yu, Siquan, Jiaxin Liu, Zhi Han, Yong Li, Yandong Tang, and Chengdong Wu. "Representation Learning Based on Autoencoder and Deep Adaptive Clustering for Image Clustering." Mathematical Problems in Engineering 2021 (January 9, 2021): 1–11. http://dx.doi.org/10.1155/2021/3742536.

Full text
Abstract:
Image clustering is a complex procedure, which is significantly affected by the choice of image representation. Most of the existing image clustering methods treat representation learning and clustering separately, which usually bring two problems. On the one hand, image representations are difficult to select and the learned representations are not suitable for clustering. On the other hand, they inevitably involve some clustering step, which may bring some error and hurt the clustering results. To tackle these problems, we present a new clustering method that efficiently builds an image representation and precisely discovers cluster assignments. For this purpose, the image clustering task is regarded as a binary pairwise classification problem with local structure preservation. Specifically, we propose here such an approach for image clustering based on a fully convolutional autoencoder and deep adaptive clustering (DAC). To extract the essential representation and maintain the local structure, a fully convolutional autoencoder is applied. To manipulate feature to clustering space and obtain a suitable image representation, the DAC algorithm participates in the training of autoencoder. Our method can learn an image representation that is suitable for clustering and discover the precise clustering label for each image. A series of real-world image clustering experiments verify the effectiveness of the proposed algorithm.
APA, Harvard, Vancouver, ISO, and other styles
6

CHEN, XIAOWU, BIN ZHOU, FANG XU, and QINPING ZHAO. "AUTOMATIC IMAGE COMPLETION WITH STRUCTURE PROPAGATION AND TEXTURE SYNTHESIS." International Journal of Software Engineering and Knowledge Engineering 20, no. 08 (December 2010): 1097–117. http://dx.doi.org/10.1142/s0218194010005055.

Full text
Abstract:
In this paper, we present a novel automatic image completion solution in a greedy manner inspired by a primal sketch representation model. Firstly, an image is divided into structure (sketchable) components and texture (non-sketchable) components, and the missing structures, such as curves and corners, are predicted by tensor voting. Secondly, the textures along structural sketches are synthesized with the sampled patches of some known structure components. Then, using the texture completion priorities decided by the confidence term, data term and distance term, the similar image patches of some known texture components are found by selecting a point with the maximum priority on the boundary of hole region. Finally, these image patches inpaint the missing textures of hole region seamlessly through graph cuts. The characteristics of this solution include: (1) introducing the primal sketch representation model to guide completion for visual consistency; (2) achieving fully automatic completion. The experiments on natural images illustrate satisfying image completion results.
APA, Harvard, Vancouver, ISO, and other styles
7

Li, Wei, Yuxiang Zhang, Na Liu, Qian Du, and Ran Tao. "Structure-Aware Collaborative Representation for Hyperspectral Image Classification." IEEE Transactions on Geoscience and Remote Sensing 57, no. 9 (September 2019): 7246–61. http://dx.doi.org/10.1109/tgrs.2019.2912507.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Li, Zhao, Le Wang, Tao Yu, and Bing Liang Hu. "Image Super-Resolution via Low-Rank Representation." Applied Mechanics and Materials 568-570 (June 2014): 652–55. http://dx.doi.org/10.4028/www.scientific.net/amm.568-570.652.

Full text
Abstract:
This paper presents a novel method for solving single-image super-resolution problems, based upon low-rank representation (LRR). Given a set of a low-resolution image patches, LRR seeks the lowest-rank representation among all the candidates that represent all patches as the linear combination of the patches in a low-resolution dictionary. By jointly training two dictionaries for the low-resolution and high-resolution images, we can enforce the similarity of LLRs between the low-resolution and high-resolution image pair with respect to their own dictionaries. Therefore, the LRR of a low-resolution image can be applied with the high-resolution dictionary to generate a high-resolution image. Unlike the well-known sparse representation, which computes the sparsest representation of each image patch individually, LRR aims at finding the lowest-rank representation of a collection of patches jointly. LRR better captures the global structure of image. Experiments show that our method gives good results both visually and quantitatively.
APA, Harvard, Vancouver, ISO, and other styles
9

Dong, Bin, Songlei Jian, and Kai Lu. "Learning Multimodal Representations by Symmetrically Transferring Local Structures." Symmetry 12, no. 9 (September 13, 2020): 1504. http://dx.doi.org/10.3390/sym12091504.

Full text
Abstract:
Multimodal representations play an important role in multimodal learning tasks, including cross-modal retrieval and intra-modal clustering. However, existing multimodal representation learning approaches focus on building one common space by aligning different modalities and ignore the complementary information across the modalities, such as the intra-modal local structures. In other words, they only focus on the object-level alignment and ignore structure-level alignment. To tackle the problem, we propose a novel symmetric multimodal representation learning framework by transferring local structures across different modalities, namely MTLS. A customized soft metric learning strategy and an iterative parameter learning process are designed to symmetrically transfer local structures and enhance the cluster structures in intra-modal representations. The bidirectional retrieval loss based on multi-layer neural networks is utilized to align two modalities. MTLS is instantiated with image and text data and shows its superior performance on image-text retrieval and image clustering. MTLS outperforms the state-of-the-art multimodal learning methods by up to 32% in terms of R@1 on text-image retrieval and 16.4% in terms of AMI onclustering.
APA, Harvard, Vancouver, ISO, and other styles
10

Berg, A. P., and W. B. Mikhael. "An efficient structure and algorithm for image representation using nonorthogonal basis images." IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing 44, no. 10 (1997): 818–28. http://dx.doi.org/10.1109/82.633439.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Guru, D. S., K. B. Nagasundara, S. Manjunath, and R. Dinesh. "An Approach for Hand Vein Representation and Indexing." International Journal of Digital Crime and Forensics 3, no. 2 (April 2011): 1–15. http://dx.doi.org/10.4018/jdcf.2011040101.

Full text
Abstract:
This paper proposes a model for representing and indexing of hand vein images. The proposed representation model identifies the junction points and perceives the spatial relationships existing among all junction points in hand vein images by the use of triangular spatial relationship (TSR). The model preserves the TSR among the junction points in a symbolic hand vein image by the use of quadruples and for each quadruple, a unique TSR key is generated. A novel methodology to label the junction points based on graph properties of junction points is also proposed. A Symbolic Hand Vein Image Database (SHVID) is created through the construction of B-tree, an efficient multilevel indexing structure. A methodology to retrieve similar symbolic hand vein images for a given query image is also presented. The proposed methodology has shown promising results.
APA, Harvard, Vancouver, ISO, and other styles
12

Fei, Yin, Gao Wei, and Song Zongxi. "Medical Image Fusion Based on Feature Extraction and Sparse Representation." International Journal of Biomedical Imaging 2017 (2017): 1–11. http://dx.doi.org/10.1155/2017/3020461.

Full text
Abstract:
As a novel multiscale geometric analysis tool, sparse representation has shown many advantages over the conventional image representation methods. However, the standard sparse representation does not take intrinsic structure and its time complexity into consideration. In this paper, a new fusion mechanism for multimodal medical images based on sparse representation and decision map is proposed to deal with these problems simultaneously. Three decision maps are designed including structure information map (SM) and energy information map (EM) as well as structure and energy map (SEM) to make the results reserve more energy and edge information. SM contains the local structure feature captured by the Laplacian of a Gaussian (LOG) and EM contains the energy and energy distribution feature detected by the mean square deviation. The decision map is added to the normal sparse representation based method to improve the speed of the algorithm. Proposed approach also improves the quality of the fused results by enhancing the contrast and reserving more structure and energy information from the source images. The experiment results of 36 groups of CT/MR, MR-T1/MR-T2, and CT/PET images demonstrate that the method based on SR and SEM outperforms five state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
13

Zhang, Yongqin, Jiaying Liu, Wenhan Yang, and Zongming Guo. "Image Super-Resolution Based on Structure-Modulated Sparse Representation." IEEE Transactions on Image Processing 24, no. 9 (September 2015): 2797–810. http://dx.doi.org/10.1109/tip.2015.2431435.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Xu, Peng, Man Guo, Lei Chen, Weifeng Hu, Qingshan Chen, and Yujun Li. "No-Reference Stereoscopic Image Quality Assessment Based on Binocular Statistical Features and Machine Learning." Complexity 2021 (January 28, 2021): 1–14. http://dx.doi.org/10.1155/2021/8834652.

Full text
Abstract:
Learning a deep structure representation for complex information networks is a vital research area, and assessing the quality of stereoscopic images or videos is challenging due to complex 3D quality factors. In this paper, we explore how to extract effective features to enhance the prediction accuracy of perceptual quality assessment. Inspired by the structure representation of the human visual system and the machine learning technique, we propose a no-reference quality assessment scheme for stereoscopic images. More specifically, the statistical features of the gradient magnitude and Laplacian of Gaussian responses are extracted to form binocular quality-predictive features. After feature extraction, these features of distorted stereoscopic image and its human perceptual score are used to construct a statistical regression model with the machine learning technique. Experimental results on the benchmark databases show that the proposed model generates image quality prediction well correlated with the human visual perception and delivers highly competitive performance with the typical and representative methods. The proposed scheme can be further applied to the real-world applications on video broadcasting and 3D multimedia industry.
APA, Harvard, Vancouver, ISO, and other styles
15

Cadieu, Charles F., and Bruno A. Olshausen. "Learning Intermediate-Level Representations of Form and Motion from Natural Movies." Neural Computation 24, no. 4 (April 2012): 827–66. http://dx.doi.org/10.1162/neco_a_00247.

Full text
Abstract:
We present a model of intermediate-level visual representation that is based on learning invariances from movies of the natural environment. The model is composed of two stages of processing: an early feature representation layer and a second layer in which invariances are explicitly represented. Invariances are learned as the result of factoring apart the temporally stable and dynamic components embedded in the early feature representation. The structure contained in these components is made explicit in the activities of second-layer units that capture invariances in both form and motion. When trained on natural movies, the first layer produces a factorization, or separation, of image content into a temporally persistent part representing local edge structure and a dynamic part representing local motion structure, consistent with known response properties in early visual cortex (area V1). This factorization linearizes statistical dependencies among the first-layer units, making them learnable by the second layer. The second-layer units are split into two populations according to the factorization in the first layer. The form-selective units receive their input from the temporally persistent part (local edge structure) and after training result in a diverse set of higher-order shape features consisting of extended contours, multiscale edges, textures, and texture boundaries. The motion-selective units receive their input from the dynamic part (local motion structure) and after training result in a representation of image translation over different spatial scales and directions, in addition to more complex deformations. These representations provide a rich description of dynamic natural images and testable hypotheses regarding intermediate-level representation in visual cortex.
APA, Harvard, Vancouver, ISO, and other styles
16

Thiedmann, Ralf, Henrik Hassfeld, Ole Stenzel, L. Jan Anton Koster, Stefan D. Oosterhout, Svetlana S. Van Bavel, Martijn M. Wienk, Joachim Loos, Rene A. J. Janssen, and Volker Schmidt. "A MULTISCALE APPROACH TO THE REPRESENTATION OF 3D IMAGES, WITH APPLICATION TO POLYMER SOLAR CELLS." Image Analysis & Stereology 30, no. 1 (March 1, 2011): 19. http://dx.doi.org/10.5566/ias.v30.p19-30.

Full text
Abstract:
A multiscale approach to the description of geometrically complex 3D image data is proposed which distinguishes between morphological features on a ‘macro-scale’ and a ‘micro-scale’. Since our method is mainly tailored to nanostructures observed in composite materials consisting of two different phases, an appropriate binarization of grayscale images is required first. Then, a morphological smoothing is applied to extract the structural information from binarized image data on the ‘macro-scale’. A stochastic algorithm is developed for the morphologically smoothed images whose goal is to find a suitable representation of the macro-scale structure by unions of overlapping spheres. Such representations can be interpreted as marked point patterns. They lead to an enormous reduction of data and allow the application of well-known tools from point-process theory for their analysis and structural modeling. All those voxels which have been ‘misspecified’ by the morphological smoothing and subsequent representation by unions of overlapping spheres are interpreted as ‘micro-scale’ structure. The exemplary data sets considered in this paper are 3D grayscale images of photoactive layers in hybrid solar cells gained by electron tomography. These composite materials consist of two phases: a polymer phase and a zinc oxide phase. The macro-scale structure of the latter is represented by unions of overlapping spheres.
APA, Harvard, Vancouver, ISO, and other styles
17

Carlson, Eric S. "Representation and Structure Conflict in the Digital Age." Advances in Archaeological Practice 2, no. 4 (November 2014): 269–84. http://dx.doi.org/10.7183/2326-3768.2.4.269.

Full text
Abstract:
AbstractDigital imaging technologies have enhanced archaeological research and profoundly expanded the scale of the discipline’s potentialities. As illustrators and archaeologists move away from using hand-drawn images (of hand-held, real-life objects) to depict artifacts and other archaeological information, certain capabilities of the traditional illustrative process are lost. One such loss is the ability to present a complete and informed representation of an artifact free of the distortions and visual limitations that single-perspective (i.e., digital or photographic) imagery produces. This is accomplished by the illustrator through the unification of multiple views of the artifact from various perspectives into a single two-dimensional image that communicates to the viewer important attributes of the artifact, free of distortion and remaining true to the measured, analytical conventions of the illustrative process. Liberation from the single-viewpoint perspective was one of the fundamental elements of the Cubist movement. Traditional archaeological illustrators utilize Cubist principles to communicate visually to the viewer a complete, accurate, and undistorted package of information about an artifact. The supplanting of hand-drawn illustrations by digital images in today’s archaeological publications threatens to revert the visual representation of data back to uninformed, surficial “snapshots” of incomplete objects.
APA, Harvard, Vancouver, ISO, and other styles
18

Dhaya, R. "Comparative Analysis of an Efficient Image Denoising Method for Wireless Multimedia Sensor Network Images in Transform Domain." September 2021 3, no. 3 (September 23, 2021): 218–33. http://dx.doi.org/10.36548/jscp.2021.3.007.

Full text
Abstract:
In recent years, there has been an increasing research interest in image de-noising due to an emphasis on sparse representation. When sparse representation theory is compared to transform domain-based image de-noising, the former indicates that the images have more information. It contains structural characteristics that are quite similar to the structure of dictionary-based atoms. This structure and the dictionary-based method is highly unsuccessful. However, image representation assumes that the noise lack such a feature. The dual-tree complex wavelet transform incorporates an increase in transform data density to reduce the effects of sparse data. This technique has been developed to decrease the image noise by selecting the best-predicted threshold value derived from wavelet coefficients. For our experiment, Discrete Cosine Transform (DCT) and Complex Wavelet Transform (CWT) are used to examine how the suggested technique compares the conventional DCT and CWT on sets of realistic images. As for image quality measures, DT-CWT has leveraged superior results. In terms of processing time, DT-CWT gave better results with a wider PSNR range. Further, the proposed model is tested with a standard digital image named Lena and multimedia sensor images for the denoising algorithm. The suggested denoising technique has delivered minimal effect on the MSE value.
APA, Harvard, Vancouver, ISO, and other styles
19

Zhu, Fuzhen, Yue Liu, Xin Huang, and Haitao Zhu. "Remote Sensing Image Super-resolution Based on Sparse Representation." MATEC Web of Conferences 232 (2018): 02037. http://dx.doi.org/10.1051/matecconf/201823202037.

Full text
Abstract:
In order to obtain higher resolution remote sensing images with more details, an improved sparse representation remote sensing image super-resolution reconstruction(SRR) algorithm is proposed. First, remote sensing image is preprocessed to obtain the required training sample image; then, the KSVD algorithm is used for dictionary training to obtain the high-low resolution dictionary pairs; finally, the image feature extraction block is represented, which is improved by using adaptive filtering method. At the same time, the mean value filtering method is used to improve the super-resolution reconstruction iterative calculation. Experiment results show that, compared with the most advanced sparse representation super-resolution algorithm, the improved sparse representation super-resolution method can effectively avoid the loss of edge information of SRR image and obtain a better super-resolution reconstruction effect. The texture details are more abundant in subjective vision, the PSNR is increased about 1 dB, and the structure similarity (SSIM) is increased about 0.01.
APA, Harvard, Vancouver, ISO, and other styles
20

Yuan, Xiaobin, Jingping Zhu, and Xiaobin Li. "Blur Kernel Estimation by Structure Sparse Prior." Applied Sciences 10, no. 2 (January 16, 2020): 657. http://dx.doi.org/10.3390/app10020657.

Full text
Abstract:
Blind image deblurring tries to recover a sharp version from a blurred image, where blur kernel is usually unknown. Recently, sparse representation has been successfully applied to estimate the blur kernel. However, the sparse representation has not considered the structure relationships among original pixels. In this paper, a blur kernel estimation method is proposed by introducing the locality constraint into sparse representation framework. Both the sparsity regularization and the locality constraint are incorporated to exploit the structure relationships among pixels. The proposed method was evaluated on a real-world benchmark dataset. Experimental results demonstrate that the proposed method achieve comparable performance to the state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
21

Hecht, Helge, Mhd Hasan Sarhan, and Vlad Popovici. "Disentangled Autoencoder for Cross-Stain Feature Extraction in Pathology Image Analysis." Applied Sciences 10, no. 18 (September 15, 2020): 6427. http://dx.doi.org/10.3390/app10186427.

Full text
Abstract:
A novel deep autoencoder architecture is proposed for the analysis of histopathology images. Its purpose is to produce a disentangled latent representation in which the structure and colour information are confined to different subspaces so that stain-independent models may be learned. For this, we introduce two constraints on the representation which are implemented as a classifier and an adversarial discriminator. We show how they can be used for learning a latent representation across haematoxylin-eosin and a number of immune stains. Finally, we demonstrate the utility of the proposed representation in the context of matching image patches for registration applications and for learning a bag of visual words for whole slide image summarization.
APA, Harvard, Vancouver, ISO, and other styles
22

Fu, Lingli, Chao Ren, Xiaohai He, Xiaohong Wu, and Zhengyong Wang. "Single Remote Sensing Image Super-Resolution with an Adaptive Joint Constraint Model." Sensors 20, no. 5 (February 26, 2020): 1276. http://dx.doi.org/10.3390/s20051276.

Full text
Abstract:
Remote sensing images have been widely used in many applications. However, the resolution of the obtained remote sensing images may not meet the increasing demands for some applications. In general, the sparse representation-based super-resolution (SR) method is one of the most popular methods to solve this issue. However, traditional sparse representation SR methods do not fully exploit the complementary constraints of images. Therefore, they cannot accurately reconstruct the unknown HR images. To address this issue, we propose a novel adaptive joint constraint (AJC) based on sparse representation for the single remote sensing image SR. First, we construct a nonlocal constraint by using the nonlocal self-similarity. Second, we propose a local structure filter according to the local gradient of the image and then construct a local constraint. Next, the nonlocal and local constraints are introduced into the sparse representation-based SR framework. Finally, the parameters of the joint constraint model are selected adaptively according to the level of image noise. We utilize the alternate iteration algorithm to tackle the minimization problem in AJC. Experimental results show that the proposed method achieves good SR performance in preserving image details and significantly improves the objective evaluation indices.
APA, Harvard, Vancouver, ISO, and other styles
23

Aghbari, Zaher Al. "Effective Image Mining by Representing Color Histograms as Time Series." Journal of Advanced Computational Intelligence and Intelligent Informatics 13, no. 2 (March 20, 2009): 109–14. http://dx.doi.org/10.20965/jaciii.2009.p0109.

Full text
Abstract:
Due to the wide spread of digital libraries, digital cameras, and the increase access to WWW by individuals, the number of digital images that exist pose a great challenge. Easy access to such collections requires an index structure to facilitate random access to individual images and ease navigation of these images. As these images are not annotated or associated with descriptions, existing systems represent the images by their extracted low level features.In this paper, we demonstrate two image mining tasks, namely image classification and image clustering, which are preliminary steps in facilitating indexing and navigation. These tasks are based on the extraction of color distributions of images. Then, these color distributions are represented as time series. To make the representation more effective and efficient for the data mining tasks, we have chosen to represent the time series by a new representation called SAX (Symbolic Aggregate approXimation) [14]. SAX based representation is very effective because it reduces the dimensionality and lower bounds the distance measure. We demonstrate by our experiment the feasibility of our approach.
APA, Harvard, Vancouver, ISO, and other styles
24

Liao, Liang, Jing Xiao, Yating Li, Mi Wang, and Ruimin Hu. "Learned Representation of Satellite Image Series for Data Compression." Remote Sensing 12, no. 3 (February 4, 2020): 497. http://dx.doi.org/10.3390/rs12030497.

Full text
Abstract:
Real-time transmission of satellite video data is one of the fundamentals in the applications of video satellite. Making use of the historical information to eliminate the long-term background redundancy (LBR) is considered to be a crucial way to bridge the gap between the compressed data rate and the bandwidth between the satellite and the Earth. The main challenge lies in how to deal with the variant image pixel values caused by the change of shooting conditions while keeping the structure of the same landscape unchanged. In this paper, we propose a representation learning based method to model the complex evolution of the landscape appearance under different conditions by making use of the historical image series. Under this representation model, the image is disentangled into the content part and the style part. The former represents the consistent landscape structure, while the latter represents the conditional parameters of the environment. To utilize the knowledge learned from the historical image series, we generate synthetic reference frames for the compression of video frames through image translation by the representation model. The synthetic reference frames can highly boost the compression efficiency by changing the original intra-frame prediction to inter-frame prediction for the intra-coded picture (I frame). Experimental results show that the proposed representation learning-based compression method can save an average of 44.22% bits over HEVC, which is significantly higher than that using references generated under the same conditions. Bitrate savings reached 18.07% when applied to satellite video data with arbitrarily collected reference images.
APA, Harvard, Vancouver, ISO, and other styles
25

Wen, Kui, Zhaojian Zhang, Xinpeng Jiang, Jie He, and Junbo Yang. "Image representation of structure color based on edge detection algorithm." Results in Physics 19 (December 2020): 103441. http://dx.doi.org/10.1016/j.rinp.2020.103441.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Tao, Gao, Xiangmo Zhao, Ting Chen, Zhanwen Liu, and Si Li. "Image feature representation with orthogonal symmetric local weber graph structure." Neurocomputing 240 (May 2017): 70–83. http://dx.doi.org/10.1016/j.neucom.2017.02.047.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Jiang, Bo, Jin Tang, Aihua Zheng, and Bin Luo. "Image representation and matching with geometric-edge random structure graph." Pattern Recognition Letters 87 (February 2017): 20–28. http://dx.doi.org/10.1016/j.patrec.2016.07.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Movshon, J. Anthony, and Eero P. Simoncelli. "Representation of Naturalistic Image Structure in the Primate Visual Cortex." Cold Spring Harbor Symposia on Quantitative Biology 79 (2014): 115–22. http://dx.doi.org/10.1101/sqb.2014.79.024844.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Ahuja, Narendra. "On detection and representation of multiscale low-level image structure." ACM Computing Surveys 27, no. 3 (September 1995): 304–6. http://dx.doi.org/10.1145/212094.212099.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Qian, Jiansheng, Dong Wu, Leida Li, Deqiang Cheng, and Xuesong Wang. "Image quality assessment based on multi-scale representation of structure." Digital Signal Processing 33 (October 2014): 125–33. http://dx.doi.org/10.1016/j.dsp.2014.06.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Liu, Guichi, Lei Gao, and Lin Qi. "Hyperspectral Image Classification via Multi-Feature-Based Correlation Adaptive Representation." Remote Sensing 13, no. 7 (March 25, 2021): 1253. http://dx.doi.org/10.3390/rs13071253.

Full text
Abstract:
In recent years, representation-based methods have attracted more attention in the hyperspectral image (HSI) classification. Among them, sparse representation-based classifier (SRC) and collaborative representation-based classifier (CRC) are the two representative methods. However, SRC only focuses on sparsity but ignores the data correlation information. While CRC encourages grouping correlated variables together but lacks the ability of variable selection. As a result, SRC and CRC are incapable of producing satisfied performance. To address these issues, in this work, a correlation adaptive representation (CAR) is proposed, enabling a CAR-based classifier (CARC). Specifically, the proposed CARC is able to explore sparsity and data correlation information jointly, generating a novel representation model that is adaptive to the structure of the dictionary. To further exploit the correlation between the test samples and the training samples effectively, a distance-weighted Tikhonov regularization is integrated into the proposed CARC. Furthermore, to handle the small training sample problem in the HSI classification, a multi-feature correlation adaptive representation-based classifier (MFCARC) and MFCARC with Tikhonov regularization (MFCART) are presented to improve the classification performance by exploring the complementary information across multiple features. The experimental results show the superiority of the proposed methods over state-of-the-art algorithms.
APA, Harvard, Vancouver, ISO, and other styles
32

Nawaz Jadoon, Rab, Waqas Jadoon, Ahmad Khan, Zia ur Rehman, Sajid Shah, Iftikhar Ahmed Khan, and WuYang Zhou. "Linear Discriminative Learning for Image Classification." Mathematical Problems in Engineering 2019 (October 20, 2019): 1–12. http://dx.doi.org/10.1155/2019/4760614.

Full text
Abstract:
In this paper, we propose a linear discriminative learning model called adaptive locality-based weighted collaborative representation (ALWCR) that formulates the image classification task as an optimization problem to reduce the reconstruction error between the query sample and its computed linear representation. The optimal linear representation for a query image is obtained by using the weighted regularized linear regression approach which incorporates intrinsic locality structure and feature variance between data into representation. The resultant representation increases the discrimination ability for correct classification. The proposed ALWCR method can be considered an extension of the collaborative representation- (CR-) based classification approach which is an alternative to the sparse representation- (SR-) based classification method. ALWCR improved the discriminant ability for classification as compared with CR original formulation and overcomes the limitations that arose due to a small training sample size and low feature dimension. Experimental results obtained using various feature dimensions on well-known publicly available face and digit datasets have verified the competitiveness of the proposed method against competing image classification methods.
APA, Harvard, Vancouver, ISO, and other styles
33

Wallace, Luke, Bryan Hally, Samuel Hillman, Simon D. Jones, and Karin Reinke. "Terrestrial Image-Based Point Clouds for Mapping Near-Ground Vegetation Structure: Potential and Limitations." Fire 3, no. 4 (October 19, 2020): 59. http://dx.doi.org/10.3390/fire3040059.

Full text
Abstract:
Site-specific information concerning fuel hazard characteristics is needed to support wildfire management interventions and fuel hazard reduction programs. Currently, routine visual assessments provide subjective information, with the resulting estimate of fuel hazard varying due to observer experience and the rigor applied in making assessments. Terrestrial remote sensing techniques have been demonstrated to be capable of capturing quantitative information on the spatial distribution of biomass to inform fuel hazard assessments. This paper explores the use of image-based point clouds generated from imagery captured using a low-cost compact camera for describing the fuel hazard within the surface and near-surface layers. Terrestrial imagery was obtained at three distances for five target plots. Subsets of these images were then processed to determine the effect of varying overlap and distribution of image captures. The majority of the point clouds produced using this image-based technique provide an accurate representation of the 3D structure of the surface and near-surface fuels. Results indicate that high image overlap and pixel size are critical; multi-angle image capture is shown to be crucial in providing a representation of the vertical stratification of fuel. Terrestrial image-based point clouds represent a viable technique for low cost and rapid assessment of fuel structure.
APA, Harvard, Vancouver, ISO, and other styles
34

He, Zhouyan, Yang Song, Caiming Zhong, and Li Li. "Curvature and Entropy Statistics-Based Blind Multi-Exposure Fusion Image Quality Assessment." Symmetry 13, no. 8 (August 6, 2021): 1446. http://dx.doi.org/10.3390/sym13081446.

Full text
Abstract:
The multi-exposure fusion (MEF) technique provides humans a new opportunity for natural scene representation, and the related quality assessment issues are urgent to be considered for validating the effectiveness of these techniques. In this paper, a curvature and entropy statistics-based blind MEF image quality assessment (CE-BMIQA) method is proposed to perceive the quality degradation objectively. The transformation process from multiple images with different exposure levels to the final MEF image leads to the loss of structure and detail information, so that the related curvature statistics features and entropy statistics features are utilized to portray the above distortion presentation. The former features are extracted from the histogram statistics of surface type map calculated by mean curvature and Gaussian curvature of MEF image. Moreover, contrast energy weighting is attached to consider the contrast variation of the MEF image. The latter features refer to spatial entropy and spectral entropy. All extracted features based on a multi-scale scheme are aggregated by training the quality regression model via random forest. Since the MEF image and its feature representation are spatially symmetric in physics, the final prediction quality is symmetric to and representative of the image distortion. Experimental results on a public MEF image database demonstrate that the proposed CE-BMIQA method achieves more outstanding performance than the state-of-the-art blind image quality assessment ones.
APA, Harvard, Vancouver, ISO, and other styles
35

Cheng, Xi, Xiang Li, and Jian Yang. "Triple-Attention Mixed-Link Network for Single-Image Super-Resolution." Applied Sciences 9, no. 15 (July 25, 2019): 2992. http://dx.doi.org/10.3390/app9152992.

Full text
Abstract:
Single-image super-resolution is of great importance as a low-level computer-vision task. Recent approaches with deep convolutional neural networks have achieved impressive performance. However, existing architectures have limitations due to the less sophisticated structure along with less strong representational power. In this work, to significantly enhance the feature representation, we proposed triple-attention mixed-link network (TAN), which consists of (1) three different aspects (i.e., kernel, spatial, and channel) of attention mechanisms and (2) fusion of both powerful residual and dense connections (i.e., mixed link). Specifically, the network with multi-kernel learns multi-hierarchical representations under different receptive fields. The features are recalibrated by the effective kernel and channel attention, which filters the information and enables the network to learn more powerful representations. The features finally pass through the spatial attention in the reconstruction network, which generates a fusion of local and global information, lets the network restore more details, and improves the reconstruction quality. The proposed network structure decreases 50% of the parameter growth rate compared with previous approaches. The three attention mechanisms provide 0.49 dB, 0.58 dB, and 0.32 dB performance gain when evaluating on Set5, Set14, and BSD100. Thanks to the diverse feature recalibrations and the advanced information flow topology, our proposed model is strong enough to perform against the state-of-the-art methods on the benchmark evaluations.
APA, Harvard, Vancouver, ISO, and other styles
36

Peng, Yong, Wanzeng Kong, Feiwei Qin, and Feiping Nie. "Manifold Adaptive Kernelized Low-Rank Representation for Semisupervised Image Classification." Complexity 2018 (2018): 1–11. http://dx.doi.org/10.1155/2018/2857594.

Full text
Abstract:
Constructing a powerful graph that can effectively depict the intrinsic connection of data points is the critical step to make the graph-based semisupervised learning algorithms achieve promising performance. Among popular graph construction algorithms, low-rank representation (LRR) is a very competitive one that can simultaneously explore the global structure of data and recover the data from noisy environments. Therefore, the learned low-rank coefficient matrix in LRR can be used to construct the data affinity matrix. Consider the existing problems such as the following: (1) the essentially linear property of LRR makes it not appropriate to process the possible nonlinear structure of data and (2) learning performance can be greatly enhanced by exploring the structure information of data; we propose a new manifold kernelized low-rank representation (MKLRR) model that can perform LRR in the data manifold adaptive kernel space. Specifically, the manifold structure can be incorporated into the kernel space by using graph Laplacian and thus the underlying geometry of data is reflected by the wrapped kernel space. Experimental results of semisupervised image classification tasks show the effectiveness of MKLRR. For example, MKLRR can, respectively, obtain 96.13%, 98.09%, and 96.08% accuracies on ORL, Extended Yale B, and PIE data sets when given 5, 20, and 20 labeled face images per subject.
APA, Harvard, Vancouver, ISO, and other styles
37

Wang, Qi, Xiang He, and Xuelong Li. "Locality and Structure Regularized Low Rank Representation for Hyperspectral Image Classification." IEEE Transactions on Geoscience and Remote Sensing 57, no. 2 (February 2019): 911–23. http://dx.doi.org/10.1109/tgrs.2018.2862899.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Christiansen, Mads-Peter V., Henrik Lund Mortensen, Søren Ager Meldgaard, and Bjørk Hammer. "Gaussian representation for image recognition and reinforcement learning of atomistic structure." Journal of Chemical Physics 153, no. 4 (July 28, 2020): 044107. http://dx.doi.org/10.1063/5.0015571.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Guo, Qin Zhen, Zhi Zeng, Shu Wu Zhang, Xiao Feng, and Hu Guan. "Simhash for Large Scale Image Retrieval." Applied Mechanics and Materials 651-653 (September 2014): 2197–200. http://dx.doi.org/10.4028/www.scientific.net/amm.651-653.2197.

Full text
Abstract:
Due to its fast query speed and reduced storage cost, hashing, which tries to learn binary code representation for data with the expectation of preserving the neighborhood structure in the original data space, has been widely used in a large variety of applications like image retrieval. For most existing image retrieval methods with hashing, there are two main steps: describe images with feature vectors, and then use hashing methods to encode the feature vectors. In this paper, we make two research contributions. First, we creatively propose to use simhash which can be intrinsically combined with the popular image representation method, Bag-of-visual-words (BoW) for image retrieval. Second, we novelly incorporate “locality-sensitive” hashing into simhash to take the correlation of the visual words of BoW into consideration to make similar visual words have similar fingerprint. Extensive experiments have verified the superiority of our method over some state-of-the-art methods for image retrieval task.
APA, Harvard, Vancouver, ISO, and other styles
40

Bilquees, Samina, Hassan Dawood, Hussain Dawood, Nadeem Majeed, Ali Javed, and Muhammad Tariq Mahmood. "Noise Resilient Local Gradient Orientation for Content-Based Image Retrieval." International Journal of Optics 2021 (July 14, 2021): 1–19. http://dx.doi.org/10.1155/2021/4151482.

Full text
Abstract:
In a world of multimedia information, where users seek accurate results against search query and demand relevant multimedia content retrieval, developing an accurate content-based image retrieval (CBIR) system is difficult due to the presence of noise in the image. The performance of the CBIR system is impaired by this noise. To estimate the distance between the query and database images, CBIR systems use image feature representation. The noise or artifacts present within the visual data might confuse the CBIR when retrieving relevant results. Therefore, we propose Noise Resilient Local Gradient Orientation (NRLGO) feature representation that overcomes the noise factor within the visual information and strengthens the CBIR to retrieve accurate and relevant results. The proposed NRLGO consists of three steps: estimation and removal of noise to protect the local visual structure; extraction of color, texture, and local contrast features; and, at the end, generation of microstructure for visual representation. The Manhattan distance between the query image and the database image is used to measure their similarity. The proposed technique was tested using the Corel dataset, which contains 10000 images from 100 different categories. The outcomes of the experiment signify that the proposed NRLGO has higher retrieval performance in comparison with state-of-the-art techniques.
APA, Harvard, Vancouver, ISO, and other styles
41

ZHAO, YU, and YAN QIU CHEN. "CONNECTED EQUI-LENGTH LINE SEGMENTS FOR CURVE AND STRUCTURE MATCHING." International Journal of Pattern Recognition and Artificial Intelligence 18, no. 06 (September 2004): 1019–37. http://dx.doi.org/10.1142/s0218001404003563.

Full text
Abstract:
This paper deals with the problem of matching curves and structures extracted from 2D images that are subject to translation, rotation, scaling and other geometric transformations. We present in this paper a novel approach, Connected Equi-Length Line Segments (CELLS), for curve representation and matching. In our framework, a curve is represented by a number of connected equi-length line segments and a new matrix called Orientation Difference Matrix (ODM) is constructed for the curve, which reflects the distribution of the rest of the line segments with respect to the current one using orientation differences between them. The representation is invariant to rotation, scaling and translation. The problem of structure matching is also considered in this paper and is solved based on CELLS. The matching of structures is performed by (1) detecting tri-junctions and quad-junctions on the structures, (2) representing each arch using CELLS. A practical use of the proposed approach is demonstrated by registering a SAR image of a certain area to a map.
APA, Harvard, Vancouver, ISO, and other styles
42

Fang, Jing, Shaohai Hu, and Xiaole Ma. "A Boosting SAR Image Despeckling Method Based on Non-Local Weighted Group Low-Rank Representation." Sensors 18, no. 10 (October 13, 2018): 3448. http://dx.doi.org/10.3390/s18103448.

Full text
Abstract:
In this paper, we propose a boosting synthetic aperture radar (SAR) image despeckling method based on non-local weighted group low-rank representation (WGLRR). The spatial structure information of SAR images leads to the similarity of the patches. Furthermore, the data matrix grouped by the similar patches within the noise-free SAR image is often low-rank. Based on this, we use low-rank representation (LRR) to recover the noise-free group data matrix. To maintain the fidelity of the recovered image, we integrate the corrupted probability of each pixel into the group LRR model as a weight to constrain the fidelity of recovered noise-free patches. Each single patch might belong to several groups, so different estimations of each patch are aggregated with a weighted averaging procedure. The residual image contains signal leftovers due to the imperfect denoising, so we strengthen the signal by leveraging on the availability of the denoised image to suppress noise further. Experimental results on simulated and actual SAR images show the superior performance of the proposed method in terms of objective indicators and of perceived image quality.
APA, Harvard, Vancouver, ISO, and other styles
43

Vida, Mark D., Adrian Nestor, David C. Plaut, and Marlene Behrmann. "Spatiotemporal dynamics of similarity-based neural representations of facial identity." Proceedings of the National Academy of Sciences 114, no. 2 (December 27, 2016): 388–93. http://dx.doi.org/10.1073/pnas.1614763114.

Full text
Abstract:
Humans’ remarkable ability to quickly and accurately discriminate among thousands of highly similar complex objects demands rapid and precise neural computations. To elucidate the process by which this is achieved, we used magnetoencephalography to measure spatiotemporal patterns of neural activity with high temporal resolution during visual discrimination among a large and carefully controlled set of faces. We also compared these neural data to lower level “image-based” and higher level “identity-based” model-based representations of our stimuli and to behavioral similarity judgments of our stimuli. Between ∼50 and 400 ms after stimulus onset, face-selective sources in right lateral occipital cortex and right fusiform gyrus and sources in a control region (left V1) yielded successful classification of facial identity. In all regions, early responses were more similar to the image-based representation than to the identity-based representation. In the face-selective regions only, responses were more similar to the identity-based representation at several time points after 200 ms. Behavioral responses were more similar to the identity-based representation than to the image-based representation, and their structure was predicted by responses in the face-selective regions. These results provide a temporally precise description of the transformation from low- to high-level representations of facial identity in human face-selective cortex and demonstrate that face-selective cortical regions represent multiple distinct types of information about face identity at different times over the first 500 ms after stimulus onset. These results have important implications for understanding the rapid emergence of fine-grained, high-level representations of object identity, a computation essential to human visual expertise.
APA, Harvard, Vancouver, ISO, and other styles
44

Bouarara, Hadj Ahmed, and Yasmin Bouarara. "Swarm Intelligence Methods for Unsupervised Images Classification." International Journal of Organizational and Collective Intelligence 6, no. 2 (April 2016): 50–74. http://dx.doi.org/10.4018/ijoci.2016040104.

Full text
Abstract:
Nowadays, Google estimates that more than 1000 billion the number of images on the internet where the classification of this type of data represents a big problem in the scientific community. Several techniques have been proposed belonging to the world of image-mining. The substance of our work is the application of swarm intelligence methods for the unsupervised image classification (UIC) problem following four steps: image digitalization by developing a new representation approach in order to transform each image into a set of term (set of pixels); image clustering using three methods: firstly a distances combination by social worker bees (DC-SWBs) based on the principle of filtering where each image must successfully pass three filters, secondly Artificial social spiders (ASS) method based on the silky structure and the principle of weaving and the third method called artificial immune system (AIS); For the authors' experiment they use the benchmark MuHavi with changing for each test the configuration (image representation, distance measures and threshold).
APA, Harvard, Vancouver, ISO, and other styles
45

Tong, Zhe, Wei Li, Fan Jiang, Zhencai Zhu, and Gongbo Zhou. "Bearing fault diagnosis based on spectrum image sparse representation of vibration signal." Advances in Mechanical Engineering 10, no. 9 (September 2018): 168781401879778. http://dx.doi.org/10.1177/1687814018797788.

Full text
Abstract:
Bearings are crucial for industrial production and susceptible to malfunction in rotating machines. Image analysis can give a comprehensive description of vibration signal, thus, it has achieved much more attention recently in fault diagnosis field. However, it brings lots of redundant information from a single spectrum image matrix behind rich fault information, and massive spectrum image samples lead to exacerbation of this situation, which readily results in the accuracy-dropping problem of multiple local defective bearings diagnosis. To solve this issue, a novel feature extraction method based on image sparse representation is proposed. Original spectrum images are acquired through fast Fourier transformation. Sparse coefficient that reveals the underlying structure of spectrum image based on raw signals is extracted as the feature by implementing the orthogonal matching pursuit and K-singular value decomposition algorithm strategically, and then two-dimensional principal component analysis is applied for further processing of these features. Finally, fault types are identified based on a minimum distance strategy. The experimental results are given to demonstrate the effectiveness of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
46

Lin, Tiffany Ying-Yu, and I.-Hsuan Chen. "How Semantics is Embodied through Visual Representation: Image Schemas in the Art of Chinese Calligraphy." Annual Meeting of the Berkeley Linguistics Society 38 (September 25, 2012): 328. http://dx.doi.org/10.3765/bls.v38i0.3338.

Full text
Abstract:
<p>This study aims to investigate abstract reasoning and embodied cognition through the analysis of image schemas and conceptual metaphors in the interplay of art and language. Chinese calligraphy is noteworthy due to its unique embodied characteristics and image-schematic representations of visual art and language. The art of Chinese calligraphy not only represents the visual forms of Chinese characters but also conveys meanings, emotion, and style, demonstrating the aesthetics of language and art. By analyzing image schemas and metaphors in classical works of art, this paper shows how semantics is conceptualized and embodied through visual representation of Chinese calligraphy. In this study, we examine how semantics is visualized within the topological structure of cognitive mechanisms of a CONTAINER schema, the crucial image schema that structures the conceptualization of spatial relation concepts. This paper proposes that the CONTAINER schema, the BALANCE schema, the FORCE schema, as well as the metaphors SIGNIFICANCE IS SIZE and MIND IS A BODY, which may motivate the calligrapher’s creative process, underlie the art of Chinese calligraphy.</p>
APA, Harvard, Vancouver, ISO, and other styles
47

Ma, Changxia, Heng Zhang, and Bing Keong Li. "Shadow Separation of Pavement Images Based on Morphological Component Analysis." Journal of Control Science and Engineering 2021 (January 15, 2021): 1–10. http://dx.doi.org/10.1155/2021/8828635.

Full text
Abstract:
The shadow of pavement images will affect the accuracy of road crack recognition and increase the rate of error detection. A shadow separation algorithm based on morphological component analysis (MCA) is proposed herein to solve the shadow problem of road imaging. The main assumption of MCA is that the image geometric structure and texture structure components are sparse within a class under a specific base or overcomplete dictionary, while the base or overcomplete dictionaries of each sparse representation of morphological components are incoherent. Thereafter, the corresponding image signal is transformed according to the dictionary to obtain the sparse representation coefficients of each part of the information, and the coefficients are shrunk by soft thresholding to obtain new coefficients. Experimental results show the effectiveness of the shadow separation method proposed in this paper.
APA, Harvard, Vancouver, ISO, and other styles
48

TANG, XIN, PATRICK S. WANG, and GUOCAN FENG. "A NOVEL SUPERVISED STRUCTURE DICTIONARY LEARNING FOR CLASSIFICATION BASED ON SPARSE REPRESENTATION." International Journal of Pattern Recognition and Artificial Intelligence 26, no. 07 (November 2012): 1255012. http://dx.doi.org/10.1142/s0218001412550129.

Full text
Abstract:
Sparse representation based classification has led to interesting image recognition results, while the dictionary used for sparse coding plays a key role in it. This paper presents a novel supervised structure dictionary learning (SSDL) algorithm to learn a discriminative and block structure dictionary. We associate label information with each dictionary item and make each class-specific sub-dictionary in the whole structured dictionary have good representation ability to the training samples from the associated class. More specifically, we learn a structured dictionary and a multiclass classifier simultaneously. Adding an inhomogeneous representation term to the objective function and considering the independence of the class-specific sub-dictionaries improve the discrimination capabilities of the sparse coordinates. An iteratively optimization method be proposed to solving the new formulation. Experimental results on four face databases demonstrate that our algorithm outperforms recently proposed competing sparse coding methods.
APA, Harvard, Vancouver, ISO, and other styles
49

Lu, Shichen, Ruimin Hu, Jing Liu, Longteng Guo, and Fei Zheng. "Structure Preserving Convolutional Attention for Image Captioning." Applied Sciences 9, no. 14 (July 19, 2019): 2888. http://dx.doi.org/10.3390/app9142888.

Full text
Abstract:
In the task of image captioning, learning the attentive image regions is necessary to adaptively and precisely focus on the object semantics relevant to each decoded word. In this paper, we propose a convolutional attention module that can preserve the spatial structure of the image by performing the convolution operation directly on the 2D feature maps. The proposed attention mechanism contains two components: convolutional spatial attention and cross-channel attention, aiming to determine the intended regions to describe the image along the spatial and channel dimensions, respectively. Both of the two attentions are calculated at each decoding step. In order to preserve the spatial structure, instead of operating on the vector representation of each image grid, the two attention components are both computed directly on the entire feature maps with convolution operations. Experiments on two large-scale datasets (MSCOCO and Flickr30K) demonstrate the outstanding performance of our proposed method.
APA, Harvard, Vancouver, ISO, and other styles
50

MA, MATTHEW Y., JINHONG K. GUO, and PATRICK S. P. WANG. "FROM PIXELS TO TRUE XML STRUCTURES IN DIGITAL DOCUMENT IMAGES." International Journal of Pattern Recognition and Artificial Intelligence 18, no. 06 (September 2004): 1057–69. http://dx.doi.org/10.1142/s0218001404003575.

Full text
Abstract:
XML has been widely used as metadata for image retrieval. As a standard, it makes it easier to index and retrieve information across different platforms. However, how to automatically convert an image into XML format remains a challenge. In this paper, a system for generating structured document in XML from digitally captured document images is presented. The system is aimed at providing an easy to use tool for average users without requiring depth of knowledge in the document processing areas. Further, a XML/XSL generator is developed to accurately represent a document in a XML structure, yet in a representation that reflects its original layout.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography