Journal articles on the topic 'Image classification tasks'

To see the other types of publications on this topic, follow the link: Image classification tasks.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Image classification tasks.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Wang, Liangliang, and Deepu Rajan. "An image similarity descriptor for classification tasks." Journal of Visual Communication and Image Representation 71 (August 2020): 102847. http://dx.doi.org/10.1016/j.jvcir.2020.102847.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Li, Chuanlong, Xiufen Ye, Jier Xi, and Yunpeng Jia. "A Texture Feature Removal Network for Sonar Image Classification and Detection." Remote Sensing 15, no. 3 (January 20, 2023): 616. http://dx.doi.org/10.3390/rs15030616.

Full text
Abstract:
Deep neural network (DNN) was applied in sonar image target recognition tasks, but it is very difficult to obtain enough sonar images that contain a target; as a result, the direct use of a small amount of data to train a DNN will cause overfitting and other problems. Transfer learning is the most effective way to address such scenarios. However, there is a large domain gap between optical images and sonar images, and common transfer learning methods may not be able to effectively handle it. In this paper, we propose a transfer learning method for sonar image classification and object detection called the texture feature removal network. We regard the texture features of an image as domain-specific features, and we narrow the domain gap by discarding the domain-specific features, and hence, make it easier to complete knowledge transfer. Our method can be easily embedded into other transfer learning methods, which makes it easier to apply to different application scenarios. Experimental results show that our method is effective in side-scan sonar image classification tasks and forward-looking sonar image detection tasks. For side-scan sonar image classification tasks, the classification accuracy of our method is enhanced by 4.5% in a supervised learning experiment, and for forward-looking sonar detection tasks, the average precision (AP) is also significantly improved.
APA, Harvard, Vancouver, ISO, and other styles
3

Zhang, Taohong, Suli Fan, Junnan Hu, Xuxu Guo, Qianqian Li, Ying Zhang, and Aziguli Wulamu. "A Feature Fusion Method with Guided Training for Classification Tasks." Computational Intelligence and Neuroscience 2021 (April 14, 2021): 1–11. http://dx.doi.org/10.1155/2021/6647220.

Full text
Abstract:
In this paper, a feature fusion method with guiding training (FGT-Net) is constructed to fuse image data and numerical data for some specific recognition tasks which cannot be classified accurately only according to images. The proposed structure is divided into the shared weight network part, the feature fused layer part, and the classification layer part. First, the guided training method is proposed to optimize the training process, the representative images and training images are input into the shared weight network to learn the ability that extracts the image features better, and then the image features and numerical features are fused together in the feature fused layer to input into the classification layer for the classification task. Experiments are carried out to verify the effectiveness of the proposed model. Loss is calculated by the output of both the shared weight network and classification layer. The results of experiments show that the proposed FGT-Net achieves the accuracy of 87.8%, which is 15% higher than the CNN model of ShuffleNetv2 (which can process image data only) and 9.8% higher than the DNN method (which processes structured data only).
APA, Harvard, Vancouver, ISO, and other styles
4

Tang, Chaohui, Qingxin Zhu, Wenjun Wu, Wenlin Huang, Chaoqun Hong, and Xinzheng Niu. "PLANET: Improved Convolutional Neural Networks with Image Enhancement for Image Classification." Mathematical Problems in Engineering 2020 (March 11, 2020): 1–10. http://dx.doi.org/10.1155/2020/1245924.

Full text
Abstract:
In the past few years, deep learning has become a research hotspot and has had a profound impact on computer vision. Deep CNN has been proven to be the most important and effective model for image processing, but due to the lack of training samples and huge number of learning parameters, it is easy to tend to overfit. In this work, we propose a new two-stage CNN image classification network, named “Improved Convolutional Neural Networks with Image Enhancement for Image Classification” and PLANET in abbreviation, which uses a new image data enhancement method called InnerMove to enhance images and augment the number of training samples. InnerMove is inspired by the “object movement” scene in computer vision and can improve the generalization ability of deep CNN models for image classification tasks. Sufficient experiment results show that PLANET utilizing InnerMove for image enhancement outperforms the comparative algorithms, and InnerMove has a more significant effect than the comparative data enhancement methods for image classification tasks.
APA, Harvard, Vancouver, ISO, and other styles
5

Zhou, Lanfeng, Ziwei Liu, and Wenfeng Wang. "Terrain Classification Algorithm for Lunar Rover Using a Deep Ensemble Network with High-Resolution Features and Interdependencies between Channels." Wireless Communications and Mobile Computing 2020 (October 13, 2020): 1–14. http://dx.doi.org/10.1155/2020/8842227.

Full text
Abstract:
For terrain classification tasks, previous methods used a single scale or single model to extract the features of the image, used high-to-low resolution networks to extract the features of the image, and used a network with no relationship between channels. These methods would lead to the inadequacy of the extracted features. Therefore, classification accuracy would reduce. The samples in terrain classification tasks are different from in other image classification tasks. The differences between samples in terrain classification tasks are subtler than other image-level classification tasks. And the colours of each sample in the terrain classification are similar. So we need to maintain the high resolution of features and establish the interdependencies between the channels to highlight the image features. This kind of networks can improve classification accuracy. To overcome these challenges, this paper presents a terrain classification algorithm for Lunar Rover by using a deep ensemble network. We optimize the activation function and the structure of the convolutional neural network to make it better to extract fine features of the images and infer the terrain category of the image. In particular, several contributions are made in this paper: establishing interdependencies between channels to highlight features and maintaining a high-resolution representation throughout the process to ensure the extraction of fine features. Multimodel collaborative judgment can help make up for the shortcomings in the design of the single model structure, make the model form a competitive relationship, and improve the accuracy. The overall classification accuracy of this method reaches 91.57% on our dataset, and the accuracy is higher on some terrains.
APA, Harvard, Vancouver, ISO, and other styles
6

Melekhin, V. B., and V. M. Khachumov. "Stable descriptors in image recognition tasks." Herald of Dagestan State Technical University. Technical Sciences 47, no. 3 (October 1, 2020): 93–100. http://dx.doi.org/10.21822/2073-6185-2020-47-3-93-100.

Full text
Abstract:
Objective. The objective of the study is to determine various stable characteristics of images (semi-invariants and invariants) as descriptors necessary for the formation of a feature space of standards intended for recognizing images of different nature belonging to different classes of objects. Methods. The authors propose metrics for evaluating the proximity of the recognized image to a given standard in the space of covariance matrices, based on the obtained descriptors as a methodological basis for constructing image recognition methods. Results. The content of the main stages of selecting descriptors for a given class of objects is developed, taking into account the different illumination of the recognized images. The effectiveness of the results obtained is confirmed by experimental studies related to the solution of the problem of recognition of special images - facies. Conclusions. The definition of stable image descriptors as invariants or semi-invariants to zoom and brightness transformations allows solving the problems of facies classification in conditions of the unstable shooting of recognized images. The images can be rotated and shifted in any way. In general, the proposed approach allows developing an effective image recognition system in the presence of various types of interference on the recognized images.
APA, Harvard, Vancouver, ISO, and other styles
7

Singh, Ankita, and Pawan Singh. "Image Classification: A Survey." Journal of Informatics Electrical and Electronics Engineering (JIEEE) 1, no. 2 (November 19, 2020): 1–9. http://dx.doi.org/10.54060/jieee/001.02.002.

Full text
Abstract:
The Classification of images is a paramount topic in artificial vision systems which have drawn a notable amount of interest over the past years. This field aims to classify an image, which is an input, based on its visual content. Currently, most people relied on hand-crafted features to describe an image in a particular way. Then, using classifiers that are learnable, such as random forest, and decision tree was applied to the extract features to come to a final decision. The problem arises when large numbers of photos are concerned. It becomes a too difficult problem to find features from them. This is one of the reasons that the deep neural network model has been introduced. Owing to the existence of Deep learning, it can become feasible to represent the hierarchical nature of features using a various number of layers and corresponding weight with them. The existing image classification methods have been gradually applied in real-world problems, but then there are various problems in its application processes, such as unsatisfactory effect and extremely low classification accuracy or then and weak adaptive ability. Models using deep learning concepts have robust learning ability, which combines the feature extraction and the process of classification into a whole which then completes an image classification task, which can improve the image classification accuracy effectively. Convolutional Neural Networks are a powerful deep neural network technique. These networks preserve the spatial structure of a problem and were built for object recognition tasks such as classifying an image into respective classes. Neural networks are much known because people are getting a state-of-the-art outcome on complex computer vision and natural language processing tasks. Convolutional neural networks have been extensively used.
APA, Harvard, Vancouver, ISO, and other styles
8

Yan, Yang, Wen Bo Huang, Yun Ji Wang, and Na Li. "Image Labeling Model Based on Conditional Random Fields." Advanced Materials Research 756-759 (September 2013): 3869–73. http://dx.doi.org/10.4028/www.scientific.net/amr.756-759.3869.

Full text
Abstract:
We present conditional random fields (CRFs), a framework for building probabilistic models to segment and label sequence data, and use CRFs to label pixels in an image. CRFs provide a discriminative framework to incorporate spatial dependencies in an image, which is more appropriate for classification tasks as opposed to a generative framework. In this paper we apply CRF to an image classification tasks: an image labeling problem (manmade vs. natural regions in the MSRC 21-object class datasets). Parameter learning is performed using contrastive divergence (CD) algorithm to maximize an approximation to the conditional likelihood. We focus on two aspects of the classification task: feature extraction and classifiers design. We present classification results on sample images from MSRC 21-object class datasets.
APA, Harvard, Vancouver, ISO, and other styles
9

Elizarov, Artem Aleksandrovich, and Evgenii Viktorovich Razinkov. "Image Classification Using Reinforcement Learning." Russian Digital Libraries Journal 23, no. 6 (May 12, 2020): 1172–91. http://dx.doi.org/10.26907/1562-5419-2020-23-6-1172-1191.

Full text
Abstract:
Recently, such a direction of machine learning as reinforcement learning has been actively developing. As a consequence, attempts are being made to use reinforcement learning for solving computer vision problems, in particular for solving the problem of image classification. The tasks of computer vision are currently one of the most urgent tasks of artificial intelligence. The article proposes a method for image classification in the form of a deep neural network using reinforcement learning. The idea of ​​the developed method comes down to solving the problem of a contextual multi-armed bandit using various strategies for achieving a compromise between exploitation and research and reinforcement learning algorithms. Strategies such as -greedy, -softmax, -decay-softmax, and the UCB1 method, and reinforcement learning algorithms such as DQN, REINFORCE, and A2C are considered. The analysis of the influence of various parameters on the efficiency of the method is carried out, and options for further development of the method are proposed.
APA, Harvard, Vancouver, ISO, and other styles
10

Yan, Yang, Wen Bo Huang, and Yun Ji Wang. "Image Classification Based on Conditional Random Fields." Applied Mechanics and Materials 556-562 (May 2014): 4901–5. http://dx.doi.org/10.4028/www.scientific.net/amm.556-562.4901.

Full text
Abstract:
We use Conditional Random Fields (CRFs) to classify regions in an image. CRFs provide a discriminative framework to incorporate spatial dependencies in an image, which is more appropriate for classification tasks as opposed to a generative framework. In this paper we apply CRFs to the image multi-classification task, we focus on three aspects of the classification task: feature extraction, the Original feature clustering based on K-means, and feature vector modeling base on CRF to obtain multiclass classification. We present classification results on sample images from Cambridge (MSRC) database, and the experimental results show that the method we present can classify the images accurately.
APA, Harvard, Vancouver, ISO, and other styles
11

Salk, Carl, Elena Moltchanova, Linda See, Tobias Sturn, Ian McCallum, and Steffen Fritz. "How many people need to classify the same image? A method for optimizing volunteer contributions in binary geographical classifications." PLOS ONE 17, no. 5 (May 19, 2022): e0267114. http://dx.doi.org/10.1371/journal.pone.0267114.

Full text
Abstract:
Involving members of the public in image classification tasks that can be tricky to automate is increasingly recognized as a way to complete large amounts of these tasks and promote citizen involvement in science. While this labor is usually provided for free, it is still limited, making it important for researchers to use volunteer contributions as efficiently as possible. Using volunteer labor efficiently becomes complicated when individual tasks are assigned to multiple volunteers to increase confidence that the correct classification has been reached. In this paper, we develop a system to decide when enough information has been accumulated to confidently declare an image to be classified and remove it from circulation. We use a Bayesian approach to estimate the posterior distribution of the mean rating in a binary image classification task. Tasks are removed from circulation when user-defined certainty thresholds are reached. We demonstrate this process using a set of over 4.5 million unique classifications by 2783 volunteers of over 190,000 images assessed for the presence/absence of cropland. If the system outlined here had been implemented in the original data collection campaign, it would have eliminated the need for 59.4% of volunteer ratings. Had this effort been applied to new tasks, it would have allowed an estimated 2.46 times as many images to have been classified with the same amount of labor, demonstrating the power of this method to make more efficient use of limited volunteer contributions. To simplify implementation of this method by other investigators, we provide cutoff value combinations for one set of confidence levels.
APA, Harvard, Vancouver, ISO, and other styles
12

Dai, Yin, Yifan Gao, and Fayu Liu. "TransMed: Transformers Advance Multi-Modal Medical Image Classification." Diagnostics 11, no. 8 (July 31, 2021): 1384. http://dx.doi.org/10.3390/diagnostics11081384.

Full text
Abstract:
Over the past decade, convolutional neural networks (CNN) have shown very competitive performance in medical image analysis tasks, such as disease classification, tumor segmentation, and lesion detection. CNN has great advantages in extracting local features of images. However, due to the locality of convolution operation, it cannot deal with long-range relationships well. Recently, transformers have been applied to computer vision and achieved remarkable success in large-scale datasets. Compared with natural images, multi-modal medical images have explicit and important long-range dependencies, and effective multi-modal fusion strategies can greatly improve the performance of deep models. This prompts us to study transformer-based structures and apply them to multi-modal medical images. Existing transformer-based network architectures require large-scale datasets to achieve better performance. However, medical imaging datasets are relatively small, which makes it difficult to apply pure transformers to medical image analysis. Therefore, we propose TransMed for multi-modal medical image classification. TransMed combines the advantages of CNN and transformer to efficiently extract low-level features of images and establish long-range dependencies between modalities. We evaluated our model on two datasets, parotid gland tumors classification and knee injury classification. Combining our contributions, we achieve an improvement of 10.1% and 1.9% in average accuracy, respectively, outperforming other state-of-the-art CNN-based models. The results of the proposed method are promising and have tremendous potential to be applied to a large number of medical image analysis tasks. To our best knowledge, this is the first work to apply transformers to multi-modal medical image classification.
APA, Harvard, Vancouver, ISO, and other styles
13

Li, Haifeng, Xin Dou, Chao Tao, Zhixiang Wu, Jie Chen, Jian Peng, Min Deng, and Ling Zhao. "RSI-CB: A Large-Scale Remote Sensing Image Classification Benchmark Using Crowdsourced Data." Sensors 20, no. 6 (March 12, 2020): 1594. http://dx.doi.org/10.3390/s20061594.

Full text
Abstract:
Image classification is a fundamental task in remote sensing image processing. In recent years, deep convolutional neural networks (DCNNs) have experienced significant breakthroughs in natural image recognition. The remote sensing field, however, is still lacking a large-scale benchmark similar to ImageNet. In this paper, we propose a remote sensing image classification benchmark (RSI-CB) based on massive, scalable, and diverse crowdsourced data. Using crowdsourced data, such as Open Street Map (OSM) data, ground objects in remote sensing images can be annotated effectively using points of interest, vector data from OSM, or other crowdsourced data. These annotated images can, then, be used in remote sensing image classification tasks. Based on this method, we construct a worldwide large-scale benchmark for remote sensing image classification. This benchmark has large-scale geographical distribution and large total image number. It contains six categories with 35 sub-classes of more than 24,000 images of size 256 × 256 pixels. This classification system of ground objects is defined according to the national standard of land-use classification in China and is inspired by the hierarchy mechanism of ImageNet. Finally, we conduct numerous experiments to compare RSI-CB with the SAT-4, SAT-6, and UC-Merced data sets. The experiments show that RSI-CB is more suitable as a benchmark for remote sensing image classification tasks than other benchmarks in the big data era and has many potential applications.
APA, Harvard, Vancouver, ISO, and other styles
14

Chen, Shizhao, Qian Zhou, and Hua Zou. "A Novel Un-Supervised GAN for Fundus Image Enhancement with Classification Prior Loss." Electronics 11, no. 7 (March 24, 2022): 1000. http://dx.doi.org/10.3390/electronics11071000.

Full text
Abstract:
Fundus images captured for clinical diagnosis usually suffer from degradation factors due to variation in equipment, operators, or environment. These degraded fundus images need to be enhanced to achieve better diagnosis and improve the results of downstream tasks. As there is no paired low- and high-quality fundus image, existing methods mainly focus on supervised or semi-supervised learning methods for color fundus image enhancement (CFIE) tasks by utilizing synthetic image pairs. Consequently, domain gaps between real images and synthetic images arise. With respect to existing unsupervised methods, the most important low scale pathological features and structural information in degraded fundus images are prone to be erased after enhancement. To solve these problems, an unsupervised GAN is proposed for CFIE tasks utilizing adversarial training to enhance low quality fundus images. Synthetic image pairs are no longer required during the training. A specially designed U-Net with skip connection in our enhancement network can effectively remove degradation factors while preserving pathological features and structural information. Global and local discriminators adopted in the GAN lead to better illumination uniformity in the enhanced fundus image. To better improve the visual quality of enhanced fundus images, a novel non-reference loss function based on a pretrained fundus image quality classification network was designed to guide the enhancement network to produce high quality images. Experiments demonstrated that our method could effectively remove degradation factors in low-quality fundus images and produce a competitive result compared with previous methods in both quantitative and qualitative metrics.
APA, Harvard, Vancouver, ISO, and other styles
15

Endo, Takeru, and Mitsuharu Matsumoto. "Aurora Image Classification with Deep Metric Learning." Sensors 22, no. 17 (September 3, 2022): 6666. http://dx.doi.org/10.3390/s22176666.

Full text
Abstract:
In recent years, neural networks have been increasingly used for classifying aurora images. In particular, convolutional neural networks have been actively studied. However, there are not many studies on the application of deep learning techniques that take into account the characteristics of aurora images. Therefore, in this study, we propose the use of deep metric learning as a suitable method for aurora image classification. Deep metric learning is one of the deep learning techniques. It was developed to distinguish human faces. Identifying human faces is a more difficult task than standard classification tasks because this task is characterized by a small number of sample images for each class and poor feature variation between classes. We thought that the face identification task is similar to aurora image classification in that the number of labeled images is relatively small and the feature differences between classes are small. Therefore, we studied the application of deep metric learning to aurora image classification. As a result, our experiments showed that deep metric learning improves the accuracy of aurora image classification by nearly 10% compared to previous studies.
APA, Harvard, Vancouver, ISO, and other styles
16

Yang, Yadong, Xiaofeng Wang, and Hengzheng Zhang. "Local Importance Representation Convolutional Neural Network for Fine-Grained Image Classification." Symmetry 10, no. 10 (October 11, 2018): 479. http://dx.doi.org/10.3390/sym10100479.

Full text
Abstract:
Compared with ordinary image classification tasks, fine-grained image classification is closer to real-life scenes. Its key point is how to find the local areas with sufficient discrimination and perform effective feature learning. Based on a bilinear convolutional neural network (B-CNN), this paper designs a local importance representation convolutional neural network (LIR-CNN) model, which can be divided into three parts. Firstly, the super-pixel segmentation convolution method is used for the input layer of the model. It allows the model to receive images of different sizes and fully considers the complex geometric deformation of the images. Then, we replaced the standard convolution of B-CNN with the proposed local importance representation convolution. It can score each local area of the image using learning to distinguish their importance. Finally, channelwise convolution is proposed and it plays an important role in balancing lightweight network and classification accuracy. Experimental results on the benchmark datasets (e.g., CUB-200-2011, FGVC-Aircraft, and Stanford Cars) showed that the LIR-CNN model had good performance in fine-grained image classification tasks.
APA, Harvard, Vancouver, ISO, and other styles
17

Zhao, Zhicheng, Ze Luo, Jian Li, Can Chen, and Yingchao Piao. "When Self-Supervised Learning Meets Scene Classification: Remote Sensing Scene Classification Based on a Multitask Learning Framework." Remote Sensing 12, no. 20 (October 9, 2020): 3276. http://dx.doi.org/10.3390/rs12203276.

Full text
Abstract:
In recent years, the development of convolutional neural networks (CNNs) has promoted continuous progress in scene classification of remote sensing images. Compared with natural image datasets, however, the acquisition of remote sensing scene images is more difficult, and consequently the scale of remote sensing image datasets is generally small. In addition, many problems related to small objects and complex backgrounds arise in remote sensing image scenes, presenting great challenges for CNN-based recognition methods. In this article, to improve the feature extraction ability and generalization ability of such models and to enable better use of the information contained in the original remote sensing images, we introduce a multitask learning framework which combines the tasks of self-supervised learning and scene classification. Unlike previous multitask methods, we adopt a new mixup loss strategy to combine the two tasks with dynamic weight. The proposed multitask learning framework empowers a deep neural network to learn more discriminative features without increasing the amounts of parameters. Comprehensive experiments were conducted on four representative remote sensing scene classification datasets. We achieved state-of-the-art performance, with average accuracies of 94.21%, 96.89%, 99.11%, and 98.98% on the NWPU, AID, UC Merced, and WHU-RS19 datasets, respectively. The experimental results and visualizations show that our proposed method can learn more discriminative features and simultaneously encode orientation information while effectively improving the accuracy of remote sensing scene classification.
APA, Harvard, Vancouver, ISO, and other styles
18

Seong-Yoon Shin, Gwanghyun Jo, and Guangxing Wang. "A Novel Method for Fashion Clothing Image Classification Based on Deep Learning." Journal of Information and Communication Technology 22, no. 1 (January 19, 2023): 127–48. http://dx.doi.org/10.32890/jict2023.22.1.6.

Full text
Abstract:
Image recognition and classification is a significant research topic in computational vision and widely used computer technology. Themethods often used in image classification and recognition tasks are based on deep learning, like Convolutional Neural Networks(CNNs), LeNet, and Long Short-Term Memory networks (LSTM). Unfortunately, the classification accuracy of these methods isunsatisfactory. In recent years, using large-scale deep learning networks to achieve image recognition and classification canimprove classification accuracy, such as VGG16 and Residual Network (ResNet). However, due to the deep network hierarchyand complex parameter settings, these models take more time in the training phase, especially when the sample number is small, which can easily lead to overfitting. This paper suggested a deep learning-based image classification technique based on a CNN model and improved convolutional and pooling layers. Furthermore, the study adopted the approximate dynamic learning rate update algorithm in the model training to realize the learning rate’s self-adaptation, ensure the model’s rapid convergence, and shorten the training time. Using the proposed model, an experiment was conducted on the Fashion-MNIST dataset, taking 6,000 images as the training dataset and 1,000 images as the testing dataset. In actual experiments, the classification accuracy of the suggested method was 93 percent, 4.6 percent higher than that of the basic CNN model. Simultaneously, the study compared the influence of the batch size of model training on classification accuracy. Experimental outcomes showed this model is very generalized in fashion clothing image classification tasks.
APA, Harvard, Vancouver, ISO, and other styles
19

Peng, Yingshu, and Yi Wang. "An industrial-grade solution for agricultural image classification tasks." Computers and Electronics in Agriculture 187 (August 2021): 106253. http://dx.doi.org/10.1016/j.compag.2021.106253.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

N. Sultani, Zainab, and Ban N. Dhannoon. "Modified Bag of Visual Words Model for Image Classification." Al-Nahrain Journal of Science 24, no. 2 (June 1, 2021): 78–86. http://dx.doi.org/10.22401/anjs.24.2.11.

Full text
Abstract:
Image classification is acknowledged as one of the most critical and challenging tasks in computer vision. The bag of visual words (BoVW) model has proven to be very efficient for image classification tasks since it can effectively represent distinctive image features in vector space. In this paper, BoVW using Scale-Invariant Feature Transform (SIFT) and Oriented Fast and Rotated BRIEF(ORB) descriptors are adapted for image classification. We propose a novel image classification system using image local feature information obtained from both SIFT and ORB local feature descriptors. As a result, the constructed SO-BoVW model presents highly discriminative features, enhancing the classification performance. Experiments on Caltech-101 and flowers dataset prove the effectiveness of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
21

Yasmin, Romena, Joshua T. Grassel, Md Mahmudulla Hassan, Olac Fuentes, and Adolfo R. Escobedo. "Enhancing Image Classification Capabilities of Crowdsourcing-Based Methods through Expanded Input Elicitation." Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 9 (October 4, 2021): 166–78. http://dx.doi.org/10.1609/hcomp.v9i1.18949.

Full text
Abstract:
This study investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Three types of input elicitation methods are tested: binary classification (positive or negative); level of confidence in binary response (on a scale from 0-100%); and what participants believe the majority of the other participants' binary classification is. We design a crowdsourcing experiment to test the performance of the proposed input elicitation methods and use data from over 200 participants. Various existing voting and machine learning (ML) methods are applied and others developed to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experimental results suggest that more accurate classifications can be achieved when using the average of the self-reported confidence values as an additional attribute for ML algorithms relative to what is achieved with more traditional approaches. Additionally, they demonstrate that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods that leverage the variety of elicited inputs.
APA, Harvard, Vancouver, ISO, and other styles
22

Liu, Min, Yu He, Minghu Wu, and Chunyan Zeng. "Breast Histopathological Image Classification Method Based on Autoencoder and Siamese Framework." Information 13, no. 3 (February 24, 2022): 107. http://dx.doi.org/10.3390/info13030107.

Full text
Abstract:
The automated classification of breast cancer histopathological images is one of the important tasks in computer-aided diagnosis systems (CADs). Due to the characteristics of small inter-class and large intra-class variances in breast cancer histopathological images, extracting features for breast cancer classification is difficult. To address this problem, an improved autoencoder (AE) network using a Siamese framework that can learn the effective features from histopathological images for CAD breast cancer classification tasks was designed. First, the inputted image is processed at multiple scales using a Gaussian pyramid to obtain multi-scale features. Second, in the feature extraction stage, a Siamese framework is used to constrain the pre-trained AE so that the extracted features have smaller intra-class variance and larger inter-class variance. Experimental results show that the proposed method classification accuracy was as high as 97.8% on the BreakHis dataset. Compared with commonly used algorithms in breast cancer histopathological classification, this method has superior, faster performance.
APA, Harvard, Vancouver, ISO, and other styles
23

Singh, A. Buboo, Kh Manglem Singh, Y. Jina Chanu, Khelchandra Thongam, and Kh Johnson Singh. "An Improved Image Spam Classification Model Based on Deep Learning Techniques." Security and Communication Networks 2022 (August 2, 2022): 1–11. http://dx.doi.org/10.1155/2022/8905424.

Full text
Abstract:
Image Spam is a type of spam that has embedded text in an image. Classification of Image Spam is done using various machine learning approaches based on a broad set of features extracted from the image. For its remarkable results, the convolutional neural networks (CNN) are widely used in image classification as well as feature extraction tasks. In this research, we analyze image spam using a CNN model based on deep learning techniques. The proposed model is fine-tuned and optimized for both feature extraction as well as for classification tasks. We also compared our proposed model to different “Improved” and “Challenge” image spam datasets, which were developed for increasing the difficulty of the classification task. Our model significantly improves the accuracy of the classification task as compared to other approaches on the same datasets.
APA, Harvard, Vancouver, ISO, and other styles
24

Sharan, Roneel V., Hao Xiong, and Shlomo Berkovsky. "Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks." Sensors 21, no. 10 (May 14, 2021): 3434. http://dx.doi.org/10.3390/s21103434.

Full text
Abstract:
Audio signal classification finds various applications in detecting and monitoring health conditions in healthcare. Convolutional neural networks (CNN) have produced state-of-the-art results in image classification and are being increasingly used in other tasks, including signal classification. However, audio signal classification using CNN presents various challenges. In image classification tasks, raw images of equal dimensions can be used as a direct input to CNN. Raw time-domain signals, on the other hand, can be of varying dimensions. In addition, the temporal signal often has to be transformed to frequency-domain to reveal unique spectral characteristics, therefore requiring signal transformation. In this work, we overview and benchmark various audio signal representation techniques for classification using CNN, including approaches that deal with signals of different lengths and combine multiple representations to improve the classification accuracy. Hence, this work surfaces important empirical evidence that may guide future works deploying CNN for audio signal classification purposes.
APA, Harvard, Vancouver, ISO, and other styles
25

Tegmark, Max, and Tailin Wu. "Pareto-Optimal Data Compression for Binary Classification Tasks." Entropy 22, no. 1 (December 19, 2019): 7. http://dx.doi.org/10.3390/e22010007.

Full text
Abstract:
The goal of lossy data compression is to reduce the storage cost of a data set X while retaining as much information as possible about something (Y) that you care about. For example, what aspects of an image X contain the most information about whether it depicts a cat? Mathematically, this corresponds to finding a mapping X → Z ≡ f ( X ) that maximizes the mutual information I ( Z , Y ) while the entropy H ( Z ) is kept below some fixed threshold. We present a new method for mapping out the Pareto frontier for classification tasks, reflecting the tradeoff between retained entropy and class information. We first show how a random variable X (an image, say) drawn from a class Y ∈ { 1 , … , n } can be distilled into a vector W = f ( X ) ∈ R n − 1 losslessly, so that I ( W , Y ) = I ( X , Y ) ; for example, for a binary classification task of cats and dogs, each image X is mapped into a single real number W retaining all information that helps distinguish cats from dogs. For the n = 2 case of binary classification, we then show how W can be further compressed into a discrete variable Z = g β ( W ) ∈ { 1 , … , m β } by binning W into m β bins, in such a way that varying the parameter β sweeps out the full Pareto frontier, solving a generalization of the discrete information bottleneck (DIB) problem. We argue that the most interesting points on this frontier are “corners” maximizing I ( Z , Y ) for a fixed number of bins m = 2 , 3 , … which can conveniently be found without multiobjective optimization. We apply this method to the CIFAR-10, MNIST and Fashion-MNIST datasets, illustrating how it can be interpreted as an information-theoretically optimal image clustering algorithm. We find that these Pareto frontiers are not concave, and that recently reported DIB phase transitions correspond to transitions between these corners, changing the number of clusters.
APA, Harvard, Vancouver, ISO, and other styles
26

Kowsari, Kamran, Rasoul Sali, Lubaina Ehsan, William Adorno, Asad Ali, Sean Moore, Beatrice Amadi, Paul Kelly, Sana Syed, and Donald Brown. "HMIC: Hierarchical Medical Image Classification, A Deep Learning Approach." Information 11, no. 6 (June 12, 2020): 318. http://dx.doi.org/10.3390/info11060318.

Full text
Abstract:
Image classification is central to the big data revolution in medicine. Improved information processing methods for diagnosis and classification of digital medical images have shown to be successful via deep learning approaches. As this field is explored, there are limitations to the performance of traditional supervised classifiers. This paper outlines an approach that is different from the current medical image classification tasks that view the issue as multi-class classification. We performed a hierarchical classification using our Hierarchical Medical Image classification (HMIC) approach. HMIC uses stacks of deep learning models to give particular comprehension at each level of the clinical picture hierarchy. For testing our performance, we use biopsy of the small bowel images that contain three categories in the parent level (Celiac Disease, Environmental Enteropathy, and histologically normal controls). For the child level, Celiac Disease Severity is classified into 4 classes (I, IIIa, IIIb, and IIIC).
APA, Harvard, Vancouver, ISO, and other styles
27

Duarte, D., F. Nex, N. Kerle, and G. Vosselman. "SATELLITE IMAGE CLASSIFICATION OF BUILDING DAMAGES USING AIRBORNE AND SATELLITE IMAGE SAMPLES IN A DEEP LEARNING APPROACH." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-2 (May 28, 2018): 89–96. http://dx.doi.org/10.5194/isprs-annals-iv-2-89-2018.

Full text
Abstract:
The localization and detailed assessment of damaged buildings after a disastrous event is of utmost importance to guide response operations, recovery tasks or for insurance purposes. Several remote sensing platforms and sensors are currently used for the manual detection of building damages. However, there is an overall interest in the use of automated methods to perform this task, regardless of the used platform. Owing to its synoptic coverage and predictable availability, satellite imagery is currently used as input for the identification of building damages by the International Charter, as well as the Copernicus Emergency Management Service for the production of damage grading and reference maps. Recently proposed methods to perform image classification of building damages rely on convolutional neural networks (CNN). These are usually trained with only satellite image samples in a binary classification problem, however the number of samples derived from these images is often limited, affecting the quality of the classification results. The use of up/down-sampling image samples during the training of a CNN, has demonstrated to improve several image recognition tasks in remote sensing. However, it is currently unclear if this multi resolution information can also be captured from images with different spatial resolutions like satellite and airborne imagery (from both manned and unmanned platforms). In this paper, a CNN framework using residual connections and dilated convolutions is used considering both manned and unmanned aerial image samples to perform the satellite image classification of building damages. Three network configurations, trained with multi-resolution image samples are compared against two benchmark networks where only satellite image samples are used. Combining feature maps generated from airborne and satellite image samples, and refining these using only the satellite image samples, improved nearly 4 % the overall satellite image classification of building damages.
APA, Harvard, Vancouver, ISO, and other styles
28

Coleman, Matthew, Joanna F. Dipnall, Myong Jung, and Lan Du. "PreRadE: Pretraining Tasks on Radiology Images and Reports Evaluation Framework." Mathematics 10, no. 24 (December 8, 2022): 4661. http://dx.doi.org/10.3390/math10244661.

Full text
Abstract:
Recently, self-supervised pretraining of transformers has gained considerable attention in analyzing electronic medical records. However, systematic evaluation of different pretraining tasks in radiology applications using both images and radiology reports is still lacking. We propose PreRadE, a simple proof of concept framework that enables novel evaluation of pretraining tasks in a controlled environment. We investigated three most-commonly used pretraining tasks (MLM—Masked Language Modelling, MFR—Masked Feature Regression, and ITM—Image to Text Matching) and their combinations against downstream radiology classification on MIMIC-CXR, a medical chest X-ray imaging and radiology text report dataset. Our experiments in the multimodal setting show that (1) pretraining with MLM yields the greatest benefit to classification performance, largely due to the task-relevant information learned from the radiology reports. (2) Pretraining with only a single task can introduce variation in classification performance across different fine-tuning episodes, suggesting that composite task objectives incorporating both image and text modalities are better suited to generating reliably performant models.
APA, Harvard, Vancouver, ISO, and other styles
29

Cheng, H. D., and Rutvik Desai. "Scene Classification by Fuzzy Local Moments." International Journal of Pattern Recognition and Artificial Intelligence 12, no. 07 (November 1998): 921–38. http://dx.doi.org/10.1142/s0218001498000506.

Full text
Abstract:
The identification of images irrespective of their location, size and orientation is one of the important tasks in pattern analysis. The use of global moment features has been one of the most popular techniques for this purpose. We present a simple and effective method for gray-level image representation and identification which utilizes fuzzy radial moments of image segments (local moments) as features as opposed to global features. A multilayer perceptron neural network is employed for classification. Fuzzy entropy measure is applied to optimize the parameters of the membership function. The technique does not require translation, scaling or rotation of the image. Furthermore, it is suitable for parallel implementation which is an advantage for real-time applications. The classification capability and robustness of the technique are demonstrated by experiments on scaled, rotated and noisy gray-level images of uppercase and lowercase characters and digits of English alphabet, as well as the images of a set of tools. The proposed approach can handle rotation, scale and translation invariance, noise and fuzziness simultaneously.
APA, Harvard, Vancouver, ISO, and other styles
30

Shi, Rui. "GSAIC: GeoScience Articles Illustration and Caption Dataset." Highlights in Science, Engineering and Technology 9 (September 30, 2022): 289–97. http://dx.doi.org/10.54097/hset.v9i.1858.

Full text
Abstract:
The scientific investigation of geoscience includes data collection, sample classification and semantic, consisting of a large number of images. An image-text search model that can well assist the research work of geoscience. However, the existing image-text datasets are mainly in the field of daily life and lack academic image-text datasets. In order to help geoscience researchers to investigate through the image and text, and to provide a new benchmark for researchers in the fields of data mining and information retrieval, this paper proposes a novel parallel material of geoscience academic illiustrateion and caption (GSAIC) based on GAKG, which contains over 900,000 illustrations of earth science papers and the corresponding captions. GSAIC filters out high-quality illustrations and captions through a classifier, and with the support of experts annotations. The GSAIC will support several tasks Including text search for images, retrieving corresponding images or papers based on academic image descriptions and academic illustration classification tasks, for geoscience scenarios Finally, both the GSAIC benchmark and classifier are publicly accessible.
APA, Harvard, Vancouver, ISO, and other styles
31

Alpert, S. I. "The basic arithmetic operations on fuzzy numbers and new approaches to the theory of fuzzy numbers under the classification of space images." Mathematical machines and systems 3 (2020): 49–59. http://dx.doi.org/10.34121/1028-9763-2020-3-49-59.

Full text
Abstract:
Classification in remote sensing is a very difficult procedure, because it involves a lot of steps and data preprocessing. Fuzzy Set Theory plays a very important role in classification problems, because the fuzzy approach can capture the structure of the image. Most concepts are fuzzy in nature. Fuzzy sets allow to deal with uncertain and imprecise data. Many classification problems are formalized by using fuzzy concepts, because crisp classes represent an oversimplification of reality, leading to wrong results of classification. Fuzzy Set Theory is an important mathematical tool to process complex and fuzzy da-ta. This theory is suitable for high resolution remote sensing image classification. Fuzzy sets and fuzzy numbers are used to determine basic probability assignment. Fuzzy numbers are used for detection of the optimal number of clusters in Fuzzy Clustering Methods. Image is modeled as a fuzzy graph, when we represent the dissimilitude between pixels in some classification tasks. Fuzzy sets are also applied in different tasks of processing digital optical images. It was noted, that fuzzy sets play an important role in analysis of results of classification, when different agreement measures between the reference data and final classification are considered. In this work arithmetic operations of fuzzy numbers using alpha-cut method were considered. Addition, subtraction, multiplication, division of fuzzy numbers and square root of fuzzy number were described in this paper. Moreover, it was illustrated examples with different arithmetic operations of fuzzy numbers. Fuzzy Set Theory and fuzzy numbers can be applied for analysis and classification of hyperspectral satellite images, solving ecological tasks, vegetation clas-sification, in remote searching for minerals.
APA, Harvard, Vancouver, ISO, and other styles
32

Koh, Joshua C. O., German Spangenberg, and Surya Kant. "Automated Machine Learning for High-Throughput Image-Based Plant Phenotyping." Remote Sensing 13, no. 5 (February 25, 2021): 858. http://dx.doi.org/10.3390/rs13050858.

Full text
Abstract:
Automated machine learning (AutoML) has been heralded as the next wave in artificial intelligence with its promise to deliver high-performance end-to-end machine learning pipelines with minimal effort from the user. However, despite AutoML showing great promise for computer vision tasks, to the best of our knowledge, no study has used AutoML for image-based plant phenotyping. To address this gap in knowledge, we examined the application of AutoML for image-based plant phenotyping using wheat lodging assessment with unmanned aerial vehicle (UAV) imagery as an example. The performance of an open-source AutoML framework, AutoKeras, in image classification and regression tasks was compared to transfer learning using modern convolutional neural network (CNN) architectures. For image classification, which classified plot images as lodged or non-lodged, transfer learning with Xception and DenseNet-201 achieved the best classification accuracy of 93.2%, whereas AutoKeras had a 92.4% accuracy. For image regression, which predicted lodging scores from plot images, transfer learning with DenseNet-201 had the best performance (R2 = 0.8303, root mean-squared error (RMSE) = 9.55, mean absolute error (MAE) = 7.03, mean absolute percentage error (MAPE) = 12.54%), followed closely by AutoKeras (R2 = 0.8273, RMSE = 10.65, MAE = 8.24, MAPE = 13.87%). In both tasks, AutoKeras models had up to 40-fold faster inference times compared to the pretrained CNNs. AutoML has significant potential to enhance plant phenotyping capabilities applicable in crop breeding and precision agriculture.
APA, Harvard, Vancouver, ISO, and other styles
33

Wen, Juan, Yangjing Shi, Xiaoshi Zhou, and Yiming Xue. "Crop Disease Classification on Inadequate Low-Resolution Target Images." Sensors 20, no. 16 (August 16, 2020): 4601. http://dx.doi.org/10.3390/s20164601.

Full text
Abstract:
Currently, various agricultural image classification tasks are carried out on high-resolution images. However, in some cases, we cannot get enough high-resolution images for classification, which significantly affects classification performance. In this paper, we design a crop disease classification network based on Enhanced Super-Resolution Generative adversarial networks (ESRGAN) when only an insufficient number of low-resolution target images are available. First, ESRGAN is used to recover super-resolution crop images from low-resolution images. Transfer learning is applied in model training to compensate for the lack of training samples. Then, we test the performance of the generated super-resolution images in crop disease classification task. Extensive experiments show that using the fine-tuned ESRGAN model can recover realistic crop information and improve the accuracy of crop disease classification, compared with the other four image super-resolution methods.
APA, Harvard, Vancouver, ISO, and other styles
34

Jabari, S., F. Fathollahi, and Y. Zhang. "APPLICATION OF SENSOR FUSION TO IMPROVE UAV IMAGE CLASSIFICATION." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W6 (August 23, 2017): 153–56. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w6-153-2017.

Full text
Abstract:
Image classification is one of the most important tasks of remote sensing projects including the ones that are based on using UAV images. Improving the quality of UAV images directly affects the classification results and can save a huge amount of time and effort in this area. In this study, we show that sensor fusion can improve image quality which results in increasing the accuracy of image classification. Here, we tested two sensor fusion configurations by using a Panchromatic (Pan) camera along with either a colour camera or a four-band multi-spectral (MS) camera. We use the Pan camera to benefit from its higher sensitivity and the colour or MS camera to benefit from its spectral properties. The resulting images are then compared to the ones acquired by a high resolution single Bayer-pattern colour camera (here referred to as HRC). We assessed the quality of the output images by performing image classification tests. The outputs prove that the proposed sensor fusion configurations can achieve higher accuracies compared to the images of the single Bayer-pattern colour camera. Therefore, incorporating a Pan camera on-board in the UAV missions and performing image fusion can help achieving higher quality images and accordingly higher accuracy classification results.
APA, Harvard, Vancouver, ISO, and other styles
35

Pan, Sai, Yibing Fu, Pu Chen, Jiaona Liu, Weicen Liu, Xiaofei Wang, Guangyan Cai, et al. "Multi-Task Learning-Based Immunofluorescence Classification of Kidney Disease." International Journal of Environmental Research and Public Health 18, no. 20 (October 15, 2021): 10798. http://dx.doi.org/10.3390/ijerph182010798.

Full text
Abstract:
Chronic kidney disease is one of the most important causes of mortality worldwide, but a shortage of nephrology pathologists has led to delays or errors in its diagnosis and treatment. Immunofluorescence (IF) images of patients with IgA nephropathy (IgAN), membranous nephropathy (MN), diabetic nephropathy (DN), and lupus nephritis (LN) were obtained from the General Hospital of Chinese PLA. The data were divided into training and test data. To simulate the inaccurate focus of the fluorescence microscope, the Gaussian method was employed to blur the IF images. We proposed a novel multi-task learning (MTL) method for image quality assessment, de-blurring, and disease classification tasks. A total of 1608 patients’ IF images were included—1289 in the training set and 319 in the test set. For non-blurred IF images, the classification accuracy of the test set was 0.97, with an AUC of 1.000. For blurred IF images, the proposed MTL method had a higher accuracy (0.94 vs. 0.93, p < 0.01) and higher AUC (0.993 vs. 0.986) than the common MTL method. The novel MTL method not only diagnosed four types of kidney diseases through blurred IF images but also showed good performance in two auxiliary tasks: image quality assessment and de-blurring.
APA, Harvard, Vancouver, ISO, and other styles
36

Yang, Qiu Xia, Chuan Wen Luo, and Tian Kai Chen. "Remote Sensing Image Classification Based on Object-Oriented Method and Support Vector Machine: A Case Study in Harbin City." Advanced Materials Research 912-914 (April 2014): 1331–34. http://dx.doi.org/10.4028/www.scientific.net/amr.912-914.1331.

Full text
Abstract:
Remote sensing classification, as an important means of urban planning and construction, has been widely concerned. Urban land use classification is extremely challenging tasks because of some land covers are spectrally too similar to be separated using only the spectral information of remote sensing image. Object-oriented remote sensing image classification method overcomes the drawbacks of traditional pixel-based classification method. It combines the spectral, special structure and texture features of the images, can effectively avoid the phenomenon of "different objects share the same spectrum" or "the same objects differ in spectrum. Support Vector Machine (SVM) is an excellent tool for remote sensing classification. Combination of both can develop their own advantages to do high-resolution remote sensing image classification. Using a public image in Harbin city as an example, classification based on object-oriented method and SVM has achieved better results than traditional pixel-based classification method.
APA, Harvard, Vancouver, ISO, and other styles
37

Arcadia, Christopher E., Amanda Dombroski, Kady Oakley, Shui Ling Chen, Hokchhay Tann, Christopher Rose, Eunsuk Kim, Sherief Reda, Brenda M. Rubenstein, and Jacob K. Rosenstein. "Leveraging autocatalytic reactions for chemical domain image classification." Chemical Science 12, no. 15 (2021): 5464–72. http://dx.doi.org/10.1039/d0sc05860b.

Full text
Abstract:
Kinetic models of autocatalytic reactions have mathematical forms similar to activation functions used in artificial neural networks. Inspired by these similarities, we use a copper-catalyzed reaction to perform digital image recognition tasks.
APA, Harvard, Vancouver, ISO, and other styles
38

Han, Binbin, Ping Han, and Zheng Cheng. "Object-Oriented Unsupervised Classification of PolSAR Images Based on Image Block." Remote Sensing 14, no. 16 (August 14, 2022): 3953. http://dx.doi.org/10.3390/rs14163953.

Full text
Abstract:
Land Use and Land Cover (LULC) classification is one of the tasks of Polarimetric Synthetic Aperture Radar (PolSAR) images’ interpretation, and the classification performance of existing algorithms is highly sensitive to the class number, which is inconsistent with the reality that LULC classification should have multiple levels of detail in the same image. Therefore, an object-oriented unsupervised classification algorithm for PolSAR images based on the image block is proposed. Firstly, the image is divided into multiple non-overlapping image blocks, and h/q/gray-Wishart classification is performed in each block. Secondly, each cluster obtained is regarded as an object, and the affinity matrix of objects is calculated in the global image. Finally, the objects are merged into the specified class number by density peak clustering (DPC), and the adjacent objects at the block boundary are checked and forced to merge. Experiments are carried out with the measured data of the airborne AIRSAR and E-SAR and the spaceborne GF-3. The experimental results show that the proposed algorithm achieves good classification results under a variety of class numbers.
APA, Harvard, Vancouver, ISO, and other styles
39

Amao, Abduljamiu O. "Automating taxonomic and systematic search of benthic foraminifera in an online database." Micropaleontology 67, no. 6 (2021): 601–8. http://dx.doi.org/10.47894/mpal.67.6.06.

Full text
Abstract:
Recent advances in the applications of deep neural networks in computer vision tasks such as image classification has seen a tremendous surge in interest. Several image classification algorithms can now be leveraged in automating some tedious tasks associated with benthic foraminifera research especially in sample picking, taxonomy and systematics. In this study, a small image identification model was built with 414 SEM micrographs representing twenty-one species of benthic foraminifera, using a convolutional neural network which achieved 84% model accuracy and 75% validation accuracy on previously unseen images. The model was also deployed through a web application to demonstrate how it may be useful in augmenting online databases such as the Ellis Messina catalogue and the World Register of Marine Species. These services although very valuable, can be modernized with image search functionalities to enhance their perpetual usefulness and continuity.
APA, Harvard, Vancouver, ISO, and other styles
40

Zhang, Xin, Hangzhi Jiang, Nuo Xu, Lei Ni, Chunlei Huo, and Chunhong Pan. "MsIFT: Multi-Source Image Fusion Transformer." Remote Sensing 14, no. 16 (August 19, 2022): 4062. http://dx.doi.org/10.3390/rs14164062.

Full text
Abstract:
Multi-source image fusion is very important for improving image representation ability since its essence relies on the complementarity between multi-source information. However, feature-level image fusion methods based on the convolution neural network are impacted by the spatial misalignment between image pairs, which leads to the semantic bias in merging features and destroys the representation ability of the region-of-interests. In this paper, a novel multi-source image fusion transformer (MsIFT) is proposed. Due to the inherent global attention mechanism of the transformer, the MsIFT has non-local fusion receptive fields, and it is more robust to spatial misalignment. Furthermore, multiple classification-based downstream tasks (e.g., pixel-wise classification, image-wise classification and semantic segmentation) are unified in the proposed MsIFT framework, and the fusion module architecture is shared by different tasks. The MsIFT achieved state-of-the-art performances on the image-wise classification dataset VAIS, semantic segmentation dataset SpaceNet 6 and pixel-wise classification dataset GRSS-DFC-2013. The code and trained model are being released upon the publication of the work.
APA, Harvard, Vancouver, ISO, and other styles
41

Hu, Wei, Yangyu Huang, Li Wei, Fan Zhang, and Hengchao Li. "Deep Convolutional Neural Networks for Hyperspectral Image Classification." Journal of Sensors 2015 (2015): 1–12. http://dx.doi.org/10.1155/2015/258619.

Full text
Abstract:
Recently, convolutional neural networks have demonstrated excellent performance on various visual tasks, including the classification of common two-dimensional images. In this paper, deep convolutional neural networks are employed to classify hyperspectral images directly in spectral domain. More specifically, the architecture of the proposed classifier contains five layers with weights which are the input layer, the convolutional layer, the max pooling layer, the full connection layer, and the output layer. These five layers are implemented on each spectral signature to discriminate against others. Experimental results based on several hyperspectral image data sets demonstrate that the proposed method can achieve better classification performance than some traditional methods, such as support vector machines and the conventional deep learning-based methods.
APA, Harvard, Vancouver, ISO, and other styles
42

Xu, Meng, Yuanyuan Zhao, Yajun Liang, and Xiaorui Ma. "Hyperspectral Image Classification Based on Class-Incremental Learning with Knowledge Distillation." Remote Sensing 14, no. 11 (May 26, 2022): 2556. http://dx.doi.org/10.3390/rs14112556.

Full text
Abstract:
By virtue of its large-covered spatial information and high-resolution spectral information, hyperspectral images make lots of mapping-based fine-grained remote sensing applications possible. However, due to the inconsistency of land-cover types between different images, most hyperspectral image classification methods keep their effectiveness by training on every image and saving all classification models and training samples, which limits the promotion of related remote sensing tasks. To deal with the aforementioned issues, this paper proposes a hyperspectral image classification method based on class-incremental learning to learn new land-cover types without forgetting the old ones, which enables the classification method to classify all land-cover types with one final model. Specially, when learning new classes, a knowledge distillation strategy is designed to recall the information of old classes by transferring knowledge to the newly trained network, and a linear correction layer is proposed to relax the heavy bias towards the new class by reapportioning information between different classes. Additionally, the proposed method introduces a channel attention mechanism to effectively utilize spatial–spectral information by a recalibration strategy. Experimental results on the three widely used hyperspectral images demonstrate that the proposed method can identify both new and old land-cover types with high accuracy, which proves the proposed method is more practical in large-coverage remote sensing tasks.
APA, Harvard, Vancouver, ISO, and other styles
43

Yang, Kaiwen, Aiga Suzuki, Jiaxing Ye, Hirokazu Nosato, Ayumi Izumori, and Hidenori Sakanashi. "CTG-Net: Cross-task guided network for breast ultrasound diagnosis." PLOS ONE 17, no. 8 (August 11, 2022): e0271106. http://dx.doi.org/10.1371/journal.pone.0271106.

Full text
Abstract:
Deep learning techniques have achieved remarkable success in lesion segmentation and classification between benign and malignant tumors in breast ultrasound images. However, existing studies are predominantly focused on devising efficient neural network-based learning structures to tackle specific tasks individually. By contrast, in clinical practice, sonographers perform segmentation and classification as a whole; they investigate the border contours of the tissue while detecting abnormal masses and performing diagnostic analysis. Performing multiple cognitive tasks simultaneously in this manner facilitates exploitation of the commonalities and differences between tasks. Inspired by this unified recognition process, this study proposes a novel learning scheme, called the cross-task guided network (CTG-Net), for efficient ultrasound breast image understanding. CTG-Net integrates the two most significant tasks in computerized breast lesion pattern investigation: lesion segmentation and tumor classification. Further, it enables the learning of efficient feature representations across tasks from ultrasound images and the task-specific discriminative features that can greatly facilitate lesion detection. This is achieved using task-specific attention models to share the prediction results between tasks. Then, following the guidance of task-specific attention soft masks, the joint feature responses are efficiently calibrated through iterative model training. Finally, a simple feature fusion scheme is used to aggregate the attention-guided features for efficient ultrasound pattern analysis. We performed extensive experimental comparisons on multiple ultrasound datasets. Compared to state-of-the-art multi-task learning approaches, the proposed approach can improve the Dice’s coefficient, true-positive rate of segmentation, AUC, and sensitivity of classification by 11%, 17%, 2%, and 6%, respectively. The results demonstrate that the proposed cross-task guided feature learning framework can effectively fuse the complementary information of ultrasound image segmentation and classification tasks to achieve accurate tumor localization. Thus, it can aid sonographers to detect and diagnose breast cancer.
APA, Harvard, Vancouver, ISO, and other styles
44

Murray, R., and L. Pritchett. "A classification-image-like method reveals strategies in 2afc tasks." Journal of Vision 14, no. 10 (August 22, 2014): 389. http://dx.doi.org/10.1167/14.10.389.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Jiang, Yuanyuan, Jinyang Xie, and Dong Zhang. "An Adaptive Offset Activation Function for CNN Image Classification Tasks." Electronics 11, no. 22 (November 18, 2022): 3799. http://dx.doi.org/10.3390/electronics11223799.

Full text
Abstract:
The performance of the activation function in convolutional neural networks is directly related to the model’s image classification accuracy. The rectified linear unit (ReLU) activation function has been extensively used in image classification models but has significant shortcomings, including low classification accuracy. The performance of a series of parametric activation functions has made parameter addition a popular research avenue for improving the performance of activation functions in recent years, and excellent progress has been achieved. Existing parametric activation functions often focus on assigning a different slope to the negative part of the activation function and still involve the negative value alone in the activation function calculation, without considering the impact of linking the negative value to the positive value on the performance of the activation function. As a result, this work proposes a novel parametric right-shift activation function, the adaptive offset activation function (AOAF). By inserting an adaptive parameter (the mean value of the input feature tensor) and two custom ReLU parameters, the negative parameters previously driven to zero by ReLU can be turned into positive parameters with lower weight and participate in CNN feature extraction. We compared the performance of the suggested activation function to the performance of a selection of typical activation functions using four distinct public datasets. Compared with ReLU, the average classification accuracy of our proposed activation function improved by 3.82%, 0.6%, 1.02%, and 4.8% for the four datasets, respectively.
APA, Harvard, Vancouver, ISO, and other styles
46

Choung, Yun Jae, and Myung Hee Jo. "Surface Material Classification Using Landsat-8 OLI Image Acquired in Ulsan by the Different Machine Learning Techniques." Applied Mechanics and Materials 865 (June 2017): 650–56. http://dx.doi.org/10.4028/www.scientific.net/amm.865.650.

Full text
Abstract:
Surface material classification is an important task for the preservation of land properties and the management of land development plans. The use of remotely sensed images is efficient for the surface material classification task without human access. This research aims to select the most appropriate machine learning technique for the surface material classification task using the remotely sensed images. In this research, the three different machine learning techniques (MD (Minimum Distance), MLC (Maximum Likelihood Classification), and SVM (Support Vector Machine)) were applied for surface material classification using the Landsat-8 OLI (Operational Land Imager) image acquired in Ulsan, South Korea, in the following steps. First, the training samples for each land cover in the given Landsat images were selected by manual labor. Next, the different machine learning techniques (MD, MLC, and SVM) were applied on the given Landsat images, respectively, for carrying out the surface material classification tasks. The accuracies of the three land cover classification maps generated by the different techniques were assessed using the ground truths. Finally, accuracy comparison was conducted for selecting the most suitable approach for classifying the various surface materials in Ulsan. The statistical results show that the SVM classifier is superior to the MD and MLC classifiers for carrying out surface material classification using the given Landsat-8 OLI image.
APA, Harvard, Vancouver, ISO, and other styles
47

Ren, Yi, Mengzhen Nie, Shichao Li, and Chuankun Li. "Single Image De-Raining via Improved Generative Adversarial Nets." Sensors 20, no. 6 (March 12, 2020): 1591. http://dx.doi.org/10.3390/s20061591.

Full text
Abstract:
Capturing images under rainy days degrades image visual quality and affects analysis tasks, such as object detection and classification. Therefore, image de-raining has attracted a lot of attention in recent years. In this paper, an improved generative adversarial network for single image de-raining is proposed. According to the principles of divide-and-conquer, we divide an image de-raining task into rain locating, rain removing, and detail refining sub-tasks. A multi-stream DenseNet, termed as Rain Estimation Network, is proposed to estimate the rain location map. A Generative Adversarial Network is proposed to remove the rain streaks. A Refinement Network is proposed to refine the details. These three models accomplish rain locating, rain removing, and detail refining sub-tasks, respectively. Experiments on two synthetic datasets and real world images demonstrate that the proposed method outperforms state-of-the-art de-raining studies in both objective and subjective measurements.
APA, Harvard, Vancouver, ISO, and other styles
48

Shen, Yuxin, Minn N. Yoon, Silvia Ortiz, Reid Friesen, and Hollis Lai. "Evaluating Classification Consistency of Oral Lesion Images for Use in an Image Classification Teaching Tool." Dentistry Journal 9, no. 8 (August 12, 2021): 94. http://dx.doi.org/10.3390/dj9080094.

Full text
Abstract:
A web-based image classification tool (DiLearn) was developed to facilitate active learning in the oral health profession. Students engage with oral lesion images using swipe gestures to classify each image into pre-determined categories (e.g., left for refer and right for no intervention). To assemble the training modules and to provide feedback to students, DiLearn requires each oral lesion image to be classified, with various features displayed in the image. The collection of accurate meta-information is a crucial step for enabling the self-directed active learning approach taken in DiLearn. The purpose of this study is to evaluate the classification consistency of features in oral lesion images by experts and students for use in the learning tool. Twenty oral lesion images from DiLearn’s image bank were classified by three oral lesion experts and two senior dental hygiene students using the same rubric containing eight features. Classification agreement among and between raters were evaluated using Fleiss’ and Cohen’s Kappa. Classification agreement among the three experts ranged from identical (Fleiss’ Kappa = 1) for “clinical action”, to slight agreement for “border regularity” (Fleiss’ Kappa = 0.136), with the majority of categories having fair to moderate agreement (Fleiss’ Kappa = 0.332–0.545). Inclusion of the two student raters with the experts yielded fair to moderate overall classification agreement (Fleiss’ Kappa = 0.224–0.554), with the exception of “morphology”. The feature of clinical action could be accurately classified, while other anatomical features indirectly related to diagnosis had a lower classification consistency. The findings suggest that one oral lesion expert or two student raters can provide fairly consistent meta-information for selected categories of features implicated in the creation of image classification tasks in DiLearn.
APA, Harvard, Vancouver, ISO, and other styles
49

Dinesh Kumar, R., E. Golden Julie, Y. Harold Robinson, S. Vimal, Gaurav Dhiman, and Murugesh Veerasamy. "Deep Convolutional Nets Learning Classification for Artistic Style Transfer." Scientific Programming 2022 (January 10, 2022): 1–9. http://dx.doi.org/10.1155/2022/2038740.

Full text
Abstract:
Humans have mastered the skill of creativity for many decades. The process of replicating this mechanism is introduced recently by using neural networks which replicate the functioning of human brain, where each unit in the neural network represents a neuron, which transmits the messages from one neuron to other, to perform subconscious tasks. Usually, there are methods to render an input image in the style of famous art works. This issue of generating art is normally called nonphotorealistic rendering. Previous approaches rely on directly manipulating the pixel representation of the image. While using deep neural networks which are constructed using image recognition, this paper carries out implementations in feature space representing the higher levels of the content image. Previously, deep neural networks are used for object recognition and style recognition to categorize the artworks consistent with the creation time. This paper uses Visual Geometry Group (VGG16) neural network to replicate this dormant task performed by humans. Here, the images are input where one is the content image which contains the features you want to retain in the output image and the style reference image which contains patterns or images of famous paintings and the input image which needs to be style and blend them together to produce a new image where the input image is transformed to look like the content image but “sketched” to look like the style image.
APA, Harvard, Vancouver, ISO, and other styles
50

Ji, Lipeng, Xiaohui Hu, and Mingye Wang. "Saliency Preprocessing Locality-Constrained Linear Coding for Remote Sensing Scene Classification." Electronics 7, no. 9 (August 30, 2018): 169. http://dx.doi.org/10.3390/electronics7090169.

Full text
Abstract:
Locality-constrained Linear Coding (LLC) shows superior image classification performance due to its underlying properties of local smooth sparsity and good construction. It encodes the visual features in remote sensing images and realizes the process of modeling human visual perception of an image through a computer. However, it ignores the consideration of saliency preprocessing in the human visual system. Saliency detection preprocessing can effectively enhance a computer’s perception of remote sensing images. To better implement the task of remote sensing image scene classification, this paper proposes a new approach by combining saliency detection preprocessing and LLC. This saliency detection preprocessing approach is realized using spatial pyramid Gaussian kernel density estimation. Experiments show that the proposed method achieved a better performance for remote sensing scene classification tasks.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography