Journal articles on the topic 'Object recognition from optical images'

To see the other types of publications on this topic, follow the link: Object recognition from optical images.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Object recognition from optical images.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Loo, Chu Kiong. "Simulated Quantum-Optical Object Recognition from High-Resolution Images." Optics and Spectroscopy 99, no. 2 (2005): 218. http://dx.doi.org/10.1134/1.2034607.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Rasheed, Nada Abdullah, and Wessam Lahmod Nados. "Object Segmentation from Background of 2D Image." JOURNAL OF UNIVERSITY OF BABYLON for Pure and Applied Sciences 26, no. 5 (March 12, 2018): 204–15. http://dx.doi.org/10.29196/jub.v26i5.913.

Full text
Abstract:
One of the difficult tasks in the image processing field and still not solved is segmentation of object from background of 2D image accurately. Therefore, a new method has been proposed for the purpose of segmenting the object from its background for the purpose of enhancing the images and obtains characteristics of the object without the rest of the region of the image. This process is important to provide optimal classification in the process of pattern recognition. Therefore, this paper proposed the method that includes several tasks, after loading the six files of images; this work applies the segmentation al- gorithm depending on the border and the color of the object. Finally, 2D median filtering algorithm was employed to remove noisy objects of various shapes and sizes. The algo- rithm was tested on variety images, and the results are high precision. In the other words, the proposed method is able to segment the objects from the background with promising results.
APA, Harvard, Vancouver, ISO, and other styles
3

Morozov, A. N., A. L. Nazolin, and I. L. Fufurin. "Optical and Spectral Methods for Detection and Recognition of Unmanned Aerial Vehicles." Radio Engineering, no. 2 (May 17, 2020): 39–50. http://dx.doi.org/10.36027/rdeng.0220.0000167.

Full text
Abstract:
The paper considers a problem of detection and identification of unmanned aerial vehicles (UAVs) against the animate and inanimate objects and identification of their load by optical and spectral optical methods. The state-of-the-art analysis has shown that, when using the radar methods to detect small UAVs, there is a dead zone for distances of 250-700 m, and in this case it is important to use optical methods for detecting UAVs.The application possibilities and improvements of the optical scheme for detecting UAVs at long distances of about 1-2 km are considered. Location is performed by intrinsic infrared (IR) radiation of an object using the IR cameras and thermal imagers, as well as using a laser rangefinder (LIDAR). The paper gives examples of successful dynamic detection and recognition of objects from video images by methods of graph theory and neural networks using the network FasterR-CNN, YOLO and SSD models, including one frame received.The possibility for using the available spectral optical methods to analyze the chemical composition of materials that can be employed for remote identification of UAV coating materials, as well as for detecting trace amounts of matter on its surface has been studied. The advantages and disadvantages of the luminescent spectroscopy with UV illumination, Raman spectroscopy, differential absorption spectroscopy based on a tunable UV laser, spectral imaging methods (hyper / multispectral images), diffuse reflectance laser spectroscopy using infrared tunable quantum cascade lasers (QCL) have been shown.To assess the potential limiting distances for detecting and identifying UAVs, as well as identifying the chemical composition of an object by optical and spectral optical methods, a described experimental setup (a hybrid lidar UAV identification complex) is expected to be useful. The experimental setup structure and its performances are described. Such studies are aimed at development of scientific basics for remote detection, identification, tracking, and determination of UAV parameters and UAV belonging to different groups by optical location and spectroscopy methods, as well as for automatic optical UAV recognition in various environments against the background of moving wildlife. The proposed problem solution is to combine the optical location and spectral analysis methods, methods of the theory of statistics, graphs, deep learning, neural networks and automatic control methods, which is an interdisciplinary fundamental scientific task.
APA, Harvard, Vancouver, ISO, and other styles
4

FINLAYSON, GRAHAM D., and GUI YUN TIAN. "COLOR NORMALIZATION FOR COLOR OBJECT RECOGNITION." International Journal of Pattern Recognition and Artificial Intelligence 13, no. 08 (December 1999): 1271–85. http://dx.doi.org/10.1142/s0218001499000720.

Full text
Abstract:
Color images depend on the color of the capture illuminant and object reflectance. As such image colors are not stable features for object recognition, however stability is necessary since perceived colors (the colors we see) are illuminant independent and do correlate with object identity. Before the colors in images can be compared, they must first be preprocessed to remove the effect of illumination. Two types of preprocessing have been proposed: first, run a color constancy algorithm or second apply an invariant normalization. In color constancy preprocessing the illuminant color is estimated and then, at a second stage, the image colors are corrected to remove color bias due to illumination. In color invariant normalization image RGBs are redescribed, in an illuminant independent way, relative to the context in which they are seen (e.g. RGBs might be divided by a local RGB average). In theory the color constancy approach is superior since it works in a scene independently: color invariant normalization can be calculated post-color constancy but the converse is not true. However, in practice color invariant normalization usually supports better indexing. In this paper we ask whether color constancy algorithms will ever deliver better indexing than color normalization. The main result of this paper is to demonstrate equivalence between color constancy and color invariant computation. The equivalence is empirically derived based on color object recognition experiments. colorful objects are imaged under several different colors of light. To remove dependency due to illumination these images are preprocessed using either a perfect color constancy algorithm or the comprehensive color image normalization. In the perfect color constancy algorithm the illuminant is measured rather than estimated. The import of this is that the perfect color constancy algorithm can determine the actual illuminant without error and so bounds the performance of all existing and future algorithms. Post-color constancy or color normalization processing, the color content is used as cue for object recognition. Counter-intuitively perfect color constancy does not support perfect recognition. In comparison the color invariant normalization does deliver near-perfect recognition. That the color constancy approach fails implies that the scene effective illuminant is different from the measured illuminant. This explanation has merit since it is well known that color constancy is more difficult in the presence of physical processes such as fluorescence and mutual illumination. Thus, in a second experiment, image colors are corrected based on a scene dependent "effective illuminant". Here, color constancy preprocessing facilitates near-perfect recognition. Of course, if the effective light is scene dependent then optimal color constancy processing is also scene dependent and so, is equally a color invariant normalization.
APA, Harvard, Vancouver, ISO, and other styles
5

Slyusar, Vadym, Mykhailo Protsenko, Anton Chernukha, Pavlo Kovalov, Pavlo Borodych, Serhii Shevchenko, Oleksandr Chernikov, Serhii Vazhynskyi, Oleg Bogatov, and Kirill Khrustalev. "Improvement of the model of object recognition in aero photographs using deep convolutional neural networks." Eastern-European Journal of Enterprise Technologies 5, no. 2 (113) (October 31, 2021): 6–21. http://dx.doi.org/10.15587/1729-4061.2021.243094.

Full text
Abstract:
Detection and recognition of objects in images is the main problem to be solved by computer vision systems. As part of solving this problem, the model of object recognition in aerial photographs taken from unmanned aerial vehicles has been improved. A study of object recognition in aerial photographs using deep convolutional neural networks has been carried out. Analysis of possible implementations showed that the AlexNet 2012 model (Canada) trained on the ImageNet image set (China) is most suitable for this problem solution. This model was used as a basic one. The object recognition error for this model with the use of the ImageNet test set of images amounted to 15 %. To solve the problem of improving the effectiveness of object recognition in aerial photographs for 10 classes of images, the final fully connected layer was modified by rejection from 1,000 to 10 neurons and additional two-stage training of the resulting model. Additional training was carried out with a set of images prepared from aerial photographs at stage 1 and with a set of VisDrone 2021 (China) images at stage 2. Optimal training parameters were selected: speed (step) (0.0001), number of epochs (100). As a result, a new model under the proposed name of AlexVisDrone was obtained. The effectiveness of the proposed model was checked with a test set of 100 images for each class (the total number of classes was 10). Accuracy and sensitivity were chosen as the main indicators of the model effectiveness. As a result, an increase in recognition accuracy from 7 % (for images from aerial photographs) to 9 % (for the VisDrone 2021 set) was obtained which has indicated that the choice of neural network architecture and training parameters was correct. The use of the proposed model makes it possible to automate the process of object recognition in aerial photographs. In the future, it is advisable to use this model at ground stations of unmanned aerial vehicle complex control when processing aerial photographs taken from unmanned aerial vehicles, in robotic systems, in video surveillance complexes and when designing unmanned vehicle systems
APA, Harvard, Vancouver, ISO, and other styles
6

Kupinski, Matthew A., Eric Clarkson, John W. Hoppin, Liying Chen, and Harrison H. Barrett. "Experimental determination of object statistics from noisy images." Journal of the Optical Society of America A 20, no. 3 (March 1, 2003): 421. http://dx.doi.org/10.1364/josaa.20.000421.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Kholifah, Desiana Nur, Hendri Mahmud Nawawi, and Indra Jiwana Thira. "IMAGE BACKGROUND PROCESSING FOR COMPARING ACCURACY VALUES OF OCR PERFORMANCE." Jurnal Pilar Nusa Mandiri 16, no. 1 (March 15, 2020): 33–38. http://dx.doi.org/10.33480/pilar.v16i1.1076.

Full text
Abstract:
Optical Character Recognition (OCR) is an application used to process digital text images into text. Many documents that have a background in the form of images in the visual context of the background image increase the security of documents that state authenticity, but the background image causes difficulties with OCR performance because it makes it difficult for OCR to recognize characters overwritten by background images. By removing background images can maximize OCR performance compared to document images that are still background. Using the thresholding method to eliminate background images and look for recall values, precision, and character recognition rates to determine the performance value of OCR that is used as the object of research. From eliminating the background image with thresholding, an increase in performance on the three types of OCR is used as the object of research.
APA, Harvard, Vancouver, ISO, and other styles
8

Lysenko, A. V., M. S. Oznobikhin, E. A. Kireev, K. S. Dubrova, and S. S. Vorobyeva. "Identification of Baikal phytoplankton inferred from computer vision methods and machine learning." Limnology and Freshwater Biology, no. 3 (2021): 1143–46. http://dx.doi.org/10.31951/2658-3518-2021-a-3-1143.

Full text
Abstract:
Abstract. This study discusses the problem of phytoplankton classification using computer vision methods and convolutional neural networks. We created a system for automatic object recognition consisting of two parts: analysis and primary processing of phytoplankton images and development of the neural network based on the obtained information about the images. We developed software that can detect particular objects in images from a light microscope. We trained a convolutional neural network in transfer learning and determined optimal parameters of this neural network and the optimal size of using dataset. To increase accuracy for these groups of classes, we created three neural networks with the same structure. The obtained accuracy in the classification of Baikal phytoplankton by these neural networks was up to 80%.
APA, Harvard, Vancouver, ISO, and other styles
9

Jung, Jae-Hyun, Tian Pu, and Eli Peli. "Comparing object recognition from binary and bipolar edge images for visual prostheses." Journal of Electronic Imaging 25, no. 6 (December 22, 2016): 061619. http://dx.doi.org/10.1117/1.jei.25.6.061619.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

R. Sanjuna, K., and K. Dinakaran. "A Multi-Object Feature Selection Based Text Detection and Extraction Using Skeletonized Region Optical Character Recognition in-Text Images." International Journal of Engineering & Technology 7, no. 3.6 (July 4, 2018): 386. http://dx.doi.org/10.14419/ijet.v7i3.6.16009.

Full text
Abstract:
Information or content extraction from image is crucial task for obtaining text in natural scene images. The problem arise due to variation in images contains differential object to explore values like, background filling, saturation ,color etc. text projections from different styles varies the essential information which is for wrong understand for detecting characters.so detection of region text need more accuracy to identify the exact object. To consider this problem, to propose a multi-objective feature for text detection and localization based on skeletonized text bound box region of text confidence score. This contributes the intra edge detection, segmentation along skeleton of object reflective. the impact of multi-objective region selection model (MSOR) is to recognize the exact character of style matches using the bounding box region analysis which is to identify the object portion to accomplish the candidate extraction model.To enclose the text region localization of text resolution and hazy image be well identified edge smoothing quick guided filter methods. Further the region are skeletonized to morphing the segmented region of inter segmentation to extract the text.
APA, Harvard, Vancouver, ISO, and other styles
11

Мельник, Р. А., Р. І. Квіт, and Т. М. Сало. "Face image profiles features extraction for recognition systems." Scientific Bulletin of UNFU 31, no. 1 (February 4, 2021): 117–21. http://dx.doi.org/10.36930/40310120.

Full text
Abstract:
The object of research is the algorithm of piecewise linear approximation when applying it to the selection of facial features and compression of its images. One of the problem areas is to obtain the optimal ratio of the degree of compression and accuracy of image reproduction, as well as the accuracy of the obtained facial features, which can be used to search for people in databases. The main characteristics of the image of the face are the coordinates and size of the eyes, mouth, nose and other objects of attention. Dimensions, distances between them, as well as their relationship also form a set of characteristics. A piecewise linear approximation algorithm is used to identify and determine these features. First, it is used to approximate the image of the face to obtain a graph of the silhouette from right to left and, secondly, to approximate fragments of the face to obtain silhouettes of the face from top to bottom. The purpose of the next stage is to implement multilevel segmentation of the approximated images to cover them with rectangles of different intensity. Due to their shape they are called barcodes. These three stages of the algorithm the faces are represented by two barcode images are vertical and horizontal. This material is used to calculate facial features. The medium intensity function in a row or column is used to form an approximation object and as a tool to measure the values of facial image characteristics. Additionally, the widths of barcodes and the distances between them are calculated. Experimental results with faces from known databases are presented. A piecewise linear approximation is used to compress facial images. Experiments have shown how the accuracy of the approximation changes with the degree of compression of the image. The method has a linear complexity of the algorithm from the number of pixels in the image, which allows its testing for large data. Finding the coordinates of a synchronized object, such as the eyes, allows calculating all the distances between the objects of attention on the face in relative form. The developed software has control parameters for conducting research.
APA, Harvard, Vancouver, ISO, and other styles
12

Liu, Shuhua, Huixin Xu, Qi Li, Fei Zhang, and Kun Hou. "A Robot Object Recognition Method Based on Scene Text Reading in Home Environments." Sensors 21, no. 5 (March 9, 2021): 1919. http://dx.doi.org/10.3390/s21051919.

Full text
Abstract:
With the aim to solve issues of robot object recognition in complex scenes, this paper proposes an object recognition method based on scene text reading. The proposed method simulates human-like behavior and accurately identifies objects with texts through careful reading. First, deep learning models with high accuracy are adopted to detect and recognize text in multi-view. Second, datasets including 102,000 Chinese and English scene text images and their inverse are generated. The F-measure of text detection is improved by 0.4% and the recognition accuracy is improved by 1.26% because the model is trained by these two datasets. Finally, a robot object recognition method is proposed based on the scene text reading. The robot detects and recognizes texts in the image and then stores the recognition results in a text file. When the user gives the robot a fetching instruction, the robot searches for corresponding keywords from the text files and achieves the confidence of multiple objects in the scene image. Then, the object with the maximum confidence is selected as the target. The results show that the robot can accurately distinguish objects with arbitrary shape and category, and it can effectively solve the problem of object recognition in home environments.
APA, Harvard, Vancouver, ISO, and other styles
13

Hirabayashi, Taketsugu, Kazuki Abukawa, Tomoo Sato, Sayuri Matsumoto, and Muneo Yoshie. "First Trial of Underwater Excavator Work Supported by Acoustic Video Camera." Journal of Robotics and Mechatronics 28, no. 2 (April 19, 2016): 138–48. http://dx.doi.org/10.20965/jrm.2016.p0138.

Full text
Abstract:
[abstFig src='/00280002/04.jpg' width=""300"" text='The experiment of recognition by acoustic video camera' ]External recognition is important for underwater machinery works. However, acquisition of external field information from optical camera images may not be possible, owing to muddiness of water caused by such work. Furthermore, in order to improve the workability of machines in the scenario of their remote operation, it is important to know the positional relation information between a target object and the end effector. To solve these problems, an acoustic video camera was developed and performance test experiments were conducted at a caisson dockyard. In the experiments, a prototype of acoustic video camera was used to measure and to recognize a target objects and an underwater construction machine. And the feasibility of monitoring for underwater construction using the acoustic videos was evaluated. As a result, it was found that despite the lower accuracy of shape recognition on account of a resolution problem, the positional relation could be recognized satisfactorily since the video images could be presented from an arbitrary viewpoint.
APA, Harvard, Vancouver, ISO, and other styles
14

Cao, W., X. H. Tong, S. C. Liu, and D. Wang. "LANDSLIDES EXTRACTION FROM DIVERSE REMOTE SENSING DATA SOURCES USING SEMANTIC REASONING SCHEME." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B8 (June 22, 2016): 25–31. http://dx.doi.org/10.5194/isprsarchives-xli-b8-25-2016.

Full text
Abstract:
Using high resolution satellite imagery to detect, analyse and extract landslides automatically is an increasing strong support for rapid response after disaster. This requires the formulation of procedures and knowledge that encapsulate the content of disaster area in the images. Object-oriented approach has been proved useful in solving this issue by partitioning land-cover parcels into objects and classifies them on the basis of expert rules. Since the landslides information present in the images is often complex, the extraction procedure based on the object-oriented approach should consider primarily the semantic aspects of the data. In this paper, we propose a scheme for recognizing landslides by using an object-oriented analysis technique and a semantic reasoning model on high spatial resolution optical imagery. Three case regions with different data sources are presented to evaluate its practicality. The procedure is designed as follows: first, the Gray Level Co-occurrence Matrix (GLCM) is used to extract texture features after the image explanation. Spectral features, shape features and thematic features are derived for semiautomatic landslide recognition. A semantic reasoning model is used afterwards to refine the classification results, by representing expert knowledge as first-order logic (FOL) rules. The experimental results are essentially consistent with the experts’ field interpretation, which demonstrate the feasibility and accuracy of the proposed approach. The results also show that the scheme has a good generality on diverse data sources.
APA, Harvard, Vancouver, ISO, and other styles
15

Cao, W., X. H. Tong, S. C. Liu, and D. Wang. "LANDSLIDES EXTRACTION FROM DIVERSE REMOTE SENSING DATA SOURCES USING SEMANTIC REASONING SCHEME." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B8 (June 22, 2016): 25–31. http://dx.doi.org/10.5194/isprs-archives-xli-b8-25-2016.

Full text
Abstract:
Using high resolution satellite imagery to detect, analyse and extract landslides automatically is an increasing strong support for rapid response after disaster. This requires the formulation of procedures and knowledge that encapsulate the content of disaster area in the images. Object-oriented approach has been proved useful in solving this issue by partitioning land-cover parcels into objects and classifies them on the basis of expert rules. Since the landslides information present in the images is often complex, the extraction procedure based on the object-oriented approach should consider primarily the semantic aspects of the data. In this paper, we propose a scheme for recognizing landslides by using an object-oriented analysis technique and a semantic reasoning model on high spatial resolution optical imagery. Three case regions with different data sources are presented to evaluate its practicality. The procedure is designed as follows: first, the Gray Level Co-occurrence Matrix (GLCM) is used to extract texture features after the image explanation. Spectral features, shape features and thematic features are derived for semiautomatic landslide recognition. A semantic reasoning model is used afterwards to refine the classification results, by representing expert knowledge as first-order logic (FOL) rules. The experimental results are essentially consistent with the experts’ field interpretation, which demonstrate the feasibility and accuracy of the proposed approach. The results also show that the scheme has a good generality on diverse data sources.
APA, Harvard, Vancouver, ISO, and other styles
16

ALLABERGANOV, AKHMEDZHAN. "METHOD FOR IDENTIFYING FROM DIGITAL FORM (IMAGES) OF TYPICAL INTERVENTIONS TO TEXT INFORMATION (COLOR, TYPE INK, SIGNS, VIDEO CHANGE)." Sociopolitical sciences 10, no. 2 (April 30, 2020): 107–12. http://dx.doi.org/10.33693/2223-0092-2020-10-2-107-112.

Full text
Abstract:
A method has been developed for processing and analyzing the spectral space, obtaining images and recognizing text (textual information) by the video spectral and video microscopic research method, using the Forensic Multifunctional Complex (CMC) with the help of forensic software, which allows to determine the features of objects that are present on the investigated (source) image. At the same time, in the visible region of the spectrum, the optimal method (method) of the algorithmic implementation of transformations depending on the type of their characteristics of the object and their parts can be distinguished. The proposed research method (method), recognition and identification from digital form (image) of typical interventions in text information (color, ink type, signs, modification), highlighting of text elements and recognition of objects and their parts, identification of falsification (fake) of a document, text characters in electronic format (digital form). This method can significantly increase the effectiveness of expert activities in the production of technical and forensic research.
APA, Harvard, Vancouver, ISO, and other styles
17

Zhurkin, I. G., L. N. Chaban, and P. Yu Orlov. "Structurally topological algorithm for star recognition and near-Earth space’ object detection." Computer Optics 44, no. 3 (June 2020): 375–84. http://dx.doi.org/10.18287/2412-6179-co-597.

Full text
Abstract:
When solving a variety of celestial navigation tasks there is a problem of determining parameters of spacecraft motion and onboard primary payload orientation based on the coordinates of registered star images. Furthermore, unwanted objects, like active satellites, natural and artificial space debris, that reduce the probability of correct recognition may get into the field of view of a satellite sensor. This prompts the necessity to filter out such interference from the star field images. However, if the objects under recognition are bodies located in near-Earth space, in this case, the star images themselves will act as interferences. In addition, since the detection and cataloging of these objects from the Earth’s surface is complicated by their small size, the atmospheric effects, as well as other technical difficulties, it is worthwhile to use the existing equipment onboard spacecrafts to solve this task. The existing recognition algorithms for star groups, as well as their classification, are presented in this paper. Moreover, a structurally topological approach for identifying groups of stars based on the properties of enveloping polygons used in constructing topological star patterns is proposed. Specific features in the construction of topological configurations on the analyzed set of points, as well as the principles of dynamic space object detection within their limits are described. Results of the numerical experiments performed using the developed algorithm on the star field maps and model scenes are presented.
APA, Harvard, Vancouver, ISO, and other styles
18

Rouhafzay, Ghazal, and Ana-Maria Cretu. "An Application of Deep Learning to Tactile Data for Object Recognition under Visual Guidance." Sensors 19, no. 7 (March 29, 2019): 1534. http://dx.doi.org/10.3390/s19071534.

Full text
Abstract:
Drawing inspiration from haptic exploration of objects by humans, the current work proposes a novel framework for robotic tactile object recognition, where visual information in the form of a set of visually interesting points is employed to guide the process of tactile data acquisition. Neuroscience research confirms the integration of cutaneous data as a response to surface changes sensed by humans with data from joints, muscles, and bones (kinesthetic cues) for object recognition. On the other hand, psychological studies demonstrate that humans tend to follow object contours to perceive their global shape, which leads to object recognition. In compliance with these findings, a series of contours are determined around a set of 24 virtual objects from which bimodal tactile data (kinesthetic and cutaneous) are obtained sequentially and by adaptively changing the size of the sensor surface according to the object geometry for each object. A virtual Force Sensing Resistor array (FSR) is employed to capture cutaneous cues. Two different methods for sequential data classification are then implemented using Convolutional Neural Networks (CNN) and conventional classifiers, including support vector machines and k-nearest neighbors. In the case of conventional classifiers, we exploit contourlet transformation to extract features from tactile images. In the case of CNN, two networks are trained for cutaneous and kinesthetic data and a novel hybrid decision-making strategy is proposed for object recognition. The proposed framework is tested both for contours determined blindly (randomly determined contours of objects) and contours determined using a model of visual attention. Trained classifiers are tested on 4560 new sequential tactile data and the CNN trained over tactile data from object contours selected by the model of visual attention yields an accuracy of 98.97% which is the highest accuracy among other implemented approaches.
APA, Harvard, Vancouver, ISO, and other styles
19

Slyusar, Vadym, Mykhailo Protsenko, Anton Chernukha, Stella Gornostal, Sergey Rudakov, Serhii Shevchenko, Oleksandr Chernikov, Nadiia Kolpachenko, Volodymyr Timofeyev, and Roman Artiukh. "Construction of an advanced method for recognizing monitored objects by a convolutional neural network using a discrete wavelet transform." Eastern-European Journal of Enterprise Technologies 4, no. 9(112) (August 31, 2021): 65–77. http://dx.doi.org/10.15587/1729-4061.2021.238601.

Full text
Abstract:
The tasks that unmanned aircraft systems solve include the detection of objects and determining their state. This paper reports an analysis of image recognition methods in order to automate the specified process. Based on the analysis, an improved method for recognizing images of monitored objects by a convolutional neural network using a discrete wavelet transform has been devised. Underlying the method is the task of automating image processing in unmanned aircraft systems. The operability of the proposed method was tested using an example of processing an image (aircraft, tanks, helicopters) acquired by the optical system of an unmanned aerial vehicle. A discrete wavelet transform has been used to build a database of objects' wavelet images and train a convolutional neural network based on them. That has made it possible to improve the efficiency of recognition of monitored objects and automate a given process. The effectiveness of the improved method is achieved by preliminary decomposition and approximation of the digital image of the monitored object by a discrete wavelet transform. The stages of a given method include the construction of a database of the wavelet images of images and training a convolutional neural network. The effectiveness of recognizing the monitored objects' images by the improved method was tested on a convolutional neural network, which was trained with images of 300 monitored objects. In this case, the time to make a decision, based on the proposed method, decreased on average from 0.7 to 0.84 s compared with the artificial neural networks ResNet and ConvNets. The method could be used in the information processing systems in unmanned aerial vehicles that monitor objects; in robotic complexes for various purposes; in the video surveillance systems of important objects
APA, Harvard, Vancouver, ISO, and other styles
20

Mancera Florez, Juan Ricardo, and Ivan Alberto Lizarazo Salcedo. "Land cover classification at three different levels of detail from optical and radar Sentinel SAR data: a case study in Cundinamarca (Colombia)." DYNA 87, no. 215 (November 5, 2020): 136–45. http://dx.doi.org/10.15446/dyna.v87n215.84915.

Full text
Abstract:
In this paper, the potential of Sentinel-1A and Sentinel-2A satellite images for land cover mapping is evaluated at three levels of spatial detail; exploratory, reconnaissance, and semi-detailed. To do so, two different image classification approaches are compared: (i) a traditional pixel-wise approach; and (ii) an object–oriented approach. In both cases, the classification task was conducted using the “RandomForest” algorithm. The case study was also intended to identify a set of radar channels, optical bands, and indices that are relevant for classification. The thematic accuracy of the classifications displays the best results for the object-oriented approach to exploratory and recognition levels. The results show that the integration of multispectral and radar data as explanatory variables for classification provides better results than the use of a single data source.
APA, Harvard, Vancouver, ISO, and other styles
21

Long, Ningbo, Han Yan, Liqiang Wang, Haifeng Li, and Qing Yang. "Unifying Obstacle Detection, Recognition, and Fusion Based on the Polarization Color Stereo Camera and LiDAR for the ADAS." Sensors 22, no. 7 (March 23, 2022): 2453. http://dx.doi.org/10.3390/s22072453.

Full text
Abstract:
The perception module plays an important role in vehicles equipped with advanced driver-assistance systems (ADAS). This paper presents a multi-sensor data fusion system based on the polarization color stereo camera and the forward-looking light detection and ranging (LiDAR), which achieves the multiple target detection, recognition, and data fusion. The You Only Look Once v4 (YOLOv4) network is utilized to achieve object detection and recognition on the color images. The depth images are obtained from the rectified left and right images based on the principle of the epipolar constraints, then the obstacles are detected from the depth images using the MeanShift algorithm. The pixel-level polarization images are extracted from the raw polarization-grey images, then the water hazards are detected successfully. The PointPillars network is employed to detect the objects from the point cloud. The calibration and synchronization between the sensors are accomplished. The experiment results show that the data fusion enriches the detection results, provides high-dimensional perceptual information and extends the effective detection range. Meanwhile, the detection results are stable under diverse range and illumination conditions.
APA, Harvard, Vancouver, ISO, and other styles
22

Kubera, Elżbieta, Agnieszka Kubik-Komar, Paweł Kurasiński, Krystyna Piotrowska-Weryszko, and Magdalena Skrzypiec. "Detection and Recognition of Pollen Grains in Multilabel Microscopic Images." Sensors 22, no. 7 (March 31, 2022): 2690. http://dx.doi.org/10.3390/s22072690.

Full text
Abstract:
Analysis of pollen material obtained from the Hirst-type apparatus, which is a tedious and labor-intensive process, is usually performed by hand under a microscope by specialists in palynology. This research evaluated the automatic analysis of pollen material performed based on digital microscopic photos. A deep neural network called YOLO was used to analyze microscopic images containing the reference grains of three taxa typical of Central and Eastern Europe. YOLO networks perform recognition and detection; hence, there is no need to segment the image before classification. The obtained results were compared to other deep learning object detection methods, i.e., Faster R-CNN and RetinaNet. YOLO outperformed the other methods, as it gave the mean average precision (mAP@.5:.95) between 86.8% and 92.4% for the test sets included in the study. Among the difficulties related to the correct classification of the research material, the following should be noted: significant similarities of the grains of the analyzed taxa, the possibility of their simultaneous occurrence in one image, and mutual overlapping of objects.
APA, Harvard, Vancouver, ISO, and other styles
23

Tanaka, K., H. Saito, Y. Fukada, and M. Moriya. "Coding visual images of objects in the inferotemporal cortex of the macaque monkey." Journal of Neurophysiology 66, no. 1 (July 1, 1991): 170–89. http://dx.doi.org/10.1152/jn.1991.66.1.170.

Full text
Abstract:
1. The inferotemporal cortex (IT) has been thought to play an essential and specific role in visual object discrimination and recognition, because a lesion of IT in the monkey results in a specific deficit in learning tasks that require these visual functions. To understand the cellular basis of the object discrimination and recognition processes in IT, we determined the optimal stimulus of individual IT cells in anesthetized, immobilized monkeys. 2. In the posterior one-third or one-fourth of IT, most cells could be activated maximally by bars or disks just by adjusting the size, orientation, or color of the stimulus. 3. In the remaining anterior two-thirds or three-quarters of IT, most cells required more complex features for their maximal activation. 4. The critical feature for the activation of individual anterior IT cells varied from cell to cell: a complex shape in some cells and a combination of texture or color with contour-shape in other cells. 5. Cells that showed different types of complexity for the critical feature were intermingled throughout anterior IT, whereas cells recorded in single penetrations showed critical features that were related in some respects. 6. Generally speaking, the critical features of anterior IT cells were moderately complex and can be thought of as partial features common to images of several different natural objects. The selectivity to the optimal stimulus was rather sharp, although not absolute. We thus propose that, in anterior IT, images of objects are coded by combinations of active cells, each of which represents the presence of a particular partial feature in the image.
APA, Harvard, Vancouver, ISO, and other styles
24

Hajari, Nasim, Gabriel Lugo Bustillo, Harsh Sharma, and Irene Cheng. "Marker-Less 3d Object Recognition and 6d Pose Estimation for Homogeneous Textureless Objects: An RGB-D Approach." Sensors 20, no. 18 (September 7, 2020): 5098. http://dx.doi.org/10.3390/s20185098.

Full text
Abstract:
The task of recognising an object and estimating its 6d pose in a scene has received considerable attention in recent years. The accessibility and low-cost of consumer RGB-D cameras, make object recognition and pose estimation feasible even for small industrial businesses. An example is the industrial assembly line, where a robotic arm should pick a small, textureless and mostly homogeneous object and place it in a designated location. Despite all the recent advancements of object recognition and pose estimation techniques in natural scenes, the problem remains challenging for industrial parts. In this paper, we present a framework to simultaneously recognise the object’s class and estimate its 6d pose from RGB-D data. The proposed model adapts a global approach, where an object and the Region of Interest (ROI) are first recognised from RGB images. The object’s pose is then estimated from the corresponding depth information. We train various classifiers based on extracted Histogram of Oriented Gradient (HOG) features to detect and recognize the objects. We then perform template matching on the point cloud based on surface normal and Fast Point Feature Histograms (FPFH) to estimate the pose of the object. Experimental results show that our system is quite efficient, accurate and robust to illumination and background changes, even for the challenging objects of Tless dataset.
APA, Harvard, Vancouver, ISO, and other styles
25

Zeng, Yujun, Lilin Qian, and Junkai Ren. "Evolutionary Hierarchical Sparse Extreme Learning Autoencoder Network for Object Recognition." Symmetry 10, no. 10 (October 10, 2018): 474. http://dx.doi.org/10.3390/sym10100474.

Full text
Abstract:
Extreme learning machine (ELM), characterized by its fast learning efficiency and great generalization ability, has been applied to various object recognition tasks. When extended to the stacked autoencoder network, which is a typical symmetrical representation learning model architecture, ELM manages to realize hierarchical feature extraction and classification, which is what deep neural networks usually do, but with much less training time. Nevertheless, the input weights and biases of the hidden nodes in ELM are generated according to a random distribution and may lead to the occurrence of non-optimal and redundant parameters that deteriorate discriminative features, which will have a bad influence on the final classification effect. In this paper, a novel sparse autoencoder derived from ELM and differential evolution is proposed and integrated into a hierarchical hybrid autoencoder network to accomplish the end-to-end learning with raw visible light camera sensor images and applied to several typical object recognition problems. Experimental results show that the proposed method is able to obtain competitive or better performance than current relevant methods with acceptable or less time consumption.
APA, Harvard, Vancouver, ISO, and other styles
26

Sunil, A., V. V. Sajithvariyar, V. Sowmya, R. Sivanpillai, and K. P. Soman. "IDENTIFYING OIL PADS IN HIGH SPATIAL RESOLUTION AERIAL IMAGES USING FASTER R-CNN." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIV-M-3-2021 (August 10, 2021): 155–61. http://dx.doi.org/10.5194/isprs-archives-xliv-m-3-2021-155-2021.

Full text
Abstract:
Abstract. Deep learning (DL) methods are used for identifying objects in aerial and ground-based images. Detecting vehicles, roads, buildings, and crops are examples of object identification applications using DL methods. Identifying complex natural and man-made features continues to be a challenge. Oil pads are an example of complex built features due to their shape, size, and presence of other structures like sheds. This work applies Faster Region-based Convolutional Neural Network (R-CNN), a DL-based object recognition method, for identifying oil pads in high spatial resolution (1m), true-color aerial images. Faster R-CNN is a region-based object identification method, consisting of Regional Proposal Network (RPN) that helps to find the area where the target can be possibly present in the images. If the target is present in the images, the Faster R-CNN algorithm will identify the area in an image as foreground and the rest as background. The algorithm was trained with oil pad locations that were manually annotated from orthorectified imagery acquired in 2017. Eighty percent of the annotated images were used for training and the number of epochs was increased from 100 to 1000 in increments of 100 with a fixed length of 1000. After determining the optimal number of epochs the performance of the algorithm was evaluated with an independent set of validation images consisting of frames with and without oil pads. Results indicate that the Faster R-CNN algorithm can be used for identifying oil pads in aerial images.
APA, Harvard, Vancouver, ISO, and other styles
27

V, Hrabovskyi, and Kmet O. "Recognition and calculation of objects in images using YOLOv3 architecture." Artificial Intelligence 26, jai2021.26(2) (December 1, 2021): 42–53. http://dx.doi.org/10.15407/jai2021.02.042.

Full text
Abstract:
Program that searches for five types of fruits in the images of fruit trees, classifies them and counts their quantity is presented. Its creation took into account the requirement to be able to work both in the background and in real time and to identify the desired objects at a sufficiently high speed. The program should also be able to learn from available computers (including laptops) and within a reasonable time. In carrying out this task, the possibilities of several existing approaches to the recognition and identification of visual objects based on the use of convolutional neural networks were analyzed. Among the considered network archi-tectures were R-CNN, Fast R-CNN, Faster R-CNN, SSD, YOLO and some modifications based on them. Based on the analysis of the peculiarities of their work, the YOLO architecture was used to perform the task, which allows the analy-sis of visual objects in real time with high speed and reliability. The software product was implemented by modifying the YOLOv3 architecture implemented in TensorFlow 2.1. Object recognition in this architecture is performed using a trained Darknet-53 network, the parameters of which are freely available. The modification of the network was to replace its original classification layer. The training of the network modified in this way was carried out on the basis of Transfer learning technology using the Agrilfruit Dataset. There was also a study of the peculiarities of the learning process of the network under the use of different types of gradient descent (stochastic and with the value of the batch 4 and 8), as a result of which the optimal version of the trained network weights was selected for further use. Tests of the modified and trained network have shown that the system based on it with high reliability distin-guishes objects of the corresponding classes of different sizes in the image (even with their significant masking) and counts their number. The ability of the program to distinguish and count the number of individual fruits in the analyzed image can be used to visually assess the yield of fruit trees
APA, Harvard, Vancouver, ISO, and other styles
28

Gong and Wang. "Research on Moving Target Tracking Based on FDRIG Optical Flow." Symmetry 11, no. 9 (September 4, 2019): 1122. http://dx.doi.org/10.3390/sym11091122.

Full text
Abstract:
Aiming at the problem of moving target recognition, a moving target tracking model based on FDRIG optical flow is proposed. First, the optical flow equation was analyzed from the theory of optical flow. Then, with the energy functional minimization, the FDRIG optical flow technique was proposed. Taking a road section of a university campus as an experimental section, 30 vehicle motion sequence images were considered as objects to form a vehicle motion sequence image with a complex background. The proposed FDRIG optical flow was used to calculate the vehicle motion optical flow field by the Halcon software. Comparable with the classic Horn and Schunck (HS) and Lucas and Kande (LK) optical flow algorithm, the monitoring results proved that the FDRIG optical flow was highly precise and fast when tracking a moving target. The Ettlinger Tor traffic scene was then taken as the second experimental object; FDRIG optical flow was used to analyze vehicle motion. The superior performance of the FDRIG optical flow was further verified. The whole research work shows that FDRIG optical flow has good performance and speed in tracking moving targets and can be used to monitor complex target motion information in real-time.
APA, Harvard, Vancouver, ISO, and other styles
29

Rombovsky, M., and R. Radchenko. "AUTOMATION OF RESEARCH AREA RECOGNITION UNDER WEAK- CONTRAST BORDERS ON A PICTURE OF TRANSPARENT OBJECTS." Theory and Practice of Forensic Science and Criminalistics 19, no. 1 (June 2, 2019): 568–80. http://dx.doi.org/10.32353/khrife.1.2019.45.

Full text
Abstract:
Recently, due to the active development of computer technologies, in the field of science dealing with visual objects, there are problems of creating systems for automatic image recognition. This class of problems does not have a universal way of solving and this leads to the fact that, depending on the needs of a particular branch of science and engineering, the developers of such systems are forced to solve problems of a private nature. The purpose of this work is to create a software product by the authors that would allow automatic recognition of the study area of objects having a weakly expressed boundary between the separation of object and background to further determine the values of the optical characteristics of transparent objects, as common features in identification trace evidence analysis the whole by parts. One of the main problems of detecting the boundaries of the separation of objects in an image is the poor contrast of the transition region between adjacent areas of the divided space, which leads to the need for image processing to segment and correct the contrast. To develop an effective way to limit the recognition area and identify the boundaries of the studied object under the condition of low contrast between the light background and the space occupied by a transparent medium, which differs from the background in optical properties, there is a need to analyze existing image enhancement algorithms and their further practical implementation using mathematical software products of numerical analysis. Thus, the paper reviews modern algorithms for image recognition methods and algorithms for changing image quality by segmentation and contrast correction. The proposed algorithms are implemented in a computer program developed by the authors of the Sumy Research Institute of the Ministry of Internal Affairs of Ukraine on the basis of the mathematical software MATLAB, which allows to carry out automatic comparative analysis of the studied samples on the basis of their digital images. The questions posed to the authors for further development of this topic are outlined.
APA, Harvard, Vancouver, ISO, and other styles
30

Lee, Dae Geon, Young Ha Shin, and Dong-Cheon Lee. "Land Cover Classification Using SegNet with Slope, Aspect, and Multidirectional Shaded Relief Images Derived from Digital Surface Model." Journal of Sensors 2020 (September 12, 2020): 1–21. http://dx.doi.org/10.1155/2020/8825509.

Full text
Abstract:
Most object detection, recognition, and classification are performed using optical imagery. Images are unable to fully represent the real-world due to the limited range of the visible light spectrum reflected light from the surfaces of the objects. In this regard, physical and geometrical information from other data sources would compensate for the limitation of the optical imagery and bring a synergistic effect for training deep learning (DL) models. In this paper, we propose to classify terrain features using convolutional neural network (CNN) based SegNet model by utilizing 3D geospatial data including infrared (IR) orthoimages, digital surface model (DSM), and derived information. The slope, aspect, and shaded relief images (SRIs) were derived from the DSM and were used as training data for the DL model. The experiments were carried out using the Vaihingen and Potsdam dataset provided by the German Society for Photogrammetry, Remote Sensing and Geoinformation (DGPF) through the International Society for Photogrammetry and Remote Sensing (ISPRS). The dataset includes IR orthoimages, DSM, airborne LiDAR data, and label data. The motivation of utilizing 3D data and derived information for training the DL model is that real-world objects are 3D features. The experimental results demonstrate that the proposed approach of utilizing and integrating various informative feature data could improve the performance of the DL for semantic segmentation. In particular, the accuracy of building classification is higher compared with other natural objects because derived information could provide geometric characteristics. Intersection-of-union (IoU) of the buildings for the test data and the new unseen data with combining all derived data were 84.90% and 52.45%, respectively.
APA, Harvard, Vancouver, ISO, and other styles
31

Hewahi, Nabil M., and Eyad A. Alashqar. "Wrapper Feature Selection based on Genetic Algorithm for Recognizing Objects from Satellite Imagery." Journal of Information Technology Research 8, no. 3 (July 2015): 1–20. http://dx.doi.org/10.4018/jitr.2015070101.

Full text
Abstract:
Object recognition is a research area that aims to associate objects to categories or classes. The recognition of object specific geospatial features, such as roads, buildings and rivers, from high-resolution satellite imagery is a time consuming and expensive problem in the maintenance cycle of a Geographic Information System (GIS). Feature selection is the task of selecting a small subset from original features that can achieve maximum classification accuracy and reduce data dimensionality. This subset of features has some very important benefits like, it reduces computational complexity of learning algorithms, saves time, improve accuracy and the selected features can be insightful for the people involved in problem domain. This makes feature selection as an indispensable task in classification task. In this work, the authors propose a new approach that combines Genetic Algorithms (GA) with Correlation Ranking Filter (CRF) wrapper to eliminate unimportant features and obtain better features set that can show better results with various classifiers such as Neural Networks (NN), K-nearest neighbor (KNN), and Decision trees. The approach is based on GA as an optimization algorithm to search the space of all possible subsets related to object geospatial features set for the purpose of recognition. GA is wrapped with three different classifier algorithms namely neural network, k-nearest neighbor and decision tree J48 as subset evaluating mechanism. The GA-ANN, GA-KNN and GA-J48 methods are implemented using the WEKA software on dataset that contains 38 extracted features from satellite images using ENVI software. The proposed wrapper approach incorporated the Correlation Ranking Filter (CRF) for spatial features to remove unimportant features. Results suggest that GA based neural classifiers and using CRF for spatial features are robust and effective in finding optimal subsets of features from large data sets.
APA, Harvard, Vancouver, ISO, and other styles
32

Pastor, Francisco, Juan M. Gandarias, Alfonso J. García-Cerezo, and Jesús M. Gómez-de-Gabriel. "Using 3D Convolutional Neural Networks for Tactile Object Recognition with Robotic Palpation." Sensors 19, no. 24 (December 5, 2019): 5356. http://dx.doi.org/10.3390/s19245356.

Full text
Abstract:
In this paper, a novel method of active tactile perception based on 3D neural networks and a high-resolution tactile sensor installed on a robot gripper is presented. A haptic exploratory procedure based on robotic palpation is performed to get pressure images at different grasping forces that provide information not only about the external shape of the object, but also about its internal features. The gripper consists of two underactuated fingers with a tactile sensor array in the thumb. A new representation of tactile information as 3D tactile tensors is described. During a squeeze-and-release process, the pressure images read from the tactile sensor are concatenated forming a tensor that contains information about the variation of pressure matrices along with the grasping forces. These tensors are used to feed a 3D Convolutional Neural Network (3D CNN) called 3D TactNet, which is able to classify the grasped object through active interaction. Results show that 3D CNN performs better, and provide better recognition rates with a lower number of training data.
APA, Harvard, Vancouver, ISO, and other styles
33

Kolhe, Ashwini, R. R. Itkarkar, and Anilkumar V. Nandani. "Robust Part-Based Hand Gesture Recognition Using Finger-Earth Mover’s Distance." International Journal of Advanced Research in Computer Science and Software Engineering 7, no. 7 (July 29, 2017): 131. http://dx.doi.org/10.23956/ijarcsse/v7i7/0196.

Full text
Abstract:
Hand gesture recognition is of great importance for human-computer interaction (HCI), because of its extensive applications in virtual reality, sign language recognition, and computer games. Despite lots of previous work, traditional vision-based hand gesture recognition methods are still far from satisfactory for real-life applications. Because of the nature of optical sensing, the quality of the captured images is sensitive to lighting conditions and cluttered backgrounds, thus optical sensor based methods are usually unable to detect and track the hands robustly, which largely affects the performance of hand gesture recognition. Compared to the entire human body, the hand is a smaller object with more complex articulations and more easily affected by segmentation errors. It is thus a very challenging problem to recognize hand gestures. This work focuses on building a robust part-based hand gesture recognition system. To handle the noisy hand shapes obtained from digital camera, we propose a novel distance metric, Finger-Earth Mover’s Distance (FEMD), to measure the dissimilarity between hand shapes. As it only matches the finger parts while not the whole hand, it can better distinguish the hand gestures of slight differences. The experiments demonstrate that proposed hand gesture recognition system’s mean accuracy is 80.4% which is measured on 6 gesture database.
APA, Harvard, Vancouver, ISO, and other styles
34

Hassan, Ehtesham, Yasser Khalil, and Imtiaz Ahmad. "Learning Feature Fusion in Deep Learning-Based Object Detector." Journal of Engineering 2020 (May 22, 2020): 1–11. http://dx.doi.org/10.1155/2020/7286187.

Full text
Abstract:
Object detection in real images is a challenging problem in computer vision. Despite several advancements in detection and recognition techniques, robust and accurate localization of interesting objects in images from real-life scenarios remains unsolved because of the difficulties posed by intraclass and interclass variations, occlusion, lightning, and scale changes at different levels. In this work, we present an object detection framework by learning-based fusion of handcrafted features with deep features. Deep features characterize different regions of interest in a testing image with a rich set of statistical features. Our hypothesis is to reinforce these features with handcrafted features by learning the optimal fusion during network training. Our detection framework is based on the recent version of YOLO object detection architecture. Experimental evaluation on PASCAL-VOC and MS-COCO datasets achieved the detection rate increase of 11.4% and 1.9% on the mAP scale in comparison with the YOLO version-3 detector (Redmon and Farhadi 2018). An important step in the proposed learning-based feature fusion strategy is to correctly identify the layer feeding in new features. The present work shows a qualitative approach to identify the best layer for fusion and design steps for feeding in the additional feature sets in convolutional network-based detectors.
APA, Harvard, Vancouver, ISO, and other styles
35

Васильева, Ирина Карловна, and Анатолий Владиславович Попов. "МЕТОД АВТОМАТИЧЕСКОЙ КЛАСТЕРИЗАЦИИ ДАННЫХ ДИСТАНЦИОННОГО ЗОНДИРОВАНИЯ." Aerospace technic and technology, no. 3 (July 15, 2019): 64–75. http://dx.doi.org/10.32620/aktt.2019.3.08.

Full text
Abstract:
The subject matter of the article is the methods of automatic clustering of remote sensing data under conditions of a priori uncertainty regarding the number of observed object classes and the statistical characteristics of the signatures of classes. The aim is to develop a method for approximating multimodal empirical distributions of observational data to construct decision rules for pixel-by-pixel statistical classification procedures, as well as to investigate the effectiveness of this method for automatically classifying objects on synthesized and real images. The tasks to be solved are: to develop and implement a procedure for splitting a mixture of basic distributions, while ensuring the following requirements: the absence of a preliminary data analysis stage in order to select optimal initial approximations; a good convergence of the method and the ability to automatically refine the list of classes by combining indistinguishable or poorly distinguishable components of the mixture into a single cluster; to synthesize test images with a specified number of objects and known data distributions for each object; to evaluate the effectiveness of the developed method for automatic classification by the criterion of the probability of correct recognition; to evaluate the results of automatic clustering of real images. The methods used are methods of stochastic simulation, methods of approximation of empirical distributions, statistical methods of recognition, methods of probability theory and mathematical statistics. The following results have been obtained. A method for automatic splitting of a mixture of Gaussian distributions to construct decision thresholds according to the maximal a posteriori probability criterion was proposed. The results of the automatic forming the list of classes and their probabilistic descriptions, as well as the results of the clustering both test images and satellite ones are given. It is shown that the developed method is quite effective and can be used to determine the number of objects’ classes as well as their stochastic characteristics’ mathematical description for pattern recognition tasks and cluster analysis. Conclusions. The scientific novelty of the results obtained is that the proposed approach makes it possible directly during the “unsupervised” training procedure to evaluate the distinguishability of classes and exclude indistinguishable objects from the list of classes.
APA, Harvard, Vancouver, ISO, and other styles
36

Lantz, E. "Retrieval of a phase-and-amplitude submicrometric object from images obtained in partially coherent microscopy." Journal of the Optical Society of America A 8, no. 5 (May 1, 1991): 791. http://dx.doi.org/10.1364/josaa.8.000791.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Alaskar, Haya, Abir Hussain, Nourah Al-Aseem, Panos Liatsis, and Dhiya Al-Jumeily. "Application of Convolutional Neural Networks for Automated Ulcer Detection in Wireless Capsule Endoscopy Images." Sensors 19, no. 6 (March 13, 2019): 1265. http://dx.doi.org/10.3390/s19061265.

Full text
Abstract:
Detection of abnormalities in wireless capsule endoscopy (WCE) images is a challenging task. Typically, these images suffer from low contrast, complex background, variations in lesion shape and color, which affect the accuracy of their segmentation and subsequent classification. This research proposes an automated system for detection and classification of ulcers in WCE images, based on state-of-the-art deep learning networks. Deep learning techniques, and in particular, convolutional neural networks (CNNs), have recently become popular in the analysis and recognition of medical images. The medical image datasets used in this study were obtained from WCE video frames. In this work, two milestone CNN architectures, namely the AlexNet and the GoogLeNet are extensively evaluated in object classification into ulcer or non-ulcer. Furthermore, we examine and analyze the images identified as containing ulcer objects to evaluate the efficiency of the utilized CNNs. Extensive experiments show that CNNs deliver superior performance, surpassing traditional machine learning methods by large margins, which supports their effectiveness as automated diagnosis tools.
APA, Harvard, Vancouver, ISO, and other styles
38

Bae, Joungeun, and Hoon Yoo. "Image Enhancement for Computational Integral Imaging Reconstruction via Four-Dimensional Image Structure." Sensors 20, no. 17 (August 25, 2020): 4795. http://dx.doi.org/10.3390/s20174795.

Full text
Abstract:
This paper describes the image enhancement of a computational integral imaging reconstruction method via reconstructing a four-dimensional (4-D) image structure. A computational reconstruction method for high-resolution three-dimensional (3-D) images is highly required in 3-D applications such as 3-D visualization and 3-D object recognition. To improve the visual quality of reconstructed images, we introduce an adjustable parameter to produce a group of 3-D images from a single elemental image array. The adjustable parameter controls overlapping in back projection with a transformation of cropping and translating elemental images. It turns out that the new parameter is an independent parameter from the reconstruction position to reconstruct a 4-D image structure with four axes of x, y, z, and k. The 4-D image structure of the proposed method provides more visual information than existing methods. Computer simulations and optical experiments are carried out to show the feasibility of the proposed method. The results indicate that our method enhances the image quality of 3-D images by providing a 4-D image structure with the adjustable parameter.
APA, Harvard, Vancouver, ISO, and other styles
39

Marian, Nicolae, Alin Drimus, and Arne Bilberg. "Development of a Tactile Sensor Array." Solid State Phenomena 166-167 (September 2010): 277–84. http://dx.doi.org/10.4028/www.scientific.net/ssp.166-167.277.

Full text
Abstract:
Flexible grasping robots are needed for enabling automated, profitable and competitive production of small batch sizes including complex handling processes of often fragile objects. This development will create new conditions for value-adding activities in the production of the future world. The paper describes the related research work we have developed for sensor design, exploration and control for a robot gripping system, in order to analyze normal forces applied on the tactile pixels for gripping force control and generate tactile images for gripping positioning and object recognition. Section 1 gives an introduction of principles and technologies in tactile sensing for robot grippers. Section 2 presents the sensor cell (taxel) and array design and characterization. Section 3 introduces object recognition and shape analysis ideas showing a few preliminary examples, where geometrical features of small objects are identified. Slip detection in order to define optimum grasp pressure is addressed in section 4. The paper will conclude by addressing future ideas about how to judge or forecast a good grasp quality from sensory information.
APA, Harvard, Vancouver, ISO, and other styles
40

Yan, Tingting, Xiaochan Wang, Heping Zhu, and Peter Ling. "Evaluation of Object Surface Edge Profiles Detected with a 2-D Laser Scanning Sensor." Sensors 18, no. 11 (November 21, 2018): 4060. http://dx.doi.org/10.3390/s18114060.

Full text
Abstract:
Canopy edge profile detection is a critical component of plant recognition in variable-rate spray control systems. The accuracy of a high-speed 270° radial laser sensor was evaluated in detecting the surface edge profiles of six complex-shaped objects. These objects were toy balls with a pink smooth surface, light brown rectangular cardboard boxes, black and red texture surfaced basketballs, white smooth cylinders, and two different sized artificial plants. Evaluations included reconstructed three-dimensional (3-D) images for the object surfaces with the data acquired from the laser sensor at four different detection heights (0.25, 0.50, 0.75, and 1.00 m) above each object, five sensor travel speeds (1.6, 2.4, 3.2, 4.0, and 4.8 km h−1), and 8 to 15 horizontal distances to the sensor ranging from 0 to 3.5 m. Edge profiles of the six objects detected with the laser sensor were compared with images taken with a digital camera. The edge similarity score (ESS) was significantly affected by the horizontal distances of the objects, and the influence became weaker when the objects were placed closer to each other. The detection heights and travel speeds also influenced the ESS slightly. The overall average ESS ranged from 0.38 to 0.95 for all the objects under all the test conditions, thereby providing baseline information for the integration of the laser sensor into future development of greenhouse variable-rate spray systems to improve pesticide, irrigation, and nutrition application efficiencies through watering booms.
APA, Harvard, Vancouver, ISO, and other styles
41

Muhammad Saqlain, Syed, Anwar Ghani, Imran Khan, Shahbaz Ahmed Khan Ghayyur, Shahaboddin Shamshirband , Narjes Nabipour, and Manouchehr Shokri. "Image Analysis Using Human Body Geometry and Size Proportion Science for Action Classification." Applied Sciences 10, no. 16 (August 7, 2020): 5453. http://dx.doi.org/10.3390/app10165453.

Full text
Abstract:
Gestures are one of the basic modes of human communication and are usually used to represent different actions. Automatic recognition of these actions forms the basis for solving more complex problems like human behavior analysis, video surveillance, event detection, and sign language recognition, etc. Action recognition from images is a challenging task as the key information like temporal data, object trajectory, and optical flow are not available in still images. While measuring the size of different regions of the human body i.e., step size, arms span, length of the arm, forearm, and hand, etc., provides valuable clues for identification of the human actions. In this article, a framework for classification of the human actions is presented where humans are detected and localized through faster region-convolutional neural networks followed by morphological image processing techniques. Furthermore, geometric features from human blob are extracted and incorporated into the classification rules for the six human actions i.e., standing, walking, single-hand side wave, single-hand top wave, both hands side wave, and both hands top wave. The performance of the proposed technique has been evaluated using precision, recall, omission error, and commission error. The proposed technique has been comparatively analyzed in terms of overall accuracy with existing approaches showing that it performs well in contrast to its counterparts.
APA, Harvard, Vancouver, ISO, and other styles
42

Adriano, Bruno, Naoto Yokoya, Hiroyuki Miura, Masashi Matsuoka, and Shunichi Koshimura. "A Semiautomatic Pixel-Object Method for Detecting Landslides Using Multitemporal ALOS-2 Intensity Images." Remote Sensing 12, no. 3 (February 8, 2020): 561. http://dx.doi.org/10.3390/rs12030561.

Full text
Abstract:
The rapid and accurate mapping of large-scale landslides and other mass movement disasters is crucial for prompt disaster response efforts and immediate recovery planning. As such, remote sensing information, especially from synthetic aperture radar (SAR) sensors, has significant advantages over cloud-covered optical imagery and conventional field survey campaigns. In this work, we introduced an integrated pixel-object image analysis framework for landslide recognition using SAR data. The robustness of our proposed methodology was demonstrated by mapping two different source-induced landslide events, namely, the debris flows following the torrential rainfall that fell over Hiroshima, Japan, in early July 2018 and the coseismic landslide that followed the 2018 Mw6.7 Hokkaido earthquake. For both events, only a pair of SAR images acquired before and after each disaster by the Advanced Land Observing Satellite-2 (ALOS-2) was used. Additional information, such as digital elevation model (DEM) and land cover information, was employed only to constrain the damage detected in the affected areas. We verified the accuracy of our method by comparing it with the available reference data. The detection results showed an acceptable correlation with the reference data in terms of the locations of damage. Numerical evaluations indicated that our methodology could detect landslides with an accuracy exceeding 80%. In addition, the kappa coefficients for the Hiroshima and Hokkaido events were 0.30 and 0.47, respectively.
APA, Harvard, Vancouver, ISO, and other styles
43

Patel, Chirag, Dulari Bhatt, Urvashi Sharma, Radhika Patel, Sharnil Pandya, Kirit Modi, Nagaraj Cholli, et al. "DBGC: Dimension-Based Generic Convolution Block for Object Recognition." Sensors 22, no. 5 (February 24, 2022): 1780. http://dx.doi.org/10.3390/s22051780.

Full text
Abstract:
The object recognition concept is being widely used a result of increasing CCTV surveillance and the need for automatic object or activity detection from images or video. Increases in the use of various sensor networks have also raised the need of lightweight process frameworks. Much research has been carried out in this area, but the research scope is colossal as it deals with open-ended problems such as being able to achieve high accuracy in little time using lightweight process frameworks. Convolution Neural Networks and their variants are widely used in various computer vision activities, but most of the architectures of CNN are application-specific. There is always a need for generic architectures with better performance. This paper introduces the Dimension-Based Generic Convolution Block (DBGC), which can be used with any CNN to make the architecture generic and provide a dimension-wise selection of various height, width, and depth kernels. This single unit which uses the separable convolution concept provides multiple combinations using various dimension-based kernels. This single unit can be used for height-based, width-based, or depth-based dimensions; the same unit can even be used for height and width, width and depth, and depth and height dimensions. It can also be used for combinations involving all three dimensions of height, width, and depth. The main novelty of DBGC lies in the dimension selector block included in the proposed architecture. Proposed unoptimized kernel dimensions reduce FLOPs by around one third and also reduce the accuracy by around one half; semi-optimized kernel dimensions yield almost the same or higher accuracy with half the FLOPs of the original architecture, while optimized kernel dimensions provide 5 to 6% higher accuracy with around a 10 M reduction in FLOPs.
APA, Harvard, Vancouver, ISO, and other styles
44

Andriyanov, N. A., V. E. Dementiev, and A. G. Tashlinskiy. "Detection of objects in the images: from likelihood relationships towards scalable and efficient neural networks." Computer Optics 46, no. 1 (February 2022): 139–59. http://dx.doi.org/10.18287/2412-6179-co-922.

Full text
Abstract:
The relevance of the tasks of detecting and recognizing objects in images and their sequences has only increased over the years. Over the past few decades, a huge number of approaches and methods for detecting both anomalies, that is, image areas whose characteristics differ from the predicted ones, and objects of interest, about the properties of which there is a priori information, up to the library of standards, have been proposed. In this work, an attempt is made to systematically analyze trends in the development of approaches and detection methods, reasons behind these developments, as well as metrics designed to assess the quality and reliability of object detection. Detection techniques based on mathematical models of images are considered. At the same time, special attention is paid to the approaches based on models of random fields and likelihood ratios. The development of convolutional neural networks intended for solving the recognition problems is analyzed, including a number of pre-trained architectures that provide high efficiency in solving this problem. Rather than using mathematical models, such architectures are trained using libraries of real images. Among the characteristics of the detection quality assessment, probabilities of errors of the first and second kind, precision and recall of detection, intersection by union, and interpolated average precision are considered. The paper also presents typical tests that are used to compare various neural network algorithms.
APA, Harvard, Vancouver, ISO, and other styles
45

Leng, Lu, Ziyuan Yang, Cheonshik Kim, and Yue Zhang. "A Light-Weight Practical Framework for Feces Detection and Trait Recognition." Sensors 20, no. 9 (May 6, 2020): 2644. http://dx.doi.org/10.3390/s20092644.

Full text
Abstract:
Fecal trait examinations are critical in the clinical diagnosis of digestive diseases, and they can effectively reveal various aspects regarding the health of the digestive system. An automatic feces detection and trait recognition system based on a visual sensor could greatly alleviate the burden on medical inspectors and overcome many sanitation problems, such as infections. Unfortunately, the lack of digital medical images acquired with camera sensors due to patient privacy has obstructed the development of fecal examinations. In general, the computing power of an automatic fecal diagnosis machine or a mobile computer-aided diagnosis device is not always enough to run a deep network. Thus, a light-weight practical framework is proposed, which consists of three stages: illumination normalization, feces detection, and trait recognition. Illumination normalization effectively suppresses the illumination variances that degrade the recognition accuracy. Neither the shape nor the location is fixed, so shape-based and location-based object detection methods do not work well in this task. Meanwhile, this leads to a difficulty in labeling the images for training convolutional neural networks (CNN) in detection. Our segmentation scheme is free from training and labeling. The feces object is accurately detected with a well-designed threshold-based segmentation scheme on the selected color component to reduce the background disturbance. Finally, the preprocessed images are categorized into five classes with a light-weight shallow CNN, which is suitable for feces trait examinations in real hospital environments. The experiment results from our collected dataset demonstrate that our framework yields a satisfactory accuracy of 98.4%, while requiring low computational complexity and storage.
APA, Harvard, Vancouver, ISO, and other styles
46

Joseph, Seena, and Oludayo O. Olugbara. "Detecting Salient Image Objects Using Color Histogram Clustering for Region Granularity." Journal of Imaging 7, no. 9 (September 16, 2021): 187. http://dx.doi.org/10.3390/jimaging7090187.

Full text
Abstract:
Salient object detection represents a novel preprocessing stage of many practical image applications in the discipline of computer vision. Saliency detection is generally a complex process to copycat the human vision system in the processing of color images. It is a convoluted process because of the existence of countless properties inherent in color images that can hamper performance. Due to diversified color image properties, a method that is appropriate for one category of images may not necessarily be suitable for others. The selection of image abstraction is a decisive preprocessing step in saliency computation and region-based image abstraction has become popular because of its computational efficiency and robustness. However, the performances of the existing region-based salient object detection methods are extremely hooked on the selection of an optimal region granularity. The incorrect selection of region granularity is potentially prone to under- or over-segmentation of color images, which can lead to a non-uniform highlighting of salient objects. In this study, the method of color histogram clustering was utilized to automatically determine suitable homogenous regions in an image. Region saliency score was computed as a function of color contrast, contrast ratio, spatial feature, and center prior. Morphological operations were ultimately performed to eliminate the undesirable artifacts that may be present at the saliency detection stage. Thus, we have introduced a novel, simple, robust, and computationally efficient color histogram clustering method that agglutinates color contrast, contrast ratio, spatial feature, and center prior for detecting salient objects in color images. Experimental validation with different categories of images selected from eight benchmarked corpora has indicated that the proposed method outperforms 30 bottom-up non-deep learning and seven top-down deep learning salient object detection methods based on the standard performance metrics.
APA, Harvard, Vancouver, ISO, and other styles
47

ZHOU, YU, YINFEI YANG, MENG YI, XIANG BAI, WENYU LIU, and LONGIN JAN LATECKI. "ONLINE MULTIPLE TARGETS DETECTION AND TRACKING FROM MOBILE ROBOT IN CLUTTERED INDOOR ENVIRONMENTS WITH DEPTH CAMERA." International Journal of Pattern Recognition and Artificial Intelligence 28, no. 01 (February 2014): 1455001. http://dx.doi.org/10.1142/s0218001414550015.

Full text
Abstract:
Indoor environment is a common scene in our everyday life, and detecting and tracking multiple targets in this environment is a key component for many applications. However, this task still remains challenging due to limited space, intrinsic target appearance variation, e.g. full or partial occlusion, large pose deformation, and scale change. In the proposed approach, we give a novel framework for detection and tracking in indoor environments, and extend it to robot navigation. One of the key components of our approach is a virtual top view created from an RGB-D camera, which is named ground plane projection (GPP). The key advantage of using GPP is the fact that the intrinsic target appearance variation and extrinsic noise is far less likely to appear in GPP than in a regular side-view image. Moreover, it is a very simple task to determine free space in GPP without any appearance learning even from a moving camera. Hence GPP is very different from the top-view image obtained from a ceiling mounted camera. We perform both object detection and tracking in GPP. Two kinds of GPP images are utilized: gray GPP, which represents the maximal height of 3D points projecting to each pixel, and binary GPP, which is obtained by thresholding the gray GPP. For detection, a simple connected component labeling is used to detect footprints of targets in binary GPP. For tracking, a novel Pixel Level Association (PLA) strategy is proposed to link the same target in consecutive frames in gray GPP. It utilizes optical flow in gray GPP, which to our best knowledge has never been done before. Then we "back project" the detected and tracked objects in GPP to original, side-view (RGB) images. Hence we are able to detect and track objects in the side-view (RGB) images. Our system is able to robustly detect and track multiple moving targets in real time. The detection process does not rely on any target model, which means we do not need any training process. Moreover, tracking does not require any manual initialization, since all entering objects are robustly detected. We also extend the novel framework to robot navigation by tracking. As our experimental results demonstrate, our approach can achieve near prefect detection and tracking results. The performance gain in comparison to state-of-the-art trackers is most significant in the presence of occlusion and background clutter.
APA, Harvard, Vancouver, ISO, and other styles
48

Petrova, O., K. Bulatov, V. V. Arlazarov, and V. L. Arlazarov. "Weighted combination of per-frame recognition results for text recognition in a video stream." Computer Optics 45, no. 1 (February 2021): 77–89. http://dx.doi.org/10.18287/2412-6179-co-795.

Full text
Abstract:
The scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure controlled capturing conditions and, consequentially, high quality of input images. Unlike specialized scanners, mobile cameras allow using a video stream as an input, thus obtaining several images of the recognized object, captured with various characteristics. In this case, a problem of combining the information from multiple input frames arises. In this paper, we propose a weighing model for the process of combining the per-frame recognition results, two approaches to the weighted combination of the text recognition results, and two weighing criteria. The effectiveness of the proposed approaches is tested using datasets of identity documents captured with a mobile device camera in different conditions, including perspective distortion of the document image and low lighting conditions. The experimental results show that the weighting combination can improve the text recognition result quality in the video stream, and the per-character weighting method with input image focus estimation as a base criterion allows one to achieve the best results on the datasets analyzed.
APA, Harvard, Vancouver, ISO, and other styles
49

Zhang, Z., M. Y. Yang, and M. Zhou. "MULTI-SOURCE MULTI-SCALE HIERARCHICAL CONDITIONAL RANDOM FIELD MODEL FOR REMOTE SENSING IMAGE CLASSIFICATION." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences II-3/W4 (March 11, 2015): 293–300. http://dx.doi.org/10.5194/isprsannals-ii-3-w4-293-2015.

Full text
Abstract:
Fusion of remote sensing images and LiDAR data provides complimentary information for the remote sensing applications, such as object classification and recognition. In this paper, we propose a novel multi-source multi-scale hierarchical conditional random field (MSMSH-CRF) model to integrate features extracted from remote sensing images and LiDAR point cloud data for image classification. MSMSH-CRF model is then constructed to exploit the features, category compatibility of multi-scale images and the category consistency of multi-source data based on the regions. The output of the model represents the optimal results of the image classification. We have evaluated the precision and robustness of the proposed method on airborne data, which shows that the proposed method outperforms standard CRF method.
APA, Harvard, Vancouver, ISO, and other styles
50

Zhan, Yantong, and Guoying Zhang. "An Improved OTSU Algorithm Using Histogram Accumulation Moment for Ore Segmentation." Symmetry 11, no. 3 (March 22, 2019): 431. http://dx.doi.org/10.3390/sym11030431.

Full text
Abstract:
When using image processing technology to analyze mineral particle size in complex scenes, it is difficult to separate the objects from the background with traditional algorithms. This paper proposes an ore image segmentation algorithm based on a histogram accumulation moment, which is applied to multi-scenario ore object location and recognition. Firstly, the multi-scale Retinex color restoration algorithm is used to improve the contrast in the dark region and eliminates the shadows generated by the stacked adhesion ores. Then, the zero-order and first-order cumulative moments close to the selected gray level are calculated, reducing the error caused by noise. Finally, the selected gray level gradually approaches the optimal threshold to avoid falling into local optimum. It can segment mineral images with unimodal or insignificant bimodal characteristic histogram effectively and accurately. Ore images in three different scenarios are used to verify the accuracy and effectiveness of the proposed method. The experimental results demonstrate that the proposed algorithm provides better segmentation results than other methods.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography