Journal articles on the topic 'Computer vision-based framework'

To see the other types of publications on this topic, follow the link: Computer vision-based framework.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Computer vision-based framework.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Almaghout, K., and A. Klimchik. "Vision-Based Robotic Comanipulation for Deforming Cables." Nelineinaya Dinamika 18, no. 5 (2022): 0. http://dx.doi.org/10.20537/nd221213.

Full text
Abstract:
Although deformable linear objects (DLOs), such as cables, are widely used in the majority of life fields and activities, the robotic manipulation of these objects is considerably more complex compared to the rigid-body manipulation and still an open challenge. In this paper, we introduce a new framework using two robotic arms cooperatively manipulating a DLO from an initial shape to a desired one. Based on visual servoing and computer vision techniques, a perception approach is proposed to detect and sample the DLO as a set of virtual feature points. Then a manipulation planning approach is introduced to map between the motion of the manipulators end effectors and the DLO points by a Jacobian matrix. To avoid excessive stretching of the DLO, the planning approach generates a path for each DLO point forming profiles between the initial and desired shapes. It is guaranteed that all these intershape profiles are reachable and maintain the cable length constraint. The framework and the aforementioned approaches are validated in real-life experiments.
APA, Harvard, Vancouver, ISO, and other styles
2

Ohta, Yuichi. "3D Image Media and Computer Vision -From CV as Robot Technology to CV as Media Technology-." Journal of Robotics and Mechatronics 9, no. 2 (April 20, 1997): 92–97. http://dx.doi.org/10.20965/jrm.1997.p0092.

Full text
Abstract:
The possibility to apply the computer vision technology to the development of a new image medium is discussed. Computer vision has been studied as a sensor technology between the real world and computers. On the other hand, the computer graphics are the interface technology between the computers and human beings. The invention of ""3D photography"" based on the computer vision technology will realize a new 3D image medium which connects the real world and the human beings via computer. In such a framework, computer vision should be studied as a media technology rather than a robot technology.
APA, Harvard, Vancouver, ISO, and other styles
3

Morley, Terence, Tim Morris, and Martin Turner. "A Computer Vision Encyclopedia-Based Framework with Illustrative UAV Applications." Computers 10, no. 3 (March 4, 2021): 29. http://dx.doi.org/10.3390/computers10030029.

Full text
Abstract:
This paper presents the structure of an encyclopedia-based framework (EbF) in which to develop computer vision systems that incorporate the principles of agile development with focussed knowledge-enhancing information. The novelty of the EbF is that it specifies both the use of drop-in modules, to enable the speedy implementation and modification of systems by the operator, and it incorporates knowledge of the input image-capture devices and presentation preferences. This means that the system includes automated parameter selection and operator advice and guidance. Central to this knowledge-enhanced framework is an encyclopedia that is used to store all information pertaining to the current system operation and can be used by all of the imaging modules and computational runtime components. This ensures that they can adapt to changes within the system or its environment. We demonstrate the implementation of this system over three use cases in computer vision for unmanned aerial vehicles (UAV) showing how it is easy to control and set up by novice operators utilising simple computational wrapper scripts.
APA, Harvard, Vancouver, ISO, and other styles
4

SHA, Liang, Guijin WANG, Xinggang LIN, and Kongqiao WANG. "A Framework of Real Time Hand Gesture Vision Based Human-Computer Interaction." IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E94-A, no. 3 (2011): 979–89. http://dx.doi.org/10.1587/transfun.e94.a.979.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Ivorra, Eugenio, Mario Ortega, José Catalán, Santiago Ezquerro, Luis Lledó, Nicolás Garcia-Aracil, and Mariano Alcañiz. "Intelligent Multimodal Framework for Human Assistive Robotics Based on Computer Vision Algorithms." Sensors 18, no. 8 (July 24, 2018): 2408. http://dx.doi.org/10.3390/s18082408.

Full text
Abstract:
Assistive technologies help all persons with disabilities to improve their accessibility in all aspects of their life. The AIDE European project contributes to the improvement of current assistive technologies by developing and testing a modular and adaptive multimodal interface customizable to the individual needs of people with disabilities. This paper describes the computer vision algorithms part of the multimodal interface developed inside the AIDE European project. The main contribution of this computer vision part is the integration with the robotic system and with the other sensory systems (electrooculography (EOG) and electroencephalography (EEG)). The technical achievements solved herein are the algorithm for the selection of objects using the gaze, and especially the state-of-the-art algorithm for the efficient detection and pose estimation of textureless objects. These algorithms were tested in real conditions, and were thoroughly evaluated both qualitatively and quantitatively. The experimental results of the object selection algorithm were excellent (object selection over 90%) in less than 12 s. The detection and pose estimation algorithms evaluated using the LINEMOD database were similar to the state-of-the-art method, and were the most computationally efficient.
APA, Harvard, Vancouver, ISO, and other styles
6

Ataş, Musa. "Open Cezeri Library: A novel java based matrix and computer vision framework." Computer Applications in Engineering Education 24, no. 5 (May 17, 2016): 736–43. http://dx.doi.org/10.1002/cae.21745.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Sharma, Rajeev, and Jose Molineros. "Computer Vision-Based Augmented Reality for Guiding Manual Assembly." Presence: Teleoperators and Virtual Environments 6, no. 3 (June 1997): 292–317. http://dx.doi.org/10.1162/pres.1997.6.3.292.

Full text
Abstract:
Augmented reality (AR) has the goal of enhancing a person's perception of the surrounding world, unlike virtual reality (VR) that aims at replacing the perception of the world with an artificial one. An important issue in AR is making the virtual world sensitive to the current state of the surrounding real world as the user interacts with it. For providing the appropriate augmentation stimulus at the right position and time, the system needs some sensor to interpret the surrounding scene. Computer vision holds great potential in providing the necessary interpretation of the scene. While a computer vision-based general interpretation of a scene is extremely difficult, the constraints from the assembly domain and specific marker-based coding scheme are used to develop an efficient and practical solution. We consider the problem of scene augmentation in the context of a human engaged in assembling a mechanical object from its components. Concepts from robot assembly planning are used to develop a systematic framework for presenting augmentation stimuli for this assembly domain. An experimental prototype system, VEGAS (Visual Enhancement for Guiding Assembly Sequences), is described, that implements some of the AR concepts for guiding assembly using computer vision.
APA, Harvard, Vancouver, ISO, and other styles
8

Saha, Sourav, Sahibjot Kaur, Jayanta Basak, and Priya Ranjan Sinha Mahapatra. "A Computer Vision Framework for Automated Shape Retrieval." American Journal of Advanced Computing 1, no. 1 (January 1, 2020): 1–15. http://dx.doi.org/10.15864/ajac.1108.

Full text
Abstract:
With the increasing number of images generated every day, textual annotation of images for image mining becomes impractical and inefficient. Thus, computer vision based image retrieval has received considerable interest in recent years. One of the fundamental characteristics of any image representation of an object is its shape which plays a vital role to recognize the object at primitive level. Keeping this view as the primary motivational focus, we propose a shape descriptive frame work using a multilevel tree structured representation called Hierarchical Convex Polygonal Decomposition (HCPD). Such a frame work explores different degrees of convexity of an object’s contour-segments in the course of its construction. The convex and non-convex segments of an object’s contour are discovered at every level of the HCPD-tree generation by repetitive convex-polygonal approximation of contour segments. We have also presented a novel shape-string-encoding scheme for representing the HCPD-tree which allows us touse the popular concept of string-edit distance to compute shape similarity score between two objects. The proposed framework when deployed for similar shape retrieval task demonstrates reasonably good performance in comparison with other popular shape-retrieval algorithms.
APA, Harvard, Vancouver, ISO, and other styles
9

Farahbakhsh, Ehsan, Rohitash Chandra, Hugo K. H. Olierook, Richard Scalzo, Chris Clark, Steven M. Reddy, and R. Dietmar Müller. "Computer vision-based framework for extracting tectonic lineaments from optical remote sensing data." International Journal of Remote Sensing 41, no. 5 (October 11, 2019): 1760–87. http://dx.doi.org/10.1080/01431161.2019.1674462.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Zhuang, Yizhou, Weimin Chen, Tao Jin, Bin Chen, He Zhang, and Wen Zhang. "A Review of Computer Vision-Based Structural Deformation Monitoring in Field Environments." Sensors 22, no. 10 (May 16, 2022): 3789. http://dx.doi.org/10.3390/s22103789.

Full text
Abstract:
Computer vision-based structural deformation monitoring techniques were studied in a large number of applications in the field of structural health monitoring (SHM). Numerous laboratory tests and short-term field applications contributed to the formation of the basic framework of computer vision deformation monitoring systems towards developing long-term stable monitoring in field environments. The major contribution of this paper was to analyze the influence mechanism of the measuring accuracy of computer vision deformation monitoring systems from two perspectives, the physical impact, and target tracking algorithm impact, and provide the existing solutions. Physical impact included the hardware impact and the environmental impact, while the target tracking algorithm impact included image preprocessing, measurement efficiency and accuracy. The applicability and limitations of computer vision monitoring algorithms were summarized.
APA, Harvard, Vancouver, ISO, and other styles
11

Chen, Guoming, and Qinghua Liu. "Overview of ship image recognition methods based on computer vision." Journal of Physics: Conference Series 2387, no. 1 (November 1, 2022): 012001. http://dx.doi.org/10.1088/1742-6596/2387/1/012001.

Full text
Abstract:
Abstract This paper introduces the research background of water traffic problems, analyzes the Marine equipment in the process of shipping the deficiencies, on the basis of introducing the concept of image recognition, computer vision and computer vision image recognition are briefly introduced the development of the status situation, sums up and summarizes the recent years research direction such as neural networks based on machine learning, convolution algorithm, For ship image recognition method, and from static image recognition to the complex environment of image recognition, and machine learning, convolutional neural networks based algorithm was analyzed, and the advantages and disadvantages and applicable environment, finally put forward to build a multilateral participation information fusion detection framework to implement the idea of the safe navigation of the ship.
APA, Harvard, Vancouver, ISO, and other styles
12

DVORAK, J., and H. BUNKE. "CONCEPT AND REALIZATION OF A HYBRID AI TOOL APPLIED TO COMPUTER VISION." International Journal on Artificial Intelligence Tools 03, no. 04 (December 1994): 451–66. http://dx.doi.org/10.1142/s0218213094000261.

Full text
Abstract:
Computer vision includes a variety of tasks of different natures, and there are many applications that have a strong need for knowledge representation and use. Typical knowledge representation methods used in computer vision include frames, rules, logic, constraints, and attributed prototype graphs. Although the advantages of hybrid approaches to knowledge representation have been recognized, no hybrid tool for high-level computer vision is available yet. In this paper we first present a general framework for a hybrid knowledge representation tool. It is based on object-oriented programming and offers distinctive features such as high flexibility, coherence, and a clean integration of a collection of knowledge-based techniques. Then we give a brief overview of our computer vision tool VISTO, which was created along the framework discussed in the first part of the paper. With an application example we illustrate the use of VISTO and the advantages of hybrid knowledge representation in comparison to non-hybrid approaches.
APA, Harvard, Vancouver, ISO, and other styles
13

Deng, Hui, Zhibin Ou, Genjie Zhang, Yichuan Deng, and Mao Tian. "BIM and Computer Vision-Based Framework for Fire Emergency Evacuation Considering Local Safety Performance." Sensors 21, no. 11 (June 2, 2021): 3851. http://dx.doi.org/10.3390/s21113851.

Full text
Abstract:
Fire hazard in public buildings may result in serious casualties due to the difficulty of evacuation caused by intricate interior space and unpredictable development of fire situations. It is essential to provide safe and reliable indoor navigation for people trapped in the fire. Distinguished from the global shortest rescue route planning, a framework focusing on the local safety performance is proposed for emergency evacuation navigation. Sufficiently utilizing the information from Building Information Modeling (BIM), this framework automatically constructs geometry network model (GNM) through Industry Foundation Classes (IFC) and integrates computer vision for indoor positioning. Considering the available local egress time (ALET), a back propagation (BP) neural network is applied for adjusting the rescue route according to the fire situation, improving the local safety performance of evacuation. A campus building is taken as an example for proving the feasibility of the framework proposed. The result indicates that the rescue route generated by proposed framework is secure and reasonable. The proposed framework provides an idea for using real-time images only to implement the automatic generation of rescue route when a fire hazard occurs, which is passive, cheap, and convenient.
APA, Harvard, Vancouver, ISO, and other styles
14

Zhuo, Ying, Lan Yan, Wenbo Zheng, Yutian Zhang, and Chao Gou. "A Novel Vehicle Detection Framework Based on Parallel Vision." Wireless Communications and Mobile Computing 2022 (January 12, 2022): 1–11. http://dx.doi.org/10.1155/2022/9667506.

Full text
Abstract:
Autonomous driving has become a prevalent research topic in recent years, arousing the attention of many academic universities and commercial companies. As human drivers rely on visual information to discern road conditions and make driving decisions, autonomous driving calls for vision systems such as vehicle detection models. These vision models require a large amount of labeled data while collecting and annotating the real traffic data are time-consuming and costly. Therefore, we present a novel vehicle detection framework based on the parallel vision to tackle the above issue, using the specially designed virtual data to help train the vehicle detection model. We also propose a method to construct large-scale artificial scenes and generate the virtual data for the vision-based autonomous driving schemes. Experimental results verify the effectiveness of our proposed framework, demonstrating that the combination of virtual and real data has better performance for training the vehicle detection model than the only use of real data.
APA, Harvard, Vancouver, ISO, and other styles
15

Sharma, Rajeev, and Narayan Srinivasa. "A framework for active vision-based robot control using neural networks." Robotica 16, no. 3 (May 1998): 309–27. http://dx.doi.org/10.1017/s0263574798000381.

Full text
Abstract:
Assembly robots that use an active camera system for visual feedback can achieve greater flexibility, including the ability to operate in an uncertain and changing environment. Incorporating active vision into a robot control loop involves some inherent difficulties, including calibration, and the need for redefining the servoing goal as the camera configuration changes. In this paper, we propose a novel self-organizing neural network that learns a calibration-free spatial representation of 3D point targets in a manner that is invariant to changing camera configurations. This representation is used to develop a new framework for robot control with active vision. The salient feature of this framework is that it decouples active camera control from robot control. The feasibility of this approach is established with the help of computer simulations and experiments with the University of Illinois Active Vision System (UIAVS).
APA, Harvard, Vancouver, ISO, and other styles
16

Zhang, Xiaoming, and Hui Yin. "A Monocular Vision-Based Framework for Power Cable Cross-Section Measurement." Energies 12, no. 15 (August 6, 2019): 3034. http://dx.doi.org/10.3390/en12153034.

Full text
Abstract:
The measurements of the diameter of different layers, the thickness of different layers and the eccentricity of insulation layer in the cross-section of power cables are important items of power cable test, which currently depend on labor-intensive manual operations. To improve efficiency, automatic measurement methods are in urgent need. In this paper, a monocular vision-based framework for automatic measurement of the diameter, thickness, and eccentricity of interest in the cross-section of power cables is proposed. The proposed framework mainly consists of three steps. In the first step, the images of cable cross-section are captured and undistorted with the camera calibration parameters. In the second step, the contours of each layer are detected in the cable cross-section image. In order to detect the complete and accurate contours of each layer, the structural edges in the cross-section image are firstly detected and divided into individual layers, then unconnected edges are connected by arc-based method, and finally contours are refined by the proposed break detection and grouping (BDG) and linear trend-based correction (LTBC) algorithm. In the third step, the monocular vision-based cross-section dimension measurement is accomplished by placing a chessboard coplanar with the power cable cross-section plane. The homography matrix mapping pixel coordinates to chessboard world coordinates is estimated, and the diameter, thickness and eccentricity of specific layers are calculated by homography matrix-based measurement method. Simulated data and actual cable data are both used to validate the proposed method. The experimental results show that diameter, minimum thickness, mean thickness and insulation eccentricity of simulated image without additive noise are measured with root mean squared error (RMSE) of 0.424, 0.103 and 0.063 mm, and 0.002, respectively, those of simulated image with additive Gaussian noise and salt and pepper noise are measured with RMSE of 0.502, 0.243 and 0.058 mm and 0.001. Diameter, minimum thickness and mean thickness of actual cable images are measured with average RMSE of 0.768, 0.308 and 0.327 mm. The measurement error of insulation eccentricity of actual cable image is comparatively large, and the measurement accuracy should be improved.
APA, Harvard, Vancouver, ISO, and other styles
17

Das, Abhishek, and Sourav Saha. "A Computer Vision based Framework for Detecting Potholes on Asphalt-Road using Machine Learning Approach." American Journal of Advanced Computing 1, no. 4 (October 1, 2020): 1–5. http://dx.doi.org/10.15864/ajac.1402.

Full text
Abstract:
This paper proposes a Pothole Detection Framework which may assist the pedestrian in avoiding potholes on the roads by giving prior warnings. The basic idea of this framework is to detect the pothole on asphalt road by analyzing the image of the road-surface. This proposed framework combines image processing techniques with machine learning methods and primarily explores edges, Histogram of Gradients and Local Binary Patterns of an image frame for extracting features to detect the presence of potholes on the road surface. The experimental results indicate promising potential of the proposed framework for detection of potholes on asphalt-road.
APA, Harvard, Vancouver, ISO, and other styles
18

Guerrero-Osuna, Héctor A., Jesús Antonio Nava-Pintor, Carlos Alberto Olvera-Olvera, Teodoro Ibarra-Pérez, Rocío Carrasco-Navarro, and Luis F. Luque-Vega. "Educational Mechatronics Training System Based on Computer Vision for Mobile Robots." Sustainability 15, no. 2 (January 11, 2023): 1386. http://dx.doi.org/10.3390/su15021386.

Full text
Abstract:
Boosting by the world’s context, several efforts have been maiden toward the digital transformation of Education. Technology-based active learning has become pivotal in pursuing a more flexible education system. This work presents the development of an Educational Mechatronics (EM) training system based on computer vision that performs as a positioning system for mobile robots in the 2D plane. The results show that the obtained precision, exactitude, and resolution of the EM training system are suitable for robotics applications comprising position, velocity, and acceleration variables. Moreover, an instructional design aligned with the EM conceptual framework, using the EM training system and a LEGO mobile robot, is presented to construct the mechatronic concept: line segment.
APA, Harvard, Vancouver, ISO, and other styles
19

Qi, Wen, Ning Wang, Hang Su, and Andrea Aliverti. "DCNN based human activity recognition framework with depth vision guiding." Neurocomputing 486 (May 2022): 261–71. http://dx.doi.org/10.1016/j.neucom.2021.11.044.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Tiwari, Rohit Kumar, and Gyanendra K. Verma. "A Computer Vision based Framework for Visual Gun Detection Using Harris Interest Point Detector." Procedia Computer Science 54 (2015): 703–12. http://dx.doi.org/10.1016/j.procs.2015.06.083.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Alam, Muhammad Shahab, Mansoor Alam, Muhammad Tufail, Muhammad Umer Khan, Ahmet Güneş, Bashir Salah, Fazal E. Nasir, Waqas Saleem, and Muhammad Tahir Khan. "TobSet: A New Tobacco Crop and Weeds Image Dataset and Its Utilization for Vision-Based Spraying by Agricultural Robots." Applied Sciences 12, no. 3 (January 26, 2022): 1308. http://dx.doi.org/10.3390/app12031308.

Full text
Abstract:
Selective agrochemical spraying is a highly intricate task in precision agriculture. It requires spraying equipment to distinguish between crop (plants) and weeds and perform spray operations in real-time accordingly. The study presented in this paper entails the development of two convolutional neural networks (CNNs)-based vision frameworks, i.e., Faster R-CNN and YOLOv5, for the detection and classification of tobacco crops/weeds in real time. An essential requirement for CNN is to pre-train it well on a large dataset to distinguish crops from weeds, lately the same trained network can be utilized in real fields. We present an open access image dataset (TobSet) of tobacco plants and weeds acquired from local fields at different growth stages and varying lighting conditions. The TobSet comprises 7000 images of tobacco plants and 1000 images of weeds and bare soil, taken manually with digital cameras periodically over two months. Both vision frameworks are trained and then tested using this dataset. The Faster R-CNN-based vision framework manifested supremacy over the YOLOv5-based vision framework in terms of accuracy and robustness, whereas the YOLOv5-based vision framework demonstrated faster inference. Experimental evaluation of the system is performed in tobacco fields via a four-wheeled mobile robot sprayer controlled using a computer equipped with NVIDIA GTX 1650 GPU. The results demonstrate that Faster R-CNN and YOLOv5-based vision systems can analyze plants at 10 and 16 frames per second (fps) with a classification accuracy of 98% and 94%, respectively. Moreover, the precise smart application of pesticides with the proposed system offered a 52% reduction in pesticide usage by spotting the targets only, i.e., tobacco plants.
APA, Harvard, Vancouver, ISO, and other styles
22

Belyakov, P. V., M. B. Nikiforov, E. R. Muratov, and O. V. Melnik. "Stereo vision-based variational optical flow estimation." E3S Web of Conferences 224 (2020): 01027. http://dx.doi.org/10.1051/e3sconf/202022401027.

Full text
Abstract:
Optical flow computation is one of the most important tasks in computer vision. The article deals with a modification of the variational method of the optical flow computation, according to its application in stereo vision. Such approaches are traditionally based on a brightness constancy assumption and a gradient constancy assumption during pixels motion. Smoothness assumption also restricts motion discontinuities, i.e. the smoothness of the vector field of pixel velocity is assumed. It is proposed to extend the functional of the optical flow computation in a similar way by adding a priori known stereo cameras extrinsic parameters and minimize such jointed model of optical flow computation. The article presents a partial differential equations framework in image processing and numerical scheme for its implementation. Performed experimental evaluation demonstrates that the proposed method gives smaller errors than traditional methods of optical flow computation.
APA, Harvard, Vancouver, ISO, and other styles
23

Ramesh, M., and K. Mahesh. "Sports Video Classification Framework Using Enhanced Threshold Based Keyframe Selection Algorithm and Customized CNN on UCF101 and Sports1-M Dataset." Computational Intelligence and Neuroscience 2022 (December 8, 2022): 1–15. http://dx.doi.org/10.1155/2022/3218431.

Full text
Abstract:
The computer vision community has taken a keen interest in recent developments in activity recognition and classification in sports videos. Advancements in sports have a broadened the technical interest of the computer vision community to perform various types of research. Images and videos are the most frequently used components in computer vision. There are numerous models and methods that can be used to classify videos. At the same time, there no specific framework or model for classifying and identifying sports videos. Hence, we proposed a framework based on deep learning to classify sports videos with their appropriate class label. The framework is to perform sports video classification using two different benchmark datasets, UCF101 and the Sports1-M dataset. The objective of the framework is to help sports players and trainers to identify specific sports from the large data source, then analyze and perform well in the future. This framework takes sports video as an input and produces the class label as an output. In between, the framework has numerous intermediary processes. Preprocessing is the first step in the proposed framework, which includes frame extraction and noise reduction. Keyframe selection is carried out by candidate frame extraction and an enhanced threshold-based frame difference algorithm, which is the second step. The final step of the sports video classification framework is feature extraction and classification using CNN. The proposed framework result is compared with pretrained neural networks such as AlexNet and GoogleNet, and then the results are also compared. Three different evaluation metrics are used to measure the accuracy and performance of the framework.
APA, Harvard, Vancouver, ISO, and other styles
24

He, Zhaoliang, Hongshan Li, Zhi Wang, Shutao Xia, and Wenwu Zhu. "Adaptive Compression for Online Computer Vision: An Edge Reinforcement Learning Approach." ACM Transactions on Multimedia Computing, Communications, and Applications 17, no. 4 (November 30, 2021): 1–23. http://dx.doi.org/10.1145/3447878.

Full text
Abstract:
With the growth of computer vision-based applications, an explosive amount of images have been uploaded to cloud servers that host such online computer vision algorithms, usually in the form of deep learning models. JPEG has been used as the de facto compression and encapsulation method for images. However, standard JPEG configuration does not always perform well for compressing images that are to be processed by a deep learning model—for example, the standard quality level of JPEG leads to 50% of size overhead (compared with the best quality level selection) on ImageNet under the same inference accuracy in popular computer vision models (e.g., InceptionNet and ResNet). Knowing this, designing a better JPEG configuration for online computer vision-based services is still extremely challenging. First, cloud-based computer vision models are usually a black box to end-users; thus, it is challenging to design JPEG configuration without knowing their model structures. Second, the “optimal” JPEG configuration is not fixed; instead, it is determined by confounding factors, including the characteristics of the input images and the model, the expected accuracy and image size, and so forth. In this article, we propose a reinforcement learning (RL)-based adaptive JPEG configuration framework, AdaCompress. In particular, we design an edge (i.e., user-side) RL agent that learns the optimal compression quality level to achieve an expected inference accuracy and upload image size, only from the online inference results, without knowing details of the model structures. Furthermore, we design an explore-exploit mechanism to let the framework fast switch an agent when it detects a performance degradation, mainly due to the input change (e.g., images captured across daytime and night). Our evaluation experiments using real-world online computer vision-based APIs from Amazon Rekognition, Face++, and Baidu Vision show that our approach outperforms existing baselines by reducing the size of images by one-half to one-third while the overall classification accuracy only decreases slightly. Meanwhile, AdaCompress adaptively re-trains or re-loads the RL agent promptly to maintain the performance.
APA, Harvard, Vancouver, ISO, and other styles
25

Liu, Yishu, and Jun Li. "Brand Marketing Decision Support System Based on Computer Vision and Parallel Computing." Wireless Communications and Mobile Computing 2022 (March 30, 2022): 1–14. http://dx.doi.org/10.1155/2022/7416106.

Full text
Abstract:
With the rapid development of information technology, decision support systems that can assist business managers in making scientific decisions have become the focus of research. At present, there are not many related studies, but from the brand marketing level, there are not many studies combining smart technology. Based on computer vision technology and parallel computing algorithms, this paper launches an in-depth study of brand marketing decision support systems. First, use computer vision technology and Viola-Jones face detection framework to detect consumers’ faces, and use the classic convolutional neural network model AlexNet for gender judgment and age prediction to analyze consumer groups. Then, use parallel computing to optimize the genetic algorithm to improve the running speed of the algorithm. Design the brand marketing decision support system based on the above technology and algorithm, analyze the relevant data of the L brand, and divide the functional structure of the system into three parts: customer market analysis, performance evaluation, and demand forecasting. The ROC curve of the Viola-Jones face detection framework shows its superior performance. After 500 iterations of the AlexNet model, the verification set loss of the network is stable at 1.8, and the accuracy of the verification set is stable at 38%. Parallel genetic algorithms run 1.8 times faster than serial genetic algorithms at the lowest and 9 times faster at the highest. The minimum prediction error is 0.17%, and the maximum is 2%, which shows that the system can make accurate predictions based on previous years’ data. Computer vision is a technique that converts still image or video data into a decision or a new representation. All such transformations are done to accomplish a specific purpose. Therefore, a brand marketing decision support system based on computer vision and parallel computing can help managers make scientific decisions, save production costs, reduce inventory pressure, and enhance the brand’s competitive advantage.
APA, Harvard, Vancouver, ISO, and other styles
26

Varghese, Prathibha, and Dr G. Arockia Selva Saroja. "A Novel Hexagonal Psuedo framework for Edge Detection Operators on Hexagonal Framework." International Journal of Electrical and Electronics Research 10, no. 4 (December 30, 2022): 1036–42. http://dx.doi.org/10.37391/ijeer.100446.

Full text
Abstract:
Edge detection using a gradient-based detector is a gold-standard method for identifying and analyzing different edge points in an image. A hexagonal grid structure is a powerful architecture dominant for intelligent human-computer vision. This structure provides the best angle resolution, good packing density, high sampling efficiency, equidistant pixels, and consistent connectivity. Edge detection application on hexagonal framework provides more accurate and efficient computations. All the real-time hardware devices available capture and display images in rectangular-shaped pixels. So, an alternative approach to mimic hexagonal pixels using software approaches is modeled in this paper. In this research work, an innovative method to create a pseudo hexagonal lattice has been simulated and the performance is compared with various edge detection operators on the hexagonal framework by comparing the quantitative and qualitative metrics of the grayscale image in both square and hexagonal lattice. The quantitative performance of the edge detection on the hexagonal framework is compared based on the experimental facts. The pseudo-hexagonal lattice structure assures to be aligned toward the human vision.
APA, Harvard, Vancouver, ISO, and other styles
27

P A, Spoorthi, Anitha G S, Bhavana S J, and Jayashree A M. "Classification of Rail Track Defects Based on Computer Vision Using DNN." International Journal for Research in Applied Science and Engineering Technology 10, no. 7 (July 31, 2022): 4983–94. http://dx.doi.org/10.22214/ijraset.2022.45968.

Full text
Abstract:
Abstract: Economic status of the country depends on the Trading which needs transportation. Railways is the most preferred road transportation as most of the profit oriented and movement of people in India is done by elevated railway. Hence it is required to monitor the track health condition frequently using an automated crack detection system. The proposed framework focuses on implementation of python to detect track defects based on Computer vision using image processing techniques. The proposed work uses CNN algorithm through yolov5 model. Yolov5 is one of the best model to achieve highest accuracy in object detection. Yolov5 has become industry standard for object detection due to its speed and accuracy. Here feeding of preprocessed image to CNN classifier to obtain the type of track. The proposed work also helps to identify the severity and nonseverity of defects, also suggests the precautions. Automatic communication occur where the message is sentto authorized people of railway department. The Accuracyof the proposed work on the procured images is more than 95%.
APA, Harvard, Vancouver, ISO, and other styles
28

Cui, Jingwen, Jianping Zhang, Guiling Sun, and Bowen Zheng. "Extraction and Research of Crop Feature Points Based on Computer Vision." Sensors 19, no. 11 (June 4, 2019): 2553. http://dx.doi.org/10.3390/s19112553.

Full text
Abstract:
Based on computer vision technology, this paper proposes a method for identifying and locating crops in order to successfully capture crops in the process of automatic crop picking. This method innovatively combines the YOLOv3 algorithm under the DarkNet framework with the point cloud image coordinate matching method, and can achieve the goal of this paper very well. Firstly, RGB (RGB is the color representing the three channels of red, green and blue) images and depth images are obtained by using the Kinect v2 depth camera. Secondly, the YOLOv3 algorithm is used to identify the various types of target crops in the RGB images, and the feature points of the target crops are determined. Finally, the 3D coordinates of the feature points are displayed on the point cloud images. Compared with other methods, this method of crop identification has high accuracy and small positioning error, which lays a good foundation for the subsequent harvesting of crops using mechanical arms. In summary, the method used in this paper can be considered effective.
APA, Harvard, Vancouver, ISO, and other styles
29

Lu, Guo Liang, Yi Qi Zhou, and Xue Yong Li. "Mechanical Parts Recognition with 3D Graphical Modeling." Applied Mechanics and Materials 644-650 (September 2014): 4505–8. http://dx.doi.org/10.4028/www.scientific.net/amm.644-650.4505.

Full text
Abstract:
Computer vision based mechanical parts recognition has been received much research attention in recent years. In this paper, we present a new framework to address this problem. The framework utilizes the computer graphic technology to model mechanical parts. Recognition is realized by comparing one query image to the instance images using improved affine transformation based on particle swarm optimization (PSO). Our experiment shows that the proposed framework outperforms the conventional invariant moments based recognition methods in recognition rates.
APA, Harvard, Vancouver, ISO, and other styles
30

Chakroun, Marwa, Sonda Ammar Bouhamed, Imene Khanfir Kallel, Basel Solaiman, and Houda Derbel. "Feature selection based on discriminative power under uncertainty for computer vision applications." ELCVIA Electronic Letters on Computer Vision and Image Analysis 21, no. 1 (June 28, 2022): 111–20. http://dx.doi.org/10.5565/rev/elcvia.1361.

Full text
Abstract:
Feature selection is a prolific research field, which has been widely studied in the last decades and has been successfully applied to numerous computer vision systems. It mainly aims to reduce the dimensionality and thus the system complexity. Features have not the same importance within the different classes. Some of them perform for class representation while others perform for class separation. In this paper, a new feature selection method based on discriminative power is proposed to select the relevant features under an uncertain framework, where the uncertainty is expressed through a possibility distribution. In an uncertain context, our method shows its ability to select features that can represent and discriminate between classes.
APA, Harvard, Vancouver, ISO, and other styles
31

Bi, Qilin, Minling Lai, Huiling Tang, Yanyao Guo, Jinyuan Li, Xinhong Zeng, and Zhijun Liu. "Precise Inspection of Geometric Parameters for Polyvinyl Chloride Pipe Section Based on Computer Vision." Traitement du Signal 38, no. 6 (December 31, 2021): 1647–55. http://dx.doi.org/10.18280/ts.380608.

Full text
Abstract:
The precise inspection of geometric parameters is crucial for quality control in the context of Industry 4.0. The current technique of precise inspection depends on the operation of professional personnel, and the measuring accuracy is restricted by the proficiency of operators. To solve the defects, this paper proposes a precise inspection framework for the geometric parameters of polyvinyl chloride (PVC) pipe section (G-PVC), using low-cost visual sensors and high-precision computer vision algorithms. Firstly, a robust imaging system was built to acquire images of a PVC pipe section under irregular illumination changes. Next, an engineering semantic model was established to calculate G-PVC like inner diameter, outer diameter, wall thickness, and roundness. After that, a region-of-interest (ROI) extraction algorithm was combined with an improved edge operator to obtain the coordinates of measured points on PVC end-face image in a stable and precise manner. Finally, our framework was proved highly precise and robust through experiments.
APA, Harvard, Vancouver, ISO, and other styles
32

Pan, Zaolin, Cheng Su, Yichuan Deng, and Jack Cheng. "Image2Triplets: A computer vision-based explicit relationship extraction framework for updating construction activity knowledge graphs." Computers in Industry 137 (May 2022): 103610. http://dx.doi.org/10.1016/j.compind.2022.103610.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Lai, Yutao, Jianye Chen, Qi Hong, Zhekai Li, Haitian Liu, Benhao Lu, Ruihao Ma, et al. "Framework for long-term structural health monitoring by computer vision and vibration-based model updating." Case Studies in Construction Materials 16 (June 2022): e01020. http://dx.doi.org/10.1016/j.cscm.2022.e01020.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Semeniuta, Oleksandr, and Petter Falkman. "EPypes: a framework for building event-driven data processing pipelines." PeerJ Computer Science 5 (February 11, 2019): e176. http://dx.doi.org/10.7717/peerj-cs.176.

Full text
Abstract:
Many data processing systems are naturally modeled as pipelines, where data flows though a network of computational procedures. This representation is particularly suitable for computer vision algorithms, which in most cases possess complex logic and a big number of parameters to tune. In addition, online vision systems, such as those in the industrial automation context, have to communicate with other distributed nodes. When developing a vision system, one normally proceeds from ad hoc experimentation and prototyping to highly structured system integration. The early stages of this continuum are characterized with the challenges of developing a feasible algorithm, while the latter deal with composing the vision function with other components in a networked environment. In between, one strives to manage the complexity of the developed system, as well as to preserve existing knowledge. To tackle these challenges, this paper presents EPypes, an architecture and Python-based software framework for developing vision algorithms in a form of computational graphs and their integration with distributed systems based on publish-subscribe communication. EPypes facilitates flexibility of algorithm prototyping, as well as provides a structured approach to managing algorithm logic and exposing the developed pipelines as a part of online systems.
APA, Harvard, Vancouver, ISO, and other styles
35

Balanji, Hamid Majidi, Ali Emre Turgut, and Lutfi Taner Tunc. "A novel vision-based calibration framework for industrial robotic manipulators." Robotics and Computer-Integrated Manufacturing 73 (February 2022): 102248. http://dx.doi.org/10.1016/j.rcim.2021.102248.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Mu, Guangyu, and Tingting Li. "Video-Based Metric Learning Framework for Basketball Skill Assessment." International Journal of e-Collaboration 19, no. 5 (January 27, 2023): 1–13. http://dx.doi.org/10.4018/ijec.316875.

Full text
Abstract:
Video-based human action recognition has become one of the research hotspots in the field of computer vision in recent years and has been widely used in the fields of intelligent human-computer interaction and virtual reality. However, most of the current existing methods and public datasets are constructed for human daily activities, and the assessment of basketball skills is still a challenging problem. In order to solve the above issues, in this paper, the authors propose a coarse-to-fine video-based metric learning framework for basketball skills assessment. Specifically, they first use a variety of models to jointly represent the action video, and then the optimal distance metric between videos is learned based on the representation. Finally, based on the distance metric, a query video is coarsely classified to obtain the corresponding label of video action, and then the video is finely classified to judge whether the action is standardized. The experiments on a collected dataset show that the proposed framework can better identify and assess the non-standard actions of basketball.
APA, Harvard, Vancouver, ISO, and other styles
37

Dang, Minh. "Efficient Vision-Based Face Image Manipulation Identification Framework Based on Deep Learning." Electronics 11, no. 22 (November 17, 2022): 3773. http://dx.doi.org/10.3390/electronics11223773.

Full text
Abstract:
Image manipulation of the human face is a trending topic of image forgery, which is done by transforming or altering face regions using a set of techniques to accomplish desired outputs. Manipulated face images are spreading on the internet due to the rise of social media, causing various societal threats. It is challenging to detect the manipulated face images effectively because (i) there has been a limited number of manipulated face datasets because most datasets contained images generated by GAN models; (ii) previous studies have mainly extracted handcrafted features and fed them into machine learning algorithms to perform manipulated face detection, which was complicated, error-prone, and laborious; and (iii) previous models failed to prove why their model achieved good performances. In order to address these issues, this study introduces a large face manipulation dataset containing vast variations of manipulated images created and manually validated using various manipulation techniques. The dataset is then used to train a fine-tuned RegNet model to detect manipulated face images robustly and efficiently. Finally, a manipulated region analysis technique is implemented to provide some in-depth insights into the manipulated regions. The experimental results revealed that the RegNet model showed the highest classification accuracy of 89% on the proposed dataset compared to standard deep learning models.
APA, Harvard, Vancouver, ISO, and other styles
38

Troncoso-Pastoriza, Francisco, Pablo Eguía-Oller, Rebeca Díaz-Redondo, Enrique Granada-Álvarez, and Aitor Erkoreka. "Orientation-Constrained System for Lamp Detection in Buildings Based on Computer Vision." Sensors 19, no. 7 (March 28, 2019): 1516. http://dx.doi.org/10.3390/s19071516.

Full text
Abstract:
Computer vision is used in this work to detect lighting elements in buildings with the goal of improving the accuracy of previous methods to provide a precise inventory of the location and state of lamps. Using the framework developed in our previous works, we introduce two new modifications to enhance the system: first, a constraint on the orientation of the detected poses in the optimization methods for both the initial and the refined estimates based on the geometric information of the building information modelling (BIM) model; second, an additional reprojection error filtering step to discard the erroneous poses introduced with the orientation restrictions, keeping the identification and localization errors low while greatly increasing the number of detections. These enhancements are tested in five different case studies with more than 30,000 images, with results showing improvements in the number of detections, the percentage of correct model and state identifications, and the distance between detections and reference positions.
APA, Harvard, Vancouver, ISO, and other styles
39

Yarkony, Julian, Yossiri Adulyasak, Maneesh Singh, and Guy Desaulniers. "Data Association via Set Packing for Computer Vision Applications." INFORMS Journal on Optimization 2, no. 3 (July 2020): 167–91. http://dx.doi.org/10.1287/ijoo.2019.0030.

Full text
Abstract:
Significant progress has been made in the field of computer vision because of the development of supervised machine learning algorithms, which efficiently extract information from high-dimensional data such as images and videos. Such techniques are particularly effective at recognizing the presence or absence of entities in the domains where labeled data are abundant. However, supervised learning is not sufficient in applications where one needs to annotate each unique entity in crowded scenes respecting known domain-specific structures of those entities. This problem, known as data association, provides fertile ground for the application of combinatorial optimization. In this review paper, we present a unified framework based on column generation for some computer vision applications, namely multiperson tracking, multiperson pose estimation, and multicell segmentation, which can be formulated as set packing problems with a massive number of variables. To solve them, column generation algorithms are applied to circumvent the need to enumerate all variables explicitly. To enhance the solution process, we provide a general approach for applying subset-row inequalities to tighten the formulations and introduce novel dual-optimal inequalities to reduce the dual search space. The proposed algorithms and their enhancements are successfully applied to solve the three aforementioned computer vision problems and achieve superior performance over benchmark approaches. The common framework presented allows us to leverage operations research methodologies to efficiently tackle computer vision problems.
APA, Harvard, Vancouver, ISO, and other styles
40

Shaikh, Imran, and Kadam V.K. "Automatic Computer Propped Diagnosis Framework of Liver Cancer Detection using CNN LSTM." International Journal of Engineering Research in Electronics and Communication Engineering 9, no. 2 (February 28, 2022): 1–8. http://dx.doi.org/10.36647/ijerece/09.02.a001.

Full text
Abstract:
Liver cancer detection using the computer vision methods and machine learning already received significant attention of researchers for authentic diagnosis and on-time medical attentions. The Computer Aided Diagnosis (CAD) preferred for cancer detection all over the world which is based on image processing service. Earlier CAD tools were designed using conventional machine learning techniue using semi-automatic approach. The modern growth of deep learning for automatic detection and classification leads to significant improvement in accuracy. This paper the automatic CAD framework for liver cancer detection using Convolutional Neural Network (CNN) including Long Short Term Memory (LSTM). The input Computed Tomography (CT) scan images early pre-processed for quality enhancement. After that we applied the lightweight and accuracy field of Interest (ROI) extraction technique using dynamic binary segmentation. From ROI images, we extracted automated CNN-based appearance and hand-craft features. The consolidation of both features formed unique feature set for classification purpose. The LSTM block is then achieve the classification either into normal or diseased CT image. The CNN-LSTM model is designed in this paper to complement the accuracy of liver cancer detection compared to other deep learning solutions. The experimental results of proposed design using CNN-based features and hybrid hand craft features outperformed the recent state-of-art methods.
APA, Harvard, Vancouver, ISO, and other styles
41

Gabbar, Hossam A., Abderrazak Chahid, Md Jamiul Alam Khan, Oluwabukola Grace Adegboro, and Matthew Immanuel Samson. "CTIMS: Automated Defect Detection Framework Using Computed Tomography." Applied Sciences 12, no. 4 (February 19, 2022): 2175. http://dx.doi.org/10.3390/app12042175.

Full text
Abstract:
Non-Destructive Testing (NDT) is one of the inspection techniques used in industrial tool inspection for quality and safety control. It is performed mainly using X-ray Computed Tomography (CT) to scan the internal structure of the tools and detect the potential defects. In this paper, we propose a new toolbox called the CT-Based Integrity Monitoring System (CTIMS-Toolbox) for automated inspection of CT images and volumes. It contains three main modules: first, the database management module, which handles the database and reads/writes queries to retrieve or save the CT data; second, the pre-processing module for registration and background subtraction; third, the defect inspection module to detect all the potential defects (missing parts, damaged screws, etc.) based on a hybrid system composed of computer vision and deep learning techniques. This paper explores the different features of the CTIMS-Toolbox, exposes the performance of its modules, compares its features to some existing CT inspection toolboxes, and provides some examples of the obtained results.
APA, Harvard, Vancouver, ISO, and other styles
42

Shashank and Indu Sreedevi. "Spatiotemporal Activity Mapping for Enhanced Multi-Object Detection with Reduced Resource Utilization." Electronics 12, no. 1 (December 22, 2022): 37. http://dx.doi.org/10.3390/electronics12010037.

Full text
Abstract:
The accuracy of data captured by sensors highly impacts the performance of a computer vision system. To derive highly accurate data, the computer vision system must be capable of identifying critical objects and activities in the field of sensors and reconfiguring the configuration space of the sensors in real time. The majority of modern reconfiguration systems rely on complex computations and thus consume lots of resources. This may not be a problem for systems with a continuous power supply, but it can be a major set-back for computer vision systems employing sensors with limited resources. Further, to develop an appropriate understanding of the scene, the computer vision system must correlate past and present events of the scene captured in the sensor’s field of view (FOV). To address the abovementioned problems, this article provides a simple yet efficient framework for a sensor’s reconfiguration. The framework performs a spatiotemporal evaluation of the scene to generate adaptive activity maps, based on which the sensors are reconfigured. The activity maps contain normalized values assigned to each pixel in the sensor’s FOV, called normalized pixel sensitivity, which represents the impact of activities or events on each pixel in the sensor’s FOV. The temporal relationship between the past and present events is developed by utilizing standard half-width Gaussian distribution. The framework further proposes a federated optical-flow-based filter to determine critical activities in the FOV. Based on the activity maps, the sensors are re-configured to align the center of the sensors to the most sensitive area (i.e., region of importance) of the field. The proposed framework is tested on multiple surveillance and sports datasets and outperforms the contemporary reconfiguration systems in terms of multi-object tracking accuracy (MOTA).
APA, Harvard, Vancouver, ISO, and other styles
43

Badawi, Aiman, and Muhammad Bilal. "A Hardware-Software Co-Design for Object Detection Using High-Level Synthesis Tools." International Journal of Electronics, Communications, and Measurement Engineering 8, no. 1 (January 2019): 63–73. http://dx.doi.org/10.4018/ijecme.2019010105.

Full text
Abstract:
Object detection is a vital component of modern video processing systems, and despite the availability of several efficient open-source feature-classifier frameworks and their corresponding implementation schemes, inclusion of this feature as a drop-in module in larger computer vision systems is still considered a daunting task. To this end, this work describes an open-source unified framework which can be used to train, test, and deploy an SVM-based object detector as a hardware-software co-design on FPGA using Simulink high-level synthesis tool. The proposed modular design can be seamlessly integrated within full systems developed using Simulink Computer Vision toolbox for rapid deployment. FPGA synthesis results show that the proposed hardware architecture utilizes fewer logic resources than the contemporary designs for similar operation. Moreover, experimental evidence has been provided to prove the generalization of the framework in efficiently detecting a variety of objects of interest including pedestrians, faces and traffic signs.
APA, Harvard, Vancouver, ISO, and other styles
44

Wang, Zhen, Guoshan Xu, Yong Ding, Bin Wu, and Guoyu Lu. "A vision-based active learning convolutional neural network model for concrete surface crack detection." Advances in Structural Engineering 23, no. 13 (June 8, 2020): 2952–64. http://dx.doi.org/10.1177/1369433220924792.

Full text
Abstract:
Concrete surface crack detection based on computer vision, specifically via a convolutional neural network, has drawn increasing attention for replacing manual visual inspection of bridges and buildings. This article proposes a new framework for this task and a sampling and training method based on active learning to treat class imbalances. In particular, the new framework includes a clear definition of two categories of samples, a relevant sliding window technique, data augmentation and annotation methods. The advantages of this framework are that data integrity can be ensured and a very large amount of annotation work can be saved. Training datasets generated with the proposed sampling and training method not only are representative of the original dataset but also highlight samples that are highly complex, yet informative. Based on the proposed framework and sampling and training strategy, AlexNet is re-tuned, validated, tested and compared with an existing network. The investigation revealed outstanding performances of the proposed framework in terms of the detection accuracy, precision and F1 measure due to its nonlinear learning ability, training dataset integrity and active learning strategy.
APA, Harvard, Vancouver, ISO, and other styles
45

Mneymneh, Bahaa Eddine, Mohamad Abbas, and Hiam Khoury. "Vision-Based Framework for Intelligent Monitoring of Hardhat Wearing on Construction Sites." Journal of Computing in Civil Engineering 33, no. 2 (March 2019): 04018066. http://dx.doi.org/10.1061/(asce)cp.1943-5487.0000813.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Weikl, Korbinian, Damien Schroeder, Daniel Blau, Zhenyi Liu, and Walter Stechele. "End-to-End Imaging System Optimization for Computer Vision in Driving Automation." Electronic Imaging 2021, no. 17 (January 18, 2021): 173–1. http://dx.doi.org/10.2352/issn.2470-1173.2021.17.avm-173.

Full text
Abstract:
Full driving automation imposes to date unmet performance requirements on camera and computer vision systems, in order to replace the visual system of a human driver in any conditions. So far, the individual components of an automotive camera hav mostly been optimized independently, or without taking into account the effect on the computer vision applications. We propose an end-to-end optimization of the imaging system in software, from generation of radiometric input data over physically based camera component models to the output of a computer vision system. Specifically, we present an optimization framework which extends the ISETCam and ISET3d toolboxes to create synthetic spectral data of high dynamic range, and which models a stateof-the-art automotive camera in more detail. It includes a stateof-the-art object detection system as benchmark application. We highlight in which way the framework approximates the physical image formation process. As a result, we provide guidelines for optimization experiments involving modification of the model parameters, and show how these apply to a first experiment on high dynamic range imaging.
APA, Harvard, Vancouver, ISO, and other styles
47

Qian, Yan, Johan Barthelemy, Umair Iqbal, and Pascal Perez. "V2ReID: Vision-Outlooker-Based Vehicle Re-Identification." Sensors 22, no. 22 (November 9, 2022): 8651. http://dx.doi.org/10.3390/s22228651.

Full text
Abstract:
With the increase of large camera networks around us, it is becoming more difficult to manually identify vehicles. Computer vision enables us to automate this task. More specifically, vehicle re-identification (ReID) aims to identify cars in a camera network with non-overlapping views. Images captured of vehicles can undergo intense variations of appearance due to illumination, pose, or viewpoint. Furthermore, due to small inter-class similarities and large intra-class differences, feature learning is often enhanced with non-visual cues, such as the topology of camera networks and temporal information. These are, however, not always available or can be resource intensive for the model. Following the success of Transformer baselines in ReID, we propose for the first time an outlook-attention-based vehicle ReID framework using the Vision Outlooker as its backbone, which is able to encode finer-level features. We show that, without embedding any additional side information and using only the visual cues, we can achieve an 80.31% mAP and 97.13% R-1 on the VeRi-776 dataset. Besides documenting our research, this paper also aims to provide a comprehensive walkthrough of vehicle ReID. We aim to provide a starting point for individuals and organisations, as it is difficult to navigate through the myriad of complex research in this field.
APA, Harvard, Vancouver, ISO, and other styles
48

Liu, Xiaobo, Junya Yan, and Yuxin Zhang. "Research on Multitarget Detection and Intelligent Tracking Technology Based on Computer Vision." Mobile Information Systems 2022 (September 30, 2022): 1–9. http://dx.doi.org/10.1155/2022/4864604.

Full text
Abstract:
With image analysis as the core for multitarget detection and intelligent tracking, mostly applying the Faster R-CNN or YOLO framework, the MOTA score for multitarget tracking is low in the face of complex working environments. Therefore, further research into computer vision techniques is carried out to design new multitarget detection and intelligent tracking methods. Based on the small-aperture imaging model, the principle of lens distortion was analyzed, and a camera calibration and image calibration scheme was designed to obtain effective environmental images. The attention mechanism is introduced to optimise the structure of deep learning networks, and a computer vision detection algorithm based on this is applied to complete regional multitarget detection. The distance between each target and the body is then measured in combination with binocular vision principles. Finally, the spatiotemporal context algorithm is applied to perform simulation calculations to obtain the multitarget intelligent tracking results. The experimental results show that the mean MOTA score of the proposed technique is 0.87 in the night environment, which is 24.14% and 28.374% better than the neural network-based and machine vision-based tracking methods, respectively; in the daytime environment, the mean MOTA score of the multitarget tracking results of the technique is 0.94, which is 28.72%, and the mean MOTA score of 0.94 for the multitarget tracking results in the daytime environment was 28.72% and 22.34% higher than the other two methods.
APA, Harvard, Vancouver, ISO, and other styles
49

Ficuciello, F., A. Migliozzi, G. Laudante, P. Falco, and B. Siciliano. "Vision-based grasp learning of an anthropomorphic hand-arm system in a synergy-based control framework." Science Robotics 4, no. 26 (January 30, 2019): eaao4900. http://dx.doi.org/10.1126/scirobotics.aao4900.

Full text
Abstract:
In this work, the problem of grasping novel objects with an anthropomorphic hand-arm robotic system is considered. In particular, an algorithm for learning stable grasps of unknown objects has been developed based on an object shape classification and on the extraction of some associated geometric features. Different concepts, coming from fields such as machine learning, computer vision, and robot control, have been integrated together in a modular framework to achieve a flexible solution suitable for different applications. The results presented in this work confirm that the combination of learning from demonstration and reinforcement learning can be an interesting solution for complex tasks, such as grasping with anthropomorphic hands. The imitation learning provides the robot with a good base to start the learning process that improves its abilities through trial and error. The learning process occurs in a reduced dimension subspace learned upstream from human observation during typical grasping tasks. Furthermore, the integration of a synergy-based control module allows reducing the number of trials owing to the synergistic approach.
APA, Harvard, Vancouver, ISO, and other styles
50

Lu, Shida, Kai Huang, Talha Meraj, and Hafiz Tayyab Rauf. "A novel CAPTCHA solver framework using deep skipping Convolutional Neural Networks." PeerJ Computer Science 8 (April 6, 2022): e879. http://dx.doi.org/10.7717/peerj-cs.879.

Full text
Abstract:
A Completely Automated Public Turing Test to tell Computers and Humans Apart (CAPTCHA) is used in web systems to secure authentication purposes; it may break using Optical Character Recognition (OCR) type methods. CAPTCHA breakers make web systems highly insecure. However, several techniques to break CAPTCHA suggest CAPTCHA designers about their designed CAPTCHA’s need improvement to prevent computer vision-based malicious attacks. This research primarily used deep learning methods to break state-of-the-art CAPTCHA codes; however, the validation scheme and conventional Convolutional Neural Network (CNN) design still need more confident validation and multi-aspect covering feature schemes. Several public datasets are available of text-based CAPTCHa, including Kaggle and other dataset repositories where self-generation of CAPTCHA datasets are available. The previous studies are dataset-specific only and cannot perform well on other CAPTCHA’s. Therefore, the proposed study uses two publicly available datasets of 4- and 5-character text-based CAPTCHA images to propose a CAPTCHA solver. Furthermore, the proposed study used a skip-connection-based CNN model to solve a CAPTCHA. The proposed research employed 5-folds on data that delivers 10 different CNN models on two datasets with promising results compared to the other studies.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography