Log in

Relevant bibliographies by topics / Multi-Camera network / Journal articles

To see the other types of publications on this topic, follow the link: Multi-Camera network.

Journal articles on the topic 'Multi-Camera network'

Author: Grafiati

Published: 25 May 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Multi-Camera network.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Wu, Yi-Chang, Ching-Han Chen, Yao-Te Chiu, and Pi-Wei Chen. "Cooperative People Tracking by Distributed Cameras Network." Electronics 10, no. 15 (2021): 1780. http://dx.doi.org/10.3390/electronics10151780.

Full text

Abstract:

In the application of video surveillance, reliable people detection and tracking are always challenging tasks. The conventional single-camera surveillance system may encounter difficulties such as narrow-angle of view and dead space. In this paper, we proposed multi-cameras network architecture with an inter-camera hand-off protocol for cooperative people tracking. We use the YOLO model to detect multiple people in the video scene and incorporate the particle swarm optimization algorithm to track the person movement. When a person leaves the area covered by a camera and enters an area covered by another camera, these cameras can exchange relevant information for uninterrupted tracking. The motion smoothness (MS) metrics is proposed for evaluating the tracking quality of multi-camera networking system. We used a three-camera system for two persons tracking in overlapping scene for experimental evaluation. Most tracking person offsets at different frames were lower than 30 pixels. Only 0.15% of the frames showed abrupt increases in offsets pixel. The experiment results reveal that our multi-camera system achieves robust, smooth tracking performance.

APA, Harvard, Vancouver, ISO, and other styles

2

R.Kennady, Et al. "A Nonoverlapping Vision Field Multi-Camera Network for Tracking Human Build Targets." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 3 (2023): 366–69. http://dx.doi.org/10.17762/ijritcc.v11i3.9871.

Full text

Abstract:

This research presents a procedure for tracking human build targets in a multi-camera network with nonoverlapping vision fields. The proposed approach consists of three main steps: single-camera target detection, single-camera target tracking, and multi-camera target association and continuous tracking. The multi-camera target association includes target characteristic extraction and the establishment of topological relations. Target characteristics are extracted based on the HSV (Hue, Saturation, and Value) values of each human build movement target, and the space-time topological relations of the multi-camera network are established using the obtained target associations. This procedure enables the continuous tracking of human build movement targets in large scenes, overcoming the limitations of monitoring within the narrow field of view of a single camera.

APA, Harvard, Vancouver, ISO, and other styles

3

Zhao, Guoliang, Yuxun Zhou, Zhanbo Xu, Yadong Zhou, and Jiang Wu. "Hierarchical Multi-Supervision Multi-Interaction Graph Attention Network for Multi-Camera Pedestrian Trajectory Prediction." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 4 (2022): 4698–706. http://dx.doi.org/10.1609/aaai.v36i4.20395.

Full text

Abstract:

Pedestrian trajectory prediction has become an essential underpinning in various human-centric applications including but not limited to autonomous vehicles, intelligent surveillance system and social robotics. Previous research endeavors mainly focus on single camera trajectory prediction (SCTP), while the problem of multi-camera trajectory prediction (MCTP) is often overly simplified into predicting presence in the next camera. This paper addresses MCTP from a more realistic yet challenging perspective, by redefining the task as a joint estimation of both future destination and possible trajectory. As such, two major efforts are devoted to facilitating related research and advancing modeling techniques. Firstly, we establish a comprehensive multi-camera Scenes Pedestrian Trajectory Dataset (mcScenes), which is collected from a real-world multi-camera space combined with thorough human interaction annotations and carefully designed evaluation metrics. Secondly, we propose a novel joint prediction framework, namely HM3GAT, for the MCTP task by building a tailored network architecture. The core idea behind HM3GAT is a fusion of topological and trajectory information that are mutually beneficial to the prediction of each task, achieved by deeply customized networks. The proposed framework is comprehensively evaluated on the mcScenes dataset with multiple ablation experiments. Status-of-the-art SCTP models are adopted as baselines to further validate the advantages of our method in terms of both information fusion and technical improvement. The mcScenes dataset, the HM3GAT, and alternative models are made publicly available for interested readers.

APA, Harvard, Vancouver, ISO, and other styles

4

Sharma, Anil, Saket Anand, and Sanjit K. Kaul. "Reinforcement Learning Based Querying in Camera Networks for Efficient Target Tracking." Proceedings of the International Conference on Automated Planning and Scheduling 29 (May 25, 2021): 555–63. http://dx.doi.org/10.1609/icaps.v29i1.3522.

Full text

Abstract:

Surveillance camera networks are a useful monitoring infrastructure that can be used for various visual analytics applications, where high-level inferences and predictions could be made based on target tracking across the network. Most multi-camera tracking works focus on re-identification problems and trajectory association problems. However, as camera networks grow in size, the volume of data generated is humongous, and scalable processing of this data is imperative for deploying practical solutions. In this paper, we address the largely overlooked problem of scheduling cameras for processing by selecting one where the target is most likely to appear next. The inter-camera handover can then be performed on the selected cameras via re-identification or another target association technique. We model this scheduling problem using reinforcement learning and learn the camera selection policy using Q-learning. We do not assume the knowledge of the camera network topology but we observe that the resulting policy implicitly learns it. We evaluate our approach using NLPR MCT dataset, which is a real multi-camera multi-target tracking benchmark and show that the proposed policy substantially reduces the number of frames required to be processed at the cost of a small reduction in recall.

APA, Harvard, Vancouver, ISO, and other styles

5

Li, Xiaolin, Wenhui Dong, Faliang Chang, and Peishu Qu. "Topology Learning of Non-overlapping Multi-camera Network." International Journal of Signal Processing, Image Processing and Pattern Recognition 8, no. 11 (2015): 243–54. http://dx.doi.org/10.14257/ijsip.2015.8.11.22.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Liu, Xin, Herman G. J. Groot, Egor Bondarev, and Peter H. N. de With. "Introducing Scene Understanding to Person Re-Identification using a Spatio-Temporal Multi-Camera Model." Electronic Imaging 2020, no. 10 (2020): 95–1. http://dx.doi.org/10.2352/issn.2470-1173.2020.10.ipas-095.

Full text

Abstract:

In this paper, we investigate person re-identification (re-ID) in a multi-camera network for surveillance applications. To this end, we create a Spatio-Temporal Multi-Camera model (ST-MC model), which exploits statistical data on a person’s entry/exit points in the multi-camera network, to predict in which camera view a person will re-appear. The created ST-MC model is used as a novel extension to the Multiple Granularity Network (MGN) [1], which is the current state of the art in person re-ID. Compared to existing approaches that are solely based on Convolutional Neural Networks (CNNs), our approach helps to improve the re-ID performance by considering not only appearance-based features of a person from a CNN, but also contextual information. The latter serves as scene understanding information complimentary to person re-ID. Experimental results show that for the DukeMTMC-reID dataset [2][3], introduction of our ST-MC model substantially increases the mean Average Precision (mAP) and Rank-1 score from 77.2% to 84.1%, and from 88.6% to 96.2%, respectively.

APA, Harvard, Vancouver, ISO, and other styles

7

He, Li, Guoliang Liu, Guohui Tian, Jianhua Zhang, and Ze Ji. "Efficient Multi-View Multi-Target Tracking Using a Distributed Camera Network." IEEE Sensors Journal 20, no. 4 (2020): 2056–63. http://dx.doi.org/10.1109/jsen.2019.2949385.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Li, Yun-Lun, Hao-Ting Li, and Chen-Kuo Chiang. "Multi-Camera Vehicle Tracking Based on Deep Tracklet Similarity Network." Electronics 11, no. 7 (2022): 1008. http://dx.doi.org/10.3390/electronics11071008.

Full text

Abstract:

Multi-camera vehicle tracking at the city scale has received lots of attention in the last few years. It has large-scale differences, frequent occlusion, and appearance differences caused by the viewing angle differences, which is quite challenging. In this research, we propose the Tracklet Similarity Network (TSN) for a multi-target multi-camera (MTMC) vehicle tracking system based on the evaluation of the similarity between vehicle tracklets. In addition, a novel component, Candidates Intersection Ratio (CIR), is proposed to refine the similarity. It provides an associate scheme to build the multi-camera tracking results as a tree structure. Based on these components, an end-to-end vehicle tracking system is proposed. The experimental results demonstrate that an 11% improvement on the evaluation score is obtained compared to the conventional similarity baseline.

APA, Harvard, Vancouver, ISO, and other styles

9

Truong, Philips, Deligiannis, Abrahamyan, and Guan. "Automatic Multi-Camera Extrinsic Parameter Calibration Based on Pedestrian Torsors †." Sensors 19, no. 22 (2019): 4989. http://dx.doi.org/10.3390/s19224989.

Full text

Abstract:

Extrinsic camera calibration is essential for any computer vision task in a camera network. Typically, researchers place a calibration object in the scene to calibrate all the cameras in a camera network. However, when installing cameras in the field, this approach can be costly and impractical, especially when recalibration is needed. This paper proposes a novel, accurate and fully automatic extrinsic calibration framework for camera networks with partially overlapping views. The proposed method considers the pedestrians in the observed scene as the calibration objects and analyzes the pedestrian tracks to obtain extrinsic parameters. Compared to the state of the art, the new method is fully automatic and robust in various environments. Our method detect human poses in the camera images and then models walking persons as vertical sticks. We apply a brute-force method to determines the correspondence between persons in multiple camera images. This information along with 3D estimated locations of the top and the bottom of the pedestrians are then used to compute the extrinsic calibration matrices. We also propose a novel method to calibrate the camera network by only using the top and centerline of the person when the bottom of the person is not available in heavily occluded scenes. We verified the robustness of the method in different camera setups and for both single and multiple walking people. The results show that the triangulation error of a few centimeters can be obtained. Typically, it requires less than one minute of observing the walking people to reach this accuracy in controlled environments. It also just takes a few minutes to collect enough data for the calibration in uncontrolled environments. Our proposed method can perform well in various situations such as multi-person, occlusions, or even at real intersections on the street.

APA, Harvard, Vancouver, ISO, and other styles

10

Sumathy, R. "Face Recognition in Multi Camera Network with Sh Feature." International Journal of Modern Education and Computer Science 7, no. 5 (2015): 59–64. http://dx.doi.org/10.5815/ijmecs.2015.05.08.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Guan, Banglei, Xiangyi Sun, Yang Shang, Xiaohu Zhang, and Manuel Hofer. "Multi-camera networks for motion parameter estimation of an aircraft." International Journal of Advanced Robotic Systems 14, no. 1 (2017): 172988141769231. http://dx.doi.org/10.1177/1729881417692312.

Full text

Abstract:

A multi-camera network is proposed to estimate an aircraft’s motion parameters relative to the reference platform in large outdoor fields. Multiple cameras are arranged to cover the aircraft’s large-scale motion spaces by field stitching. A camera calibration method using dynamic control points created by a multirotor unmanned aerial vehicle is presented under the conditions that the field of view of the cameras is void. The relative deformation of the camera network caused by external environmental factors is measured and compensated using a combination of cameras and laser rangefinders. A series of field experiments have been carried out using a fixed-wing aircraft without artificial makers, and its accuracy is evaluated using an onboard Differential Global Positioning System. The experimental results show that the multi-camera network is precise, robust, and highly dynamic and can improve the aircraft’s landing accuracy.

APA, Harvard, Vancouver, ISO, and other styles

12

Elwarfalli, Hamed, Dylan Flaute, and Russell C. Hardie. "Exponential Fusion of Interpolated Frames Network (EFIF-Net): Advancing Multi-Frame Image Super-Resolution with Convolutional Neural Networks." Sensors 24, no. 1 (2024): 296. http://dx.doi.org/10.3390/s24010296.

Full text

Abstract:

Convolutional neural networks (CNNs) have become instrumental in advancing multi-frame image super-resolution (SR), a technique that merges multiple low-resolution images of the same scene into a high-resolution image. In this paper, a novel deep learning multi-frame SR algorithm is introduced. The proposed CNN model, named Exponential Fusion of Interpolated Frames Network (EFIF-Net), seamlessly integrates fusion and restoration within an end-to-end network. Key features of the new EFIF-Net include a custom exponentially weighted fusion (EWF) layer for image fusion and a modification of the Residual Channel Attention Network for restoration to deblur the fused image. Input frames are registered with subpixel accuracy using an affine motion model to capture the camera platform motion. The frames are externally upsampled using single-image interpolation. The interpolated frames are then fused with the custom EWF layer, employing subpixel registration information to give more weight to pixels with less interpolation error. Realistic image acquisition conditions are simulated to generate training and testing datasets with corresponding ground truths. The observation model captures optical degradation from diffraction and detector integration from the sensor. The experimental results demonstrate the efficacy of EFIF-Net using both simulated and real camera data. The real camera results use authentic, unaltered camera data without artificial downsampling or degradation.

APA, Harvard, Vancouver, ISO, and other styles

13

Huang, Sunan, Rodney Swee Huat Teo, and William Wai Lun Leong. "Multi-Camera Networks for Coverage Control of Drones." Drones 6, no. 3 (2022): 67. http://dx.doi.org/10.3390/drones6030067.

Full text

Abstract:

Multiple unmanned multirotor (MUM) systems are becoming a reality. They have a wide range of applications such as for surveillance, search and rescue, monitoring operations in hazardous environments and providing communication coverage services. Currently, an important issue in MUM is coverage control. In this paper, an existing coverage control algorithm has been extended to incorporate a new sensor model, which is downward facing and allows pan-tilt-zoom (PTZ). Two new constraints, namely view angle and collision avoidance, have also been included. Mobile network coverage among the MUMs is studied. Finally, the proposed scheme is tested in computer simulations.

APA, Harvard, Vancouver, ISO, and other styles

14

Huang, Zhengyue, Zhehui Zhao, Hengguang Zhou, Xibin Zhao, and Yue Gao. "DeepCCFV: Camera Constraint-Free Multi-View Convolutional Neural Network for 3D Object Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 8505–12. http://dx.doi.org/10.1609/aaai.v33i01.33018505.

Full text

Abstract:

3D object retrieval has a compelling demand in the field of computer vision with the rapid development of 3D vision technology and increasing applications of 3D objects. 3D objects can be described in different ways such as voxel, point cloud, and multi-view. Among them, multi-view based approaches proposed in recent years show promising results. Most of them require a fixed predefined camera position setting which provides a complete and uniform sampling of views for objects in the training stage. However, this causes heavy over-fitting problems which make the models failed to generalize well in free camera setting applications, particularly when insufficient views are provided. Experiments show the performance drastically drops when the number of views reduces, hindering these methods from practical applications. In this paper, we investigate the over-fitting issue and remove the constraint of the camera setting. First, two basic feature augmentation strategies Dropout and Dropview are introduced to solve the over-fitting issue, and a more precise and more efficient method named DropMax is proposed after analyzing the drawback of the basic ones. Then, by reducing the over-fitting issue, a camera constraint-free multi-view convolutional neural network named DeepCCFV is constructed. Extensive experiments on both single-modal and cross-modal cases demonstrate the effectiveness of the proposed method in free camera settings comparing with existing state-of-theart 3D object retrieval methods.

APA, Harvard, Vancouver, ISO, and other styles

15

PENALOZA, Christian, Yasushi MAE, Tatsuo ARAI, Kenichi OHARA, and Tomohito TAKUBO. "2P1-Q04 Multi-Appearance Object Modeling using Camera Network in Household Environment(Intelligent and Robotic Room)." Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2011 (2011): _2P1—Q04_1—_2P1—Q04_4. http://dx.doi.org/10.1299/jsmermd.2011._2p1-q04_1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Cho, Myungjin, Howon Lee, Hyun-Ho Choi, and Bahram Javidi. "A Three-Dimensional Image Transmission Using In-Network Computation in Wireless Multi-Camera Networks." IEEE Journal of the Electron Devices Society 5, no. 6 (2017): 445–52. http://dx.doi.org/10.1109/jeds.2017.2721368.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Guo, Yuhao, and Hui Hu. "Multi-Layer Fusion 3D Object Detection via Lidar Point Cloud and Camera Image." Applied Sciences 14, no. 4 (2024): 1348. http://dx.doi.org/10.3390/app14041348.

Full text

Abstract:

Object detection is a key task in automatic driving, and the poor performance of small object detection is a challenge that needs to be overcome. Previously, object detection networks could detect large-scale objects in ideal environments, but detecting small objects was very difficult. To address this problem, we propose a multi-layer fusion 3D object detection network. First, a dense fusion (D-fusion) method is proposed, which is different from the traditional fusion method. By fusing the feature maps of each layer, more semantic information of the fusion network can be preserved. Secondly, in order to preserve small objects at the feature map level, we designed a feature extractor with an adaptive fusion module (AFM), which reduces the impact of the background on small objects by weighting and fusing different feature layers. Finally, an attention mechanism was added to the feature extractor to accelerate the training efficiency and convergence speed of the network by suppressing information that is irrelevant to the task. The experimental results show that our proposed approach greatly improves the baseline and outperforms most state-of-the-art methods on KITTI object detection benchmarks.

APA, Harvard, Vancouver, ISO, and other styles

18

Voulodimos, Athanasios S., Nikolaos D. Doulamis, Dimitrios I. Kosmopoulos, and Theodora A. Varvarigou. "IMPROVING MULTI-CAMERA ACTIVITY RECOGNITION BY EMPLOYING NEURAL NETWORK BASED READJUSTMENT." Applied Artificial Intelligence 26, no. 1-2 (2012): 97–118. http://dx.doi.org/10.1080/08839514.2012.629540.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Bi Song and A. K. Roy-Chowdhury. "Robust Tracking in A Camera Network: A Multi-Objective Optimization Framework." IEEE Journal of Selected Topics in Signal Processing 2, no. 4 (2008): 582–96. http://dx.doi.org/10.1109/jstsp.2008.925992.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Kim, Seong Hyun, and Ju Yong Chang. "Single-Shot 3D Multi-Person Shape Reconstruction from a Single RGB Image." Entropy 22, no. 8 (2020): 806. http://dx.doi.org/10.3390/e22080806.

Full text

Abstract:

Although the performance of the 3D human shape reconstruction method has improved considerably in recent years, most methods focus on a single person, reconstruct a root-relative 3D shape, and rely on ground-truth information about the absolute depth to convert the reconstruction result to the camera coordinate system. In this paper, we propose an end-to-end learning-based model for single-shot, 3D, multi-person shape reconstruction in the camera coordinate system from a single RGB image. Our network produces output tensors divided into grid cells to reconstruct the 3D shapes of multiple persons in a single-shot manner, where each grid cell contains information about the subject. Moreover, our network predicts the absolute position of the root joint while reconstructing the root-relative 3D shape, which enables reconstructing the 3D shapes of multiple persons in the camera coordinate system. The proposed network can be learned in an end-to-end manner and process images at about 37 fps to perform the 3D multi-person shape reconstruction task in real time.

APA, Harvard, Vancouver, ISO, and other styles

21

Osiński, Piotr, Jakub Markiewicz, Jarosław Nowisz, Michał Remiszewski, Albert Rasiński, and Robert Sitnik. "A Novel Approach for Dynamic (4d) Multi-View Stereo System Camera Network Design." Sensors 22, no. 4 (2022): 1576. http://dx.doi.org/10.3390/s22041576.

Full text

Abstract:

Image network design is a critical factor in image-based 3D shape reconstruction and data processing (especially in the application of combined SfM/MVS methods). This paper aims to present a new approach to designing and planning multi-view imaging networks for dynamic 3D scene reconstruction without preliminary information about object geometry or location. The only constraints are the size of defined measurement volume, the required resolution, and the accuracy of geometric reconstruction. The proposed automatic camera network design method is based on the Monte Carlo algorithm and a set of prediction functions (considering accuracy, density, and completeness of shape reconstruction). This is used to determine the camera positions and orientations and makes it possible to achieve the required completeness of shape, accuracy, and resolution of the final 3D reconstruction. To assess the accuracy and efficiency of the proposed method, tests were carried out on synthetic and real data. For a set of 20 virtual images of rendered spheres, completeness of shape reconstruction was up by 92.3% while maintaining accuracy and resolution at the user-specified level. In the case of the real data, the differences between predictions and evaluations for average density were in the range between 33.8% to 45.0%.

APA, Harvard, Vancouver, ISO, and other styles

22

Hong, Yong, Deren Li, Shupei Luo, Xin Chen, Yi Yang, and Mi Wang. "An Improved End-to-End Multi-Target Tracking Method Based on Transformer Self-Attention." Remote Sensing 14, no. 24 (2022): 6354. http://dx.doi.org/10.3390/rs14246354.

Full text

Abstract:

Current multi-target multi-camera tracking algorithms demand increased requirements for re-identification accuracy and tracking reliability. This study proposed an improved end-to-end multi-target tracking algorithm that adapts to multi-view multi-scale scenes based on the self-attentive mechanism of the transformer’s encoder–decoder structure. A multi-dimensional feature extraction backbone network was combined with a self-built raster semantic map which was stored in the encoder for correlation and generated target position encoding and multi-dimensional feature vectors. The decoder incorporated four methods: spatial clustering and semantic filtering of multi-view targets; dynamic matching of multi-dimensional features; space–time logic-based multi-target tracking, and space–time convergence network (STCN)-based parameter passing. Through the fusion of multiple decoding methods, multi-camera targets were tracked in three dimensions: temporal logic, spatial logic, and feature matching. For the MOT17 dataset, this study’s method significantly outperformed the current state-of-the-art method by 2.2% on the multiple object tracking accuracy (MOTA) metric. Furthermore, this study proposed a retrospective mechanism for the first time and adopted a reverse-order processing method to optimize the historical mislabeled targets for improving the identification F1-score (IDF1). For the self-built dataset OVIT-MOT01, the IDF1 improved from 0.948 to 0.967, and the multi-camera tracking accuracy (MCTA) improved from 0.878 to 0.909, which significantly improved the continuous tracking accuracy and reliability.

APA, Harvard, Vancouver, ISO, and other styles

23

Vandendriessche, Jurgen, Bruno da Silva, Lancelot Lhoest, An Braeken, and Abdellah Touhafi. "M3-AC: A Multi-Mode Multithread SoC FPGA Based Acoustic Camera." Electronics 10, no. 3 (2021): 317. http://dx.doi.org/10.3390/electronics10030317.

Full text

Abstract:

Acoustic cameras allow the visualization of sound sources using microphone arrays and beamforming techniques. The required computational power increases with the number of microphones in the array, the acoustic images resolution, and in particular, when targeting real-time. Such a constraint limits the use of acoustic cameras in many wireless sensor network applications (surveillance, industrial monitoring, etc.). In this paper, we propose a multi-mode System-on-Chip (SoC) Field-Programmable Gate Arrays (FPGA) architecture capable to satisfy the high computational demand while providing wireless communication for remote control and monitoring. This architecture produces real-time acoustic images of 240 × 180 resolution scalable to 640 × 480 by exploiting the multithreading capabilities of the hard-core processor. Furthermore, timing cost for different operational modes and for different resolutions are investigated to maintain a real time system under Wireless Sensor Networks constraints.

APA, Harvard, Vancouver, ISO, and other styles

24

Rosas-Cervantes, Vinicio, Quoc-Dong Hoang, Sooho Woo, and Soon-Geul Lee. "Mobile robot 3D trajectory estimation on a multilevel surface with multimodal fusion of 2D camera features and a 3D light detection and ranging point cloud." International Journal of Advanced Robotic Systems 19, no. 2 (2022): 172988062210891. http://dx.doi.org/10.1177/17298806221089198.

Full text

Abstract:

Nowadays, multi-sensor fusion is a popular tool for feature recognition and object detection. Integrating various sensors allows us to obtain reliable information about the environment. This article proposes a 3D robot trajectory estimation based on a multimodal fusion of 2D features extracted from color images and 3D features from 3D point clouds. First, a set of images was collected using a monocular camera, and we trained a Faster Region Convolutional Neural Network. Using the Faster Region Convolutional Neural Network, the robot detects 2D features from camera input and 3D features using the point’s normal distribution on the 3D point cloud. Then, by matching 2D image features to a 3D point cloud, the robot estimates its position. To validate our results, we compared the trained neural network with similar convolutional neural networks. Then, we evaluated their response for the mobile robot trajectory estimation.

APA, Harvard, Vancouver, ISO, and other styles

25

Zhang, Yi, and J. Chen. "A Intelligent Wheelchair Obstacle Avoidance System Based on Multi-Sensor Fusion Technology." Key Engineering Materials 455 (December 2010): 121–26. http://dx.doi.org/10.4028/www.scientific.net/kem.455.121.

Full text

Abstract:

In this paper an intelligent wheelchair obstacle avoidance system based on multi-sensor data fusion technology is instructed. It giving rises to the hardware architecture of the wheelchair and develops a sonar and camera data acquisition system on the VC++ platform by which we could complete the sonar and camera sensor information collection and data processing. Use a T-S model based fuzzy neural network multi-sensor data fusion method for intelligent wheelchair obstacle avoidance. Some simulations were done to test the method in different environments and the method can effectively integrated the information of sonar and camera, give appropriate control signals to avoid obstacles.

APA, Harvard, Vancouver, ISO, and other styles

26

Gao, Fei, Meizhen Wang, Xuejun Liu, and Ziran Wang. "A multi-objective scheduling optimization algorithm of a camera network for directional road network coverage." PLOS ONE 13, no. 10 (2018): e0206038. http://dx.doi.org/10.1371/journal.pone.0206038.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

He, Fangzhou. "Exploration of Multi-Node Collaborative Image Acquisition and Compression Techniques for Wireless Multimedia Sensor Networks." International Journal of Online and Biomedical Engineering (iJOE) 15, no. 01 (2019): 196. http://dx.doi.org/10.3991/ijoe.v15i01.9787.

Full text

Abstract:

<span lang="EN-US">Aiming at saving energy and maximizing the network life cycle, the multi-node cooperative image acquisition and compression technology in Wireless Multimedia Sensor Networks</span><span lang="EN-US">(</span><span lang="EN-US">WMSNs) is studied deeply. </span><span lang="EN-US">T</span><span lang="EN-US">he Minimum Energy Image Collection (MEIC) problem for multiple target domains in a certain period of time in the monitoring area is proposed, the integer linear programming for Minimum Energy Image Collection (MEIC) problem is described and proved to be NP complete; then combined with the features of image acquisition of camera node,<a name="_Hlk527549560"></a> the Local Camera Coordinative Energy-saving Strategy (LCCES) is proposed, and the performance of the Local Camera Coordinative Energy-saving Strategy (LCCES) is evaluated through a lot of simulation experiments; finally, the LBT-based Multi-node Cooperative Image Compression Scheme (LBT-MCIC) is proposed. The results show that this strategy can effectively reduce the number of active camera nodes in the process of image acquisition, thus reducing the energy consumption of image acquisition</span><span lang="EN-US">.</span><span lang="EN-US"> At the same time, it also plays a role in balancing the energy consumption of camera nodes in the network, effectively solves the problem of high cost of common nodes in the image transmission scheme of two-hop cluster structure and has the characteristics of low computational complexity and high quality of reconstructed image.</span>

APA, Harvard, Vancouver, ISO, and other styles

28

Dong, Xuan, Weixin Li, and Xiaojie Wang. "Pyramid convolutional network for colorization in monochrome-color multi-lens camera system." Neurocomputing 450 (August 2021): 129–42. http://dx.doi.org/10.1016/j.neucom.2021.04.014.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

MING, An-Long, Hua-Dong MA, and Hui-Yuan FU. "Bayes Causal Network Based Method for Role Identification in Multi-Camera Surveillance." Chinese Journal of Computers 33, no. 12 (2011): 2378–86. http://dx.doi.org/10.3724/sp.j.1016.2010.02378.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Canedo-Rodriguez, A., C. V. Regueiro, R. Iglesias, V. Alvarez-Santos, and X. M. Pardo. "Self-organized multi-camera network for ubiquitous robot deployment in unknown environments." Robotics and Autonomous Systems 61, no. 7 (2013): 667–75. http://dx.doi.org/10.1016/j.robot.2012.08.014.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Yao, Hongwei, Tong Qiao, Ming Xu, and Ning Zheng. "Robust Multi-Classifier for Camera Model Identification Based on Convolution Neural Network." IEEE Access 6 (2018): 24973–82. http://dx.doi.org/10.1109/access.2018.2832066.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Stamatopoulos, C., and C. S. Fraser. "Automated Target-Free Network Orienation and Camera Calibration." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences II-5 (May 28, 2014): 339–46. http://dx.doi.org/10.5194/isprsannals-ii-5-339-2014.

Full text

Abstract:

Automated close-range photogrammetric network orientation and camera calibration has traditionally been associated with the use of coded targets in the object space to allow for an initial relative orientation (RO) and subsequent spatial resection of the images. However, over the last decade, advances coming mainly from the computer vision (CV) community have allowed for fully automated orientation via feature-based matching techniques. There are a number of advantages in such methodologies for various types of applications, as well as for cases where the use of artificial targets might be not possible or preferable, for example when attempting calibration from low-level aerial imagery, as with UAVs, or when calibrating long-focal length lenses where small image scales call for inconveniently large coded targets. While there are now a number of CV-based algorithms for multi-image orientation within narrow-baseline networks, with accompanying open-source software, from a photogrammetric standpoint the results are typically disappointing as the metric integrity of the resulting models is generally poor, or even unknown. The objective addressed in this paper is target-free automatic multi-image orientation, maintaining metric integrity, within networks that incorporate wide-baseline imagery. The focus is on both the development of a methodology that overcomes the shortcomings that can be present in current CV algorithms, and on the photogrammetric priorities and requirements that exist in current processing pipelines. This paper also reports on the application of the proposed methodology to automated target-free camera self-calibration and discusses the process via practical examples.

APA, Harvard, Vancouver, ISO, and other styles

33

Bazi, Yakoub, Haikel Alhichri, Naif Alajlan, and Farid Melgani. "Scene Description for Visually Impaired People with Multi-Label Convolutional SVM Networks." Applied Sciences 9, no. 23 (2019): 5062. http://dx.doi.org/10.3390/app9235062.

Full text

Abstract:

In this paper, we present a portable camera-based method for helping visually impaired (VI) people to recognize multiple objects in images. This method relies on a novel multi-label convolutional support vector machine (CSVM) network for coarse description of images. The core idea of CSVM is to use a set of linear SVMs as filter banks for feature map generation. During the training phase, the weights of the SVM filters are obtained using a forward-supervised learning strategy unlike the backpropagation algorithm used in standard convolutional neural networks (CNNs). To handle multi-label detection, we introduce a multi-branch CSVM architecture, where each branch will be used for detecting one object in the image. This architecture exploits the correlation between the objects present in the image by means of an opportune fusion mechanism of the intermediate outputs provided by the convolution layers of each branch. The high-level reasoning of the network is done through binary classification SVMs for predicting the presence/absence of objects in the image. The experiments obtained on two indoor datasets and one outdoor dataset acquired from a portable camera mounted on a lightweight shield worn by the user, and connected via a USB wire to a laptop processing unit are reported and discussed.

APA, Harvard, Vancouver, ISO, and other styles

34

Hong, Jongkwang, Bora Cho, Yong Hong, and Hyeran Byun. "Contextual Action Cues from Camera Sensor for Multi-Stream Action Recognition." Sensors 19, no. 6 (2019): 1382. http://dx.doi.org/10.3390/s19061382.

Full text

Abstract:

In action recognition research, two primary types of information are appearance and motion information that is learned from RGB images through visual sensors. However, depending on the action characteristics, contextual information, such as the existence of specific objects or globally-shared information in the image, becomes vital information to define the action. For example, the existence of the ball is vital information distinguishing “kicking” from “running”. Furthermore, some actions share typical global abstract poses, which can be used as a key to classify actions. Based on these observations, we propose the multi-stream network model, which incorporates spatial, temporal, and contextual cues in the image for action recognition. We experimented on the proposed method using C3D or inflated 3D ConvNet (I3D) as a backbone network, regarding two different action recognition datasets. As a result, we observed overall improvement in accuracy, demonstrating the effectiveness of our proposed method.

APA, Harvard, Vancouver, ISO, and other styles

35

Li, Hongchao, Chenglong Li, Xianpeng Zhu, Aihua Zheng, and Bin Luo. "Multi-Spectral Vehicle Re-Identification: A Challenge." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 11345–53. http://dx.doi.org/10.1609/aaai.v34i07.6796.

Full text

Abstract:

Vehicle re-identification (Re-ID) is a crucial task in smart city and intelligent transportation, aiming to match vehicle images across non-overlapping surveillance camera views. Currently, most works focus on RGB-based vehicle Re-ID, which limits its capability of real-life applications in adverse environments such as dark environments and bad weathers. IR (Infrared) spectrum imaging offers complementary information to relieve the illumination issue in computer vision tasks. Furthermore, vehicle Re-ID suffers a big challenge of the diverse appearance with different views, such as trucks. In this work, we address the RGB and IR vehicle Re-ID problem and contribute a multi-spectral vehicle Re-ID benchmark named RGBN300, including RGB and NIR (Near Infrared) vehicle images of 300 identities from 8 camera views, giving in total 50125 RGB images and 50125 NIR images respectively. In addition, we have acquired additional TIR (Thermal Infrared) data for 100 vehicles from RGBN300 to form another dataset for three-spectral vehicle Re-ID. Furthermore, we propose a Heterogeneity-collaboration Aware Multi-stream convolutional Network (HAMNet) towards automatically fusing different spectrum features in an end-to-end learning framework. Comprehensive experiments on prevalent networks show that our HAMNet can effectively integrate multi-spectral data for robust vehicle Re-ID in day and night. Our work provides a benchmark dataset for RGB-NIR and RGB-NIR-TIR multi-spectral vehicle Re-ID and a baseline network for both research and industrial communities. The dataset and baseline codes are available at: https://github.com/ttaalle/multi-modal-vehicle-Re-ID.

APA, Harvard, Vancouver, ISO, and other styles

36

Heo, Jinyeong, and Yongjin (James) Kwon. "3D Vehicle Trajectory Extraction Using DCNN in an Overlapping Multi-Camera Crossroad Scene." Sensors 21, no. 23 (2021): 7879. http://dx.doi.org/10.3390/s21237879.

Full text

Abstract:

The 3D vehicle trajectory in complex traffic conditions such as crossroads and heavy traffic is practically very useful in autonomous driving. In order to accurately extract the 3D vehicle trajectory from a perspective camera in a crossroad where the vehicle has an angular range of 360 degrees, problems such as the narrow visual angle in single-camera scene, vehicle occlusion under conditions of low camera perspective, and lack of vehicle physical information must be solved. In this paper, we propose a method for estimating the 3D bounding boxes of vehicles and extracting trajectories using a deep convolutional neural network (DCNN) in an overlapping multi-camera crossroad scene. First, traffic data were collected using overlapping multi-cameras to obtain a wide range of trajectories around the crossroad. Then, 3D bounding boxes of vehicles were estimated and tracked in each single-camera scene through DCNN models (YOLOv4, multi-branch CNN) combined with camera calibration. Using the abovementioned information, the 3D vehicle trajectory could be extracted on the ground plane of the crossroad by calculating results obtained from the overlapping multi-camera with a homography matrix. Finally, in experiments, the errors of extracted trajectories were corrected through a simple linear interpolation and regression, and the accuracy of the proposed method was verified by calculating the difference with ground-truth data. Compared with other previously reported methods, our approach is shown to be more accurate and more practical.

APA, Harvard, Vancouver, ISO, and other styles

37

Fan, Zhijie, Zhiwei Cao, Xin Li, Chunmei Wang, Bo Jin, and Qianjin Tang. "Video Surveillance Camera Identity Recognition Method Fused With Multi-Dimensional Static and Dynamic Identification Features." International Journal of Information Security and Privacy 17, no. 1 (2023): 1–18. http://dx.doi.org/10.4018/ijisp.319304.

Full text

Abstract:

With the development of smart cities, video surveillance networks have become an important infrastructure for urban governance. However, by replacing or tampering with surveillance cameras, an important front-end device, attackers are able to access the internal network. In order to identify illegal or suspicious camera identities in advance, a camera identity identification method that incorporates multidimensional identification features is proposed. By extracting the static information of cameras and dynamic traffic information, a camera identity system that incorporates explicit, implicit, and dynamic identifiers is constructed. The experimental results show that the explicit identifiers have the highest contribution, but they are easy to forge; the dynamic identifiers rank second, but the traffic preprocessing is complex; the static identifiers rank last but are indispensable. Experiments on 40 cameras verified the effectiveness and feasibility of the proposed identifier system for camera identification, and the accuracy of identification reached 92.5%.

APA, Harvard, Vancouver, ISO, and other styles

38

Zhou, Jing, Shi Jun Li, Dong Yan Huang, Xiao Lin Chen, and Ye Chi Zhang. "Data Normalization of Single Camera Visual Measurement Network System." Applied Mechanics and Materials 263-266 (December 2012): 2381–84. http://dx.doi.org/10.4028/www.scientific.net/amm.263-266.2381.

Full text

Abstract:

With the development of visual measurement system , the visual measurement system of single camera which applied in the imaging theory of optical feature points , it has been widespread used in the modern production . Due to the limit of the environment in scene , the visual measurement system of single camera could not measure the shield between the measured objects each other . Focus on this problem , present a kind of the the knowledge of measurement network based on the visual measurement of single camera , set up the measurement network system via the multi-control points . Measure the optical feature points in every network control point via the visual measurement system of single camera; Normalize the measured coordinates into the same world coordinate system , then get the global data , and achieve the overall measurement ; Apply in the three coordinate measuring machine(CMM) to have the simulation experiment , the result indicates that the maximum absolute tolerance between the normalization coordinates via the center of point set and the ones measured by CMM directly is 0.058mm ; But the one between the coordinate repeatedly measured at the position of one network control point is 0.066mm ; So the data normalization has the advantage of more high precision .

APA, Harvard, Vancouver, ISO, and other styles

39

Jiang, Mingjun, Zihan Zhang, Kohei Shimasaki, Shaopeng Hu, and Idaku Ishii. "Multi-Thread AI Cameras Using High-Speed Active Vision System." Journal of Robotics and Mechatronics 34, no. 5 (2022): 1053–62. http://dx.doi.org/10.20965/jrm.2022.p1053.

Full text

Abstract:

In this study, we propose a multi-thread artificial intelligence (AI) camera system that can simultaneously recognize remote objects in desired multiple areas of interest (AOIs), which are distributed in a wide field of view (FOV) by using single image sensor. The proposed multi-thread AI camera consists of an ultrafast active vision system and a convolutional neural network (CNN)-based ultrafast object recognition system. The ultrafast active vision system can function as multiple virtual cameras with high spatial resolution by synchronizing exposure of a high-speed camera and movement of an ultrafast two-axis mirror device at hundreds of hertz, and the CNN-based ultrafast object recognition system simultaneously recognizes the acquired high-frame-rate images in real time. The desired AOIs for monitoring can be automatically determined after rapidly scanning pre-placed visual anchors in the wide FOV at hundreds of fps with object recognition. The effectiveness of the proposed multi-thread AI camera system was demonstrated by conducting several wide area monitoring experiments on quick response (QR) codes and persons in nature spacious scene such as meeting room, which was formerly too wide for a single still camera with wide angle lens to simultaneously acquire clear images.

APA, Harvard, Vancouver, ISO, and other styles

40

Perfetti, L., C. Polari, and F. Fassi. "FISHEYE MULTI-CAMERA SYSTEM CALIBRATION FOR SURVEYING NARROW AND COMPLEX ARCHITECTURES." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2 (May 30, 2018): 877–83. http://dx.doi.org/10.5194/isprs-archives-xlii-2-877-2018.

Full text

Abstract:

Narrow spaces and passages are not a rare encounter in cultural heritage, the shape and extension of those areas place a serious challenge on any techniques one may choose to survey their 3D geometry. Especially on techniques that make use of stationary instrumentation like terrestrial laser scanning. The ratio between space extension and cross section width of many corridors and staircases can easily lead to distortions/drift of the 3D reconstruction because of the problem of propagation of uncertainty. This paper investigates the use of fisheye photogrammetry to produce the 3D reconstruction of such spaces and presents some tests to contain the degree of freedom of the photogrammetric network, thereby containing the drift of long data set as well. The idea is that of employing a multi-camera system composed of several fisheye cameras and to implement distances and relative orientation constraints, as well as the pre-calibration of the internal parameters for each camera, within the bundle adjustment. For the beginning of this investigation, we used the NCTech iSTAR panoramic camera as a rigid multi-camera system. The case study of the Amedeo Spire of the Milan Cathedral, that encloses a spiral staircase, is the stage for all the tests. Comparisons have been made between the results obtained with the multi-camera configuration, the auto-stitched equirectangular images and a data set obtained with a monocular fisheye configuration using a full frame DSLR. Results show improved accuracy, down to millimetres, using a rigidly constrained multi-camera.

APA, Harvard, Vancouver, ISO, and other styles

41

Wang, Sijie, Qiyu Kang, Rui She, Wee Peng Tay, Andreas Hartmannsgruber, and Diego Navarro Navarro. "RobustLoc: Robust Camera Pose Regression in Challenging Driving Environments." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 5 (2023): 6209–16. http://dx.doi.org/10.1609/aaai.v37i5.25765.

Full text

Abstract:

Camera relocalization has various applications in autonomous driving. Previous camera pose regression models consider only ideal scenarios where there is little environmental perturbation. To deal with challenging driving environments that may have changing seasons, weather, illumination, and the presence of unstable objects, we propose RobustLoc, which derives its robustness against perturbations from neural differential equations. Our model uses a convolutional neural network to extract feature maps from multi-view images, a robust neural differential equation diffusion block module to diffuse information interactively, and a branched pose decoder with multi-layer training to estimate the vehicle poses. Experiments demonstrate that RobustLoc surpasses current state-of-the-art camera pose regression models and achieves robust performance in various environments. Our code is released at: https://github.com/sijieaaa/RobustLoc

APA, Harvard, Vancouver, ISO, and other styles

42

Alsadik, Bashar, Luuk Spreeuwers, Farzaneh Dadrass Javan, and Nahuel Manterola. "Mathematical Camera Array Optimization for Face 3D Modeling Application." Sensors 23, no. 24 (2023): 9776. http://dx.doi.org/10.3390/s23249776.

Full text

Abstract:

Camera network design is a challenging task for many applications in photogrammetry, biomedical engineering, robotics, and industrial metrology, among other fields. Many driving factors are found in the camera network design including the camera specifications, object of interest, and type of application. One of the interesting applications is 3D face modeling and recognition which involves recognizing an individual based on facial attributes derived from the constructed 3D model. Developers and researchers still face difficulty in reaching the required high level of accuracy and reliability needed for image-based 3D face models. This is caused among many factors by the hardware limitations and imperfection of the cameras and the lack of proficiency in designing the ideal camera-system configuration. Accordingly, for precise measurements, we still need engineering-based techniques to ascertain the specific level of deliverables quality. In this paper, an optimal geometric design methodology of the camera network is presented by investigating different multi-camera system configurations composed of four up to eight cameras. A mathematical nonlinear constrained optimization technique is applied to solve the problem and each camera system configuration is tested for a facial 3D model where a quality assessment is applied to conclude the best configuration. The optimal configuration is found to be a 7-camera array, comprising a pentagon shape enclosing two additional cameras, offering high accuracy. For those who prioritize point density, a 9-camera array with a pentagon and quadrilateral arrangement in the X-Z plane is a viable choice. However, a 5-camera array offers a balance between accuracy and the number of cameras.

APA, Harvard, Vancouver, ISO, and other styles

43

Dong, Huanan, Ming Wen, and Zhouwang Yang. "Vehicle Speed Estimation Based on 3D ConvNets and Non-Local Blocks." Future Internet 11, no. 6 (2019): 123. http://dx.doi.org/10.3390/fi11060123.

Full text

Abstract:

Vehicle speed estimation is an important problem in traffic surveillance. Many existing approaches to this problem are based on camera calibration. Two shortcomings exist for camera calibration-based methods. First, camera calibration methods are sensitive to the environment, which means the accuracy of the results are compromised in some situations where the environmental condition is not satisfied. Furthermore, camera calibration-based methods rely on vehicle trajectories acquired by a two-stage tracking and detection process. In an effort to overcome these shortcomings, we propose an alternate end-to-end method based on 3-dimensional convolutional networks (3D ConvNets). The proposed method bases average vehicle speed estimation on information from video footage. Our methods are characterized by the following three features. First, we use non-local blocks in our model to better capture spatial–temporal long-range dependency. Second, we use optical flow as an input in the model. Optical flow includes the information on the speed and direction of pixel motion in an image. Third, we construct a multi-scale convolutional network. This network extracts information on various characteristics of vehicles in motion. The proposed method showcases promising experimental results on commonly used dataset with mean absolute error (MAE) as 2.71 km/h and mean square error (MSE) as 14.62 .

APA, Harvard, Vancouver, ISO, and other styles

44

Liang, Hong, Zizhen Ma, and Qian Zhang. "Self-Supervised Object Distance Estimation Using a Monocular Camera." Sensors 22, no. 8 (2022): 2936. http://dx.doi.org/10.3390/s22082936.

Full text

Abstract:

Distance estimation using a monocular camera is one of the most classic tasks for computer vision. Current monocular distance estimating methods need a lot of data collection or they produce imprecise results. In this paper, we propose a network for both object detection and distance estimation. A network-based on ShuffleNet and YOLO is used to detect an object, and a self-supervised learning network is used to estimate distance. We calibrated the camera, and the calibrated parameters were integrated into the overall network. We also analyzed the parameter variation of the camera pose. Further, a multi-scale resolution is applied to improve estimation accuracy by enriching the expression ability of depth information. We validated the results of object detection and distance estimation on the KITTI dataset and demonstrated that our approach is efficient and accurate. Finally, we construct a dataset and conduct similar experiments to verify the generality of the network in other scenarios. The results show that our proposed methods outperform alternative approaches on object-specific distance estimation.

APA, Harvard, Vancouver, ISO, and other styles

45

Choi, Doo-Hyun, and Se-Young Oh. "Real-Time Neural Network Based Camera Localization and its Extension to Mobile Robot Control." International Journal of Neural Systems 08, no. 03 (1997): 279–93. http://dx.doi.org/10.1142/s012906579700029x.

Full text

Abstract:

The feasibility of using neural networks for camera localization and mobile robot control is investigated here. This approach has the advantages of eliminating the laborious and error-prone process of imaging system modeling and calibration procedures. Basically, two different approaches of using neural networks are introduced of which one is a hybrid approach combining neural networks and the pinhole-based analytic solution while the other is purely neural network based. These techniques have been tested and compared through both simulation and real-time experiments and are shown to yield more precise localization than analytic approaches. Furthermore, this neural localization method is also shown to be directly applicable to the navigation control of an experimental mobile robot along the hallway purely guided by a dark wall strip. It also facilitates multi-sensor fusion through the use of multiple sensors of different types for control due to the network's capability of learning without models.

APA, Harvard, Vancouver, ISO, and other styles

46

Antuña-Sánchez, Juan C., Roberto Román, Victoria E. Cachorro, et al. "Relative sky radiance from multi-exposure all-sky camera images." Atmospheric Measurement Techniques 14, no. 3 (2021): 2201–17. http://dx.doi.org/10.5194/amt-14-2201-2021.

Full text

Abstract:

Abstract. All-sky cameras are frequently used to detect cloud cover; however, this work explores the use of these instruments for the more complex purpose of extracting relative sky radiances. An all-sky camera (SONA202-NF model) with three colour filters narrower than usual for this kind of cameras is configured to capture raw images at seven exposure times. A detailed camera characterization of the black level, readout noise, hot pixels and linear response is carried out. A methodology is proposed to obtain a linear high dynamic range (HDR) image and its uncertainty, which represents the relative sky radiance (in arbitrary units) maps at three effective wavelengths. The relative sky radiances are extracted from these maps and normalized by dividing every radiance of one channel by the sum of all radiances at this channel. Then, the normalized radiances are compared with the sky radiance measured at different sky points by a sun and sky photometer belonging to the Aerosol Robotic Network (AERONET). The camera radiances correlate with photometer ones except for scattering angles below 10∘, which is probably due to some light reflections on the fisheye lens and camera dome. Camera and photometer wavelengths are not coincident; hence, camera radiances are also compared with sky radiances simulated by a radiative transfer model at the same camera effective wavelengths. This comparison reveals an uncertainty on the normalized camera radiances of about 3.3 %, 4.3 % and 5.3 % for 467, 536 and 605 nm, respectively, if specific quality criteria are applied.

APA, Harvard, Vancouver, ISO, and other styles

47

Deng, Xinpeng, Su Qiu, Weiqi Jin, and Jiaan Xue. "Three-Dimensional Reconstruction Method for Bionic Compound-Eye System Based on MVSNet Network." Electronics 11, no. 11 (2022): 1790. http://dx.doi.org/10.3390/electronics11111790.

Full text

Abstract:

In practical scenarios, when shooting conditions are limited, high efficiency of image shooting and success rate of 3D reconstruction are required. To achieve the application of bionic compound eyes in small portable devices for 3D reconstruction, auto-navigation, and obstacle avoidance, a deep learning method of 3D reconstruction using a bionic compound-eye system with partial-overlap fields was studied. We used the system to capture images of the target scene, then restored the camera parameter matrix by solving the PnP problem. Considering the unique characteristics of the system, we designed a neural network based on the MVSNet network structure, named CES-MVSNet. We fed the captured image and camera parameters to the trained deep neural network, which can generate 3D reconstruction results with good integrity and precision. We used the traditional multi-view geometric method and neural networks for 3D reconstruction, and the difference between the effects of the two methods was analyzed. The efficiency and reliability of using the bionic compound-eye system for 3D reconstruction are proved.

APA, Harvard, Vancouver, ISO, and other styles

48

Makov, S., A. Minaev, A. Nikitin, V. Voronin, E. Semenishchev, and V. Marchuk. "Feature representation learning by sparse neural network for multi-camera person re-identification." Electronic Imaging 2017, no. 13 (2017): 149–55. http://dx.doi.org/10.2352/issn.2470-1173.2017.13.ipas-211.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Mendez, Javier, Miguel Molina, Noel Rodriguez, Manuel P. Cuellar, and Diego P. Morales. "Camera-LiDAR Multi-Level Sensor Fusion for Target Detection at the Network Edge." Sensors 21, no. 12 (2021): 3992. http://dx.doi.org/10.3390/s21123992.

Full text

Abstract:

There have been significant advances regarding target detection in the autonomous vehicle context. To develop more robust systems that can overcome weather hazards as well as sensor problems, the sensor fusion approach is taking the lead in this context. Laser Imaging Detection and Ranging (LiDAR) and camera sensors are two of the most used sensors for this task since they can accurately provide important features such as target´s depth and shape. However, most of the current state-of-the-art target detection algorithms for autonomous cars do not take into consideration the hardware limitations of the vehicle such as the reduced computing power in comparison with Cloud servers as well as the reduced latency. In this work, we propose Edge Computing Tensor Processing Unit (TPU) devices as hardware support due to their computing capabilities for machine learning algorithms as well as their reduced power consumption. We developed an accurate and small target detection model for these devices. Our proposed Multi-Level Sensor Fusion model has been optimized for the network edge, specifically for the Google Coral TPU. As a result, high accuracy results are obtained while reducing the memory consumption as well as the latency of the system using the challenging KITTI dataset.

APA, Harvard, Vancouver, ISO, and other styles

50

Linok, S. A., and D. A. Yudin. "Influence of Neural Network Receptive Field on Monocular Depth and Ego-Motion Estimation." Optical Memory and Neural Networks 32, S2 (2023): S206—S213. http://dx.doi.org/10.3103/s1060992x23060103.

Full text

Abstract:

Abstract We present an analysis of a self-supervised learning approach for monocular depth and ego-motion estimation. This is an important problem for computer vision systems of robots, autonomous vehicles and other intelligent agents, equipped only with monocular camera sensor. We have explored a number of neural network architectures that perform single-frame depth and multi-frame camera pose predictions to minimize photometric error between consecutive frames on a sequence of camera images. Unlike other existing works, our proposed approach called ERF-SfMLearner examines the influence of the deep neural network receptive field on the performance of depth and ego-motion estimation. To do this, we study the modification of network layers with two convolution operators with extended receptive field: dilated and deformable convolutions. We demonstrate on the KITTI dataset that increasing the receptive field leads to better metrics and lower errors both in terms of depth and ego-motion estimation. Code is publicly available at github.com/linukc/ERF-SfMLearner.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!