Articoli di riviste sul tema "Modèle voxel"

Segui questo link per vedere altri tipi di pubblicazioni sul tema: Modèle voxel.

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 articoli di riviste per l'attività di ricerca sul tema "Modèle voxel".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi gli articoli di riviste di molte aree scientifiche e compila una bibliografia corretta.

1

Zhao, Lin, Siyuan Xu, Liman Liu, Delie Ming e Wenbing Tao. "SVASeg: Sparse Voxel-Based Attention for 3D LiDAR Point Cloud Semantic Segmentation". Remote Sensing 14, n. 18 (7 settembre 2022): 4471. http://dx.doi.org/10.3390/rs14184471.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
3D LiDAR has become an indispensable sensor in autonomous driving vehicles. In LiDAR-based 3D point cloud semantic segmentation, most voxel-based 3D segmentors cannot efficiently capture large amounts of context information, resulting in limited receptive fields and limiting their performance. To address this problem, a sparse voxel-based attention network is introduced for 3D LiDAR point cloud semantic segmentation, termed SVASeg, which captures large amounts of context information between voxels through sparse voxel-based multi-head attention (SMHA). The traditional multi-head attention cannot directly be applied to the non-empty sparse voxels. To this end, a hash table is built according to the incrementation of voxel coordinates to lookup the non-empty neighboring voxels of each sparse voxel. Then, the sparse voxels are grouped into different groups, and each group corresponds to a local region. Afterwards, position embedding, multi-head attention and feature fusion are performed for each group to capture and aggregate the context information. Based on the SMHA module, the SVASeg can directly operate on the non-empty voxels, maintaining a comparable computational overhead to the convolutional method. Extensive experimental results on the SemanticKITTI and nuScenes datasets show the superiority of SVASeg.
2

Tang, Jiaxiang, Xiaokang Chen, Jingbo Wang e Gang Zeng. "Not All Voxels Are Equal: Semantic Scene Completion from the Point-Voxel Perspective". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 2 (28 giugno 2022): 2352–60. http://dx.doi.org/10.1609/aaai.v36i2.20134.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
We revisit Semantic Scene Completion (SSC), a useful task to predict the semantic and occupancy representation of 3D scenes, in this paper. A number of methods for this task are always based on voxelized scene representations. Although voxel representations keep local structures of the scene, these methods suffer from heavy computation redundancy due to the existence of visible empty voxels when the network goes deeper. To address this dilemma, we propose our novel point-voxel aggregation network for this task. We first transfer the voxelized scenes to point clouds by removing these visible empty voxels and adopt a deep point stream to capture semantic information from the scene efficiently. Meanwhile, a light-weight voxel stream containing only two 3D convolution layers preserves local structures of the voxelized scenes. Furthermore, we design an anisotropic voxel aggregation operator to fuse the structure details from the voxel stream into the point stream, and a semantic-aware propagation module to enhance the up-sampling process in the point stream by semantic labels. We demonstrate that our model surpasses state-of-the-arts on two benchmarks by a large margin, with only the depth images as input.
3

He, Qingdong, Zhengning Wang, Hao Zeng, Yi Zeng e Yijun Liu. "SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 1 (28 giugno 2022): 870–78. http://dx.doi.org/10.1609/aaai.v36i1.19969.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Accurate 3D object detection from point clouds has become a crucial component in autonomous driving. However, the volumetric representations and the projection methods in previous works fail to establish the relationships between the local point sets. In this paper, we propose Sparse Voxel-Graph Attention Network (SVGA-Net), a novel end-to-end trainable network which mainly contains voxel-graph module and sparse-to-dense regression module to achieve comparable 3D detection tasks from raw LIDAR data. Specifically, SVGA-Net constructs the local complete graph within each divided 3D spherical voxel and global KNN graph through all voxels. The local and global graphs serve as the attention mechanism to enhance the extracted features. In addition, the novel sparse-to-dense regression module enhances the 3D box estimation accuracy through feature maps aggregation at different levels. Experiments on KITTI detection benchmark and Waymo Open dataset demonstrate the efficiency of extending the graph representation to 3D object detection and the proposed SVGA-Net can achieve decent detection accuracy.
4

Chen, Yuhong, Weilong Peng, Keke Tang, Asad Khan, Guodong Wei e Meie Fang. "PyraPVConv: Efficient 3D Point Cloud Perception with Pyramid Voxel Convolution and Sharable Attention". Computational Intelligence and Neuroscience 2022 (13 maggio 2022): 1–9. http://dx.doi.org/10.1155/2022/2286818.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Designing efficient deep learning models for 3D point cloud perception is becoming a major research direction. Point-voxel convolution (PVConv) Liu et al. (2019) is a pioneering research work in this topic. However, since with quite a few layers of simple 3D convolutions and linear point-voxel feature fusion operations, it still has considerable room for improvement in performance. In this paper, we propose a novel pyramid point-voxel convolution (PyraPVConv) block with two key structural modifications to address the above issues. First, PyraPVConv uses a voxel pyramid module to fully extract voxel features in the manner of feature pyramid, such that sufficient voxel features can be obtained efficiently. Second, a sharable attention module is utilized to capture compatible features between multi-scale voxels in pyramid and point cloud for aggregation, as well as to reduce the complexity via structure sharing. Extensive results on three point cloud perception tasks, i.e., indoor scene segmentation, object part segmentation and 3D object detection, validate that the networks constructed by stacking PyraPVConv blocks are efficient in terms of both GPU memory consumption and computational complexity, and are superior to the state-of-the-art methods.
5

Li, Guangping, Zuanfang Mo e Bingo Wing-Kuen Ling. "AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature Fusion". Sensors 23, n. 23 (22 novembre 2023): 9319. http://dx.doi.org/10.3390/s23239319.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
With the advent of autonomous vehicle applications, the importance of LiDAR point cloud 3D object detection cannot be overstated. Recent studies have demonstrated that methods for aggregating features from voxels can accurately and efficiently detect objects in large, complex 3D detection scenes. Nevertheless, most of these methods do not filter background points well and have inferior detection performance for small objects. To ameliorate this issue, this paper proposes an Attention-based and Multiscale Feature Fusion Network (AMFF-Net), which utilizes a Dual-Attention Voxel Feature Extractor (DA-VFE) and a Multi-scale Feature Fusion (MFF) Module to improve the precision and efficiency of 3D object detection. The DA-VFE considers pointwise and channelwise attention and integrates them into the Voxel Feature Extractor (VFE) to enhance key point cloud information in voxels and refine more-representative voxel features. The MFF Module consists of self-calibrated convolutions, a residual structure, and a coordinate attention mechanism, which acts as a 2D Backbone to expand the receptive domain and capture more contextual information, thus better capturing small object locations, enhancing the feature-extraction capability of the network and reducing the computational overhead. We performed evaluations of the proposed model on the nuScenes dataset with a large number of driving scenarios. The experimental results showed that the AMFF-Net achieved 62.8% in the mAP, which significantly boosted the performance of small object detection compared to the baseline network and significantly reduced the computational overhead, while the inference speed remained essentially the same. AMFF-Net also achieved advanced performance on the KITTI dataset.
6

Shuang, Feng, Hanzhang Huang, Yong Li, Rui Qu e Pei Li. "AFE-RCNN: Adaptive Feature Enhancement RCNN for 3D Object Detection". Remote Sensing 14, n. 5 (27 febbraio 2022): 1176. http://dx.doi.org/10.3390/rs14051176.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The point clouds scanned by lidar are generally sparse, which can result in fewer sampling points of objects. To perform precise and effective 3D object detection, it is necessary to improve the feature representation ability to extract more feature information of the object points. Therefore, we propose an adaptive feature enhanced 3D object detection network based on point clouds (AFE-RCNN). AFE-RCNN is a point-voxel integrated network. We first voxelize the raw point clouds and obtain the voxel features through the 3D voxel convolutional neural network. Then, the 3D feature vectors are projected to the 2D bird’s eye view (BEV), and the relationship between the features in both spatial dimension and channel dimension is learned by the proposed residual of dual attention proposal generation module. The high-quality 3D box proposals are generated based on the BEV features and anchor-based approach. Next, we sample key points from raw point clouds to summarize the information of the voxel features, and obtain the key point features by the multi-scale feature extraction module based on adaptive feature adjustment. The neighboring contextual information is integrated into each key point through this module, and the robustness of feature processing is also guaranteed. Lastly, we aggregate the features of the BEV, voxels, and point clouds as the key point features that are used for proposal refinement. In addition, to ensure the correlation among the vertices of the bounding box, we propose a refinement loss function module with vertex associativity. Our AFE-RCNN exhibits comparable performance on the KITTI dataset and Waymo open dataset to state-of-the-art methods. On the KITTI 3D detection benchmark, for the moderate difficulty level of the car and the cyclist classes, the 3D detection mean average precisions of AFE-RCNN can reach 81.53% and 67.50%, respectively.
7

Bourbonne, V., V. Jaouen, M. Rehn, M. Hatt, O. Pradier, D. Visvikis, F. Lucia e U. Schick. "Développement et validation d’un modèle basé sur l’analyse par voxel pour la prédiction de la toxicité pulmonaire aiguë chez les patients pris en charge par arcthérapie volumétrique pour un cancer du poumon localement évolué". Cancer/Radiothérapie 25, n. 6-7 (ottobre 2021): 736. http://dx.doi.org/10.1016/j.canrad.2021.07.020.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Wang, Yu, e Chao Tong. "H2GFormer: Horizontal-to-Global Voxel Transformer for 3D Semantic Scene Completion". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 6 (24 marzo 2024): 5722–30. http://dx.doi.org/10.1609/aaai.v38i6.28384.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
3D Semantic Scene Completion (SSC) has emerged as a novel task in vision-based holistic 3D scene understanding. Its objective is to densely predict the occupancy and category of each voxel in a 3D scene based on input from either LiDAR or images. Currently, many transformer-based semantic scene completion frameworks employ simple yet popular Cross-Attention and Self-Attention mechanisms to integrate and infer dense geometric and semantic information of voxels. However, they overlook the distinctions among voxels in the scene, especially in outdoor scenarios where the horizontal direction contains more variations. And voxels located at object boundaries and within the interior of objects exhibit varying levels of positional significance. To address this issue, we propose a transformer-based SSC framework called H2GFormer that incorporates a horizontal-to-global approach. This framework takes into full consideration the variations of voxels in the horizontal direction and the characteristics of voxels on object boundaries. We introduce a horizontal window-to-global attention (W2G) module that effectively fuses semantic information by first diffusing it horizontally from reliably visible voxels and then propagating the semantic understanding to global voxels, ensuring a more reliable fusion of semantic-aware features. Moreover, an Internal-External Position Awareness Loss (IoE-PALoss) is utilized during network training to emphasize the critical positions within the transition regions between objects. The experiments conducted on the SemanticKITTI dataset demonstrate that H2GFormer exhibits superior performance in both geometric and semantic completion tasks. Our code is available on https://github.com/Ryanwy1/H2GFormer.
9

Guo, Xindong, Yu Sun e Hua Yang. "FF-Net: Feature-Fusion-Based Network for Semantic Segmentation of 3D Plant Point Cloud". Plants 12, n. 9 (1 maggio 2023): 1867. http://dx.doi.org/10.3390/plants12091867.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Semantic segmentation of 3D point clouds has played an important role in the field of plant phenotyping in recent years. However, existing methods need to down-sample the point cloud to a relatively small size when processing large-scale plant point clouds, which contain more than hundreds of thousands of points, which fails to take full advantage of the high-resolution of advanced scanning devices. To address this issue, we propose a feature-fusion-based method called FF-Net, which consists of two branches, namely the voxel-branch and the point-branch. In particular, the voxel-branch partitions a point cloud into voxels and then employs sparse 3D convolution to learn the context features, and the point-branch learns the point features within a voxel to preserve the detailed point information. Finally, an attention-based module was designed to fuse the two branch features to produce the final segmentation. We conducted extensive experiments on two large plant point clouds (maize and tomato), and the results showed that our method outperformed three commonly used models on both datasets and achieved the best mIoU of 80.95% on the maize dataset and 86.65% on the tomato dataset. Extensive cross-validation experiments were performed to evaluate the generalization ability of the models, and our method achieved promising segmentation results. In addition, the drawbacks of the proposed method were analyzed, and the directions for future works are given.
10

Peng, Hao, Guofeng Tong, Zheng Li, Yaqi Wang e Yuyuan Shao. "3D object detection combining semantic and geometric features from point clouds". Cobot 1 (12 gennaio 2022): 2. http://dx.doi.org/10.12688/cobot.17433.1.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Background: 3D object detection based on point clouds in road scenes has attracted much attention recently. The voxel-based methods voxelize the scene to regular grids, which can be processed with the advanced feature learning frameworks based on convolutional layers for semantic feature learning. The point-based methods can extract the geometric feature of the point due to the coordinate reservations. The combination of the two is effective for 3D object detection. However, the current methods use a voxel-based detection head with anchors for classification and localization. Although the preset anchors cover the entire scene, it is not suitable for detection tasks with larger scenes and multiple categories of objects, due to the limitation of the voxel size. Additionally, the misalignment between the predicted confidence and proposals in the Regions of the Interest (ROI) selection bring obstacles to 3D object detection. Methods: We investigate the combination of voxel-based methods and point-based methods for 3D object detection. Additionally, a voxel-to-point module that captures semantic and geometric features is proposed in the paper. The voxel-to-point module is conducive to the detection of small-size objects and avoids the presets of anchors in the inference stage. Moreover, a confidence adjustment module with the center-boundary-aware confidence attention is proposed to solve the misalignment between the predicted confidence and proposals in the regions of the interest selection. Results: The proposed method has achieved state-of-the-art results for 3D object detection in the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) object detection dataset. Actually, as of September 19, 2021, our method ranked 1st in the 3D and Bird Eyes View (BEV) detection of cyclists tagged with difficulty level ‘easy’, and ranked 2nd in the 3D detection of cyclists tagged with ‘moderate’. Conclusions: We propose an end-to-end two-stage 3D object detector with voxel-to-point module and confidence adjustment module.
11

Zhang, Jing, Da Xu, Yunsong Li, Liping Zhao e Rui Su. "FusionPillars: A 3D Object Detection Network with Cross-Fusion and Self-Fusion". Remote Sensing 15, n. 10 (22 maggio 2023): 2692. http://dx.doi.org/10.3390/rs15102692.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In the field of unmanned systems, cameras and LiDAR are important sensors that provide complementary information. However, the question of how to effectively fuse data from two different modalities has always been a great challenge. In this paper, inspired by the idea of deep fusion, we propose a one-stage end-to-end network named FusionPillars to fuse multisensor data (namely LiDAR point cloud and camera images). It includes three branches: a point-based branch, a voxel-based branch, and an image-based branch. We design two modules to enhance the voxel-wise features in the pseudo-image: the Set Abstraction Self (SAS) fusion module and the Pseudo View Cross (PVC) fusion module. For the data from a single sensor, by considering the relationship between the point-wise and voxel-wise features, the SAS fusion module self-fuses the point-based branch and the voxel-based branch to enhance the spatial information of the pseudo-image. For the data from two sensors, through the transformation of the images’ view, the PVC fusion module introduces the RGB information as auxiliary information and cross-fuses the pseudo-image and RGB image of different scales to supplement the color information of the pseudo-image. Experimental results revealed that, compared to existing current one-stage fusion networks, FusionPillars yield superior performance, with a considerable improvement in the detection precision for small objects.
12

Zhao, Yuekun, Suyun Luo, Xiaoci Huang e Dan Wei. "A Multi-Sensor 3D Detection Method for Small Objects". World Electric Vehicle Journal 15, n. 5 (10 maggio 2024): 210. http://dx.doi.org/10.3390/wevj15050210.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In response to the limited accuracy of current three-dimensional (3D) object detection algorithms for small objects, this paper presents a multi-sensor 3D small object detection method based on LiDAR and a camera. Firstly, the LiDAR point cloud is projected onto the image plane to obtain a depth image. Subsequently, we propose a cascaded image fusion module comprising multi-level pooling layers and multi-level convolution layers. This module extracts features from both the camera image and the depth image, addressing the issue of insufficient depth information in the image feature. Considering the non-uniform distribution characteristics of the LiDAR point cloud, we introduce a multi-scale voxel fusion module composed of three sets of VFE (voxel feature encoder) layers. This module partitions the point cloud into grids of different sizes to improve detection ability for small objects. Finally, the multi-level fused point features are associated with the corresponding scale’s initial voxel features to obtain the fused multi-scale voxel features, and the final detection results are obtained based on this feature. To evaluate the effectiveness of this method, experiments are conducted on the KITTI dataset, achieving a 3D AP (average precision) of 73.81% for the hard level of cars and 48.03% for the hard level of persons. The experimental results demonstrate that this method can effectively achieve 3D detection of small objects.
13

Jiang, Haobin, Junhao Ren e Aoxue Li. "3D Object Detection under Urban Road Traffic Scenarios Based on Dual-Layer Voxel Features Fusion Augmentation". Sensors 24, n. 11 (21 maggio 2024): 3267. http://dx.doi.org/10.3390/s24113267.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
To enhance the accuracy of detecting objects in front of intelligent vehicles in urban road scenarios, this paper proposes a dual-layer voxel feature fusion augmentation network (DL-VFFA). It aims to address the issue of objects misrecognition caused by local occlusion or limited field of view for targets. The network employs a point cloud voxelization architecture, utilizing the Mahalanobis distance to associate similar point clouds within neighborhood voxel units. It integrates local and global information through weight sharing to extract boundary point information within each voxel unit. The relative position encoding of voxel features is computed using an improved attention Gaussian deviation matrix in point cloud space to focus on the relative positions of different voxel sequences within channels. During the fusion of point cloud and image features, learnable weight parameters are designed to decouple fine-grained regions, enabling two-layer feature fusion from voxel to voxel and from point cloud to image. Extensive experiments on the KITTI dataset demonstrate the significant performance of DL-VFFA. Compared to the baseline network Second, DL-VFFA performs better in medium- and high-difficulty scenarios. Furthermore, compared to the voxel fusion module in MVX-Net, the voxel feature fusion results in this paper are more accurate, effectively capturing fine-grained object features post-voxelization. Through ablative experiments, we conducted in-depth analyses of the three voxel fusion modules in DL-VFFA to enhance the performance of the baseline detector and achieved superior results.
14

Topoliński, Tomasz, Artur Cichański, Adam Mazurkiewicz e Krzysztof Nowicki. "The Relationship between Trabecular Bone Structure Modeling Methods and the Elastic Modulus as Calculated by FEM". Scientific World Journal 2012 (2012): 1–9. http://dx.doi.org/10.1100/2012/827196.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Trabecular bone cores were collected from the femoral head at the time of surgery (hip arthroplasty). Investigated were 42 specimens, from patients with osteoporosis and coxarthrosis. The cores were scanned used computer microtomography (microCT) system at an isotropic spatial resolution of 36 microns. Image stacks were converted to finite element models via a bone voxel-to-element algorithm. The apparent modulus was calculated based on the assumptions that for the elastic properties,E=10 MPa andν=0.3. The compressive deformation as calculated by finite elements (FE) analysis was 0.8%. The models were coarsened to effectively change the resolution or voxel size (from 72 microns to 288 microns or from 72 microns to 1080 microns). The aim of our study is to determine how an increase in the distance between scans changes the elastic properties as calculated by FE models. We tried to find a border value voxel size at which the module values were possible to calculate. As the voxel size increased, the mean voxel volume increased and the FEA-derived apparent modulus decreased. The slope of voxel size versus modulus relationship correlated with several architectural indices of trabecular bone.
15

Chen, Hongmei, Haifeng Wang, Zilong Liu, Dongbing Gu e Wen Ye. "HP3D-V2V: High-Precision 3D Object Detection Vehicle-to-Vehicle Cooperative Perception Algorithm". Sensors 24, n. 7 (28 marzo 2024): 2170. http://dx.doi.org/10.3390/s24072170.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Cooperative perception in the field of connected autonomous vehicles (CAVs) aims to overcome the inherent limitations of single-vehicle perception systems, including long-range occlusion, low resolution, and susceptibility to weather interference. In this regard, we propose a high-precision 3D object detection V2V cooperative perception algorithm. The algorithm utilizes a voxel grid-based statistical filter to effectively denoise point cloud data to obtain clean and reliable data. In addition, we design a feature extraction network based on the fusion of voxels and PointPillars and encode it to generate BEV features, which solves the spatial feature interaction problem lacking in the PointPillars approach and enhances the semantic information of the extracted features. A maximum pooling technique is used to reduce the dimensionality and generate pseudoimages, thereby skipping complex 3D convolutional computation. To facilitate effective feature fusion, we design a feature level-based crossvehicle feature fusion module. Experimental validation is conducted using the OPV2V dataset to assess vehicle coperception performance and compare it with existing mainstream coperception algorithms. Ablation experiments are also carried out to confirm the contributions of this approach. Experimental results show that our architecture achieves lightweighting with a higher average precision (AP) than other existing models.
16

Kim, Taeho, e Joohee Kim. "Voxel Transformer with Density-Aware Deformable Attention for 3D Object Detection". Sensors 23, n. 16 (17 agosto 2023): 7217. http://dx.doi.org/10.3390/s23167217.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The Voxel Transformer (VoTr) is a prominent model in the field of 3D object detection, employing a transformer-based architecture to comprehend long-range voxel relationships through self-attention. However, despite its expanded receptive field, VoTr’s flexibility is constrained by its predefined receptive field. In this paper, we present a Voxel Transformer with Density-Aware Deformable Attention (VoTr-DADA), a novel approach to 3D object detection. VoTr-DADA leverages density-guided deformable attention for a more adaptable receptive field. It efficiently identifies key areas in the input using density features, combining the strengths of both VoTr and Deformable Attention. We introduce the Density-Aware Deformable Attention (DADA) module, which is specifically designed to focus on these crucial areas while adaptively extracting more informative features. Experimental results on the KITTI dataset and the Waymo Open dataset show that our proposed method outperforms the baseline VoTr model in 3D object detection while maintaining a fast inference speed.
17

Li, Zheng, Guofeng Tong, Hao Peng e Mingwei Ma. "GAF-RCNN: Grid attention fusion 3D object detection from point cloud". Cobot 2 (21 febbraio 2023): 3. http://dx.doi.org/10.12688/cobot.17590.1.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Background: Due to the refinement of region of the interests (RoIs), two-stage 3D detection algorithms can usually obtain better performance compared with most single-stage detectors. However, most two-stage methods adopt feature connection, to aggregate the grid point features using multi-scale RoI pooling in the second stage. This connection mode does not consider the correlation between multi-scale grid features. Methods: In the first stage, we employ 3D sparse convolution and 2D convolution to fully extract rich semantic features. Then, a small number of coarse RoIs are predicted based region proposal network (RPN) on generated bird’s eye view (BEV) map. After that, we adopt voxel RoI-pooling strategy to aggregate the neighborhood nonempty voxel features of each grid point in RoI in the last two layers of 3D sparse convolution. In this way, we obtain two aggregated features from 3D sparse voxel space for each grid point. Next, we design an attention feature fusion module. This module includes a local and a global attention layer, which can fully integrate the grid point features from different voxel layers. Results: We carry out relevant experiments on the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) dataset. The average precisions of our proposed method are 88.21%, 81.51%, 77.07% on three difficulty levels (easy, moderate, and hard, respectively) for 3D detection, and 92.30%, 90.19%, 86.00% on three difficulty levels (easy, moderate, and hard, respectively) for BEV detection. Conclusions: In this paper, we propose a novel two-stage 3D detection algorithm named Grid Attention Fusion Region-based Convolutional Neural Network (GAF-RCNN) from point cloud. Because we integrate multi-scale RoI grid features with attention mechanism in the refinement stage, different multi-scale features can be better correlated, achieving a competitive level compared with other well tested detection algorithms. This 3D object detection has important implications for robot and cobot technology.
18

Zhu, Yun, Le Hui, Yaqi Shen e Jin Xie. "SPGroup3D: Superpoint Grouping Network for Indoor 3D Object Detection". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 7 (24 marzo 2024): 7811–19. http://dx.doi.org/10.1609/aaai.v38i7.28616.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Current 3D object detection methods for indoor scenes mainly follow the voting-and-grouping strategy to generate proposals. However, most methods utilize instance-agnostic groupings, such as ball query, leading to inconsistent semantic information and inaccurate regression of the proposals. To this end, we propose a novel superpoint grouping network for indoor anchor-free one-stage 3D object detection. Specifically, we first adopt an unsupervised manner to partition raw point clouds into superpoints, areas with semantic consistency and spatial similarity. Then, we design a geometry-aware voting module that adapts to the centerness in anchor-free detection by constraining the spatial relationship between superpoints and object centers. Next, we present a superpoint-based grouping module to explore the consistent representation within proposals. This module includes a superpoint attention layer to learn feature interaction between neighboring superpoints, and a superpoint-voxel fusion layer to propagate the superpoint-level information to the voxel level. Finally, we employ effective multiple matching to capitalize on the dynamic receptive fields of proposals based on superpoints during the training. Experimental results demonstrate our method achieves state-of-the-art performance on ScanNet V2, SUN RGB-D, and S3DIS datasets in the indoor one-stage 3D object detection. Source code is available at https://github.com/zyrant/SPGroup3D.
19

Zhu, Yuan, Ruidong Xu, Chongben Tao, Hao An, Huaide Wang, Zhipeng Sun e Ke Lu. "DS-Trans: A 3D Object Detection Method Based on a Deformable Spatiotemporal Transformer for Autonomous Vehicles". Remote Sensing 16, n. 9 (30 aprile 2024): 1621. http://dx.doi.org/10.3390/rs16091621.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Facing the significant challenge of 3D object detection in complex weather conditions and road environments, existing algorithms based on single-frame point cloud data struggle to achieve desirable results. These methods typically focus on spatial relationships within a single frame, overlooking the semantic correlations and spatiotemporal continuity between consecutive frames. This leads to discontinuities and abrupt changes in the detection outcomes. To address this issue, this paper proposes a multi-frame 3D object detection algorithm based on a deformable spatiotemporal Transformer. Specifically, a deformable cross-scale Transformer module is devised, incorporating a multi-scale offset mechanism that non-uniformly samples features at different scales, enhancing the spatial information aggregation capability of the output features. Simultaneously, to address the issue of feature misalignment during multi-frame feature fusion, a deformable cross-frame Transformer module is proposed. This module incorporates independently learnable offset parameters for different frame features, enabling the model to adaptively correlate dynamic features across multiple frames and improve the temporal information utilization of the model. A proposal-aware sampling algorithm is introduced to significantly increase the foreground point recall, further optimizing the efficiency of feature extraction. The obtained multi-scale and multi-frame voxel features are subjected to an adaptive fusion weight extraction module, referred to as the proposed mixed voxel set extraction module. This module allows the model to adaptively obtain mixed features containing both spatial and temporal information. The effectiveness of the proposed algorithm is validated on the KITTI, nuScenes, and self-collected urban datasets. The proposed algorithm achieves an average precision improvement of 2.1% over the latest multi-frame-based algorithms.
20

Yan, Xu, Jiantao Gao, Jie Li, Ruimao Zhang, Zhen Li, Rui Huang e Shuguang Cui. "Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion". Proceedings of the AAAI Conference on Artificial Intelligence 35, n. 4 (18 maggio 2021): 3101–9. http://dx.doi.org/10.1609/aaai.v35i4.16419.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
LiDAR point cloud analysis is a core task for 3D computer vision, especially for autonomous driving. However, due to the severe sparsity and noise interference in the single sweep LiDAR point cloud, the accurate semantic segmentation is non-trivial to achieve. In this paper, we propose a novel sparse LiDAR point cloud semantic segmentation framework assisted by learned contextual shape priors. In practice, an initial semantic segmentation (SS) of a single sweep point cloud can be achieved by any appealing network and then flows into the semantic scene completion (SSC) module as the input. By merging multiple frames in the LiDAR sequence as supervision, the optimized SSC module has learned the contextual shape priors from sequential LiDAR data, completing the sparse single sweep point cloud to the dense one. Thus, it inherently improves SS optimization through fully end-to-end training. Besides, a Point-Voxel Interaction (PVI) module is proposed to further enhance the knowledge fusion between SS and SSC tasks, i.e., promoting the interaction of incomplete local geometry of point cloud and complete voxel-wise global structure. Furthermore, the auxiliary SSC and PVI modules can be discarded during inference without extra burden for SS. Extensive experiments confirm that our JS3C-Net achieves superior performance on both SemanticKITTI and SemanticPOSS benchmarks, i.e., 4% and 3% improvement correspondingly.
21

Xu, Jinfeng, Xianzhi Li, Yuan Tang, Qiao Yu, Yixue Hao, Long Hu e Min Chen. "CasFusionNet: A Cascaded Network for Point Cloud Semantic Scene Completion by Dense Feature Fusion". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 3 (26 giugno 2023): 3018–26. http://dx.doi.org/10.1609/aaai.v37i3.25405.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Semantic scene completion (SSC) aims to complete a partial 3D scene and predict its semantics simultaneously. Most existing works adopt the voxel representations, thus suffering from the growth of memory and computation cost as the voxel resolution increases. Though a few works attempt to solve SSC from the perspective of 3D point clouds, they have not fully exploited the correlation and complementarity between the two tasks of scene completion and semantic segmentation. In our work, we present CasFusionNet, a novel cascaded network for point cloud semantic scene completion by dense feature fusion. Specifically, we design (i) a global completion module (GCM) to produce an upsampled and completed but coarse point set, (ii) a semantic segmentation module (SSM) to predict the per-point semantic labels of the completed points generated by GCM, and (iii) a local refinement module (LRM) to further refine the coarse completed points and the associated labels from a local perspective. We organize the above three modules via dense feature fusion in each level, and cascade a total of four levels, where we also employ feature fusion between each level for sufficient information usage. Both quantitative and qualitative results on our compiled two point-based datasets validate the effectiveness and superiority of our CasFusionNet compared to state-of-the-art methods in terms of both scene completion and semantic segmentation. The codes and datasets are available at: https://github.com/JinfengX/CasFusionNet.
22

Pratt, Sheila R., Anne T. Heintzelman e Susan Ensrud Deming. "The Efficacy of Using the IBM Speech Viewer Vowel Accuracy Module to Treat Young Children With Hearing Impairment". Journal of Speech, Language, and Hearing Research 36, n. 5 (ottobre 1993): 1063–74. http://dx.doi.org/10.1044/jshr.3605.1063.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The efficacy of the IBM SpeechViewer’s Vowel Accuracy Module for the treatment of vowel productions was evaluated in six preschool children with hearing-impairment over a 4-month period. A single-subject design was used, and the vowels /a/, /i/ and /u/ were treated. Untreated sounds also were probed to monitor for carryover and developmental effects. One of the children was dismissed from the study because of noncompliance. Of the remaining five children, four exhibited a treatment effect for /u/, two for /a/, and one for /i/. Four of the children demonstrated some generalization. Developmental effects, as represented by change in /s/-cluster production, were not documented. Although treatment effects were observed, difficulties with the Vowel Accuracy Module were also observed. These included inaccuracies in the feedback on low-intensity, hypemasal, and high-pitched utterances; inability to sustain the attention of preschoolers over multiple sessions; lack of instructional feedback; and nonlinearity in the criterion-adjustment control.
23

Chen, Chen, Zhe Chen, Jing Zhang e Dacheng Tao. "SASA: Semantics-Augmented Set Abstraction for Point-Based 3D Object Detection". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 1 (28 giugno 2022): 221–29. http://dx.doi.org/10.1609/aaai.v36i1.19897.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Although point-based networks are demonstrated to be accurate for 3D point cloud modeling, they are still falling behind their voxel-based competitors in 3D detection. We observe that the prevailing set abstraction design for down-sampling points may maintain too much unimportant background information that can affect feature learning for detecting objects. To tackle this issue, we propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA). Technically, we first add a binary segmentation module as the side output to help identify foreground points. Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling. In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection. Additionally, it is an easy-to-plug-in module and able to boost various point-based detectors, including single-stage and two-stage ones. Extensive experiments on the popular KITTI and nuScenes datasets validate the superiority of SASA, lifting point-based detection models to reach comparable performance to state-of-the-art voxel-based methods. Code is available at https://github.com/blakechen97/SASA.
24

Lee, Jinho, Geonkyu Bang, Takaya Shimizu, Masato Iehara e Shunsuke Kamijo. "LiDAR-to-Radar Translation Based on Voxel Feature Extraction Module for Radar Data Augmentation". Sensors 24, n. 2 (16 gennaio 2024): 559. http://dx.doi.org/10.3390/s24020559.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In autonomous vehicles, the LiDAR and radar sensors are indispensable components for measuring distances to objects. While deep-learning-based algorithms for LiDAR sensors have been extensively proposed, the same cannot be said for radar sensors. LiDAR and radar share the commonality of measuring distances, but they are used in different environments. LiDAR tends to produce less noisy data and provides precise distance measurements, but it is highly affected by environmental factors like rain and fog. In contrast, radar is less impacted by environmental conditions but tends to generate noisier data. To reduce noise in radar data and enhance radar data augmentation, we propose a LiDAR-to-Radar translation method with a voxel feature extraction module, leveraging the fact that both sensors acquire data in a point-based manner. Because of the translation of high-quality LiDAR data into radar data, this becomes achievable. We demonstrate the superiority of our proposed method by acquiring and using data from both LiDAR and radar sensors in the same environment for validation.
25

Ning, Yaqian, Jie Cao, Chun Bao e Qun Hao. "DVST: Deformable Voxel Set Transformer for 3D Object Detection from Point Clouds". Remote Sensing 15, n. 23 (3 dicembre 2023): 5612. http://dx.doi.org/10.3390/rs15235612.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The use of a transformer backbone in LiDAR point-cloud-based models for 3D object detection has recently gained significant interest. The larger receptive field of the transformer backbone improves its representation capability but also results in excessive attention being given to background regions. To solve this problem, we propose a novel approach called deformable voxel set attention, which we utilized to create a deformable voxel set transformer (DVST) backbone for 3D object detection from point clouds. The DVST aims to efficaciously integrate the flexible receptive field of the deformable mechanism and the powerful context modeling capability of the transformer. Specifically, we introduce the deformable mechanism into voxel-based set attention to selectively transfer candidate keys and values of foreground queries to important regions. An offset generation module was designed to learn the offsets of the foreground queries. Furthermore, a globally responsive convolutional feed-forward network with residual connection is presented to capture global feature interactions in hidden space. We verified the validity of the DVST on the KITTI and Waymo open datasets by constructing single-stage and two-stage models. The findings indicated that the DVST enhanced the average precision of the baseline model while preserving computational efficiency, achieving a performance comparable to state-of-the-art methods.
26

Wang, Jiachun, Junkui Song, Yizhe Zhang e Hao Chen. "Design of 3D Display System for Intangible Cultural Heritage Based on Generative Adversarial Network". Scientific Programming 2022 (21 luglio 2022): 1–12. http://dx.doi.org/10.1155/2022/2944750.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
This paper designs a three-dimensional display system for intangible cultural heritage based on generative adversarial networks. The system function is realized through four modules: input module, data processing module, 3D model generation module, and model output module. Two 3D model reconstruction methods are used to realize the transformation from 2D images to 3D models. In the low-resolution Nuo surface 3D construction, multiresidual dense blocks are introduced and applied to the deep image super-resolution network. The experimental comparison results show that the quadratic optimization multifusion 3D construction model proposed in this paper can achieve considerable improvement and can improve the reconstruction accuracy by about 6.3%. In the high-resolution 3D construction of the Nuo surface, a generative adversarial network model is used to improve the generator, discriminator, and loss function of the original SRGAN model. Experimental results show that this method can generate super-resolution images with more realistic and natural depth maps. In addition, when it is used for high-resolution 3D Nuo surface sculpting, it can also generate 3D voxel Nuo surfaces with more details.
27

Xie, Liang, Chao Xiang, Zhengxu Yu, Guodong Xu, Zheng Yang, Deng Cai e Xiaofei He. "PI-RCNN: An Efficient Multi-Sensor 3D Object Detector with Point-Based Attentive Cont-Conv Fusion Module". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 07 (3 aprile 2020): 12460–67. http://dx.doi.org/10.1609/aaai.v34i07.6933.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
LIDAR point clouds and RGB-images are both extremely essential for 3D object detection. So many state-of-the-art 3D detection algorithms dedicate in fusing these two types of data effectively. However, their fusion methods based on Bird's Eye View (BEV) or voxel format are not accurate. In this paper, we propose a novel fusion approach named Point-based Attentive Cont-conv Fusion(PACF) module, which fuses multi-sensor features directly on 3D points. Except for continuous convolution, we additionally add a Point-Pooling and an Attentive Aggregation to make the fused features more expressive. Moreover, based on the PACF module, we propose a 3D multi-sensor multi-task network called Pointcloud-Image RCNN(PI-RCNN as brief), which handles the image segmentation and 3D object detection tasks. PI-RCNN employs a segmentation sub-network to extract full-resolution semantic feature maps from images and then fuses the multi-sensor features via powerful PACF module. Beneficial from the effectiveness of the PACF module and the expressive semantic features from the segmentation module, PI-RCNN can improve much in 3D object detection. We demonstrate the effectiveness of the PACF module and PI-RCNN on the KITTI 3D Detection benchmark, and our method can achieve state-of-the-art on the metric of 3D AP.
28

Liu, Huaijin, Jixiang Du, Yong Zhang e Hongbo Zhang. "Enhancing Point Features with Spatial Information for Point-Based 3D Object Detection". Scientific Programming 2021 (21 dicembre 2021): 1–11. http://dx.doi.org/10.1155/2021/4650660.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Currently, there are many kinds of voxel-based multisensor 3D object detectors, while point-based multisensor 3D object detectors have not been fully studied. In this paper, we propose a new 3D two-stage object detection method based on point cloud and image fusion to improve the detection accuracy. To address the problem of insufficient semantic information of point cloud, we perform multiscale deep fusion of LiDAR point and camera image in a point-wise manner to enhance point features. Due to the imbalance of LiDAR points, the object point cloud in the long-distance area is sparse. We design a point cloud completion module to predict the spatial shape of objects in the candidate boxes and extract the structural information to improve the feature representation ability to further refine the boxes. The framework is evaluated on widely used KITTI and SUN-RGBD dataset. Experimental results show that our method outperforms all state-of-the-art point-based 3D object detection methods and has comparable performance to voxel-based methods as well.
29

Li, Yinhao, Zheng Ge, Guanyi Yu, Jinrong Yang, Zengran Wang, Yukang Shi, Jianjian Sun e Zeming Li. "BEVDepth: Acquisition of Reliable Depth for Multi-View 3D Object Detection". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 2 (26 giugno 2023): 1477–85. http://dx.doi.org/10.1609/aaai.v37i2.25233.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In this research, we propose a new 3D object detector with a trustworthy depth estimation, dubbed BEVDepth, for camera-based Bird's-Eye-View~(BEV) 3D object detection. Our work is based on a key observation -- depth estimation in recent approaches is surprisingly inadequate given the fact that depth is essential to camera 3D detection. Our BEVDepth resolves this by leveraging explicit depth supervision. A camera-awareness depth estimation module is also introduced to facilitate the depth predicting capability. Besides, we design a novel Depth Refinement Module to counter the side effects carried by imprecise feature unprojection. Aided by customized Efficient Voxel Pooling and multi-frame mechanism, BEVDepth achieves the new state-of-the-art 60.9% NDS on the challenging nuScenes test set while maintaining high efficiency. For the first time, the NDS score of a camera model reaches 60%. Codes have been released.
30

Chang, Sungho, e Sang Chul Lee. "A Comparative Study on the Voxel Values in Alveolar Bones Acquired by MDCT and Newly Developed Dental Dual-Energy CBCT". Sensors 21, n. 22 (13 novembre 2021): 7552. http://dx.doi.org/10.3390/s21227552.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The purpose of this study was to analyze the effectiveness of newly developed dental dual-energy (DE) cone-beam computed tomography (CBCT) to compare both the voxel values in hard bone tissue of DE-CBCT and multidetector computed tomography (MDCT) images, collected in a clinical trial conducted at Seoul National University Dental Hospital. A software implemented as a scripted module of a three-dimensional (3D) slicer was developed to register the volume data from the MDCT space to DE-CBCT, locate the same 3D regions of interest (ROIs) in each image space, and extract the statistics of the ROIs. The mean values were paired and used as representative values of the ROIs. A scatter plot with the line of equality and Bland–Altman (BA) plot of difference for a pair of measured means were used for statistical analysis. Of the ROI pairs, 96% were within ±15% from the identity line, and more than 95% of the measured ROI pairs were within the limits of agreement of the 95% confidence intervals (CIs), with the CI of the limits in BA plots. The newly developed dental DE-CBCT showed a level of voxel value accuracy similar to that of MDCT.
31

Liu, Xinqi, Jituo Li e Guodong Lu. "A New Volumetric Fusion Strategy with Adaptive Weight Field for RGB-D Reconstruction". Sensors 20, n. 15 (3 agosto 2020): 4330. http://dx.doi.org/10.3390/s20154330.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
High-quality 3D reconstruction results are very important in many application fields. However, current texture generation methods based on point sampling and fusion often produce blur. To solve this problem, we propose a new volumetric fusion strategy which can be embedded in the current online and offline reconstruction framework as a basic module to achieve excellent geometry and texture effects. The improvement comes from two aspects. Firstly, we establish an adaptive weight field to evaluate and adjust the reliability of data from RGB-D images by using a probabilistic and heuristic method. By using this adaptive weight field to guide the voxel fusion process, we can effectively preserve the local texture structure of the mesh, avoid wrong texture problems and suppress the influence of outlier noise on the geometric surface. Secondly, we use a new texture fusion strategy that combines replacement, integration, and fixedness operations to fuse and update voxel texture to reduce blur. Experimental results demonstrate that compared with the classical KinectFusion, our approach can significantly improve the accuracy in geometry and texture clarity, and can achieve equivalent texture reconstruction effects in real-time as the offline reconstruction methods such as intrinsic3d, even better in relief scenes.
32

Alsadoon, Reem, e Trude Heift. "Textual Input Enhancement for Vowel Blindness: A Study with Arabic ESL Learners". Modern Language Journal 99, n. 1 (marzo 2015): 57–79. http://dx.doi.org/10.1111/modl.12188.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
33

Wang, Likang, Yue Gong, Qirui Wang, Kaixuan Zhou e Lei Chen. "Flora: Dual-Frequency LOss-Compensated ReAl-Time Monocular 3D Video Reconstruction". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 2 (26 giugno 2023): 2599–607. http://dx.doi.org/10.1609/aaai.v37i2.25358.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In this work, we propose a real-time monocular 3D video reconstruction approach named Flora for reconstructing delicate and complete 3D scenes from RGB video sequences in an end-to-end manner. Specifically, we introduce a novel method with two main contributions. Firstly, the proposed feature aggregation module retains both color and reliability in a dual-frequency form. Secondly, the loss compensation module solves missing structure by correcting losses for falsely pruned voxels. The dual-frequency feature aggregation module enhances reconstruction quality in both precision and recall, and the loss compensation module benefits the recall. Notably, both proposed contributions achieve great results with negligible inferencing overhead. Our state-of-the-art experimental results on real-world datasets demonstrate Flora's leading performance in both effectiveness and efficiency. The code is available at https://github.com/NoOneUST/Flora.
34

Ashmawy, Mostafa, Ashraf Abou-Khalaf e Raghdaa Mostafa. "Effect of Voxel Size On The Accuracy of Nerve Tracing Module of Cone Beam Computed Tomography Images". Egyptian Dental Journal 63, n. 3 (1 luglio 2017): 2403–12. http://dx.doi.org/10.21608/edj.2017.76057.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
35

Liu, Zhe, Xin Zhao, Tengteng Huang, Ruolan Hu, Yu Zhou e Xiang Bai. "TANet: Robust 3D Object Detection from Point Clouds with Triple Attention". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 07 (3 aprile 2020): 11677–84. http://dx.doi.org/10.1609/aaai.v34i07.6837.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In this paper, we focus on exploring the robustness of the 3D object detection in point clouds, which has been rarely discussed in existing approaches. We observe two crucial phenomena: 1) the detection accuracy of the hard objects, e.g., Pedestrians, is unsatisfactory, 2) when adding additional noise points, the performance of existing approaches decreases rapidly. To alleviate these problems, a novel TANet is introduced in this paper, which mainly contains a Triple Attention (TA) module, and a Coarse-to-Fine Regression (CFR) module. By considering the channel-wise, point-wise and voxel-wise attention jointly, the TA module enhances the crucial information of the target while suppresses the unstable cloud points. Besides, the novel stacked TA further exploits the multi-level feature attention. In addition, the CFR module boosts the accuracy of localization without excessive computation cost. Experimental results on the validation set of KITTI dataset demonstrate that, in the challenging noisy cases, i.e., adding additional random noisy points around each object, the presented approach goes far beyond state-of-the-art approaches. Furthermore, for the 3D object detection task of the KITTI benchmark, our approach ranks the first place on Pedestrian class, by using the point clouds as the only input. The running speed is around 29 frames per second.
36

ABCHIR, H., e C. BLANCHET. "ON THE COMPUTATION OF THE TURAEV-VIRO MODULE OF A KNOT". Journal of Knot Theory and Its Ramifications 07, n. 07 (novembre 1998): 843–56. http://dx.doi.org/10.1142/s0218216598000437.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Let M be the manifold obtained by 0-framed surgery along a knot K in the 3-sphere. A Topological Quantum Field Theory assigns to a fundamental domain of the universal abelian cover of M an operator, whose non-nilpotent part is the Turaev-Viro module of K. In this paper, using surgery formulas, we give a matrix presentation for the Turaev-Viro module of any knot K, in the case of the (Vp, Zp) TQFT of Blanchet, Habegger, Masbaum and Vogel. We do the computation for a family of knots in the special case p = 8, and note the relation with the fibering question.
37

Barca, Patrizio, Daniela Marfisi, Chiara Marzi, Sabino Cozza, Stefano Diciotti, Antonio Claudio Traino e Marco Giannelli. "A Voxel-Based Assessment of Noise Properties in Computed Tomography Imaging with the ASiR-V and ASiR Iterative Reconstruction Algorithms". Applied Sciences 11, n. 14 (16 luglio 2021): 6561. http://dx.doi.org/10.3390/app11146561.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Given the inherent characteristics of nonlinearity and nonstationarity of iterative reconstruction algorithms in computed tomography (CT) imaging, this study aimed to perform, for the first time, a voxel-based characterization of noise properties in CT imaging with the ASiR-V and ASiR algorithms as compared with conventional filtered back projection (FBP). Multiple repeated scans of the Catphan-504 phantom were carried out. CT images were reconstructed using FBP and ASiR/ASiR-V with different blending levels of reconstruction (20%, 40%, 60%, 80%, 100%). Noise maps and their nonuniformity index (NUI) were obtained according to the approach proposed by the report of AAPM TG-233. For the homogeneous CTP486 module, ASiR-V/ASiR allowed a noise reduction of up to 63.7%/52.9% relative to FBP. While the noise reduction values of ASiR-V-/ASiR-reconstructed images ranged up to 33.8%/39.9% and 31.2%/35.5% for air and Teflon contrast objects, respectively, these values were approximately 60%/50% for other contrast objects (PMP, LDPE, polystyrene, acrylic, Delrin). Moreover, for all contrast objects but air and Teflon, ASiR-V showed a greater noise reduction potential than ASiR when the blending level was ≥40%. While noise maps of the homogenous CTP486 module showed only a slight spatial variation of noise (NUI < 5.2%) for all reconstruction algorithms, the NUI values of iterative-reconstructed images of the nonhomogeneous CTP404 module increased nonlinearly with blending level and were 19%/15% and 6.7% for pure ASiR-V/ASiR and FBP, respectively. Overall, these results confirm the potential of ASiR-V and ASiR in reducing noise as compared with conventional FBP, suggesting, however, that the use of pure ASiR-V or ASiR might be suboptimal for specific clinical applications.
38

Wu, Lei, Jiewu Leng e Bingfeng Ju. "Digital Twins-Based Smart Design and Control of Ultra-Precision Machining: A Review". Symmetry 13, n. 9 (16 settembre 2021): 1717. http://dx.doi.org/10.3390/sym13091717.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Ultra-Precision Machining (UPM) is a kind of highly accurate processing technology developed to satisfy the manufacturing requirements of high-end cutting-edge products including nuclear energy producers, very large-scale integrated circuits, lasers, and aircraft. The information asymmetry phenomenon widely exists in the design and control of ultra-precision machining. It may lead to inconsistency between the designed performance and operational performance of the UPM equipment on stiffness, thermal stability, and motion accuracy, which result from its design, manufacturing, and control, and determine the form accuracy and surface roughness of machined parts. The performance of the UPM equipment should be improved continuously. It is still challenging to realize the real-time and self-adaptive control, in which building a high-fidelity and computationally efficient digital twin is a valuable solution. Nevertheless, the incorporation of the digital twin technology into the UPM design and control remains vague and sometimes contradictory. Based on a literature search in the Google Scholar database, the critical issues in the UPM design and control, and how to use the digital twin technologies to promote it, are reviewed. Firstly, the digital twins-based UPM design, including bearings module design, spindle-drive module design, stage system module design, servo module design, and clamping module design, are reviewed. Secondly, the digital twins-based UPM control studies, including voxel modeling, process planning, process monitoring, vibration control, and quality prediction, are reviewed. The key enabling technologies and research directions of digital twins-based design and control are discussed to deal with the information asymmetry phenomenon in UPM.
39

Jinming, Chen. "Obstacle Detection Based on 3D Lidar Euclidean Clustering". Applied Science and Innovative Research 5, n. 3 (8 novembre 2021): p39. http://dx.doi.org/10.22158/asir.v5n3p39.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Environment perception is the basis of unmanned driving and obstacle detection is an important research area of environment perception technology. In order to quickly and accurately identify the obstacles in the direction of vehicle travel and obtain their location information, combined with the PCL (Point Cloud Library) function module, this paper designed a euclidean distance based Point Cloud clustering obstacle detection algorithm. Environmental information was obtained by 3D lidar, and ROI extraction, voxel filtering sampling, outlier point filtering, ground point cloud segmentation, Euclide clustering and other processing were carried out to achieve a complete PCL based 3D point cloud obstacle detection method. The experimental results show that the vehicle can effectively identify the obstacles in the area and obtain their location information.
40

Zhang, Zhikang, Zhongjie Zhu, Yongqiang Bai, Yiwen Jin e Ming Wang. "Multi-Scale Feature Fusion Point Cloud Object Detection Based on Original Point Cloud and Projection". Electronics 13, n. 11 (6 giugno 2024): 2213. http://dx.doi.org/10.3390/electronics13112213.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Existing point cloud object detection algorithms struggle to effectively capture spatial features across different scales, often resulting in inadequate responses to changes in object size and limited feature extraction capabilities, thereby affecting detection accuracy. To solve this problem, we present a point cloud object detection method based on multi-scale feature fusion of the original point cloud and projection, which aims to improve the multi-scale performance and completeness of feature extraction in point cloud object detection. First, we designed a 3D feature extraction module based on the 3D Swin Transformer. This module pre-processes the point cloud using a 3D Patch Partition approach and employs a self-attention mechanism within a 3D sliding window, along with a downsampling strategy, to effectively extract features at different scales. At the same time, we convert the 3D point cloud to a 2D image using projection technology and extract 2D features using the Swin Transformer. A 2D/3D feature fusion module is then built to integrate 2D and 3D features at the channel level through point-by-point addition and vector concatenation to improve feature completeness. Finally, the integrated feature maps are fed into the detection head to facilitate efficient object detection. Experimental results show that our method has improved the average precision of vehicle detection by 1.01% on the KITTI dataset over three levels of difficulty compared to Voxel-RCNN. In addition, visualization analyses show that our proposed algorithm also exhibits superior performance in object detection.
41

Zheng, Wenqi, Han Xie, Yunfan Chen, Jeongjin Roh e Hyunchul Shin. "PIFNet: 3D Object Detection Using Joint Image and Point Cloud Features for Autonomous Driving". Applied Sciences 12, n. 7 (6 aprile 2022): 3686. http://dx.doi.org/10.3390/app12073686.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Owing to its wide range of applications, 3D object detection has attracted increasing attention in computer vision tasks. Most existing 3D object detection methods are based on Lidar point cloud data. However, these methods have some limitations in localization consistency and classification confidence, due to the irregularity and sparsity of Light Detection and Ranging (LiDAR) point cloud data. Inspired by the complementary characteristics of Lidar and camera sensors, we propose a new end-to-end learnable framework named Point-Image Fusion Network (PIFNet) to integrate the LiDAR point cloud and camera images. To resolve the problem of inconsistency in the localization and classification, we designed an Encoder-Decoder Fusion (EDF) module to extract the image features effectively, while maintaining the fine-grained localization information of objects. Furthermore, a new effective fusion module is proposed to integrate the color and texture features from images and the depth information from the point cloud. This module can enhance the irregularity and sparsity problem of the point cloud features by capitalizing the fine-grained information from camera images. In PIFNet, each intermediate feature map is fed into the fusion module to be integrated with its corresponding point-wise features. Furthermore, point-wise features are used instead of voxel-wise features to reduce information loss. Extensive experiments using the KITTI dataset demonstrate the superiority of PIFNet over other state-of-the-art methods. Compared with several state-of-the-art methods, our approach outperformed by 1.97% in mean Average Precision (mAP) and by 2.86% in Average Precision (AP) for the hard cases on the KITTI 3D object detection benchmark.
42

Luo, Naisong, Rui Sun, Yuwen Pan, Tianzhu Zhang e Feng Wu. "Electron Microscopy Images as Set of Fragments for Mitochondrial Segmentation". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 4 (24 marzo 2024): 3981–89. http://dx.doi.org/10.1609/aaai.v38i4.28191.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Automatic mitochondrial segmentation enjoys great popularity with the development of deep learning. However, the coarse prediction raised by the presence of regular 3D grids in previous methods regardless of 3D CNN or the vision transformers suggest a possibly sub-optimal feature arrangement. To mitigate this limitation, we attempt to interpret the 3D EM image stacks as a set of interrelated 3D fragments for a better solution. However, it is non-trivial to model the 3D fragments without introducing excessive computational overhead. In this paper, we design a coherent fragment vision transformer (FragViT) combined with affinity learning to manipulate features on 3D fragments yet explore mutual relationships to model fragment-wise context, enjoying locality prior without sacrificing global reception. The proposed FragViT includes a fragment encoder and a hierarchical fragment aggregation module. The fragment encoder is equipped with affinity heads to transform the tokens into fragments with homogeneous semantics, and the multi-layer self-attention is used to explicitly learn inter-fragment relations with long-range dependencies. The hierarchical fragment aggregation module is responsible for hierarchically aggregating fragment-wise prediction back to the final voxel-wise prediction in a progressive manner. Extensive experimental results on the challenging MitoEM, Lucchi, and AC3/AC4 benchmarks demonstrate the effectiveness of the proposed method.
43

Yoshida, Keisuke, Shijun Pan, Junichi Taniguchi, Satoshi Nishiyama, Takashi Kojima e Md Touhidul Islam. "Airborne LiDAR-assisted deep learning methodology for riparian land cover classification using aerial photographs and its application for flood modelling". Journal of Hydroinformatics 24, n. 1 (1 gennaio 2022): 179–201. http://dx.doi.org/10.2166/hydro.2022.134.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Abstract In response to challenges in land cover classification (LCC), many researchers have experimented recently with classification methods based on artificial intelligence techniques. For LCC mapping of the vegetated Asahi River in Japan, the current study uses deep learning (DL)-based DeepLabV3+ module for image segmentation of aerial photographs. We modified the existing model by concatenating data on its resultant output port to access the airborne laser bathymetry (ALB) dataset, including voxel-based laser points and vegetation height (i.e. digital surface model data minus digital terrain model data). Findings revealed that the modified approach improved the accuracy of LCC greatly compared to our earlier unsupervised ALB-based method, with 25 and 35% improvement, respectively, in overall accuracy and the macro F1-score for November 2017 dataset (no–leaf condition). Finally, by estimating flow-resistance parameters in flood modelling using LCC mapping-derived data, we conclude that the upgraded DL methodology produces better fit between numerically analyzed and observed peak water levels.
44

Yu, Siyang, Si Sun, Wei Yan, Guangshuai Liu e Xurui Li. "A Method Based on Curvature and Hierarchical Strategy for Dynamic Point Cloud Compression in Augmented and Virtual Reality System". Sensors 22, n. 3 (7 febbraio 2022): 1262. http://dx.doi.org/10.3390/s22031262.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
As a kind of information-intensive 3D representation, point cloud rapidly develops in immersive applications, which has also sparked new attention in point cloud compression. The most popular dynamic methods ignore the characteristics of point clouds and use an exhaustive neighborhood search, which seriously impacts the encoder’s runtime. Therefore, we propose an improved compression means for dynamic point cloud based on curvature estimation and hierarchical strategy to meet the demands in real-world scenarios. This method includes initial segmentation derived from the similarity between normals, curvature-based hierarchical refining process for iterating, and image generation and video compression technology based on de-redundancy without performance loss. The curvature-based hierarchical refining module divides the voxel point cloud into high-curvature points and low-curvature points and optimizes the initial clusters hierarchically. The experimental results show that our method achieved improved compression performance and faster runtime than traditional video-based dynamic point cloud compression.
45

Cui, Mingyue, Junhua Long, Mingjian Feng, Boyang Li e Huang Kai. "OctFormer: Efficient Octree-Based Transformer for Point Cloud Compression with Local Enhancement". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 1 (26 giugno 2023): 470–78. http://dx.doi.org/10.1609/aaai.v37i1.25121.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Point cloud compression with a higher compression ratio and tiny loss is essential for efficient data transportation. However, previous methods that depend on 3D convolution or frequent multi-head self-attention operations bring huge computations. To address this problem, we propose an octree-based Transformer compression method called OctFormer, which does not rely on the occupancy information of sibling nodes. Our method uses non-overlapped context windows to construct octree node sequences and share the result of a multi-head self-attention operation among a sequence of nodes. Besides, we introduce a locally-enhance module for exploiting the sibling features and a positional encoding generator for enhancing the translation invariance of the octree node sequence. Compared to the previous state-of-the-art works, our method obtains up to 17% Bpp savings compared to the voxel-context-based baseline and saves an overall 99% coding time compared to the attention-based baseline.
46

Qimin, Xu, Zhao Xin, Liao Longjie, Li Yameng e Li Na. "Efficient and Accurate Vehicle Localization Based on LiDAR Place Recognition". Information Technology and Control 52, n. 2 (15 luglio 2023): 562–75. http://dx.doi.org/10.5755/j01.itc.52.2.32690.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
An efficient and accurate LiDAR place recognition methodology is proposed for vehicle localization. First, the Iris-LOAM is proposed to overcome the disadvantages of low accuracy of loop-closure detection and low efficiency of map construction in the existing LOAM-series methods. The method integrates the LiDAR-Iris global descriptor and Normal Distribution Transform (NDT) registration method into the loop-closure detection module of LiDAR Odometry and Mapping (LOAM), thereby improving the accuracy and efficiency of map construction. For the shortcomings of low map loading and matching efficiency, the Random Sample Consensus method is used to remove the ground point cloud information. The Voxel Grid method is used to down sample the loaded map. Finally, the NDT method is adopted for point cloud map matching to obtain the position information. Show that the Iris-LOAM has higher efficiency than the SC-LeGO-LOAM. The average time of point cloud map matching is reduced by 39.5%. The place recognition can be executed to achieve accuracy vehicle localization.
47

Liu, Rongsheng, Xiaowei Liu, Chengfeng Peng, Anping Li e Yong Liao. "Automatic Brain Tumour Subregion Segmentation from Multimodal MRIs Fusing Muti-channel and Spatial Features". Journal of Physics: Conference Series 2449, n. 1 (1 marzo 2023): 012034. http://dx.doi.org/10.1088/1742-6596/2449/1/012034.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Abstract It is very necessary for disease diagnosis, monitoring and treatment planning to locate and segment brain tumours from 3D MRI images accurately. 3D segmentation from MRIs means classifying each voxel in 3D space, it is very conducive to the relevant biological measurements and further analysis of the lesion. Until now, brain tumour segmentation from 3D biomedical images has been a challenging worldwide task due to the tumour features’ variousness, which varies part of U-Net and concatenates these features, which are upsampled to the same scale. To grasp the channel weight and ROIs, the bottleneck of the network is an improved dual path attention module, which convergence the advantages of channel attention and spatial attention. The proposed model has been validated in the online dataset of BraTS 2018. The mean dice score of enhancing tumours is 0.772. The mean dice score of the whole tumour is 0.907. The mean dice score of the tumour core is 0.819. The effectiveness of the proposed method is proved by quantitative and qualitative evaluation.
48

Gan, Xingli, Hao Shi, Shan Yang, Yao Xiao e Lu Sun. "MANet: End-to-End Learning for Point Cloud Based on Robust Pointpillar and Multiattention". Wireless Communications and Mobile Computing 2022 (14 settembre 2022): 1–12. http://dx.doi.org/10.1155/2022/6909314.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Detecting 3D objects in a crowd remains a challenging problem since the cars and pedestrians often gather together and occlude each other in the real world. The Pointpillar is the leader in 3D object detection, its detection process is simple, and the detection speed is fast. Due to the use of maxpooling in the Voxel Feature Encode (VFE) stage to extract global features, the fine-grained features will disappear, resulting in insufficient feature expression ability in the feature pyramid network (FPN) stage, so the object detection of small targets is not accurate enough. This paper proposes to improve the detection effect of networks in complex environments by integrating attention mechanisms and the Pointpillar. In the VFE stage of the model, the mixed-attention module (HA) was added to retain the spatial structure information of the point cloud to the greatest extent from the three perspectives: local space, global space, and points. The Convolutional Block Attention Module (CBAM) was embedded in FPN to mine the deep information of pseudoimages. The experiments based on the KITTI dataset demonstrated our method had better performance than other state-of-the-art single-stage algorithms. Compared with another model, in crowd scenes, the mean average precision (mAP) under the bird’s-eye view (BEV) detection benchmark increased from 59.20% of Pointpillar and 66.19% of TANet to 69.91 of ours, the mAP under the 3D detection benchmark was increased from 62% of TANet to 65.11% of ours, and the detection speed only dropped from 13.1 fps of Pointpillar to 12.8 fps of ours.
49

LUCCONI, GIULIA, CHIARA ROMEO, ROBERTO BONETTI, PATRIZIA CENNI e NICOLETTA SCRITTORI. "DIFFUSION MRI-BASED FIBER TRACKING IN HEALTHY AND BRAIN INJURY PATIENTS: A COMPARISON OF DIFFERENT SOFTWARE TOOLS". Journal of Mechanics in Medicine and Biology 15, n. 02 (aprile 2015): 1540020. http://dx.doi.org/10.1142/s0219519415400205.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In this work, we compared different DTI-based fibertracking software using deterministic and probabilistic approaches. DTI brain images of 35 healthy and five brain-injury patients were acquired with Philips Achieva 1.5 T scanner using an EPI-SE DTI sequence with 16 diffusion directions. Images were analyzed with Philips FiberTrack module, DTI-Studio and FSL. We studied corticospinal tract and corpus callosum, considering different termination criteria for the fibertracking algorithm. Group studies were performed to create a database of healthy patients. Results of FSL fibertracking with 1 or 2 fibers per voxel were no statistically different. T-tests between Philips and DTI-Studio led to p-values > 0.01 for corticospinal tract and < 0.01 for corpus callosum. FSL analysis led to higher ADC and lower FA values, with significative differences with the other software. In brain injury patients we measured different fibers orientation, reduced FA and increased ADC around the lesion. In conclusion, although DTI fibertracking is a promising non-invasive preoperative imaging tool, the outcome is strongly influenced by the algorithm used and the parameters chosen for the seed generation and fiber propagation.
50

Peng, Zhao, Yu Lu, Yao Xu, Yongzhe Li, Bo Cheng, Ming Ni, Zhi Chen et al. "Development of a GPU-accelerated Monte Carlo dose calculation module for nuclear medicine, ARCHER-NM: demonstration for a PET/CT imaging procedure". Physics in Medicine & Biology 67, n. 6 (17 marzo 2022): 06NT02. http://dx.doi.org/10.1088/1361-6560/ac58dd.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Abstract Objective. This paper describes the development and validation of a GPU-accelerated Monte Carlo (MC) dose computing module dedicated to organ dose calculations of individual patients undergoing nuclear medicine (NM) internal radiation exposures involving PET/CT examination. Approach. This new module extends the more-than-10-years-long ARCHER project that developed a GPU-accelerated MC dose engine by adding dedicated NM source-definition features. To validate the code, we compared dose distributions from the point ion source, including 18F, 11C, 15O, and 68Ga, calculated for a water phantom against a well-tested MC code, GATE. To demonstrate the clinical utility and advantage of ARCHER-NM, one set of 18F-FDG PET/CT data for an adult male NM patient is calculated using the new code. Radiosensitive organs in the CT dataset are segmented using a CNN-based tool called DeepViewer. The PET image intensity maps are converted to radioactivity distributions to allow for MC radiation transport dose calculations at the voxel level. The dose rate maps and corresponding statistical uncertainties were calculated at the acquisition time of PET image. Main results. The water-phantom results show excellent agreement, suggesting that the radiation physics module in the new NM code is adequate. The dose rate results of the 18F-FDG PET imaging patient show that ARCHER-NM’s results agree very well with those of the GATE within −2.45% to 2.58% (for a total of 28 organs considered in this study). Most impressively, ARCHER-NM obtains such results in 22 s while it takes GATE about 180 min for the same number of 5 × 108 simulated decay events. Significance. This is the first study presenting GPU-accelerated patient-specific MC internal radiation dose rate calculations for clinically realistic 18F-FDG PET/CT imaging case involving autosegmentation of whole-body PET/CT images. This study suggests that the proposed computing tools—ARCHER-NM— are accurate and fast enough for routine internal dosimetry in NM clinics.

Vai alla bibliografia